Problem
I have the following password policy:
- At least 8 characters in length
- At least one digit
- At least one special* character
*I consider any character that is not a letter, digit or space to be a special character.
I’ve come up with the following regex for enforcing this policy, and it seems to be working.
^(?=.*?[0-9])(?=.*?[^a-zA-Z0-9 ]).{8,}$
Are there any drawbacks or pitfalls with my regex?
Solution
I find the problem with regular expressions is how hard they are to read, and how quickly they become very complicated.
Instead of using a single regex to check for all your conditions I would define a class for a the password that you can then set a series of simple rules for.
This does use static methods, which is not to everyone’s taste.
interface Rule
{
public function check(string $value): bool;
}
class LengthRule implements Rule
{
protected $length;
/**
* LengthRule constructor.
*
* @param $length
*/
public function __construct($length)
{
$this->length = $length;
}
public function check(string $value): bool
{
return mb_strlen($value) >= 8;
}
}
class DigitRule implements Rule
{
public function check(string $value): bool
{
return preg_match('/d/', $value) === 1;
}
}
class SpecialCharacterRule implements Rule
{
public function check(string $value): bool
{
return preg_match('/[^a-zA-Zd ]/', $value) === 1;
}
}
class Password
{
/** @var Rules[] */
protected static $rules = [];
protected $value;
/**
* @param string $value
*/
public function __construct(string $value)
{
foreach (static::$rules as $rule) {
if (false === $rule->check($value)) {
throw new InvalidArgumentException("The value doesn't" .
" match all rules.");
}
}
$this->value = $value;
}
/**
* @param Rule $rule
*/
public static function registerRule(Rule $rule)
{
self::$rules[] = $rule;
}
}
Password::registerRule(new LengthRule(8));
Password::registerRule(new DigitRule());
Password::registerRule(new SpecialCharacterRule());
new Password('12343.4654j');
You can maintain the same level of accuracy and reduce the “step” count (improve efficiency) by replacing .
with character classes / negated character classes where appropriate. As a result of using character classes, you can use greedy quantifiers without negatively impacting accuracy.
Also, by incorporating the i
flag at the end of your pattern, you can reduce your upper and lower case alphabetical character ranges to one or the other. /d
is also shorter than 0-9
(albeit, only by 1 character).
I would recommend this pattern:
/^(?=[^d]*d)(?=[A-Zd ]*[^A-Zd ]).{8,}$/i
I’ve prepared a regex pattern comparison to display the efficiency gains (in terms of steps). Note, I had to slightly modify your pattern and my recommended pattern to prevent unintended newline matching within the negated character classes. I have used the same 8 test strings as in my php demo.
My pattern: /^(?=[^ds]*d)(?=[A-Zd ]*[^A-Zds]).{8,}$/img
83 steps Demo
OP’s pattern: /^(?=.*?[0-9])(?=.*?[^a-zA-Z0-9s]).{8,}$/mg
145 steps Demo
Instead of using the zero-width lookaheads, you could also use three separate regular expressions. Then you are able to report the violated role to the user:
function validate_password($password) {
if (strlen($password) < 8)
return "too short";
if (!preg_match('/d/', $password))
return "must contain a digit";
if (!preg_match('/[^A-Za-z0-9]/', $password))
return "must contain a special character";
return true;
}
By the way, the third rule rejects autogenerated passwords from password managers, so you should allow these passwords if the length is at least 32.