What 10 million passwords reveal about the people who choose them 2


A team at WP Engine have conducted an interesting analysis of some 10 million passwords that had been collected from various sources such as leaks and dumps of passwords.   Virtually none of the passwords were still in use so the researchers considered that it was ethical to use the dataset in their research.

The analysis highlights that people tend to choose passwords based on defined patterns and what comes into their mind when asked for a password.  So it is not surprising that in the 50 most used passwords, the most common text-based password is the word password itself.  However, the use of patterns does often make guessing passwords very easy, especially for password cracking software such as HashCat which can make up to 300,000 guess at a password per second.

Other patterns identified were people adding their year of birth to their name to create a password and an interesting sex difference was that the word”love” appeared in women’s passwords more often than in men’s.  Keyboard patterns (e.g. qwerty) also feature prominently in the passwords.  These can appear apparently random, but again they are easy to predicted using software.

Password Entropy

The WP engine team highlight the strength of a password increases with its entropy which is a measure of variation of the characters in the password.  Entropy increases most significantly with the length of the password; however, passwords that appear to have a lot of entropy when an entropy calculation is applied may in practice have none.  For example, “password” has an entropy score of 37.6 bits; however, in practice its score is zero because every word list used by password crackers includes the word password.

Interestingly, adding a number to a password will increase its entropy, but the increase in entropy may not be as significant as it may initially appear.  This is because both adding a number and the actual number added (the most common being 1) is predictable and therefore easily incorporated into a password cracking program.

Overall, the WP Engine article is a recommend read, if only to make sure that any passwords you are using are not amongst the 50 most used passwords!


2 thoughts on “What 10 million passwords reveal about the people who choose them

  • cloudstarer

    Pick something nice and short, someone’s name, a car registration, you’re favourite animal, something you won’t forget.

    Then replace each letter with a word, use the NATO alphabet (A-Alpha, B-Bravo…) or a childs alphabet (A-Apple, B-Biscuit…) or something else (A-Aardvark, B-Building…) as long as you remember the sequence it’s OK, you are the only one who needs to remember it

    So if you pick ‘Fred1’ as your password & use NATO you get

    FoxtrotRomeoEchoDeltaOne

    Using something else you could get

    FlyingRumbleExistentialDromedaryFirst

    As I say, as long as you know the substitutions the actual words aren’t vital.

    So now you’ve got a long password with mixed upped & lower case, so run through it one more time and replace specific letters with symbols, you could use 1337 speak or something else say if the shape of the letter looks like a symbol you swap the letter, as long as you remember the way you swap that’s all you need

    So ‘fred’ becomes ‘FlyingRumbleExistentialDromedaryFirst’ which now could become

    ‘F1y1n£Rum813e*1$t3nt141Dr0m3d4ryF1r$t’

    That should be pretty secure, but you can add more rules if you want, two passes made a 4 character trivial password into something a supercomputer cant crack in your lifetime.

    All you need to remember is a nice short password like ‘Fred1’ & you can generate a 30 odd character secure password using a couple of rules which only you know.

    The biggest problem I’ve found with this is a lot of web sites won’t accept really long passwords.

  • cloudstarer

    Pick something nice and short, someone’s name, a car registration, you’re favourite animal, something you won’t forget.

    Then replace each letter with a word, use the NATO alphabet (A-Alpha, B-Bravo…) or a childs alphabet (A-Apple, B-Biscuit…) or something else (A-Aardvark, B-Building…) as long as you remember the sequence it’s OK, you are the only one who needs to remember it

    So if you pick ‘Fred1’ as your password & use NATO you get

    FoxtrotRomeoEchoDeltaOne

    Using something else you could get

    FlyingRumbleExistentialDromedaryFirst

    As I say, as long as you know the substitutions the actual words aren’t vital.

    So now you’ve got a long password with mixed upped & lower case, so run through it one more time and replace specific letters with symbols, you could use 1337 speak or something else say if the shape of the letter looks like a symbol you swap the letter, as long as you remember the way you swap that’s all you need

    So ‘fred’ becomes ‘FlyingRumbleExistentialDromedaryFirst’ which now could become

    ‘F1y1n£Rum813e*1$t3nt141Dr0m3d4ryF1r$t’

    That should be pretty secure, but you can add more rules if you want, two passes made a 4 character trivial password into something a supercomputer cant crack in your lifetime.

    All you need to remember is a nice short password like ‘Fred1’ & you can generate a 30 odd character secure password using a couple of rules which only you know.

    The biggest problem I’ve found with this is a lot of web sites won’t accept really long passwords.

Comments are closed.