Here I would like to present some real life statistics, albeit based only on ~48000 samples, it should give a good view of password selection habits. Only the actual results are shown, it's left up to the reader to draw any conclusions.
Background information:
- Source data is a list of 48595 username-password pairs, coming partially from a public discussion board (28595) and partially from a corporate network resource (20000). Users awareness about information security is unknown, but we could assume with a great deal of certainty that the users' expertise represents a complete spectrum from 'casual user' to 'technically inclined'.
- We can also assume that the average age for the 20000 list is 20+ (people working in the company are most likely after a college, army, etc.)
- Alpha-numeric and general characters allowed. Minimum password length is 6.
- Initial password generated by the administrator is 10 characters long, consist of interleaving cases and numbers. E.g. UaI7VyijSt
- For passwords from public discussion board: users with last access date - registration date difference no greater than a week were removed. This is done in order to clean up the list from one-time users who presumably chose a common, simple to remember, combination. This should remove a great share of non representative passwords and give us better statistics.
- Overall distribution by length (X axis - length, Y axis- distribution percentage):

- Combination match in a publicly available wordlist (~3349730 words): 5.12%
Distribution by length:
- Consists solely of numbers: 11.91%
Distribution by length:
- Top 30 most frequently occurring passwords:

- Has a numerical suffix (remaining characters are alphabetic): 19.83%
Has a numerical prefix (remaining characters are alphabetic): 2.81%
Top 30 suffixes/prefixes:
- Original passwords assigned by server retained (under assumption that the passwords of the form UaI7VyijSt are indeed system assigned and not user chosen): 1.44%
- Capitalized (remaining characters are lowercase/numbers/general): 2.41%
- All letters are uppercase (remaining characters are either numbers or general): 0.19%
- Consist solely of same repeating character (e.g. aaaaaaa, 33333333): 0.74%
- A double pattern (e.g. funkyfunky): 2.84%
- Password is an username derivative (e.g. username: vikk -> password: Zvikk007): 1.52%
1 comment:
trustno1, donkey, unreal, samsung???? wow! some really weird passwords there
Post a Comment