If it’s on Facebook, it must have happened. If it’s on Twitter, someone will disagree. If it’s on Wikipedia, it must be true. Then why, on Wikipedia, is there a section stating that “The Password is Dead”, but a password was just used to post this blog? Claims about the demise of the password have been popping up since 2004, but more than a decade later it is still actively used. And the only victims are the users.
I want to explore how passwords are used to the detriment of secure systems and actively lead to user error in information systems. Has this been discussed before? Of course, but until the day where we can say that the use of passwords is declining, we should continue advocating for its downfall.
Users and Passwords
Humans are lazy. In a very broad sense, this has scientifically been proven. When it comes to passwords, it becomes even more obvious. I could probably fill the pages of this post with news articles and advisories about password reuse and secure passwords. Security firm, SplashData, keeps track of (mostly stolen) password lists and annually releases a list of that year’s top passwords. Every year these lists show that users are blatantly ignoring any advice.
In 2016, the top 5 passwords were:
The password “password” is luckily only #2 on this list, with “1234” only at #8 on the list (As an interesting aside, #25 on this list is “starwars”.) At the end of 2017, we should probably not expect this to change much. Ultimately, this list should be shocking, albeit not surprising. But, why do people do this?
If we want to know why people use such simple passwords, we should attempt to understand the psychology behind passwords. A survey released by LastPass, a well-known password manager app, found a clear cognitive dissonance in our users.
The survey showed that 82% of respondents clearly understand that a password that consists of a “combination of letters, numbers and symbols” is more secure, yet 47% still used initials, friends or family names as passwords. Respondents created stronger passwords for systems they deemed to be more important. 69% had strong passwords for financial accounts, yet only 20% had strong passwords for entertainment accounts.
Finally, 91% of respondents knew there was a risk in reusing passwords, but 61% continued to do so (55% fully understanding the risk in doing so).
Why are we still using passwords if our users are blatantly brushing off their insecurity? However, we can argue that “not all users” use easy-to-guess passwords; and, that regardless of what they use, we should still protect their passwords.
Protection of Passwords
It should go without saying that passwords are not supposed to be stored as plaintext in any database. The most common best practice is to store the password as a hash digest. This is not necessarily to protect the information system but to protect users who reused passwords elsewhere.
Microsoft Research had suggested that systems should stop hashing passwords and rather use (two-way) encryption. They argue that hashing has done more harm than good and that the decrypted passwords should be available for (offline) analysis in the interest of “social good”. Their goal is to study user behaviour in terms of password usage. This practice has definitely not caught on, and so we will look at passwords in their hashed form.
From the first definition of the cryptographic hash function in 1976, it was MD5 (released in 1994) that really took hold as the standard for stored passwords. Since 1997 MD5 was found to be broken and SHA-1 (published in 1995) took over the role of a recommended standard. The use of SHA-1 has now been deprecated (due to collisions found) and the SHA-2 and SHA-3 suites of cryptographic hashing algorithms are suggested.
The recommended algorithm moved to SHA-1 (which was first published in 1995) at the very least or any from the SHA-2 set.
However, even though the use of MD5 and SHA-1 has been strongly discouraged, this hasn’t outright stopped its use. Legacy implementations still make use of MD5 and some newer implementations use SHA-1.
Storing the password as a hash digest is not so much to protect the information system, but the user of the password. If a malicious actor has (unauthorised) access to the database to retrieve passwords in the first place, the information system has bigger problems than a couple of leaked passwords. Although users who reuse passwords on different information systems might have a different opinion…
But, as mentioned earlier, we see that the most common passwords are still the same common passwords from a decade ago (well, except it counts to 8 instead of 6). So, hashed or not, it will probably easily be guessed.
Is there a way to put a value on passwords?
Password Crack: The Price of a Password
Perhaps if we started giving monetary value to passwords, it would resonate better with users. How snobbish would it be if your password was “worth more” than your neighbour’s? But, how can we determine the monetary value of a password? The best way would be to determine how long it would take to guess a user’s password. And, as time is money, we should be able to come up with a discernible worth of the password.
When it comes to cracking passwords, malicious actors do not go for the brute force attempt from the get-go. The past decade has seen enough large password databases leak to provide a good basis for quickly finding passwords. The best known of these passwords lists was the ROCKYOU list, released between 2009 and 2012. Since then, a variety of large data breaches have added to these password lists.
When calculating its worth, if the password can be found in something like the ROCKYOU password list, it should not be worth anything as iterating through such lists are trivial. But, what if we brute force a password?
For this exercise, we will consider three popular types of hashes (MD5, SHA-1 and SHA-256). Note that these hashes are raw hashes and not seeded in any way. As we saw from the top 5 passwords, they only consisted of letters and numbers. So, we will consider a search space of 62 characters (a-z + A-Z + 0-9).
Now we need somewhere to crack these hashes but also determine a monetary value. Amazon’s EC2 (Elastic Compute Cloud) can give a value of running an instance per hour. It has been found that cracking hashes on a GPU (as opposed to a CPU) is much faster so we will take an instance of Amazon’s g2.2xlarge with access to high-performance NVidia GPUs with 1536 CUDA cores. The current cost of running a g2.2xlarge instance is $0.65 per hour (as at December 2016). That is less than a cup of coffee, per hour.
Finally, we need a tool to do the brute forcing with. The two popular tools are “John the Ripper” and “hashcat”. Since we’re fonder of cats than puns on London murderers, we will use oclHashcat (The GPU-based version of hashcat). David Um from the blog Rockfish Sec already did some benchmarking of hashcat on Amazon’s g2.2xlarge. The benchmark came down to the following:
|Hashes per Second|
Obviously, MD5 is the easiest and fastest to brute force.
We now have our platform, our hashes, and our search space. The last thing we need for our exercise is a search depth. Since the top 5 passwords were 8 characters or fewer (with the number one being 6 characters), we will use 6 and 8 characters as our search depth. With some number crunching, we calculated the following worst-case scenario costs:
|6 Characters||8 Characters|
|Time to Crack||Cost||Time to Crack||Cost|
|MD5||23 secs||$0.004153||1 d 42 mins 29 secs
|SHA-1||1 mins 23 secs||$0.014986||3 d 17 hrs 33 mins 36 secs||$58.214|
|SHA-256||3 mins 15 secs||$0.035208||8 d 16 hrs 15 mins 27 secs||$135.3674|
Great! We now have a monetary value for our users’ passwords. A 6-character word hashed in MD5 will costs less than a cent to crack, whereas an 8-character word hashed with SHA-256 will cost roughly $135.
This should have come as no surprise and this is why we advocate for more complex passwords. There’s a great rant in that too, but I’ll leave that for next time.