It's actually reasonable to store passwords in a database in plaintext, if your threat model excludes the possibility that an attacker could gain access to your database.
However, the reason for hashing/salting passwords is to account for the scenario that an attacker could gain a copy of the database or read-only access to the database. Have I Been Pwned lists many site breaches that have resulted in plaintext passwords being shared or sold. Many of which were due to unprotected databases, backups and poor access controls.
With a site user's password, an attacker can easily:
- Gain access to the user's account on the site
- Gain access to the user's accounts on other sites
Note that the first objective is also achievable if the attacker has write access to the accounts database.
A simple defence against exposure of the user's password is to apply an unrecoverable operation on the password before storage. A hash function is such an operation. Hash functions produce a digest from data:
hash(data) -> digest. In the user database, the digest (`hash(password)`) will be stored instead of the password.
The properties of a strong cryptographic hash function prevent deriving
password is either known or small enough, it can be used to recover
hash(password). Because most hash functions are really fast, it takes a very short time to compute
A list of known or small passwords is called a "dictionary". An attacker would use a dictionary as a list of possible passwords to determine if one of them produces the same hash.
These dictionaries can be turned into "rainbow tables", which is a list of
hash(password) associations. Here's an example of a rainbow table:
If the digest you are looking for is in this table, you can easily obtain
To make the use of "rainbow tables" ineffective, a salt can be added. A salt is a random value associated with each password. The password would be stored as
hash(salt, password), with the salt stored alongside it. A specialised salting scheme like Bcrypt should be used. Salting schemes make the time it takes to perform
salting_scheme(salt, password) significant.
Since the salt is a random value, it will be different every time the password is set or changed, and different for every website. This prevents the use of "rainbow tables" because it is not known in advance what the salt is.
A good salting scheme would take a significant amount of time to compute, possibly non-trivial. Bcrypt furthers on this property to make this computation time adaptive, chosen when the password is first set.
Handling credentials and code
Databases using salted passwords still need to be guarded as carefully as plaintext passwords.
So should your application, and whatever code runs on clients. An attacker with means of injecting code into the client (for example, through XSS) or server can easily steal plaintext passwords.
It might be tempting to hash or salt passwords on the client instead of on the server. However, this turns the salt into merely a plaintext password, and thus the same flaws apply. NT hashes in NTLM suffered from a similar problem, where the hash became functionally equivalent to the plaintext password. (Implementing a challenge-response authentication protocol on top of HTTPS brings no benefit).
Note that it's also possible to perform dictionary attacks on an online system rather than an (offline) salted or hashed password. However, the time and memory it takes to make such quantity of network requests is significantly higher than it takes to do the computation offline. Additionally, most sites have strict rate limits on login attempts, and it is also important to implement such limits.
If your password is a commonly used or short password (therefore having too little entropy), an attacker can easily perform a dictionary attack. With only hashing, it would take little to no time to derive the password. With salting, it would take a larger amount of time.
If your password is unique, an attacker might take significantly more time to guess the password as it may not exist in a dictionary. With only hashing, it would take a large amount of time to derive the password. With salting, it would take a non-trivial amount of time.
A long password doesn't directly mean it is harder to guess. A smart attacker could make intelligent brute force attacks on a password such as "password-very-secure-wow" or "buritos2000arethebest", especially if the attacker knows some information about the user. It's less likely but possible.
A password should not be used when it is either previously discovered in a data breach or used by many people. Additionally, passwords that are merely variations of common "concepts" should be avoided. Common concepts are likely to be chosen by anyone.
P@5sW0rD! are examples of variations of a common concept, in this case "password". Many people enter passwords every day, so don't use "password". Many people like movies, so don't use "movielover".
A good password would involve multiple "concepts", with a good amount of variation. Mix things that you can remember with things that you can't. It might even work to trade off some randomness for a longer but memorable password.
correct horse battery staple is a good idea but you should pick the words randomly from a dictionary and use more words. You could even do something like this, making use of poetry to increase memorability.
Or if you prefer, use a reputable password manager like LastPass or 1Password, which allows you to use fully random passwords different for every site, without needing to remember all of them.
To those implementing passwords in your application and those managing password requirements, I really hope you can follow the 2017 NIST guidelines and:
- Allow at least up to 64 characters in passwords. Don't limit the number of characters.
- Don't enforce composition rules. Allow passwords to have no symbols or numbers.
- Allow pasting of passwords. Some people use password managers.
- If you can, provide a password strength meter.
And for the best results, test passwords entered against a dictionary or pwned passwords list. I really like the latter, a mechanism provided by Have I Been Pwned.