Griff’s Grumblings

Writing about things that interest me.

Hashing: Securely Storing Passwords

Cryptographic hash functions, or hashes for short, take an input of a variable length and produce a unique, fixed size output. Hashes are used in modern computing in a variety of ways, including password verification, file integrity validation, and file identification. Hash functions are easy to compute based on the input, but are difficult to re-compute given only the output. Because of this property, hashes are great for storing sensitive information, like passwords, in a database.

Basic Example - MD5 Hash

Let’s take a look at a quick example so you can get a basic understanding of how hashes work. My primary language of choice is Ruby, so all of the examples will use Ruby code.

Hashing a password with MD5
1
2
3
4
5
require "openssl"
 => true

@password = OpenSSL::Digest::MD5.hexdigest("password")
 => "5f4dcc3b5aa765d61d8327deb882cf99"

In this example, the input (referred to as the message), is “password.” This message is run through the MD5 hashing algorithm and returns a string of characters called the digest, in this case “5f4dcc3b5aa765d61d8327deb882cf99”. The digest is what you store in the database so that if your database is compromised, the attacker could not easily obtain the password from the digest alone.

Testing a Valid Password
1
2
@password == OpenSSL::Digest::MD5.hexdigest("password")
 => true

Now, we make a second call to the system checking whether the hash of password matches the stored value of @password. This returns true and would authenticate the user.

Testing an Invalid Password
1
2
@password == OpenSSL::Digest::MD5.hexdigest("idunno")
 => false

Here, we make a call to the system checking whether the hash of “idunno” matches @password. In this case, the program returns false, and would not authenticate the user.

Cracking Hashed Passwords

The MD5 algorithm, as we have implemented it above, is better than simply storing “password” directly in to the database, but is generally not accepted as a secure method of storing passwords. To understand why, let’s take a look at how attackers might try to crack your password hashes.

The simplest way to crack a password hash is to simply try to guess the password. If you guess the password correctly, you will obtain the same hash and gain access to the system. To automate this process, attackers use a variety of tools. Two of the most common tools are called Lookup Tables and Rainbow Tables. Lookup Tables take a file of common words and phrases used in passwords and precompute the hash for each input. Thus, your table would look something like this:

Lookup Table

Input Hash
password 5f4dcc3b5aa765d61d8327deb882cf99
monkey d0763edaa9d9bd2a9516280e9044d885
etc abcdef0123456789abcdef0123456789

Rainbow tables take a similar approach, however, instead of using a precompiled dictionary they will try to compute all the hashes for a subset of characters. For example, an 8 character password consisting of all numeric characters.

Rainbow Table

Input Hash
00000000 dd4b21e9ef71e1291183a46b913ae6f2
00000001 ced165163e51e06e01dc44c35fea3eaf
etc abcdef0123456789abcdef0123456789

Strengthening our Password Algorithm

By now, you are probably feeling a bit discouraged about storing passwords securely. Luckily, cryptographers have designed a special type of hash function called a key derivation function that make Lookup and Rainbow Tables more difficult for attackers. We are going to talk about three of those hashes. They are called PBKDF2, BCrypt, and SCrypt.

Compared to the basic MD5 hash above, key derivation functions add salts and iterations to the process. A salt is a random string of data that is used as additional input into the function. Each time a new password is stored in the database, a random salt is generated and stored along with the password. This additional field requires the attacker to generate a custom lookup or rainbow table for each individual password! Each of these functions also allow you to configure the number of hashing iterations you go through before finalizing the value. Doing this one time to store or lookup the user’s password doesn’t affect your system much, but if your attacker is trying to generate Lookup or Rainbow Tables for all of the possible solutions, it adds a significant amount of time to their process.

PBKDF2 Algorithm

First, let’s take a look at the PBKDF2 key derivation function. Here, we need to set a few variables. Notably, PBKDF2 takes inputs of the password, salt, number of iterations, the digest, and digest length to create the output.

Set Variables for use with PBKDF2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
require "securerandom"
 => true

@password = "password"
 => "password"

@salt = SecureRandom.hex
 => "ca8ec945aae7fac402c8012cbf78b616"

@iterations = 10000
 => 10000

@digest = OpenSSL::Digest::SHA256.new
 => #<OpenSSL::Digest::SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855> 

@key_length = digest.digest_length
 => 32

Now, let’s pass these variables into the PBKDF2 algorithm.

Generating the PBKDF2 Digest
1
2
@password = OpenSSL::PKCS5.pbkdf2_hmac(@password, @salt, @iterations, @key_length, @digest).unpack("H*")
 => ["43f1fab16632a21fdcd033d0a2c29835154d4827e21bb893d66eafc7a33c61fc"]

To make this a little easier, there is a ruby implementation of PBKDF2 that you can use. First install the ruby gem ‘pbkdf2.’

Install SCrypt Gem
1
gem install pbkdf2

Now, in the ruby shell, require the PBKDF2 library.

Requiring the Ruby Implementation of PBKDF2
1
2
3
4
5
require 'securerandom'
=> true

require 'pbkdf2'
=> true

If you are using Ruby 1.9 or newer, you need to add the following code in your console.

1.9 Fix for PBKDF2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class String
  if RUBY_VERSION >= "1.9"
    def xor_impl(other)
      result = "".encode("ASCII-8BIT")
      o_bytes = other.bytes.to_a
      bytes.each_with_index do |c, i|
        result << (c ^ o_bytes[i])
      end
      result
    end
  else
    def xor_impl(other)
      result = (0..self.length-1).collect { |i| self[i] ^ other[i] }
      result.pack("C*")
    end
  end

  private :xor_impl

  def ^(other)
    raise ArgumentError, "Can't bitwise-XOR a String with a non-String" \
      unless other.kind_of? String
    raise ArgumentError, "Can't bitwise-XOR strings of different length" \
      unless self.length == other.length

    xor_impl(other)
  end
end

In this implementation, you do not have to create the digest and digest length yourself. It is done behind the scenes.

Generating the PBKDF2 Digest with the PBKDF2 Class
1
2
3
4
5
@password_digest = PBKDF2.new(:password=>@password, :salt=>@salt, :iterations=>@iterations)
 => #<PBKDF2:0x00000001717c20 @hash_function=#<OpenSSL::Digest::Digest: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855>, @value=nil, @password="password", @salt="ca8ec945aae7fac402c8012cbf78b616", @iterations=10000, @key_length=32>

@password_digest = @password_digest.hex_string
 => "43f1fab16632a21fdcd033d0a2c29835154d4827e21bb893d66eafc7a33c61fc"

To test password success and failure, we need to know both the password and the salt.

Testing the Password with PBKDF2
1
2
3
4
5
@password_digest == PBKDF2.new(:password=>"password", :salt=>"ca8ec945aae7fac402c8012cbf78b616", :iterations=>@iterations).hex_string
 => true

@password_digest == PBKDF2.new(:password=>"idunno", :salt=>"ca8ec945aae7fac402c8012cbf78b616", :iterations=>@iterations).hex_string
 => false

BCrypt Algorithm

BCrypt is a key derivation function that adds a layer of complexity to the computations from PBKDF2. The bcrypt-ruby library is widely used in Ruby on Rails applications as the default secure password library. It handles the salts for you so that you only have to store one string in your database. Let’s take a look at how it is used in practice.

Hashing a password with BCrypt
1
2
3
4
5
6
7
8
9
10
require 'bcrypt'

@password = BCrypt::Password.create("my password")
  => "$2a$10$qRVVULjD1mnTCR5n3WefsuEaODLX5gG1xl5XLowpcpYj.jBVW3DjC"

@password == "my password"
 => true

@password == "idunno"
 => false

Here, you can see that the string output from creating the password is divided up by the ‘$’. 2a is the version. 10 is the cost. The next 22 characters (“qRVVULjD1mnTCR5n3Wefsu”) is the salt and the remainder (“EaODLX5gG1xl5XLowpcpYj.jBVW3DjC”) is the digest.

SCrypt Algorithm

So you think you’ve got it solved? Many people believe that using BCrypt is sufficient for storing passwords, however, the designers of SCrypt have upped the ante.

The designers of scrypt estimate that “on modern (2009) hardware, if 5 seconds are spent computing a derived key, the cost of a hardware brute-force attack against scrypt is roughly 4000 times greater than the cost of a similar attack against bcrypt (to find the same password), and 20000 times greater than a similar attack against PBKDF2.

Install SCrypt Gem
1
gem install scrypt

Now, lets load the SCrypt gem and see what it can do. Here, we create a password hash using the SCrypt algorithm and subsequently check whether our string matches the hash.

Hashing a password with SCrypt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
require "scrypt"

password = SCrypt::Password.create("my password")
 => "400$8$17$62ef0843665ee94c$9eed5dfe8ea6c9938af8f75c1bb0d07e8836fc457a8c1fb22bbbadecf152e73e"

password == "my password"
 => true

password == "a random guess"
 => false

password.cost
 => "400$8$30$"

password.salt
 => "451300823c6c72ee"

password.digest
 => "28b4fb7ce75b0caa1b2da67f4e3c7d76ded40cb93bb10cc087f6c0afb25d6a0c"

For comparison, the chart below compares the time to crack hashes generated from a variety of common algorithms:

[Update] Thanks to @gilbes for pointing out that this table is based on 2009 hardware. When you read through this table, please take into consideration that this is based on 5 year old hardware.

KDF comparison

Comments