This is an attempt to understand the basics of cryptography. The very basics 🙂
Beware of a link-intensive post, it is meant (as many others) to serve me as a reference.
It started with the recent iCloud privacy problems, then the article about hashing of secrets intrigued me a bit and made me curious to read more about this field. So here it is.
Hashing vs Encrypting vs Encoding
Hashing – irreversible; used to check integrity of data, to irreversibly encode data (passwords) and also to sign data (in conjunction with HMAC).
Encrypting – reversible; used for maintaining data confidentiality
Encoding – reversible, for usability (ex Base64Encode) #
Update 16/Dec/2014: There is a small debate whether applying ROT13 to a string is considered encryption or not. ROT13 is a very simple substitution cipher (one of the 26 possible ones) – which substitutes each letter by another one placed 13 positions further in the alphabet.
I would say that ROT13 is a form of encryption; true, a very very weak one. But it has an algorithm (substitution of letters) and a key (13 positions). So in theory it encodes a message so that only authorized parties can read it. In practice, almost anyone with a basic motivation can read it.
Hashing vs HMAC vs KDF
1. Hashing algorithms
A hashing algorithm converts a variable-length string to a fixed-length string that can act as a “fingerprint” or unique identifier for the original string. It is not possible to convert the hash result back to the source string.
Hash(string [, algorithm [, encoding ]])
string hash ( string $algo, string $data [, bool $raw_output = false ] )
2. HMAC (Hash-Based Message Authentication Codes)
HMAC is used to verify the data integrity and authenticity of a message transmitted. It involves a cryptographic hash function in combination with a secret key.
According to the official specifications, HMAC is defined as:
H(K XOR opad, H(K XOR ipad, text))
H is a cryptographic hash function where data is hashed by iterating a basic compression function on blocks of data
B is the byte-length of such blocks (B=64 for MD5, SHA-1)
L is the byte-length of hash outputs (L=16 for MD5, L=20 for SHA-1)
K is the authentication key and can be of any length up to B, the block length of the hash function.
Applications that use keys longer than B bytes will first hash the key using H and then use the resultant L byte string as the actual key to HMAC. In any case the minimal recommended length for K is L bytes (as the hash output length). »» this is an interesting fact leading to potential problems, but it does not make pbkdf-hmac-sha1 unsecure
ipad, opad (inner/outer pad) are two fixed and different strings defined as
ipad = the byte 0x36 repeated B times
opad = the byte 0x5C repeated B times.
0x5C? “Their values have been arbitrarily chosen by the HMAC designers, and any pair (opad,ipad) could have been selected, as long as opad≠ipad. #”
string hash_hmac ( string $algo , string $data , string $key [, bool $raw_output = false ] )
In ColdFusion, the hmac() function exists starting ColdFusion 10, while in the Open Source world Railo had introduced it with version 4 (see cfml.io)
hmac(object message,object key,[string algorithm,[string encoding]]):string
Custom implementations of the function: here, here and here
3. Password-based Key Derivation Function (PBKDF)
A key derivation function (or KDF) derives the encryption key from a master password. Specifications
PBKDF2 applies HMAC to the input password along with a salt value and repeats the process many times to produce a derived key, which can then be used as a cryptographic key in subsequent operations. The added computational work makes password cracking much more difficult, and is known as key stretching. When the standard was written in 2000, the recommended minimum number of iterations was 1000, but the parameter is intended to be increased over time as CPU speeds increase.
Having a salt added to the password reduces the ability to use precomputed hashes (rainbow tables) for attacks, and means that multiple passwords have to be tested individually, not all at once. The standard recommends a salt length of at least 64 bits.
In ColdFusion the PBKDF support was introduced very recently (April 2014) – with ColdFusion 11:
GeneratePBKDFKey(algorithm, inputString, salt, iterations, keysize) (algorithm can be ‘PBKDF2WithHmacSHA1’)
Same story with PHP, only supporting PBKDF starting version 5.5.0:
string hash_pbkdf2 ( string $algo , string $password , string $salt , int $iterations [, int $length = 0 [, bool $raw_output = false ]] )