Some security basics

Written on 24 September 2014, 04:33pm

Tagged with: ,

This is an attempt to understand the basics of cryptography. The very basics 🙂
Beware of a link-intensive post, it is meant (as many others) to serve me as a reference.
It started with the recent iCloud privacy problems, then the article about hashing of secrets intrigued me a bit and made me curious to read more about this field. So here it is.

Hashing vs Encrypting vs Encoding

Hashing – irreversible; used to check integrity of data, to irreversibly encode data (passwords) and also to sign data (in conjunction with HMAC).
Encrypting – reversible; used for maintaining data confidentiality
Encoding – reversible, for usability (ex Base64Encode) #

Update 16/Dec/2014: There is a small debate whether applying ROT13 to a string is considered encryption or not. ROT13 is a very simple substitution cipher (one of the 26 possible ones) – which substitutes each letter by another one placed 13 positions further in the alphabet.
I would say that ROT13 is a form of encryption; true, a very very weak one. But it has an algorithm (substitution of letters) and a key (13 positions). So in theory it encodes a message so that only authorized parties can read it. In practice, almost anyone with a basic motivation can read it.

Hashing vs HMAC vs KDF

1. Hashing algorithms

A hashing algorithm converts a variable-length string to a fixed-length string that can act as a “fingerprint” or unique identifier for the original string. It is not possible to convert the hash result back to the source string.
In ColdFusion:
Hash(string [, algorithm [, encoding ]])
In PHP:
string hash ( string $algo, string $data [, bool $raw_output = false ] )

2. HMAC (Hash-Based Message Authentication Codes)

HMAC is used to verify the data integrity and authenticity of a message transmitted. It involves a cryptographic hash function in combination with a secret key.

According to the official specifications, HMAC is defined as:
H(K XOR opad, H(K XOR ipad, text))
where:
H is a cryptographic hash function where data is hashed by iterating a basic compression function on blocks of data
B is the byte-length of such blocks (B=64 for MD5, SHA-1)
L is the byte-length of hash outputs (L=16 for MD5, L=20 for SHA-1)
K is the authentication key and can be of any length up to B, the block length of the hash function.
Applications that use keys longer than B bytes will first hash the key using H and then use the resultant L byte string as the actual key to HMAC. In any case the minimal recommended length for K is L bytes (as the hash output length). »» this is an interesting fact leading to potential problems, but it does not make pbkdf-hmac-sha1 unsecure
ipad, opad (inner/outer pad) are two fixed and different strings defined as
ipad = the byte 0x36 repeated B times
opad = the byte 0x5C repeated B times.
Why 0x36 and 0x5C? “Their values have been arbitrarily chosen by the HMAC designers, and any pair (opad,ipad) could have been selected, as long as opad≠ipad. #

In PHP:
string hash_hmac ( string $algo , string $data , string $key [, bool $raw_output = false ] )

In ColdFusion, the hmac() function exists starting ColdFusion 10, while in the Open Source world Railo had introduced it with version 4 (see cfml.io)
hmac(object message,object key,[string algorithm,[string encoding]]):string
Custom implementations of the function: here, here and here

3. Password-based Key Derivation Function (PBKDF)

A key derivation function (or KDF) derives the encryption key from a master password. Specifications

PBKDF2 applies HMAC to the input password along with a salt value and repeats the process many times to produce a derived key, which can then be used as a cryptographic key in subsequent operations. The added computational work makes password cracking much more difficult, and is known as key stretching. When the standard was written in 2000, the recommended minimum number of iterations was 1000, but the parameter is intended to be increased over time as CPU speeds increase.
Having a salt added to the password reduces the ability to use precomputed hashes (rainbow tables) for attacks, and means that multiple passwords have to be tested individually, not all at once. The standard recommends a salt length of at least 64 bits.
http://en.wikipedia.org/wiki/PBKDF2

In ColdFusion the PBKDF support was introduced very recently (April 2014) – with ColdFusion 11:
GeneratePBKDFKey(algorithm, inputString, salt, iterations, keysize) (algorithm can be ‘PBKDF2WithHmacSHA1’)

Same story with PHP, only supporting PBKDF starting version 5.5.0:
string hash_pbkdf2 ( string $algo , string $password , string $salt , int $iterations [, int $length = 0 [, bool $raw_output = false ]] )

Use cases

(more…)

Algorithm complexity – easy understanding

Written on 23 May 2013, 05:02pm

Tagged with:

O(1) – Time to complete is the same regardless of the size of input set. An example is accessing an array element by index.
O(Log N) – Time to complete increases roughly in line with the log2(n). binary search / divide and conquer
O(N) – Time to complete that scales linearly with the size of the input set. In other words if you double the number of items in the input set, the algorithm takes roughly twice as long. An example is counting the number of items in a linked list.
O(N Log N) – Time to complete increases by the number of items times the result of Log2(N). An example of this is heap sort and quick sort.
O(N^2) – Time to complete is roughly equal to the square of the number of items. An example of this is bubble sort.
O(N!) – Time to complete is the factorial of the input set. An example of this is the traveling salesman problem brute-force solution.

http://stackoverflow.com/a/7310676/517717

And the Phone book example:

O(1): Given the page that a person’s name is on and their name, find the phone number.

O(log n): Given a person’s name, find the phone number by picking a random point about halfway through the part of the book you haven’t searched yet, then checking to see whether the person’s name is at that point. Then repeat the process about halfway through the part of the book where the person’s name lies. (This is a binary search for a person’s name.)

O(n): Given a phone number, find the person or business with that number (reverse look up).

O(n log n): There was a mix-up at the printer’s office, and our phone book had all its pages inserted in a random order. Fix the ordering so that it’s correct by looking at the first name on each page and then putting that page in the appropriate spot in a new, empty phone book. (n moving pages * log(n) comparisons in the new address book)

For the below examples, we’re now at the printer’s office. Phone books are waiting to be mailed to each resident or business, and there’s a sticker on each phone book identifying where it should be mailed to. Every person or business gets one phone book.

O(n log n): We want to personalize the phone book, so we’re going to find each person or business’s name in their designated copy, then circle their name in the book and write a short thank-you note for their patronage. (n books, log n for names search)

O(n^2): A mistake occurred at the office, and every entry in each of the phone books has an extra “0” at the end of the phone number. Take some white-out and remove each zero. (n books, n names)

http://stackoverflow.com/a/2307314/517717

Update: Big O Cheatsheet.com/