Saturday, May 09, 2009

Colin Percival's scrypt: a new chapter in the history of encryption


Today at the 2009 BSDCan conference, Colin Percival (of FreeBSD and Tarsnap) announced a new key derivation function called scrypt. I am not a math person by any means - and the slides from the presentation explain it pretty well - but what Percival has done here is phenomenal. I feel lucky to have been there in person for the presentation.

The basic concept is that sequential memory-hard algorithms make dedicated encryption ASICs significantly more costly because they are sequential (difficult to run in parallel) and memory-hard (require a significant amount of RAM relative to the size of the input.)

In other words, making the math hard and difficult to run on ten thousand cores increases the time required, and making the math need a lot of room to be worked on increases the money required to crack the password. This is because it's relatively cheap to put 10,000 encryption-specific CPU mini-cores on a single die, but it's expensive to feed them piles of fast RAM in a small space.

This is nothing short of historic.

Brute-force attacks against encrypted passwords will change overnight - for passwords created after scrypt becomes available, that is. What was once a trivial password to crack becomes decidedly less trivial - literally by an order of magnitude. By Percival's back-of-the-slide-rule math, an 8-character password with good entropy (pretty random-looking) encrypted in .1 seconds (good enough for authentication speeds) will take on the rough order of CAD$4.8M to crack. By contrast, the previous best method (bcrypt) would cost $130K.

Longer strings (source text longer than passwords) become decidedly stronger by much greater factors because the function can spend more time on them, as the user has less of an interactive expectation for file encryption. If allowed to chew on the data for 5 seconds or less, a relatively short English text string encrypted using scrypt would take CAD$210B (vs $47M for bcrypt(!))

This table from the presentation shows the rough cost estimates to crack a string in 1 year (click to enlarge):


The last column may look odd because I've taken this slide out of context. Because the 40-character string is English text, it has decidedly less entropy, and is therefore easier to crack.

This isn't just theory; Percival has published version 1.0 of working FreeBSD demonstration code.

I decided to check it out in its early state. Out of the box, the demo code makes some quick checks of your system to see how much RAM you have, and tries to sanely limit how much it uses - because it can use a lot of RAM, very very quickly. There are also limits on how large a passphrase it will currently accept, as well as other parameters automatically sized based your system's memory size.

On a quick check of the code, the options are:

  • m - maximum memory fraction (default: .5, or 50% of RAM as reported by the hw.usermem sysctl for decryption; .128 for encryption)

  • M - maximum memory in bytes (default: 0)

  • t - maximum time in seconds (default: 5 for encryption; 300 for decryption)

In my testing on a 512M system, I couldn't immediately get it to eat more than 128M of RAM; not sure if this is fully by design, or if I'm passing the parameters incorrectly.
royce@heffalump$ time ./scrypt enc -m 1 -t 20 rand.in rand-scrypt.out
Please enter passphrase:
Please confirm passphrase:

real 0m25.090s
user 0m17.017s
sys 0m0.143s
The resulting file's header may not survive in future versions ... but for now, you can tell where it came from in the first six bytes:
royce@heffalump$ cut -b1-6 rand-scrypt.out | head -1
scrypt
At the presentation, FreeBSD Core Team member Brooks Davis asked a good question about the impact of large volumes of users trying to authenticate simultaneously on a system without huge amounts of RAM. Percival replied that the memory and other options could be tuned to fit the profile of a particular system. These parameters would be stored in the shadow password field, much as the salt and rounds are stored there today.

I look forward to seeing a full-fledged scrypt(3) show up in FreeBSD and OpenSSL ... and I'm sure that the Three Letter Agencies do not.
blog comments powered by Disqus