Thursday, May 14, 2009

The Bill Paul clause to the BSD license

One of the lesser-known variants. I use "clause" here loosely; it's really in the disclaimer.

THIS SOFTWARE IS PROVIDED BY Bill Paul AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL Bill Paul OR THE VOICES IN HIS HEAD
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
THE POSSIBILITY OF SUCH DAMAGE.

I'm attributing this to Bill Paul based solely on the frequency count in the FreeBSD source tree, though there are others. This sample is from 7.2-RELEASE:

147 ENT SHALL Bill Paul OR THE
28 ENT SHALL THE AUTHOR OR THE
10 ENT SHALL NICK HIBMA OR THE
9 ENT SHALL ICHIRO FUKUHARA OR THE
1 ENT SHALL Ivan Sharov OR THE
1 ENT SHALL David Hulton OR THE

The "THE AUTHOR OR THE VOICES IN HIS HEAD" instances are mostly in some ACL-related code, primarily by Chris Faulhaber. Hard to say which of these are cut-and-paste, and which are deliberate.

Either way, I'm using it from now on. :-)

Saturday, May 09, 2009

Colin Percival's scrypt: a new chapter in the history of encryption


Today at the 2009 BSDCan conference, Colin Percival (of FreeBSD and Tarsnap) announced a new key derivation function called scrypt. I am not a math person by any means - and the slides from the presentation explain it pretty well - but what Percival has done here is phenomenal. I feel lucky to have been there in person for the presentation.

The basic concept is that sequential memory-hard algorithms make dedicated encryption ASICs significantly more costly because they are sequential (difficult to run in parallel) and memory-hard (require a significant amount of RAM relative to the size of the input.)

In other words, making the math hard and difficult to run on ten thousand cores increases the time required, and making the math need a lot of room to be worked on increases the money required to crack the password. This is because it's relatively cheap to put 10,000 encryption-specific CPU mini-cores on a single die, but it's expensive to feed them piles of fast RAM in a small space.

This is nothing short of historic.

Brute-force attacks against encrypted passwords will change overnight - for passwords created after scrypt becomes available, that is. What was once a trivial password to crack becomes decidedly less trivial - literally by an order of magnitude. By Percival's back-of-the-slide-rule math, an 8-character password with good entropy (pretty random-looking) encrypted in .1 seconds (good enough for authentication speeds) will take on the rough order of CAD$4.8M to crack. By contrast, the previous best method (bcrypt) would cost $130K.

Longer strings (source text longer than passwords) become decidedly stronger by much greater factors because the function can spend more time on them, as the user has less of an interactive expectation for file encryption. If allowed to chew on the data for 5 seconds or less, a relatively short English text string encrypted using scrypt would take CAD$210B (vs $47M for bcrypt(!))

This table from the presentation shows the rough cost estimates to crack a string in 1 year (click to enlarge):


The last column may look odd because I've taken this slide out of context. Because the 40-character string is English text, it has decidedly less entropy, and is therefore easier to crack.

This isn't just theory; Percival has published version 1.0 of working FreeBSD demonstration code.

I decided to check it out in its early state. Out of the box, the demo code makes some quick checks of your system to see how much RAM you have, and tries to sanely limit how much it uses - because it can use a lot of RAM, very very quickly. There are also limits on how large a passphrase it will currently accept, as well as other parameters automatically sized based your system's memory size.

On a quick check of the code, the options are:

  • m - maximum memory fraction (default: .5, or 50% of RAM as reported by the hw.usermem sysctl for decryption; .128 for encryption)

  • M - maximum memory in bytes (default: 0)

  • t - maximum time in seconds (default: 5 for encryption; 300 for decryption)

In my testing on a 512M system, I couldn't immediately get it to eat more than 128M of RAM; not sure if this is fully by design, or if I'm passing the parameters incorrectly.
royce@heffalump$ time ./scrypt enc -m 1 -t 20 rand.in rand-scrypt.out
Please enter passphrase:
Please confirm passphrase:

real 0m25.090s
user 0m17.017s
sys 0m0.143s
The resulting file's header may not survive in future versions ... but for now, you can tell where it came from in the first six bytes:
royce@heffalump$ cut -b1-6 rand-scrypt.out | head -1
scrypt
At the presentation, FreeBSD Core Team member Brooks Davis asked a good question about the impact of large volumes of users trying to authenticate simultaneously on a system without huge amounts of RAM. Percival replied that the memory and other options could be tuned to fit the profile of a particular system. These parameters would be stored in the shadow password field, much as the salt and rounds are stored there today.

I look forward to seeing a full-fledged scrypt(3) show up in FreeBSD and OpenSSL ... and I'm sure that the Three Letter Agencies do not.