Fixing ECB hard disk encryption
- From: Paul Rubin <http://phr.cx@xxxxxxxxxxxxxx>
- Date: 29 Mar 2009 20:49:54 -0700
I see there are some low cost SATA disk enclosures (e.g. addonics.com
has a bunch) containing a SATA-to-SATA bridge chip that does AES
encryption, so the enclosure encrypts the entire hard drive with
"256-bit AES" (insert usual marketing drool about how unbreakable this
is). Unfortunately it looks like the bridge chip uses ECB mode--oops.
The data sheet says the chip can do CBC mode on receipt of a signed
agreement, making it sound insecure on purpose! Anyway, let's assume
ECB mode that means the chip encrypts each 16-byte data block with a
fixed secret key K (K is stored on an EEPROM hardware token that you
plug into the enclosure).
I'm wondering if it's still possible to use such an enclosure to
securely encrypt a disk, by doing a little bit of software
preprocessing but avoiding the overhead of running an actual software
block or stream cipher (the fastest of those are still several cycles
per byte). Does the following plan (including the proposed security
criteria) sound ok?
1. The proposed threat model is simply that the hard drive is stolen
and examined (but the user notices the disappearance), or
alternatively that the drive someday malfunctions and can't be spun up
(and therefore can't be erased) by normal means, creating a disposal
problem. In particular, we don't worry about attacks involving
examining the drive contents multiple times while the drive is in
service: any attacker with that kind of hardware access can intercept
the plaintext directly and not worry about the encryption.
2. Let's further treat the disk contents as 16-byte blocks of data (==
the AES blocksize). Call the plaintext P0, P1, ... P_N. N can be on
the order of 2**40 (that would be a 16 terabyte RAID). We are
concerned that correlations in the plaintext may be detected in the
ciphertext. With pure ECB mode, this obviously happens when two
plaintext blocks are equal (say, all zero), so we want to design an
alternative mode using ECB as a building block. What other
correlations could happen? We'll take the worst case: the plaintext
is chosen by the attacker. On the other hand, from our threat model,
it's not an adaptive choice--the attacker supplies ONE plaintext
vector P0...P_n and gets back ONE ciphertext vector C_0...C_n. The
system is secure if the attacker can't distinguish the ciphertext from
a random string with better than small advantage.
Question: is the above criterion reasonable?
3. Let R be a random, secret 128-bit number. Our proposed encryption
function is: C_i = AES_K(P_i + R*i). That is, we're going to modify
the plaintext vector by pointwise adding the arithmetic sequence
(0,R,2R,3R,...) before sending it through AES. This should be very
fast in software--to access a bunch of consecutive blocks (e.g. for a
given 4KB disk sector number n) we just do one 128-bit multiplication
to get the initial offset, then one 128-bit addition per AES block,
all very fast using the x86 XMM instructions or something similar.
Now we want to know what this does to the ciphertext.
4) There are three possibilities:
(a) Some two ciphertext blocks are identical (C_i == C_j for some i,j).
(b) All ciphertext blocks are distinct and appear uncorrelated
(c) All ciphertext blocks are distinct but contain some
detectable correlation
Looking at the above in order:
(a) Since AES is a permutation, (a) can only happen if (P_i - P_j) ==
R*(i-j). So there must be a pair (i,j) where ((P_i-P_j) / (i-j))
exactly coincides with R. There are about 2**80 possible i,j pairs
and 2**128 possible R's, so the chance of such a pair occurring is no
higher than about 2**-48, which we deem sufficiently small for this
application. [Am I missing something here? This doesn't seem rigorous].
(b) If the ciphertext blocks appear uncorrelated, they look random, so
we win. (They will almost always be distinct because the chance of a
birthday collision on this number of blocks is quite small).
(c) This should not happen, because if the ciphertext blocks are
distinct, then the AES inputs are distinct, and since the number of
AES inputs involved is much less than sqrt(2**128), from the
PRP-PRF switching lemma, we shouldn't be able to tell the AES
outputs from the output of a random function.
So if the above is correct, we have a way to combine a very fast
software transformation (just adding a simple arithmetic sequence to
the plaintext) combined with this deficient but cheap AES hardware, to
result in a secure system.
Does this sound reasonable? Does it sound worth hacking the Linux
kernel to support that device?
Thanks for any thoughts. I like to think there is a simple security
proof inside the above somewhere, but I'm not yet good at writing them
out explicitly.
.
- Follow-Ups:
- Re: Fixing ECB hard disk encryption
- From: Kristian Gjøsteen
- Re: Fixing ECB hard disk encryption
- Prev by Date: Re: Conficker C and Ron Rivest
- Next by Date: Re: Fixing ECB hard disk encryption
- Previous by thread: Best IPsec opensource code & Key Management tool
- Next by thread: Re: Fixing ECB hard disk encryption
- Index(es):
Relevant Pages
|