Tải bản đầy đủ - 0 (trang)
12 Precomputing Keystream in OFB, CTR, CCM, or CWC Modes (or with Stream Ciphers)

12 Precomputing Keystream in OFB, CTR, CCM, or CWC Modes (or with Stream Ciphers)

Tải bản đầy đủ - 0trang

out += SPC_BLOCK_SZ;

}

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr, ctx->ksm);

ctr_increment(ctx->ctr);

for (i = 0; i ksm[ctx->ix++];

return 1;

}



Note that we simply remove the in argument along with the XOR operation whenever we write to the output buffer.



5.13 Parallelizing Encryption and Decryption in

Modes That Allow It (Without Breaking

Compatibility)

Problem

You want to parallelize encryption, decryption, or keystream generation.



Solution

Only some cipher modes are naturally parallelizable in a way that doesn’t break compatibility. In particular, CTR mode is naturally parallizable, as are decryption with

CBC and CFB. There are two basic strategies: one is to treat the message in an interleaved fashion, and the other is to break it up into a single chunk for each parallel

process.

The first strategy is generally more practical. However, it is often difficult to make

either technique result in a speed gain when processing messages in software.



Discussion

Parallelizing encryption and decryption does not necessarily result in a

speed improvement. To provide any chance of a speedup, you’ll certainly need to ensure that multiple processors are working in parallel.

Even in such an environment, data sets may be too small to run faster

when they are processed in parallel.



Some cipher modes can have independent parts of the message operated upon independently. In such cases, there is the potential for parallelization. For example, with

CTR mode, the keystream is computed in blocks, where each block of keystream is

generated by encrypting a unique plaintext block. Those blocks can be computed in

any order.



208



|



Chapter 5: Symmetric Encryption

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



In CBC, CFB, and OFB modes, encryption can’t really be parallelized because the

ciphertext for a block is necessary to create the ciphertext for the next block; thus,

we can’t compute ciphertext out of order. However, for CBC and CFB, when we

decrypt, things are different. Because we only need the ciphertext of a block to

decrypt the next block, we can decrypt the next block before we decrypt the first

one.

There are two reasonable strategies for parallelizing the work. When a message

shows up all at once, you might divide it roughly into equal parts and handle each

part separately. Alternatively, you can take an interleaved approach, where alternating blocks are handled by different threads. That is, the actual message is separated

into two different plaintexts, as shown in Figure 5-5.

Original message



M1



1st plaintext



M1



2nd plaintext



M2



M3



M4



M3

M2



M5

M5



M4



Figure 5-5. Encryption through interleaving



If done correctly, both approaches will result in the correct output. We generally prefer the interleaving approach, because all threads can do work with just a little bit of

data available. This is particularly true in hardware, where buffers are small.

With a noninterleaving approach, you must wait at least until the length of the message is known, which is often when all of the data is finally available. Then, if the

message length is known in advance, you must wait for a large percentage of the data

to show up before the second thread can be launched.

Even the interleaved approach is a lot easier when the size of the message is known

in advance because it makes it easier to get the message all in one place. If you need

the whole message to come in before you know the length, parallelization may not be

worthwhile, because in many cases, waiting for an entire message to come in before

beginning work can introduce enough latency to thwart the benefits of parallelization.

If you aren’t generally going to get an entire message all at once, but you are able to

determine the biggest message you might get, another reasonably easy approach is to

allocate a result buffer big enough to hold the largest possible message.

For the sake of simplicity, let’s assume that the message arrives all at once and you

might want to process a message with two parallel threads. The following code provides an example API that can handle CTR mode encryption and decryption in parallel (remember that encryption and decryption are the same operation in CTR mode).



Parallelizing Encryption and Decryption in Modes That Allow It (Without Breaking Compatibility) | 209

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



Because we assume the message is available up front, all of the information we need

to operate on a message is passed into the function spc_pctr_setup( ), which requires

a context object (here, the type is SPC_CTR2_CTX), the key, the key length in bytes, a

nonce SPC_BLOCK_SZ - SPC_CTR_BYTES in length, the input buffer, the length of the

message, and the output buffer. This function does not do any of the encryption and

decryption, nor does it copy the input buffer anywhere.

To process the first block, as well as every second block after that, call spc_pctr_do_

odd( ), passing in a pointer to the context object. Nothing else is required because the

input and output buffers used are the ones passed to the spc_pctr_setup( ) function.

If you test, you’ll notice that the results are exactly the same as with the CTR mode

implementation from Recipe 5.9.

This code requires the preliminaries from Recipe 5.5, as well as the spc_memset( )

function from Recipe 13.2.

#include

#include

typedef struct {

SPC_KEY_SCHED ks;

size_t

len;

unsigned char ctr_odd[SPC_BLOCK_SZ];

unsigned char ctr_even[SPC_BLOCK_SZ];

unsigned char *inptr_odd;

unsigned char *inptr_even;

unsigned char *outptr_odd;

unsigned char *outptr_even;

} SPC_CTR2_CTX;

static void pctr_increment(unsigned char *ctr) {

unsigned char *x = ctr + SPC_CTR_BYTES;

while (x-- != ctr) if (++(*x)) return;

}

void spc_pctr_setup(SPC_CTR2_CTX *ctx, unsigned char *key, size_t kl,

unsigned char *nonce, unsigned char *in, size_t len,

unsigned char *out) {

SPC_ENCRYPT_INIT(&(ctx->ks), key, kl);

spc_memset(key,0, kl);

memcpy(ctx->ctr_odd, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);

spc_memset(ctx->ctr_odd + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);

memcpy(ctx->ctr_even, nonce, SPC_BLOCK_SZ - SPC_CTR_BYTES);

spc_memset(ctx->ctr_even + SPC_BLOCK_SZ - SPC_CTR_BYTES, 0, SPC_CTR_BYTES);

pctr_increment(ctx->ctr_even);

ctx->inptr_odd

= in;

ctx->inptr_even = in + SPC_BLOCK_SZ;

ctx->outptr_odd = out;

ctx->outptr_even = out + SPC_BLOCK_SZ;

ctx->len

= len;

}



210



|



Chapter 5: Symmetric Encryption

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



void spc_pctr_do_odd(SPC_CTR2_CTX *ctx) {

size_t

i, j;

unsigned char final[SPC_BLOCK_SZ];

for (i = 0; i + SPC_BLOCK_SZ < ctx->len; i += 2 * SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_odd, ctx->outptr_odd);

pctr_increment(ctx->ctr_odd);

pctr_increment(ctx->ctr_odd);

for (j = 0; j < SPC_BLOCK_SZ / sizeof(int); j++)

((int *)ctx->outptr_odd)[j] ^= ((int *)ctx->inptr_odd)[j];

ctx->outptr_odd += SPC_BLOCK_SZ * 2;

ctx->inptr_odd += SPC_BLOCK_SZ * 2;

}

if (i < ctx->len) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_odd, final);

for (j = 0; j < ctx->len - i; j++)

ctx->outptr_odd[j] = final[j] ^ ctx->inptr_odd[j];

}

}

void spc_pctr_do_even(SPC_CTR2_CTX *ctx) {

size_t

i, j;

unsigned char final[SPC_BLOCK_SZ];

for (i = SPC_BLOCK_SZ; i + SPC_BLOCK_SZ < ctx->len; i += 2 * SPC_BLOCK_SZ) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_even, ctx->outptr_even);

pctr_increment(ctx->ctr_even);

pctr_increment(ctx->ctr_even);

for (j = 0; j < SPC_BLOCK_SZ / sizeof(int); j++)

((int *)ctx->outptr_even)[j] ^= ((int *)ctx->inptr_even)[j];

ctx->outptr_even += SPC_BLOCK_SZ * 2;

ctx->inptr_even += SPC_BLOCK_SZ * 2;

}

if (i < ctx->len) {

SPC_DO_ENCRYPT(&(ctx->ks), ctx->ctr_even, final);

for (j = 0; j < ctx->len - i; j++)

ctx->outptr_even[j] = final[j] ^ ctx->inptr_even[j];

}

}

int spc_pctr_final(SPC_CTR2_CTX *ctx) {

spc_memset(&ctx, 0, sizeof(SPC_CTR2_CTX));

return 1;

}



See Also

Recipes 5.5, 5.9, 13.2



Parallelizing Encryption and Decryption in Modes That Allow It (Without Breaking Compatibility) | 211

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



5.14 Parallelizing Encryption and Decryption in

Arbitrary Modes (Breaking Compatibility)

Problem

You are using a cipher mode that is not intrinsically parallelizable, but you have a

large data set and want to take advantage of multiple processors at your disposal.



Solution

Treat the data as multiple streams of interleaved data.



Discussion

Parallelizing encryption and decryption does not necessarily result in a

speed improvement. To provide any chance of a speedup, you will certainly need to ensure that multiple processors are working in parallel.

Even in such an environment, data sets may be too small to run faster

when they are processed in parallel.



Recipe 5.13 demonstrates how to parallelize CTR mode encryption on a per-block

level using a single encryption context. Instead of having spc_pctr_do_even( ) and

spc_pctr_do_odd( ) share a key and nonce, you could use two separate encryption

contexts. In such a case, there is no need to limit your choice of mode to one that is

intrinsically parallelizable. However, note that you won’t get the same results when

using two separate contexts as you do when you use a single context, even if you use

the same key and IV or nonce (remembering that IV/nonce reuse is a bad idea—and

that certainly applies here).

One consideration is how much to interleave. There’s no need to interleave on a block

level. For example, if you are using two parallel encryption contexts, you could encrypt

the first 1,024 bytes of data with the first context, then alternate every 1,024 bytes.

Generally, it is best to use a different key for each context. You can derive multiple

keys from a single base key, as shown in Recipe 4.11.

It’s easiest to consider interleaving only at the plaintext level, particularly if you’re

using a block-based mode, where padding will generally be added for each cipher

context. In such a case, you would send the encrypted data in multiple independent

streams and reassemble it after decryption.



See Also

Recipes 4.11, 5.13



212



|



Chapter 5: Symmetric Encryption

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



5.15 Performing File or Disk Encryption

Problem

You want to encrypt a file or a disk.



Solution

If you’re willing to use a nonce or an initialization vector, standard modes such as

CBC and CTR are acceptable. For file-at-a-time encryption, you can avoid the use of

a nonce or IV altogether by using the LION construction, described in the “Discussion” section.

Generally, keys will be generated from a password. For that, use PKCS #5, as discussed in Recipe 4.10.



Discussion

Disk encryption is usually done in fixed-size chunks at the operating system level.

File encryption can be performed in chunks so that random access to an encrypted

file doesn’t require decrypting the entire file. This also has the benefit that part of a

file can be changed without reencrypting the entire file.

CBC mode is commonly used for this purpose, and it is used on chunks that are a

multiple of the block size of the underlying block cipher, so that padding is never

necessary. This eliminates any message expansion that one would generally expect

with CBC mode.

However, when people are doing disk or file encryption with CBC mode, they often

use a fixed initialization vector. That’s a bad idea because an initialization vector is

expected to be random for CBC mode to obtain its security goals. Using a fixed IV

leads to dictionary-like attacks that can often lead to recovering, at the very least, the

beginning of a file.

Other modes that require only a nonce (not an initialization vector) tend to be

streaming modes. These fail miserably when used for disk encryption if the nonce

does not change every single time the contents associated with that nonce change.

Keys for disk encryption are generally created from a password. Such

keys will be only as strong as the password. See Recipe 4.10 for a discussion of turning a password into a cryptographic key.



For example, if you’re encrypting file-by-file in 8,192-byte chunks, you need a separate nonce for each 8,192-byte chunk, and you need to select a new nonce every sin-



Performing File or Disk Encryption | 213

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



gle time you want to protect a modified version of that chunk. You cannot just make

incremental changes, then reencrypt with the same nonce.

In fact, even for modes where sequential nonces are possible, they really don’t make

much sense in the context of file encryption. For example, some people think they

can use just one CTR mode nonce for the entire disk. But if you ever reuse the same

piece of keystream, there are attacks. Therefore, any time you change even a small

piece of data, you will have to reencrypt the entire disk using a different nonce to

maintain security. Clearly, that isn’t practical.

Therefore, no matter what mode you choose to use, you should choose random initial values.

Many people don’t like IVs or nonces for file encryption because of storage space

issues. They believe they shouldn’t “waste” space on storing an IV or nonce. When

you’re encrypting fixed-size chunks, there are not any viable alternatives; if you want

to ensure security, you must use an IV.

If you’re willing to accept message expansion, you might want to consider a highlevel mode such as CWC, so that you can also incorporate integrity checks. In practice, integrity checks are usually ignored on filesystems, though, and the filesystems

trust that the operating system’s access control system will ensure integrity.

Actually, if you’re willing to encrypt and decrypt on a per-file basis, where you cannot decrypt the file in parts, you can actually get rid of the need for an initialization

vector by using LION, which is a construction that takes a stream cipher and hash

function and turns them into a block cipher that has an arbitrary block size. Essentially, LION turns those constructs into a single block cipher that has a variable

block length, and you use the cipher in ECB mode.

Throughout this book, we repeatedly advise against using raw block cipher operations for things like file encryption. However, when the block size is always the same

length as the message you want to encrypt, ECB mode isn’t so bad. The only problem is that, given a {key, plaintext} pair, an unchanged file will always encrypt to the

same value. Therefore, an attacker who has seen a particular file encrypted once can

find any unchanged versions of that file encrypted with the same key. A single

change in the file thwarts this problem, however. In practice, most people probably

won’t be too concerned with this kind of problem.

Using raw block cipher operations with LION is useful only if the block size really is

the size of the file. You can’t break the file up into 8,192-byte chunks or anything

like that, which can have a negative impact on performance, particularly as the file

size gets larger.

Considering what we’ve discussed, something like CBC mode with a randomly chosen IV per block is probably the best solution for pretty much any use, even if it does

take up some additional disk space. Nonetheless, we recognize that people may want

to take an approach where they only need to have a key, and no IV or nonce.



214



|



Chapter 5: Symmetric Encryption

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



Therefore, we’ll show you LION, built out of the RC4 implementation from Recipe

5.23 and SHA1 (see Recipe 6.7). The structure of LION is shown in Figure 5-6.

While we cover RC4 because it is popular, we strongly recommend

you use SNOW 2.0 instead, because it seems to have a much more

comfortable security margin.



The one oddity of this technique is that files must be longer than the output size of

the message digest function (20 bytes in the case of SHA1). Therefore, if you have

files that small, you will either need to come up with a nonambiguous padding

scheme, which is quite complicated to do securely, or you’ll need to abandon LION

(either just for small messages or in general).

LION requires a key that is twice as long as the output size of the message digest

function. As with regular CBC-style encryption for files, if you’re using a cipher that

takes fixed-size keys, we expect you’ll generate a key of the appropriate length from a

password.

L0



Plaintext

0



R0

20



n



K0



Round 1



RC4

L0

0



R1

20



n

Round 2



SHA 1

L1

0



R1

20



n



K1

Round 3



RC4

L1



Ciphertext

0



R2

20



n



= XOR



Figure 5-6. The structure of LION



We also assume a SHA1 implementation with a very standard API. Here, we use an

API that works with OpenSSL, which should be easily adaptable to other libraries.



Performing File or Disk Encryption | 215

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



To switch hash functions, replace the SHA1 calls as appropriate, and change the

value of HASH_SZ to be the digest size of the hash function that you wish to use.

The function spc_lion_encrypt( ) encrypts its first argument, putting the result into

the memory pointed to by the second argument. The third argument specifies the

size of the message, and the last argument is the key. Again, note that the input size

must be larger than the hash function’s output size.

The spc_lion_decrypt( ) function takes a similar argument set as spc_lion_encrypt( ),

merely performing the inverse operation.

#include

#include

#include

#define HASH_SZ

20

#define NUM_WORDS (HASH_SZ / sizeof(int))

void spc_lion_encrypt(char *in, char *out, size_t blklen, char *key) {

int

i, tmp[NUM_WORDS];

RC4_KEY k;

/* Round 1: R = R ^ RC4(L ^ K1) */

for (i = 0; i < NUM_WORDS; i++)

tmp[i] = ((int *)in)[i] ^ ((int *)key)[i];

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, in + HASH_SZ, out + HASH_SZ);

/* Round 2: L = L ^ SHA1(R) */

SHA1(out + HASH_SZ, blklen - HASH_SZ, out);

for (i = 0; i < NUM_WORDS; i++)

((int *)out)[i] ^= ((int *)in)[i];

/* Round 3: R = R ^ RC4(L ^ K2) */

for (i = 0; i < NUM_WORDS; i++)

tmp[i] = ((int *)out)[i] ^ ((int *)key)[i + NUM_WORDS];

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, out + HASH_SZ, out + HASH_SZ);

}

void spc_lion_decrypt(char *in, char *out, size_t blklen, char *key) {

int

i, tmp[NUM_WORDS];

RC4_KEY k;

for (i = 0; i < NUM_WORDS; i++)

tmp[i] = ((int *)in)[i] ^ ((int *)key)[i + NUM_WORDS];

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, in + HASH_SZ, out + HASH_SZ);

SHA1(out + HASH_SZ, blklen - HASH_SZ, out);

for (i = 0; i < NUM_WORDS; i++) {

((int *)out)[i] ^= ((int *)in)[i];

tmp[i] = ((int *)out)[i] ^ ((int *)key)[i];



216



|



Chapter 5: Symmetric Encryption

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



}

RC4_set_key(&k, HASH_SZ, (char *)tmp);

RC4(&k, blklen - HASH_SZ, out + HASH_SZ, out + HASH_SZ);

}



See Also

Recipes 4.10, 5.23, 6.7



5.16 Using a High-Level, Error-Resistant

Encryption and Decryption API

Problem

You want to do encryption or decryption without the hassle of worrying about

choosing an encryption algorithm, performing an integrity check, managing a nonce,

and so on.



Solution

Use the following “Encryption Queue” implementation, which relies on the reference CWC mode implementation (discussed in Recipe 5.10) and the key derivation

function from Recipe 4.11.



Discussion

Be sure to take into account the fact that functions in this API can fail,

particularly the decryption functions. If a decryption function fails,

you need to fail gracefully. In Recipe 9.12, we discuss many issues that

help ensure robust network communication that we don’t cover here.



This recipe provides an easy-to-use interface to symmetric encryption. The two ends

of communication must set up cipher queues in exactly the same configuration.

Thereafter, they can exchange messages easily until the queues are destroyed.

This code relies on the reference CWC implementation discussed in Recipe 5.10. We

use CWC mode because it gives us both encryption and integrity checking using a

single key with a minimum of fuss.

We add a new data type, SPC_CIPHERQ, which is responsible for keeping track of

queue state. Here’s the declaration of the SPC_CIPHERQ data type:

typedef struct {

cwc_t

ctx;

unsigned char nonce[SPC_BLOCK_SZ];

} SPC_CIPHERQ;



Using a High-Level, Error-Resistant Encryption and Decryption API | 217

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



SPC_CIPHERQ objects are initialized by calling spc_cipherq_setup( ), which requires

the code from Recipe 5.5, as well as an implementation of the randomness API discussed in Recipe 11.2:

#include

#include

#include

#define MAX_KEY_LEN (32)



/* 256 bits */



size_t spc_cipherq_setup(SPC_CIPHERQ *q, unsigned char *basekey, size_t keylen,

size_t keyuses) {

unsigned char dk[MAX_KEY_LEN];

unsigned char salt[5];

spc_rand(salt, 5);

spc_make_derived_key(basekey, keylen, salt, 5, 1, dk, keylen);

if (!cwc_init(&(q->ctx), dk, keylen * 8)) return 0;

memcpy(q->nonce, salt, 5);

spc_memset(basekey, 0, keylen);

return keyuses + 1;

}



The function has the following arguments:

q

SPC_CIPHERQ context object.

basekey



Shared key used by both ends of communication (the “base key” that will be

used to derive session keys).

keylen



Length of the shared key in bytes, which must be 16, 24, or 32.

keyuses



Indicates how many times the current key has been used to initialize a SPC_

CIPHERQ object. If you are going to reuse keys, it is important that this argument

be used properly.

On error, spc_cipherq_setup() returns 0. Otherwise, it returns the

next value it would expect to receive for the keyuses argument. Be sure

to save this value if you ever plan to reuse keys.

Note also that basekey is erased upon successful initialization.



Every time you initialize an SPC_CIPHERQ object, a key specifically for use with that

queue instance is generated, using the basekey and the keyuses arguments. To derive

the key, we use the key derivation function discussed in Recipe 4.11. Note that this is

useful when two parties share a long-term key that they wish to keep reusing. However, if you exchange a session key at connection establishment (i.e., using one of the



218



|



Chapter 5: Symmetric Encryption

This is the Title of the Book, eMatter Edition

Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

12 Precomputing Keystream in OFB, CTR, CCM, or CWC Modes (or with Stream Ciphers)

Tải bản đầy đủ ngay(0 tr)

×