Re: DES implementation questions (FinalPermutation code)

From: David G. Koontz (david_koontz_at_xtra.co.nz)
Date: 06/15/05


Date: Thu, 16 Jun 2005 09:25:33 +1200

Brian Ross wrote:
>
> I do not end up with the same result as I started with - yet my DES test
> vectors indicate the complete encrypt/decrypt code is functioning correctly.
>
> It 'looks' like there is some sort of Left/Right swap taking place in the FP
> code as well as some of the inner bits pairs are swapped. I don't see how
> this even remotely resembles the standard final permutation but it must be
> correct since the overall code functions correctly. As far as I can tell the
> output from the DES rounds is normal with the exception of the rotate left 1
> bit still present and (I think) an extra L/R swap.
>
> Can you provide the little hint that I need to complete my understanding?
>
> Thanks
>

There is a left/right swap implied between the Final Permuation and
Initial Permutations. The purpose is to allow symmetrical decryption.
You'll also notice that the Key Schedule is symmetrical, and the
scheduled key order is reversed for decryption.

The Initial Permutation

The Initial Permuation (IP) is a description of how a byte wide
interface is connected to a 64 bit block comprised of two 32 bit blocks
(L and R). Consider a byte wide interface with the bits numbered 1-8.
The event numbered bits go to the L Block and the odd numbered bits go
to the R block. Note that the bit order is big endian, where bit 1 is
most significant and bit 8 is least least significant. The input block
is typically loaded as 8 successive byte
loads:

Port MSB7 Input (LR) Left
  Bit Bit Block (64 bits) Block (32 bits)

   2------6-------58 50 42 34 26 18 10 2 1 2 3 4 5 6 7 8
   4------4-------60 52 44 36 28 20 12 4 9 10 11 12 13 14 15 16
   6------2-------62 54 46 38 30 22 14 6 17 18 19 20 21 22 23 24
   8------0-------64 56 48 40 32 24 16 8 25 26 27 28 29 30 31 32

                                                   Right
                                                   Block (32 bits)
   1------7-------57 49 41 33 25 17 9 1 1 2 3 4 5 6 7 8
   3------5-------59 51 43 35 27 19 11 3 9 10 11 12 13 14 15 16
   5------3-------61 53 45 37 29 21 13 5 17 18 19 20 21 22 23 24
   7------1-------63 55 47 39 31 23 15 7 25 26 27 28 29 30 31 32

Input Byte 8 7 6 5 4 3 2 1

The Final Permutation

The Final Permutation (IP-1) provides the inverse, it standarizes the
output of the R16L16 output block to a byte wide interface. The Output
block is ordered Right then Left to allow complementary operation for
subsequen decryption. Were one to perform an IP followed by IP-1
without any intervening round iteration operations, one would end up
with odd and even bits swapped:

Right Output (R16L16) Standard Port
Block (32 bits) Block (64 bits) Bit Bit

  1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8-------6--------2
  9 10 11 12 13 14 15 16 9 10 11 12 13 14 15 16-------4--------4
17 18 19 20 21 22 23 24 17 18 19 20 21 22 23 24-------2--------6
25 26 27 28 29 30 31 32 25 26 27 28 29 30 31 32-------0--------8

Left
Block (32 bits)

  1 2 3 4 5 6 7 8 33 34 35 36 37 38 39 40-------7--------1
  9 10 11 12 13 14 15 16 41 42 43 44 45 46 47 48-------5--------3
17 18 19 20 21 22 23 24 49 50 51 52 53 54 55 56-------3--------5
25 26 27 28 29 30 31 32 57 58 59 60 61 62 63 64-------1--------7

Output Byte 8 7 6 5 4 3 2 1

>>From FIPS Pub 46-2:

Final Permuation IP-1:
                                    Output Byte
40 8 48 16 56 24 64 32 1
39 7 47 15 55 23 63 31 2
38 6 46 14 54 22 62 30 3
37 5 45 13 53 21 61 29 4
36 4 44 12 52 20 60 28 5
35 3 43 11 51 19 59 27 6
34 2 42 10 50 18 58 26 7
33 1 41 9 49 17 57 25 8

  1 2 3 4 5 6 7 8 Port Bit
  7 6 5 4 3 2 1 0 MSB7 Bits

In the simplest hardware implementation of DES, the Left and Right
blocks are comprised in hardware of four 8 bit register each. Each 8
bit register can be serially loaded (IP), serially unloaded (IP-1), or
parallel output and parallel loaded (round interation). DES is an
encryption algorithm originally required to be implemented in hardware,
specified in 1977 - predating 16 or 32 bit microprocessor peripherals.

Permuted Choice 1

PC1 performs a similar function loading the C and D 28 bit registers
(comprised of three 8 bit bidirectional shift register and 1 4 bit
bidirectional shift register, all with parallel outputs). The C and D
registers can be serially loaded (shifting right), or serially shifted
left or right in a closed ring for encryption or decryption.

Port MSB7
Bits Bits

                 Input (CD) C
                         Block, 64 bits Block (28 bits)

1--------7------57 49 41 33 25 17 9 1 MS 1 2 3 4 5 6 7 8
2--------6------58 50 42 34 26 18 10 2 9 10 11 12 13 14 15 16
3--------5------59 51 43 35 27 19 11 3 17 18 19 20 21 22 23 24
4--------4------60 52 44 36 ----- -- (C(28)) 25 26 27 28

                                                     D
                                                     Block (28 bits)

7--------1------63 55 47 39 31 23 15 7 1 2 3 4 5 6 7 8
6--------2------62 54 46 38 30 22 14 6 9 10 11 12 13 14 15 16
5--------3------61 53 45 37 29 21 33 5 17 18 19 20 21 22 23 24
4-------(D(25)--------------28 20 12 4 25 26 27 28

8--------0------64 56 48 40 32 24 16 8 LS (parity)

Input Byte 8 7 6 5 4 3 2 1

Note that bit 4 is used as input for both C and D. This implies that
C(28)output is used as the serial input to D(25). The least significant
bit is used for odd parity.

Permuted Choice 2

Permuted Choice 2 is the selection of 48 bits, each of 24 bits from each
of the C and D register. The definition for PC2 specifies the selected
values for selected key Ks[16] in the order they are used for the 8 SBox
cells of f(R,K). The Key Scheduler defines a preshift to obtain Ks[1].
You only need one set of tap-offs for the two 28 bit shift registers.
You either preshift (encryp) or postshift(decrypt) after selecting
values according to PC2. When you have completed the key shift
schedule, C and D are in their original position. The shift direction
is either left or right dependent on the encrypt/decrypt mode. In
software you can assemble the 16 selected key values or in a more
optimal hardware implementation you can use the above table and have a
16 to 1 mux for all 48 selected key bits allowing 16 clocks for round
iteration as opposed to 28 (or 29). You can likewise get the C and D
registers to shift two positions as well as one.
;



Relevant Pages

  • Re: DES implementation questions (FinalPermutation code)
    ... > output from the DES rounds is normal with the exception of the rotate left 1 ... The Initial Permutation ... blocks are comprised in hardware of four 8 bit register each. ... You only need one set of tap-offs for the two 28 bit shift registers. ...
    (sci.crypt)
  • Re: Random permutation of 2 bit system. Checking on what trandom permutation means?
    ... It had automatic numbering from word. ... Anyway Extending to what was said about DES and permutations. ... If you have not seen all of the output of the permutation generation ... idea of a block cipher as a random permutation? ...
    (sci.crypt)
  • Re: Thanks for the tips: Re: Random permutation of 2 bit system. Checking on what trandom permutatio
    ... I also got that no table is physically used in DES. ... But I must admit that the permutation generation process driven by the key ... It is not a random permutation so I cannot prove ... The algorithm I used below is clearly true for the first, ...
    (sci.crypt)
  • Re: few questions
    ... >> Q) what is the purpose of the INITIAL permutation in DES? ... > require twice the computational time / space as breaking DES. ... > This requires twice as much time and very large amounts of memory as I ...
    (sci.crypt)
  • Re: x86 setup code rewrite in C - revised
    ... write protected register is written without setting ... Register 0x11 has the Protect bit in it, ... or 8x8 chars - some VGA adapters do not hide the bottom graphic lines. ... Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! ...
    (Linux-Kernel)