Re: Verifying CRCs

From: Rob Warnock (rpw3_at_rpw3.org)
Date: 07/12/04


Date: Sun, 11 Jul 2004 22:23:08 -0500

Francois Grieu <fgrieu@francenet.fr> wrote:
+---------------
| "Skybuck Flying" wrote:
| > Well for hardware I don't know... Seems a bit weird to me... Since you
| > need to receive bytes first anyway ?
| > Or can a crc calculation be done, bit by bit ?
|
| Yes, CRC calculation in hardware can be done while bits are received
| (after removing "transparency" elements)
+---------------

Moreover, the CRC calculation can be done in hardware one bit at a
time, or 8 bits at a time, or 32, 64, 128 bits &c. at a time, with
fairly small amounts of logic. Doing the Autodin-II CRC-32 (Ethernet,
FDDI, ATM/AAL5, etc.) in parallel 32 bits at a time (that is, consuming
32 input bits per clock) takes about 500 XOR-gate-equivalents (or maybe
less with multi-input XORs).

[Little-known fact: An M-bit CRC can be done N bits at a time even
if N >> M. E.g., a CRC-32 can be done 128 bits at a time (per clock).
Just takes a little more hardware...]

+---------------
| > I am not sure... but maybe ethernet adds a crc32 to the end of the frame.
|
| It does.
+---------------

And it pre-initializes the result to all one's *and* it one's-complements
the result before appending it to the outgoing frame [thus improving its
ability to detect lost runs of zeros at points in the frame where the
current running CRC happens to be zero].

+---------------
| > That would be unfortunate since frames can have variable sizes.
|
| No problem at all with a canonical CRC, which can be entered in the
| CRC engine as data.
+---------------

And at least for the Autodin-II CRC-32, the "magic" constant that
results from a valid CRC at the receiver does *not* depend on the
frame length.

+---------------
| > A better technique at least for calculating the crc32... is to put the crc
| > in the header of the frame. That way the location of the crc32 is always
| > known and can be calculated... as soon as the payload bytes arrive.
|
| And then the sender must compute the CRC on the whole frame before
| sending it, making an extra pass in the data, adding latency...
+---------------

Not only *that*, but it can double (or in the worst case, triple!)
the required memory bandwidth!

+---------------
| CRC at start of data is never used in any data transmission context,
| to the best of my knowledge.
+---------------

Although putting multiple CRCs in a packet is done by some protocols.
E.g., DEC's DDCMP link-level protocol had a short, fixed-length header
which contained both the length of the following variable-length data
section *and* a CRC-16 over just the header (including that data length).
Then the data section which followed had a completely separate CRC-16
over just the variable-length data. The result was that the receiver
saw a good header CRC, it could (with high confidence) trust the length
field and dynamically set up the DMA transfer for the following variable-
length section to be just that length.

+---------------
| Computing a 16-bit CRC in software on an 8-bit CPU only cost 2 table
| lookups, a XOR, and few load/store, for each byte of data.
+---------------

A bit of ancient trivia: With the standard BSC/SDLC CRC-16, the table
can be compressed into 12 bits in width by stripping out "bit columns"
that are all zeros, resulting in a table that is just 256 12-bit words.
Thus you need only one table lookup per byte (though you must do a few
shifts/masks/XORs to regenerate the 16 bits from the 12-bit value).
This is how we did CRC-16's in software on the DEC PDP-8 circa 1972.
[Two of them per packet: see note above about DDMCP... ;-} ]

Similarly, computing a CRC-32 in software on a 32-bit CPU costs only
one table lookup, two XORs, and a byte-fetch for each byte of data
(assuming you're doing the whole packet in a tight loop, otherwise
add another 32-bit store & load per byte).

+---------------
| > So instead the packet can be stored and crc32 calculations done later.
|
| Adding latency, and implying an extra pass in the data.
+---------------

And extra memory bandwidth and CPU consumption.

-Rob

-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607



Relevant Pages

  • Re: Need a sysRPL CRC16 routine.
    ... The 2nd subroutine call does the actual CRC calculation, ... So to exclude the ob's length field from CRC calculation, ... no matter what actual length hex string. ...
    (comp.sys.hp48)
  • Re: Problems with Spartan6 CRC calculation
    ... One interesting deviation in Spartan6 compared to other architectures: The Auto-CRC after the FDRI write does not reset the CRC. ... /// \brief Length of the CRC calculation. ... I am also assuming that only "payload words" (after the packet header) factor into the calculation, just as with the other architectures, and UG380 explicitly shows the address length as 6 bits. ...
    (comp.arch.fpga)
  • Re: CRC of Preferred Roaming List problem
    ... > I have a problem with the crc calculation for the preferred roaming ... > the 16 bit preferred roaming list crc field PR_LIST_CRC is calculated ... > algorithm to calculate the CRC of the preferred roaming list? ...
    (comp.arch.embedded)
  • Re: Query in 32 bit Parallel CRC...urgent
    ... The CRC calculation is just an LFSR. ... is only the result of a mathematical simplification. ... The mathematics underlying *why* CRCs work as ...
    (comp.lang.vhdl)
  • Re: CRC of Preferred Roaming List problem
    ... > I have a problem with the crc calculation for the preferred roaming ... > the 16 bit preferred roaming list crc field PR_LIST_CRC is calculated ... > algorithm to calculate the CRC of the preferred roaming list? ...
    (comp.arch.embedded)