Re: Cohen's paper on byte order

From: Mok-Kong Shen (mok-kong.shen_at_t-online.de)
Date: 05/08/03


Date: Thu, 08 May 2003 11:49:14 +0200


"Douglas A. Gwyn" wrote:
>
> Mok-Kong Shen wrote:
> > It is also fortunate that a byte variable in C is
> > of type designated 'unsigned char' and not something
> > like 'unsigned extra-short int' which would have implied
> > an inherent integer characteristics. Yes, in C one can
> > do arithmetic operations on such a variable. But that's
> > a 'convenience' matter in my view. In other PLs, e.g.
> > Fortran 95, a character variable is just that. You are
> > not allowed to do arithmetic operations on it. So
> > a byte variable is (primarily) just a 'container' of
> > 8 bits, if you don't exploit the functionalities that
> > C 'allows' you.
>
> That's just nonsense. In C, type unsigned char is an
> integer type capable of representing all integer values
> from 0 through at least 255, possibly higher. It is not
> necessarily capable of representing all characters, and
> these days typically it cannot do so. That's why there
> is a wchar_t (also an integer type via typedef). Bits
> within *any* integer type in C can be accessed using the
> same operations, e.g. w&MASK, 1<<b, etc. The fact is
> that there is a well-defined numerical value for the bit
> pattern in the representation for an integer type and
> conversely (except for padding and trap representations),
> which is fortunate for use with character codes since a
> character code *is* typically an integer value
> (determined by some character encoding standard such as
> EBCDIC or ISO 10646). C's unsigned char type is unique
> in having no padding and no trap representations, making
> it the unit of choice for examining representations of
> other types. When a C program inputs or outputs an
> unsigned char datum, it's not necessarily as a character
> code (usually if the stream is open in text mode it is,
> and usually if the stream is open in binary mode it is
> not). And AES does not apply only to arrays of
> character codes, but to arrays of arbitrary bit data
> (not necessarily stored in 8-bit units).

Then try to do it with character data type in other
PLs. There, a character is just a certain sequence of
bits on which no integer operations can't be done and
which has thus no inherent connotations of an integer
whatsoever. Or do you think that AES is to be
'exclusively' implemented in C? That could well be
the intention of the authors of the docuemtn, but,
using the same type of argumentation as yours, where
does the document 'say' that? (The document only
provides an 'abstract' algorithm, right?)

>
> > One also doesn't have difficulty with the hex notation.
> > Gwyn claimed previously that what is used in the
> > document is 'only' applicable within the document,
> > which is an argument un-understandable in my humble
> > view.
>
> The notation is *defined* in the document for use in
> a specific context, just as terms like "index" and
> "byte" are *defined* in the document with specific
> contexts. For purposes of interpreting such a
> document, one necessarily uses the definitions it
> provides for technical terms and notation. And in
> order to understand the connection between two levels
> of description, as in external data and internal field
> elements, it is important not to misapply to another
> context what is specified for a particular context.

What I meant is that {pq} is just the AES's way of
saying 0xpq (or even pq when the context is clear).
So that's inherently the common hex notation and it
is applicable both to internal and external entities,
both inside and outside the document and in this
sense has universal 'applicability', which apparently
contradicts some of the sentences you wrote in the past.

M. K. Shen



Relevant Pages

  • Re: Sarah Jane return -- from BBC news
    ... >> main character from the 50-something generation because the all ... > at all who aren't impossible representations of shaggable perfection? ... Housewives) or prepubescent, promiscuous, almost innocently dumb (eg. ... Martha Stewart) or a homemaker's show. ...
    (rec.arts.drwho)
  • Re: Mackie Tracktion questions
    ... that's a great metaphor. ... it does seem "Chinese" with character ... representations. ...
    (rec.audio.pro)
  • Re: RfD: Escaped Strings
    ... Win32Forth treats incomplete strings ... non-octal character ... Why do we need two representations, ... Thus your \x12AB would produce the sequence 12, 'A', and 'B'. ...
    (comp.lang.forth)
  • Re: newbie question about character encoding: what does 0xC0 0x8A have in common with 0xE0 0x80 0
    ... UTF-8 represents unicode characters as variable length sequences of ... with smaller unicode numbers having shorter sequences. ... their ASCII representations. ... character, is represented as 0x0A. ...
    (comp.text.xml)
  • Re: Unrecognized escape sequences in string literals
    ... a reader can't know if \ is a literal character or escape character ... without knowing the context, and it means an innocuous change in context ... thinking about the fact that the s comes after the backslash. ...
    (comp.lang.python)