Re: Public disclosure of discovered vulnerabilities

From: Douglas A. Gwyn (DAGwyn_at_null.net)
Date: 05/29/05


Date: Sun, 29 May 2005 16:31:32 -0400


> Douglas A. Gwyn wrote:
> > Bryan Olson wrote:
> >> Negative character codes are consistent with the C standard,
> >> and common systems,
> > No. The value '0xC0' (EBCDIC code for the character '0',
> > as I recall) is the *positive* decimal value 192 on *all*
> > conforming C implementations.

To avoid confusion, I meant the numerical value 0xC0;
I shouldn't have put quotes around it since I didn't
intend to suggest a C "integer character constant",
but rather (as I said and illustrated by an example)
the code value obtained from an external file.

(There is contradictory wording in the C standard
about the value of such an integer character constant,
which has type int, but an example indicates that the
intent was that the value of an *integer character
constant* might be negative. The contradiction is that
initially the value is specified as the hex value,
i.e. decimal 192, but then in semantics it is said
that the value of the integer character constant would
be what is obtained by converting a char that has that
value to int. Of course when 8-bit char is signed
there is no way for a char to represent that value.
The peculiar semantics are again meant to allow the
accidental behavior of some existing pre-standard
implementations.)

> So quote the standard and show me wrong.

I have copies of several character coding standards,
and none of them assign negative values to code points.

The C standard spec for fgetc is quite explicit about
avoiding sign extension, which is why you get the
unmolested code value upon input. (This is essential
anyway, as allowing a "signed char knothole" to occur
here would allow two distinct external code values to
be mapped to the same integer value upon input, on any
platform other than twos-complement.)



Relevant Pages

  • Re: question about padding in signed T but not in unsigned T
    ... >>> My reading of the standard is that an implementation cannot conform ... >> character types are explicitly excluded is faulty logic. ... > No one wants to use such implementations because the vast majority ... especially if plain char as signed. ...
    (comp.std.c)
  • Re: Public disclosure of discovered vulnerabilities
    ... The quotes in the string ... an integer character constant as implementation-defined: ... The C standard indicates that it can. ... an object with type char whose value is that of the single ...
    (sci.crypt)
  • Re: Manipulation of strings: upper/lower case
    ... but also in the way it treats character ... I don't honestly claim to be able to rectify the standard ... > char to have the same characteristics as unsigned char? ... Implementations themselves should be able to make the transition ...
    (comp.lang.c)
  • Re: C Primitive Data Type Sizes
    ... > In article, Chris Croughton ... char can only be 8 bits or more (at least in standard C). ... No C implementation can have char with less than 8 bits. ... implementations of some language which looks vaguely like C which do ...
    (comp.lang.c)
  • Re: C Primitive Data Type Sizes
    ... char can only be 8 bits or more (at least in standard C). ... >implementations of some language which looks vaguely like C which do ... complete C99 compiler systems in the world. ...
    (comp.lang.c)

Quantcast