RE: safe strcpy()?

From: Daniel Reed (n@cs.rpi.edu)
Date: 01/29/03

  • Next message: Dave Aitel: "Re: safe strcpy()?"
    Date: Wed, 29 Jan 2003 17:04:58 -0500 (EST)
    From: Daniel Reed <n@cs.rpi.edu>
    To: "Hall, Philip" <phall@spss.com>
    
    

    On 2003-01-28T22:00-0600, Hall, Philip wrote:
    ) > Of course, the real way to build secure software is not
    ) > to use "safe" functions, but to check data validity :-)
    ) Hang on, that sounds akin to not having locks (safe functions) on your
    ) front door, but posting a guard (data validation) at the end of your
    ) drive way...hmmmmm I think I'll stick to my eXtreme Defensive
    ) Programming (XDP) and be paranoid about everything...unless you meant
    ) that by *adding* the data validity to the 'safe' functions to beef them
    ) up...?

    This discussion, while bringing up some interesting points, largely misses
    the point of what "safe" programming involves.

    For example, in one package I maintain I have to deal with converting
    printable strings into HTML entities. Due to protocol constraints, it's
    possible that the encoded string might be too large to send whole, so I
    split such strings at the entity boundary (see below).

    For each character in a printable string, I check whether it needs to be
    encoded or not. Once I have determined what entity will be sent in place of
    the original character, I check whether adding that entity to my buffer
    would push it past its limit. If so, I stop copying and send the buffer
    along before attempting to add the next character.

    This way, if the protocol I was dealing with limited strings to (let's say)
    22 characters, a string such as:
            "1 < 3 - That's the truth"
    might be split into:

             1234567890123456789012
            "1 &lt; 3 &nbsp;- "
            "&nbsp;That's the truth"

    instead of the less desirable:

             1234567890123456789012
            "1 &lt; 3 &nbsp;- &nbsp"
            ";That's the truth"

    The former would decode into "I < 3 - " + " That's the truth", and could be
    glued back to the original "I < 3 - That's the truth", whereas the latter
    would decode into "I < 3 - &nbsp" + ";That's the truth" and be recombined
    into "I < 3 - &nbsp;That's the truth". Whoops.

    Now, the code in question was originally written with a blind fear of buffer
    overflows clouding the original authour's style, and worked something like:

            if (input[i] == ' ') {
                    strncpy(output+outputpos, sizeof(output)-outputpos, "&nbsp;");
                    outputpos += sizeof("&nbsp;")-1;
            }

    This would allow a space occuring near the end of "output" to be truncated
    into "&nbsp", as in the example above. The new code is similar to:

            if (input[i] == ' ')
                    if ((outputpos + sizeof("&nbsp;")) < sizeof(output)) {
                            strcpy(output+outputpos, "&nbsp;");
                            outputpos += sizeof("&nbsp;")-1;
                    } else
                            break;

    This allows the loop to break once the "output" buffer has become filled,
    for all intents and purposes, and will allow the procedure to empty "output"
    and start from where it left off (so the space wouldn't appear at all in the
    current line, and would instead appear whole in the next line).

    Security is indeed very important, and if more people made secure code-
    writing a priority, a lot of our lives would become much easier. However,
    there are no magic wands in programming:
            Replacing strcpy()'s with strncpy()'s will not solve all problems,
    and may in fact introduce new ones. In the above example, strncpy() did not
    itself cause a problem, but its ignorant usage led to a misbehaviour.
            Using manipulation routines that ensure the string is large enough
    to "hold" everything can lead to its own problems. A quick example: reading
    data from the network; all someone need do is feed your service a constant
    stream of characters, eventually the program will fill all available memory
    trying to store the string. Again, it would be a programmer ignorantly
    feeding a network socket directly into a string (as I've seen provided in
    examples on this very list). However, in all of these cases, programmer
    failure seems to be a common thread. There is no intrinsic flaw in the
    methods or implementations they are using.

    -- 
    Daniel Reed <n@cs.rpi.edu>
    Real computer scientists like having a computer on their desk, else how could they read their mail?
    naim FAQ: http://128.113.139.111/~n/naim/FAQ
    


    Relevant Pages

    • Re: input & output in assembly
      ... [As you've not specified OS or assembler, ... using individual character I/O and handling the rest yourself in your ... it finds in that string, ... ENTER key is pressed (maximum buffer size: ...
      (comp.lang.asm.x86)
    • Re: input & output in assembly
      ... [As you've not specified OS or assembler, ... using individual character I/O and handling the rest yourself in your ... it finds in that string, ... ENTER key is pressed (maximum buffer size: ...
      (alt.lang.asm)
    • Re: js Newbie
      ... >>born with the talent of programming. ... > Delete all the vowels from a string. ... > Check whether each character is a vowel, and if it is, delete it. ...
      (microsoft.public.scripting.jscript)
    • Re: Check for Common character sequence ( I will pay)?
      ... Dude, programming is all problem-solving. ... You need to identify character sequences of 3 or more characters that appear ... in more than one string. ... and test each 3-character sequence that results. ...
      (microsoft.public.dotnet.framework)
    • Re: searching for the highest index within a directory
      ... (I used to write code in the Ada programming language... ... Because "testFile_34" is a string, ... there is no way to compare them as numbers. ... means we look at the first character in each string. ...
      (microsoft.public.dotnet.languages.csharp)