Re: TCHAR and buffer overflows

From: David Hopwood (david.hopwood@zetnet.co.uk)
Date: 08/26/02


Date: Mon, 26 Aug 2002 17:21:11 +0000
From: David Hopwood <david.hopwood@zetnet.co.uk>


-----BEGIN PGP SIGNED MESSAGE-----

"Anon E. Maus" wrote:
> "David Hopwood" <david.hopwood@zetnet.co.uk> wrote:
> > Edward Elliott wrote:
> > > strcpy, strcat, sprintf, and scanf are also common offenders. Even
> > > strncpy, strncat, etc don't eliminate buffer overflows; they just make
> > > them harder. If you pass a length that's too long, you can get an
> > > overflow.
> > >
> > > MS had this problem in a lot of their programs. They would write code
> > > like
> > >
> > > WCHAR buf[256];
> > > strncpy(buf, src, sizeof(buf));
>
> This would be a problem if you hadn't defined UNICODE in your build. Hint:
> Copy characters of one string to another.
>
> char *strncpy( char *strDest, const char *strSource, size_t count );
>
> wchar_t *wcsncpy( wchar_t *strDest, const wchar_t *strSource, size_t
> count );
>
> unsigned char *_mbsncpy( unsigned char *strDest, const unsigned char
> *strSource, size_t count );
>
> Routine Required Header Compatibility
> strncpy <string.h> ANSI, Win 95, Win NT
> wcsncpy <string.h> or <wchar.h> ANSI, Win 95, Win NT
> _mbsncpy <mbstring.h> Win 95, Win NT
>
> > >
> > > The problem arises when WCHAR is defined as a Unicode character (the
> > > macro WCHAR was introduced to switch between ascii and unicode chars as
> > > needed).
> >
> > I think you mean TCHAR (WCHAR always represents a UTF-16 code unit).
> > TCHAR is an abomination. For any program that needs to support Unicode,
> > decide on a specific UTF for the internal representation and use it
> > consistently. This is likely to require additional libraries for
> > platforms that have poor OS and/or C library support for Unicode, but
> > such libraries are available as open source (e.g. ICU).
> >
> > > Since Unicode chars are two bytes long, and sizeof returns the
> > > size in bytes, sizeof(buf) is actually 512! The fix of course is to use
> > > sizeof(buf) / sizeof WCHAR, but the bug is very subtle. Even to most
> > > security people, the code looks fine at first.
>
> The code _is_ OK, because the compiler (assuming you are using MS VC++ or
> VS) predefines which lib routine to use based on the Unicode settings
> (compiler switches).

No, not for the <string.h> (str*) functions. You're thinking of the functions
from <tchar.h> (_tcs*). Not even Microsoft would make gratuitously incompatible
changes to <string.h>.

In any case, the code is not OK, because '[const] WCHAR *' is not compatible
with '[const] char *', and therefore it invokes undefined behaviour (not that
it would work even if casts to '[const] char *' were added).

> Check your factory headers for TCHAR defs and such, you'll see what I mean.

Like I said, don't use TCHAR or <tchar.h>; if you're going to support Unicode
then support it on all platforms.

> > It shouldn't look fine to either security people or i18n people.
> > strncpy treats the first zero byte in the source string as terminating
> > the string, so it simply doesn't work for copying UTF-16 strings. Anyway,
> > it will cause a compiler warning due to incompatible types.
>
> //
> // Neutral ANSI/UNICODE types and macros
> //
> #ifdef UNICODE // r_winnt
>
> #ifndef _TCHAR_DEFINED
> typedef WCHAR TCHAR, *PTCHAR;
> typedef WCHAR TBYTE , *PTBYTE ;
> #define _TCHAR_DEFINED
> #endif /* !_TCHAR_DEFINED */

But strncpy does not take 'TCHAR *', it takes 'char *', and also the OP's code
used WCHAR, not TCHAR.

'WCHAR *' is never compatible with 'char *', whether UNICODE is defined or not.

> > Perhaps you meant to say "memcpy(buf, src, sizeof(buf));", but that works
> > correctly (assuming sizeof(src) >= sizeof(buf), and that src and buf don't
> > overlap) even for TCHAR.
>
> In the MS factory libs, memcpy() memmove(), etc. detect overlapping buffers
> and do reverse copy as appropriate.

memmove() is required to do that in Standard C, but memcpy() isn't. I haven't
checked whether the MSVCRT memcpy() implementation actually detects overlapping
buffers, but their docs don't say that it does:

# The memcpy function copies count bytes of src to dest. If the source and
# destination overlap, this function does not ensure that the original source
# bytes in the overlapping region are copied before being overwritten. Use
# memmove to handle overlapping regions.

> > IME, the skills required to write secure internationalised code are just
> > those required to write internationalised code, plus those required to
> > write secure code. As long as you're doing both correctly, there are no
> > hidden gotchas in the interaction between them.
>
> True; I might add one, though: you have to be familiar with your tools and
> how they work. Your work can be only as good as your tools. Lousy tools,
> lousy work. Or good tools, lousy user, lousy work. True quality comes
> from pride of workmanship.

Agreed.

- --
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPWpjFzkCAxeYt5gVAQE0Iwf9FV0ssguhKgcosxNISYjGfoojW+sn7/lz
bjVHKYZG4/8n5qgq28a23r+EDYzdbsoYN+xwDeF2LoXZaminGd6cSI3nq4vDsS9f
VehlzT0yTnLKdmKdAbgR/8NUxF0ChG8YM3GnaxcPN6dYhWqzrmtCGhq+ASVO6pON
gmKuM43QK9dBAXeRKQRDTZpk1AfHI2lqXuEaPym2Zu1OM/okztaFzgkL0/nBxJdK
494l9CA4FhTNWX3S3iH8pFyjEZMx/yy9+jyWSr4rQYpJpf6dIsO4YXLMrvgZ9ptL
0BJpCY+n6fSvd3jMfJ3bizbbMXmYms8W6hr7ViIvIpkIhrGuUYxWpA==
=8MiI
-----END PGP SIGNATURE-----



Relevant Pages

  • Re: TCHAR and buffer overflows
    ... > This would be a problem if you hadn't defined UNICODE in your build. ... > Check your factory headers for TCHAR defs and such, ... checked whether the MSVCRT memcpyimplementation actually detects overlapping ... Or good tools, lousy user, lousy work. ...
    (comp.security.misc)
  • Re: Window types
    ... typedef WCHAR TCHAR, * PTCHAR; ... typedef char TCHAR, * PTCHAR; ... Under a Unicode build, ... If the only text your app needs to deal with can be handled with ASCII ...
    (microsoft.public.vc.language)
  • Re: VC2005 Pro: IDE (Compiler ?) cant find Stdafx.h
    ... For new code I would obviously use UNICODE but for existing stuff ... > Use Multi-Byte Character Set ... > If you use TCHAR etc., and want your code to run on another platform, then ... > typedef char TCHAR; ...
    (microsoft.public.vc.language)
  • Re: TCHAR and buffer overflows
    ... This would be a problem if you hadn't defined UNICODE in your build. ... unsigned char *_mbsncpy(unsigned char *strDest, ... >> macro WCHAR was introduced to switch between ascii and unicode chars as ... > I think you mean TCHAR. ...
    (comp.os.ms-windows.nt.admin.security)
  • Re: TCHAR and buffer overflows
    ... This would be a problem if you hadn't defined UNICODE in your build. ... unsigned char *_mbsncpy(unsigned char *strDest, ... >> macro WCHAR was introduced to switch between ascii and unicode chars as ... > I think you mean TCHAR. ...
    (comp.security.misc)