Re: Thou shalt have no other gods before the ANSI C standard

From: Trevor L. Jackson, III (tlj3_at_comcast.net)
Date: 02/28/05


Date: Sun, 27 Feb 2005 19:00:50 -0500

David Wagner wrote:

> Trevor L. Jackson, III wrote:
>
>>Your list is sufficient to trigger something like an "Aha!". Consider
>>that in any executable the examples you gave are examples of
>>non-application-accesible addresses. We can color them red. The set of
>>program-declared and system-supplied variables are the blue adresses.
>>In theory programs have access only to blue addresses. But proving that
>>a program stays within the blue zone is very difficult.
>
>
> Yes, I think this is exactly it. I find this a very useful way to think
> about things. I was introduced to this style of thinking by Drew Dean, but
> I've never heard it articulated by anyone else.
>
> Caveat: Buffer overruns can still be problematic even if they only give
> the adversary the ability to overwrite blue addresses. For instance, the
> attacker might be able to change the behavior of the program by tampering
> with some critical internal variable. Nonetheless, many buffer overruns
> today take the form of tampering with red addresses, and most of the time
> it seems sufficient to confine our thinking to detecting cases where the
> attacker gains the power to write to red addresses.
>
>
>>Add to the above a form of automated analysis similar to interval
>>arithmetic as applied to program variables and you have the possibility
>>of a tool that could warn of constructs that have the potential of
>>interacting with red addresses.
>
>
> You may enjoy the following paper:
> David Wagner, Jeffrey S. Foster, Eric A. Brewer, and Alexander Aiken.
> A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities.
> NDSS 2000. http://www.cs.berkeley.edu/~daw/papers/overruns-ndss00.pdf
> http://www.cs.berkeley.edu/~daw/boon/
> It reports experience with an attempt to use a form of interval analysis
> to detect, for every program variable, the maximum and minimum values that
> variable could take on. It then attempts to detect cases where a pointer
> can be used to write outside the memory object that it points into. We
> associate each memory object (e.g., any object allocated on the stack or
> heap, or any stretch of memory returned by a call to malloc()) with a
> variable that represents its length, and we look for accesses to memory
> objects that may violate bounds.
>
> One major problem with the analysis described above is that leads to too
> many false alarms. It is unable to deal well with dynamically allocated
> memory objects. For instance, if we have the code snippet:
> char *p = malloc(n);
> ... p[i] ...
> then what we need to know to be sure that this is safe is to know that
> 0 <= i < n. This requires tracking the correlation between a pair of
> variables, not just the upper bound of either.
>
> Another problem with the analysis described above is that it is unsound:
> it can miss bugs. This is because dealing with pointers and aliasing in
> C is very painful. The main source of the problem is the "strong update"
> problem. Consider the following snippet of code:
> int f(int *p, int *q) {
> *p = 5;
> *q = 7;
> return *p;
> }
> If you want to claim that f() returns 5, you need to know that p and q
> don't alias. Accurate aliasing information is hard to infer. And without
> accurate aliasing information, you're stuck with doing a "weak update",
> which means that whenever you see something like "*p = 0;", then each int
> variable that p could point to is updated to contain either the value it
> previously hard or 0 (we don't know which; both are possible). The above
> example looks artificial, but the "strong update" problems comes up all
> over the place in real code and is highly non-trivial to deal with.

This has been a compiler optimization problem for basically forever. I
do not propose to solve it. :-) I don;t think they would affect the
tool I am considering. I propose a run-time analysis rather than a
static analysis.

>
> I don't think I know how to build a tool to detect buffer overruns in
> legacy code that could be both sound (detect all bugs) and precise (not
> have too many false positives). But if you have any ideas on how to do
> that, such a tool would be terribly useflu.

There is no way I would aim to detect all bugs. But I would be happy to
detect all possible writes to red memory. As for positives, there will
be a residue of code that will be reported as problematic, but a human
might be able to tell, by having information that is not represented in
the source code, that those problems are not going to occur. I would
expect the false positive rate to be 50% and the false negative rate to
be hard to measure.

>
>
>>I'd be interested in working on such a tool. It would be terribly slow,
>>but _you only have to run it once per module, component, or subsystem_.
>> So it it takes an hour or a day to run who cares?
>
>
> Absolutely. CPU time is a lot cheaper than programmer time.
>
>
>>I think such a tool is within the current state of the art. If it
>>existed, would it be useful enough to spend the time to run it and
>>interpret the results (a list of software that has the capacity to write
>>to red addresses)?
>
>
> That all depends on how much work it is to interpret the results. If
> the output is essentially a list of all places in the program where any
> pointer write occurs, and the onus is on the human to check every single
> one manually, that would probably require so much effort that such a tool
> would have almost no impact. But if the tool was precise enough that its
> "real bug" to "false positive" rate was not too terribly (a ratio of 1:10
> might be tolerable), then I think that would be extremely useful and
> most definitely worth building.

I've invested several personal man-years into developing tools to
validate legacy numerical software. The extension to memory management
software is not too hard to imagine.

I will read your paper and reflect on it a bit.

/tj3



Relevant Pages

  • Re: How to develop a random number generation device
    ... Perhaps buffer overruns can't be ... required to make safe code possible. ... A buffer overrun is where a process trashes its own memory. ... The links to extended explanatory data seem to be good as well. ...
    (sci.electronics.design)
  • Re: How to develop a random number generation device
    ... Perhaps buffer overruns can't be prevented ... to make safe code possible. ... related to process isolation (preventing one process from trashing another ... A buffer overrun is where a process trashes its own memory. ...
    (sci.electronics.design)
  • RE: checking and improving performance
    ... Handles - maps to the Process Object, Handle Count counter, Instance - ... Total - maps to Memory Object, ... PLEASE NOTE the newsgroup SECURE CODE and PASSWORD were ...
    (microsoft.public.windows.server.sbs)
  • Re: [PATCH 2.6.17-rc4 6/6] Remove some of the kmemleak false positives
    ... are not memory leaks. ... How many false positives are you expecting? ... The tool scans the data section and kernel stacks for pointers. ... allocations during the booting process). ...
    (Linux-Kernel)
  • Re: WDF memory object operation between bus and child driver
    ... Upper driver (child) issue a request to its bus driver. ... Bus driver allocate WDF memory and record the ... Child hadle the memory object in its completion routine and free ...
    (microsoft.public.development.device.drivers)