Re: IDS CorrelationFrom: Stephen P. Berry (firstname.lastname@example.org)
- Previous message: Foster, Belinda: "RE: Red Button in Cisco Secure Policy Manager"
- Maybe in reply to: Stephen P. Berry: "Re: IDS Correlation"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
To: "Marcus J. Ranum" <email@example.com> Date: Wed, 03 Apr 2002 16:23:42 -0800 From: "Stephen P. Berry" <firstname.lastname@example.org>
-----BEGIN PGP SIGNED MESSAGE-----
Marcus J. Ranum writes:
>Incident models are independent of the data representation used.
There's no hard, integral connexion...but I think there are nontrivial
incidental correlations. I won't bother trying to construct a formal
defence of this proposition---I'm not sure that one could be constructed---but
I think it's reasonable to imagine that the way data are presented to
us can affect how we conceptualise those data. And, conversely, the
the way we present data tells something about how we're modelling it.
One of the NIDS widgets I wrote, for example, includes a filter rule that
logs a random sample of traffic that doesn't match any of the other
signatures. Most NIDS don't do this sort of thing, and an independent
observer would probably (correctly) guess that the author of a tool
that does implement it is one of those damn statistical analysis goons.
The thing that strikes me about XML schemata for incident data is that
they look very much like the analysis schemes that output them, and the
analysis schemes in turn look an awful lot like the data on the wire. I.e.,
we still as a whole have a view of an incident as a packet or a clump of
packets zipping by a fixed point in the network.
In many cases this is a servicable model, and in some cases it might be
functionally complete (i.e., the incident might consist entirely of
a couple packets which match some signature in your NIDS). My thinking
is that while incidents will continue to look like this on the wire
(they'll always be describable as discrete packets), our models for
them will eventually make descriptions on this level less important.
If you look at any reasonably mature behavioural model---traffic patterns,
economics, quality failures in industrial processes, u.s.w.---the
terms in which inputs, results, and metrics are stated are typically
abstracted from the actual quanta which they model.
NIDS methodology is still a long way from being concerned about this
sort of thing in average day-to-day operation, and I don't think anything
will ever entirely supplant packetgazing as an analysis/reporting tool.
My contention is merely that such abstracted techniques will eventually
become more important than they are today; that XML-based representations
(such as IDMEF) are ill-suited to handling the data used by such techniques;
and that as a result, IDMEF fails as an universal format for handling
incident data. The corollary is that if this is true, then the general
perception that things like the IDMEF -are- universal indicates a bias
in how we (as a whole) think about incidents and incident data.
I don't think this bias is an indication of collective stupidity, a
vendor conspiracy, covert intervention by the Mossad, or anything wonky
like that. I think it's a -mistake-...but it doesn't even come close
to making the top ten list of mistakes collectively made by the
information security community (rated by severity or alphabetically).
I -do- think that it is one of the reasons that IDS technologies haven't
progressed further than they have, but that's a topic for another day.
>That being said, consider that you can take a series of events that are
>represented simply and coalesce them into meta-events that contain
>references to their ancestors. It's not hard to do some pretty impressive
>stuff with it once you've done that.
This is true, but it is also just about the limit of what you can do
with such a system. I go into this in somewhat greater detail in my
response to Mr McAlerney and I won't reiterate it all here. The punchline
is that inheritance and aggregation, while great, don't exhaust the
possibilities in terms of relationships between the data that analyst
might find profitable to determine/track/report.
0 The examples that immediately leap to mind involve nonparametric
numeric methods, heuristic rule generation systems, and suchlike.
But more simply, think about the case where you have a sensor
foo on 10.1.1.0/24 and a sensor bar on 192.168.1.0/24. Bar
sees a packet inbound from 10.1.1.25...but foo doesn't see a
matching outbound packet. You also find some ICMP_NET_UNREACHables
on sensor baz (on 172.16.1.0/24), with 10.1.1.25 in the encapsulated
header...but no `stimulus' packets matching the encapsulated header.
Now, try to imagine a model that records the relevant relationships
without loss of information. I.e., you could simply lump
all the data together based on the 10.1.1.25 address being involved
in all the traffic. Recording the null event from foo would be
a little trickier, but not impossible. But if you're fiddling
around with things like comparing the TTL of packets seen by
bar and the TTL encapsulated in the ICMP traffic seen by baz...and
`triangulating' a plausible source based on your knowledge of the
surrounding topology...well, maybe I'm just unimaginative, but
I have trouble envisioning a way of recording all the relevant
information using something like the IDMEF---not counting some
trivial solution like appending an `AdditionalData' bit including
the mail you send to the ISP describing your interpretation of
A similar `problem case' that I can think of off the top of my
head would involve the case where an employee's desktop uses
an modem to connect to a remote network, functions briefly as
a router between the `internal' network and the remote network,
and then disconnects.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.3 (GNU/Linux)
Comment: For info see http://www.gnupg.org
-----END PGP SIGNATURE-----