Re: Statistical Anomaly Analysis?
From: Vern Paxson (vern@icir.org)Date: 03/19/02
- Previous message: Alfred Huger: "Researching SIMs"
- Maybe in reply to: Marcus J. Ranum: "Re: Statistical Anomaly Analysis?"
- Next in thread: Derek Walker: "Re: Statistical Anomaly Analysis?"
- Reply: Derek Walker: "Re: Statistical Anomaly Analysis?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
To: "Marcus J. Ranum" <mjr@nfr.com> Date: Mon, 18 Mar 2002 22:33:22 -0800 From: Vern Paxson <vern@icir.org>
(finally have my head [briefly!] above water - interesting thread!)
> >Interesting. I thought that it would be harder for small scale networks.
> >For large scale networks, the aggregation of traffics should smooth the
> >variasions and turn it to more statistically expectable distributions.
>
> Right! Smoother distributions mean that you're losing detail - the "edge"
> cases that you're looking for for an IDS capability. Basically, what you're
> dealing with is a signal processing problem - you're trying to pick a signal
> out of a lot of other signals (not noise: other signals). So if you start
> doing standard distributions what you're doing is eliminating the weaker
> signals (the trojan horse remote control channel) and favoring the stronger
> signals (HTTP traffic)...
>
> I keep waiting for Vern to chime in on this one and tell me I'm an idiot...
> Vern? ;)
'Fraid I have to disappoint :-), I agree with you. Traffic does smooth
out in a statistical sense as it's aggregated - *but*, and this is the
key, events that were 6-sigma outliers for a small network, and hence
almost never happened, remain 6-sigma outliers for a large network, but
now *do* happen, because you have such a high volume of traffic. Here
I'm referring to statistical "anomalies" that in fact are benign.
One facet of this is discussed in (the journal version of) the Bro paper
as the problem of "Crud" (section 7.3 of
http://www.icir.org/vern/papers/bro-CN99.html).
I also experience this every day, through three different Bro systems
I'm involved in operating. One watches ICSI's network, which has ~150
hosts that are centrally managed and running just a few different OS's.
I can imagine being able to develop useful statistical profiles for that
network; something we may try at some point, though ICSI's cross-section
for being attacked is pretty low (this is **not** an invitation to anyone
to increase it!).
Another watches LBL's network, which has ~6,000 hosts that are diversely
managed. The traffic is *much* more variable and pretty much every day
we see new-but-benign weird things.
Another watches UC Berkeley's network, which has ~45,000 hosts. That traffic
likewise has an immense amount of variability; but its bulk statistics
(total packets or TB per day, traffic mix) are more stable than LBL's, in
line with the smoothing you discuss above. But I agree with the follow-on
discussion in this thread - it's not clear you can do much with that sort
of regularity beyond detecting huge-spike attacks like worms or DDOS (which
certainly make themselves apparent without anomaly detection anyway).
Vern
- Previous message: Alfred Huger: "Researching SIMs"
- Maybe in reply to: Marcus J. Ranum: "Re: Statistical Anomaly Analysis?"
- Next in thread: Derek Walker: "Re: Statistical Anomaly Analysis?"
- Reply: Derek Walker: "Re: Statistical Anomaly Analysis?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|