Re: Statistical Anomaly Analysis?

From: Vern Paxson (vern@icir.org)
Date: 03/19/02


To: "Marcus J. Ranum" <mjr@nfr.com>
Date: Mon, 18 Mar 2002 22:33:22 -0800
From: Vern Paxson <vern@icir.org>


(finally have my head [briefly!] above water - interesting thread!)

> >Interesting. I thought that it would be harder for small scale networks.
> >For large scale networks, the aggregation of traffics should smooth the
> >variasions and turn it to more statistically expectable distributions.
>
> Right! Smoother distributions mean that you're losing detail - the "edge"
> cases that you're looking for for an IDS capability. Basically, what you're
> dealing with is a signal processing problem - you're trying to pick a signal
> out of a lot of other signals (not noise: other signals). So if you start
> doing standard distributions what you're doing is eliminating the weaker
> signals (the trojan horse remote control channel) and favoring the stronger
> signals (HTTP traffic)...
>
> I keep waiting for Vern to chime in on this one and tell me I'm an idiot...
> Vern? ;)

'Fraid I have to disappoint :-), I agree with you. Traffic does smooth
out in a statistical sense as it's aggregated - *but*, and this is the
key, events that were 6-sigma outliers for a small network, and hence
almost never happened, remain 6-sigma outliers for a large network, but
now *do* happen, because you have such a high volume of traffic. Here
I'm referring to statistical "anomalies" that in fact are benign.

One facet of this is discussed in (the journal version of) the Bro paper
as the problem of "Crud" (section 7.3 of
http://www.icir.org/vern/papers/bro-CN99.html).

I also experience this every day, through three different Bro systems
I'm involved in operating. One watches ICSI's network, which has ~150
hosts that are centrally managed and running just a few different OS's.
I can imagine being able to develop useful statistical profiles for that
network; something we may try at some point, though ICSI's cross-section
for being attacked is pretty low (this is **not** an invitation to anyone
to increase it!).

Another watches LBL's network, which has ~6,000 hosts that are diversely
managed. The traffic is *much* more variable and pretty much every day
we see new-but-benign weird things.

Another watches UC Berkeley's network, which has ~45,000 hosts. That traffic
likewise has an immense amount of variability; but its bulk statistics
(total packets or TB per day, traffic mix) are more stable than LBL's, in
line with the smoothing you discuss above. But I agree with the follow-on
discussion in this thread - it's not clear you can do much with that sort
of regularity beyond detecting huge-spike attacks like worms or DDOS (which
certainly make themselves apparent without anomaly detection anyway).

                Vern



Relevant Pages

  • RE: Statistical Anomaly Analysis?
    ... network are out of intrusion while training. ... be possible for the statistics to cover system intrinsic dynamics over the ... >> traffics or the total amounts of network traffics. ... >> the recent short term distribution should be close to the long term ...
    (Focus-IDS)
  • Re: Statistical Anomaly Analysis?
    ... If you set up your model to account for each event type as a part of the ... the aggregation of traffics should smooth the ... > key, events that were 6-sigma outliers for a small network, and hence ... > likewise has an immense amount of variability; but its bulk statistics ...
    (Focus-IDS)
  • Re: Statistical Anomaly Analysis?
    ... > traffics or the total amounts of network traffics. ... > the recent short term distribution should be close to the long term ... of anomaly detection is typically vulnerable to data set poisoning. ...
    (Focus-IDS)
  • RE: Statistical Anomaly Analysis?
    ... define statistical concepts of network normalcy. ... > traffics or the total amounts of network traffics. ... of anomaly detection is typically vulnerable to data set poisoning. ... if you are getting statistics about a users login habits over ...
    (Focus-IDS)
  • Re: Help.. Unwanted network traffic - netbios-ssn port 139
    ... Milliseconds, at a rate of once single occurance per millisecond, you can ... network almost 100% of the traffic is this kind of stuff. ... These traffics were desinated ...
    (microsoft.public.win2000.networking)