Re: [fw-wiz] Handling large log files



On Tue, 5 May 2009, Nate Hausrath wrote:

Hello everyone,

I have a central log server set up in our environment that would
receive around 200-300 MB of messages per day from various devices
(switches, routers, firewalls, etc). With this volume, logcheck was
able to effectively parse the files and send out a nice email. Now,
however, the volume has increased to around 3-5 GB per day and will
continue growing as we add more systems. Unfortunately, the old
logcheck solution now spends hours trying to parse the logs, and even
if it finishes, it will generate an email that is too big to send.

I'm somewhat new to log management, and I've done quite a bit of
googling for solutions. However, my problem is that I just don't have
enough experience to know what I need. Should I try to work with
logcheck/logsentry in hopes that I can improve its efficiency more?
Should I use filters on syslog-ng to cut out some of the messages I
don't want to see as they reach the box?

I have also thought that it would be useful to cut out all the
duplicate messages and just simply report on the number of times per
day I see each message. After this, it seems likely that logcheck
would be able to effectively parse through the remaining logs and
report the items that I need to see (as well as new messages that
could be interesting).

Are there other solutions that would be better suited to log volumes
like this? Should I look at commercial products?

I don't like the idea of filtering out messages completely, the number of times that an otherwise 'unintersting' message shows up can be significant (if the number of requests for a web image per day suddenly jumps to 100 times what it was before, that's a significant thing to know)

the key is to categorize and summarize the data. I have not found a good commercial tool to do this job (there are good tools for drilling down and querying the logs), the task of summarizing the data is just too site specific. I currently get 40-80G of logs per day and have a nightly process that summarizes them.

I first have a process (perl script) that goes through the logs and splits them into seperate files based on the program name in the logs. Internally it does a lookup of the program name to a bucket name and then outputs the message to that bucket (this lets be combine all the mail logs to one file, no matter which OS they are from and all the different ways that the mail software identifies itself). for things that I haven't defined a specific bucket for, I have a bucket called 'other'

I then run seperate processes against each of these buckets to create summary reports of the information in that bucket. some of these processes are home-grown scripts, some are log summary scripts that came with specific programs.

one of the reports is how mnay log messages there are in each bucket (this report is generated by my splitlogs program)

for the 'other' bucket, I have a sed line from hell that filters out 'unintersting' details in the log messages (timestamps, port numbers, etc) and then run them through a sort|uniq -c |sort -rn to produce a report that shows how many times a log message that looks like this shows up (the sed line works hard to collaps similar messages togeather)

I then have a handful of scripts that assemble e-mails from these reports (different e-mails reporting on different things going to different groups). For a lot of the summaries I don't put the entire report in the e-mail, but instead just do a head -X (X=20-50 in many cases) to show the most common items.

for example, I have a report that shows all the websites that were hit by people on the desktop network. I have another report that shows the hits by desktop -> website. I generate an e-mail showing the top 50 entries in each of these reports and send it to the folks looking for unusual activity on the desktop network (it's amazing how accuratly a simple report like this can pinpoint a problem desktop machine)

getting this setup takes a bit of time and tuning, but with a bit of effort you can quickly knock out a LOT of your messages, and then you start finding interesting things (machines that are misconfigured and generating errors on a regular basis, etc). as you fix some of these problems, the other report goes from an overwelming tens of thousands of lines, to a much smaller report. just concentrate on killing the big items and don't try to deal with the entire report at once (the nightly e-mail to me shows the top several hundred lines of this report so that I can work on tuning it. when I can keep up on the tuning it's not unusual for this to be the entire report)

with this approach (and a reasonably beefy log reporting machine), it takes about 3-6 hours to generate the report (6 hours being the 80G days)

I have other tools watch the logs in real-time for known bad things (to generate alerts), and am installing splunk to let me go searching in the logs when I find something in the reports that I want to investigate further (with this sort of log volume, just doing a grep through the logs can take days)

hope this helps.

David Lang
_______________________________________________
firewall-wizards mailing list
firewall-wizards@xxxxxxxxxxxxxxxxxxxxx
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards



Relevant Pages

  • RE: Securty Audit Correlating
    ... exporting both(events and tickets) to a SQL/Access DB ... > viewer logs, so you can set filters for specific ... >>Currently we are outsourcing our account creation, ... >>After that generate a report. ...
    (Focus-Microsoft)
  • Re: Performance Report Monitoring Problem
    ... Nothing at all in the event logs? ... Les Connor [SBS Community Member - SBS MVP] ... >associated with the performance report. ... In the "Server Status Rerports" ...
    (microsoft.public.windows.server.sbs)
  • Re: pix log analyser
    ... to obtain meaningful info, but my personal favourites are no's.2 and 3. ... Uses the popular tool 'analog' to graph the PIX logs. ... fwanalog produces three web-based reports (a report for today, ... implement and enforce WLAN security policies to lockdown enterprise WLANs. ...
    (Pen-Test)
  • Re: Usage Report Question
    ... The usage and performance report is based on the performance/health monitor ... results which comes by analyzing corresponding logs. ... >> save value collected by the monitoring components to the SBS monitoring ...
    (microsoft.public.windows.server.sbs)
  • optimize log parsing
    ... a certain queue of log files to parse, such that the sum of the bytes ... i calculate the sum of the sizes of all the logs, ... divided among 20 buckets, that's roughly 380 MB per bucket. ...
    (comp.lang.perl.misc)

Quantcast