Re: Features for a monitoring tool

From: Juha Laiho (Juha.Laiho@iki.fi)
Date: 02/16/03


From: Juha Laiho <Juha.Laiho@iki.fi>
Date: Sun, 16 Feb 2003 13:32:01 GMT

Felinux <felinux@supereva.it> said:
>I am writing a C/C++ client/server monitoring tool for Linux. It's just a
>hobby, and it's still on the starting grid (client and server sum up to
>about 300 lines). The rough (and certainly not original) idea is: you run
>the server on a host you want to remotely keep an eye on, then you run the
>client on your, say, home workstation.

No, it's the other way around. Clients on the monitored machine(s) and
server somewhere else. Then possibly a user interface for the server
somewhere. Some monitoring is best done so that the server polls the
client, but there are also cases where you want to have notifications
originated by the client, without the server polling for the info.

>At the moment, the password is stored in a config file on the server
>and is sent clear-text by the client, but I will encrypt it in the near
>future.

Ok, and then (using your terminology on server and client here) the
server will still need to present the password in clear to the client,
so the server will need means to decrypt the password. And thus, the
means to decrypt the password will reside on the server, so you're
not gaining much by encrypting the password and leaving the tools to
decrypt the password just next to it.

>The reason I am writing this is I have no idea what exactly the data
>sent by the server should be. To get to the point(s):
>1) what info do you think are REALLY critical to have a useful snapshot of
>the situation of a typical server?
>2) what features would YOU like to see on such a tool?

- configurable monitoring, because I probably am running things that you
  didn't know about
- good user interface, to be able to gather summary of the current
  situation at a glance, but also to be able to drill down to a number
  of different details whenever needed

Examples for very basic monitoring ("these on all hosts"):
- disk space utilization
- disk I/O utilization
- memory utilization
- swap utilization
- CPU utilization
- kernel data structure utilization
  - file table
  - process table
- process monitoring
  - syslogd
  - ntpd
  - sshd
  - crond
  - processes in "zombie" state
- log files
  - login failures
  - administrative (root) logons
  - security events (su/sudo usage)
  - ntpd status
  - cron job status
  - kernel distress messages
- active checking
  - ntpd status
  - DNS operation
  - network link (but then, if having only one link, the problem cannot
                  be immediately reported)
- environmental
  - UPS status (if having UPS)
  - temperature (if possible)

... and on top of these the monitoring of the actual function of the host.

But this (the monitoring targets) is the easy part. Not losing messages
and getting the UI right are the hard part. Security isn't easy, either.

-- 
Wolf  a.k.a.  Juha Laiho     Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
         PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)