[volunteers] Monitoring SVLUG Processes
mark weisler
mark at weisler-saratoga-ca.us
Sun Jul 19 09:24:26 PDT 2009
On Jul 19, 2009, at 2:20 AM, Rick Moen wrote:
> Quoting Mark Weisler (mark at weisler-saratoga-ca.us):
>
>> Would it be worthwhile to use some automated monitoring tools on our
>> processes? Like Nagios for example?
>
> You mean automated monitoring of those processes from a remote host?
> Sure, go ahead.
Wondering about appropriate ways to do this considering...
a. the decidedly fragile nature of the ten year old system we are
running (I am not slamming the system; on the contrary it is a well
configured system considering the state of the art ten years ago).
b. that this a volunteer operation.
c. that our communications are not life and death matters.
I was thinking of something very simple and unobtrusive. Some solution
that does not require much in resource to run.
Maybe a chron job on the server which would run every say five minutes.
It would be something like...
If load < 0.3 send "i'm alive" message to external listener
(optionally with some system stats).
The listener (a volunteer's computer on the network) would listen for
mail from the svlug server and, if it does not receive it in, say,
eleven minutes it would send a text/SMS message to a list of
appropriate volunteers.
Does this make sense? Suggestions? If it does make sense can anyone
suggest tools that can help with this, especially on the listener
side? (Nagios maybe? http://en.wikipedia.org/wiki/Nagios)
Thanks.
Mark
>
>
> You'll notice if you read the system cronjobs that it, itself, tries
> to
> do monitoring of its own process list and relaunch essential daemons
> if
> they're for some reason not running. Of course, if the root cause is
> that the system's run itself clean out of RAM/swap, then self-
> monitoring
> is going to tend to be a little fallible.
>
> On the flip side of that, external monitoring of network service
> availability may be immune to out-of-memory problems on the monitored
> host, but can't accomplish much beyond notification -- which is of
> course better sooner than later.
>
> Meanwhile, I've just attempted to better tune system performance by
> considerably reducing the number of running Apache httpd processes
> (as it needs far fewer, now that Apache's used only for Mailman and
> not
> SVLUG's other Web content), and cutting the number of spamd processes
> from 32 to 24. We'll see.
>
>
> _______________________________________________
> volunteers mailing list
> volunteers at lists.svlug.org
> http://lists.svlug.org/lists/listinfo/volunteers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.svlug.org/archives/volunteers/attachments/20090719/1f0b127d/attachment.htm
More information about the volunteers
mailing list