[svlug] Protecting and recovering from high system load?
Scott Hess
scott at doubleu.com
Mon Aug 28 14:41:33 PDT 2006
On 8/28/06, DzM <svlug at dzm.com> wrote:
> I have sometimes been fortunate enough to be logged into the box when
> this happens and be able to immediately begin kill -9 the PIDs that seem
> to be on top of top. More often than not though I've had to call the ISP
> and have the power cycled in order to recover the machine.
You'd probably be better off to run "ps auxmww" and capture the output
somewhere safe (by that I mean probably not on the box with the load).
A given process (well, thread) only counts as at most 1 in load
average, so the processes at the top of "top" may-or-may-not be
involved with the load average, unless it's spawning off a bunch of
subprocesses.
pstree output might also be useful, as might "ps ajxmww". It might
also be useful to spin up something to monitor things over time, to
see if this is happening frequently or not. This might be as simple
as a cron job running once a minute and logging the output of "top -b
-n1" to a file.
-scott
More information about the svlug
mailing list