[volunteers] Why you are careful about /dev/null operations as the root user

Rick Moen rick at linuxmafia.com
Wed Dec 16 17:29:42 PST 2015


(I started writing this about 9am, and it's in chronological order.
To skip to the good bits, search for '/dev/null'.)


I'm just back from my flight out of the country, and am dead tired.  
Noticed that a post I made to Volunteers didn't show up (in my mail or
the Web archive), so I ssh'ed into lists.svlug.org, did 'su -' to
gain root, and rebooted.  Yes, quick and sleazy, but I'm tired and this
sometimes makes symptoms go away.  

Did this from my laptop:

Litte-Datamaskin:$ ssh lists.svlug.org
Permission denied (publickey).
Litte-Datamaskin:Desktop rick$ telnet lists.svlug.org 25
Trying 71.19.144.13...
Connected to lists.svlug.org.
Escape character is '^]'.
220 svlug.org ESMTP Exim 4.44 #1 Wed, 16 Dec 2015 09:55:28 -0800 - mm9
HELO linuxmafia.com
250 svlug.org Hello linuxmafia.com [198.144.195.190]
MAIL FROM: rick at linuxmafia.com
250 OK
RCPT TO: test at lists.svlug.org
250 Accepted
DATA
354 Enter message, ending with "." on a line by itself
421 svlug.org SMTP incoming data timeout - closing connection.
Connection closed by foreign host.
Litte-Datamaskin:$

And later:

Litte-Datamaskin:$ telnet lists.svlug.org 25
Trying 71.19.144.13...
Connected to lists.svlug.org.
Escape character is '^]'.
220 svlug.org ESMTP Exim 4.44 #1 Wed, 16 Dec 2015 13:34:20 -0800 - mm9
HELO linuxmafia.com
250 svlug.org Hello linuxmafia.com [198.144.195.190]
MAIL FROM: rick at linuxmafia.com
250 OK
RCPT TO: test at lists.svlug.org
250 Accepted
DATA
354 Enter message, ending with "." on a line by itself
From: Rick Moen <rick at linuxmafia.com>
To: test at lists.svlug.org
Subject: Test #2 Dec. 15, 2015

Test #2 Dec. 15, 2015
.
250 OK id=1a9JiW-0000ka-Kx
quit
221 svlug.org closing connection
Connection closed by foreign host.
Litte-Datamaskin:$


The two test messages don't show up in http://lists.svlug.org/archives/test/, 
nor Mailman's admin queue.


After deleting my MTA's retry database, and forcing a delivery run of
outbound mail:

linuxmafia:~# exim -q -v
LOG: queue_run MAIN
  Start queue run: pid=32247
delivering 1a8mOV-0006z4-Cq (queue run pid 32247)
R: dnslookup for up at trpx.net
LOG: MAIN
  == up at trpx.net R=dnslookup defer (-1): host lookup did not complete
delivering 1a9Fut-0006xt-HN (queue run pid 32247)
R: dnslookup for volunteers at lists.svlug.org
T: remote_smtp for volunteers at lists.svlug.org
Connecting to svlug.org [71.19.144.13]:25 ... connected
  SMTP<< 220 svlug.org ESMTP Exim 4.44 #1 Wed, 16 Dec 2015 13:44:12 -0800 - mm9
  SMTP>> EHLO linuxmafia.com
  SMTP<< 250-svlug.org Hello linuxmafia.com [198.144.195.186]
         250-SIZE 10485760
         250-EXPN
         250-PIPELINING
         250-AUTH PLAIN LOGIN CRAM-MD5
         250-STARTTLS
         250 HELP
  SMTP>> STARTTLS
  SMTP<< Fatal: no entropy gathering module detected
LOG: MAIN
  Malformed SMTP reply from svlug.org [71.19.144.13] in response to STARTTLS: Fatal: no entropy gathering module detected
LOG: MAIN
  == volunteers at lists.svlug.org R=dnslookup T=remote_smtp defer (-19): Malformed SMTP reply from svlug.org [71.19.144.13] in response to STARTTLS: Fatal: no entropy gathering module detected
delivering 1a8pft-0007wE-05 (queue run pid 32247)
R: dnslookup for williamslamar6196 at advfinsa-previsora.telconet.net
T: remote_smtp for williamslamar6196 at advfinsa-previsora.telconet.net
Connecting to advfinsa-previsora.telconet.net [190.95.190.233]:25 ... failed: Connection refused
LOG: MAIN
  advfinsa-previsora.telconet.net [190.95.190.233] Connection refused
LOG: MAIN
  == williamslamar6196 at advfinsa-previsora.telconet.net R=dnslookup T=remote_smtp defer (111): Connection refused
LOG: queue_run MAIN
  End queue run: pid=32247
linuxmafia:~# 


A few minutes later, trying again:

linuxmafia:~# exim -q -v
LOG: queue_run MAIN
  Start queue run: pid=32358
delivering 1a8mOV-0006z4-Cq (queue run pid 32358)
LOG: retry_defer MAIN
  == up at trpx.net routing defer (-51): retry time not reached
delivering 1a8pft-0007wE-05 (queue run pid 32358)
R: dnslookup for williamslamar6196 at advfinsa-previsora.telconet.net
T: remote_smtp for williamslamar6196 at advfinsa-previsora.telconet.net
LOG: retry_defer MAIN
  == williamslamar6196 at advfinsa-previsora.telconet.net R=dnslookup T=remote_smtp defer (-53): retry time not reached for any host
delivering 1a9Fut-0006xt-HN (queue run pid 32358)
R: dnslookup for volunteers at lists.svlug.org
T: remote_smtp for volunteers at lists.svlug.org
LOG: retry_defer MAIN
  == volunteers at lists.svlug.org R=dnslookup T=remote_smtp defer (-53): retry time not reached for any host
LOG: queue_run MAIN
  End queue run: pid=32358
linuxmafia:~#


Entries in lists.svlug.org's /var/log/exim4/mainlog for the second 'test' delivery attempt:
2015-12-16 13:34:20 SMTP connection from [198.144.195.190]:41032 I=[71.19.144.13]:25 (TCP/IP connection count = 1)
2015-12-16 13:34:20 no IP address found for host emperor.deirdre.org (during SMTP connection from [198.144.195.190]:41032 I=[71.19.144.13]:25)
2015-12-16 13:34:56 1a9JiW-0000ka-Kx SA: Debug: SAEximRunCond expand returned: '1'
2015-12-16 13:34:56 1a9JiW-0000ka-Kx SA: Debug: check succeeded, running spamc
2015-12-16 13:34:59 1a9JiW-0000ka-Kx SA: Action: SA didn't successfully run against message, accepting (time: 3/3 secs | Message-Id: 1a9JiW-0000ka-Kx). From 
<rick at linuxmafia.com> (host=NULL [198.144.195.190]) for test at lists.svlug.org
2015-12-16 13:34:59 1a9JiW-0000ka-Kx <= rick at linuxmafia.com H=(linuxmafia.com) [198.144.195.190]:41032 I=[71.19.144.13]:25 P=smtp S=610 T="Test #2 Dec. 15, 2
015" from <rick at linuxmafia.com> for test at lists.svlug.org

Earlier entries for first 'test' message:

2015-12-16 10:03:07 1a9GOx-0000GW-Dx SA: Debug: check succeeded, running spamc
2015-12-16 10:03:10 1a9GOx-0000GW-Dx SA: Action: SA didn't successfully run against message, accepting (time: 3/3 secs | Message-Id: 1a9GOx-0000GW-Dx). From 
<rick at linuxmafia.com> (host=NULL [198.144.195.190]) for test at lists.svlug.org
2015-12-16 10:03:10 1a9GOx-0000GW-Dx <= rick at linuxmafia.com H=(linuxmafia.com) [198.144.195.190]:42886 I=[71.19.144.13]:25 P=smtp S=610 T="Test #1 Dec. 15, 2
015" from <rick at linuxmafia.com> for test at lists.svlug.org


Hmm, /var/log/exim4/paniclog exists.  That's bad.  It needs to not exist.  Contents:

2015-12-16 06:28:46 1a9D4L-000873-NQ failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 06:31:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:01:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:01:42 1a9DaE-0008CL-Ru failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:05:02 1a9DdW-0008D1-Np failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:31:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:33:18 1a9E4p-0008JY-0N failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:34:55 1a9E6R-0008Jn-Fw failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:36:15 1a9E7j-0008KE-MH failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 07:50:59 1a9ELv-0008MP-Cm failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008Ny-H3 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008O0-K3 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008O2-Lj failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008O4-NR failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008O6-Ou failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008O8-QO failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:00:02 1a9EUk-0008OA-RX failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:01:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:05:01 1a9EZZ-0008P2-Dy failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:05:19 1a9EZn-0008P4-Tc failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:13:12 1a9EgM-0008Q0-2P failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:23:27 1a9ErP-0008Rw-MX failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:25:37 1a9EtQ-0008Ry-Vh failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 08:31:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:01:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:01:09 1a9FRp-00005x-Tv failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:05:00 1a9FVc-00006Q-UP failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:05:01 1a9FVd-00006j-Oz failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:05:56 1a9FWS-00006o-Se failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:08:41 1a9FZ8-00007G-0Y failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:31:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:37:07 1a9G0h-0000Bp-Fa failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:39:29 1a9G2v-0000C7-Q4 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 09:47:57 1a9GB7-0000DM-Ty failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 10:01:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 10:03:10 1a9GOx-0000GW-Dx failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 10:05:02 1a9GRh-0000HE-VF failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 10:19:18 1a9GfW-0000JL-QY failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 10:31:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 10:36:04 1a9Gvk-0000MI-5i failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:01:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:05:02 1a9HNm-0000Qu-70 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:09:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:11:45 1a9HU8-0000NZ-EG failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:12:29 1a9HUv-0000OI-Ko failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:27:22 1a9HjL-0000Qa-1l failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:39:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 11:52:46 1a9I7m-0000UD-SF failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:05:01 1a9IJp-0000WB-FN failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:09:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:11:27 1a9IPz-0000X6-Ii failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:38:52 1a9IqW-0000b7-FH failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:39:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:54:49 1a9J5y-0000do-7u failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 12:57:31 1a9J8Z-0000eF-VL failed to open /dev/null: Permission denied (euid=111 egid=111)
[...]
2015-12-16 13:17:41 1a9JS5-0000ho-4V failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:34:59 1a9JiW-0000ka-Kx failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:37:11 1a9Jkx-0000l7-FZ failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:39:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:49:57 1a9JxN-0000n0-Qr failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 14:05:01 1a9KBx-0000tT-Vx failed to open /dev/null: Permission denied (euid=111 egid=111)


Holy mother of god.  Someone/something overwrote /dev/null.



lists:/var/log/exim4# ls -al /dev/null
ls: unrecognized prefix: rs
ls: unparsable value for LS_COLORS environment variable
-rw-r--r--    1 root     root           45 Dec 16 14:06 /dev/null
lists:/var/log/exim4# 
lists:/var/log/exim4# cd /dev
lists:/dev# cat null
1287
2756
2903
3025
3033
3034
3332
3333
3457
lists:/dev# 

/dev/null should be a character device, not a regular file.  Compare my server:

$ ls -al /dev/null
crw-rw-rw- 1 root root 1, 3 Dec  2 01:48 /dev/null
$

Probably either a person or script did 'cp' or equivalent to /dev/null, rather than 
directing stdout or stderr to /dev/null as intended.

Fixing:


lists:/dev# rm null 
rm: remove regular file `null'? y
lists:/dev# mknod -m 666 null c 1 3 
lists:/dev# ls -al null 
ls: unrecognized prefix: rs
ls: unparsable value for LS_COLORS environment variable
crw-rw-rw-    1 root     root       1,   3 Dec 16 14:12 null
lists:/dev# 

Being lazy, I'm rebooting it again:

ists:/dev# exit
root at lists:~# shutdown -r now

Broadcast message from rick at lists.svlug.org
        (/dev/pts/0) at 22:13 ...

The system is going down for reboot NOW!
root at lists:~# 



Suddenly, on linuxmafia.com, I'm getting a huge number of delayed pieces of 
postmaster mail from lists.svlug.org (from before the above reboot), like this one:


Date: Wed, 16 Dec 2015 14:15:50 -0800
>From MAILER-DAEMON Wed Dec 16 14: 3:59 2015
From: Mail Delivery System <Mailer-Daemon at svlug.org>
To: postmaster at svlug.org
Subject: Message frozen

Message 1a9KME-0000eZ-GR has been frozen.
The sender is <svlug-bounces+rogerchrisman=gmail.com at lists.svlug.org>.

The following address(es) have yet to be delivered:
  rogerchrisman at gmail.com: smtp transport process returned non-zero status 0x0200: exit code 2


Suddenly, I am also getting held-mail notices from Mailman on lists.svlug.org.

And suddenly, linuxmafia.com is extremely busy (doubtless processing a metric tonne of mail):

[rick at linuxmafia]
~ $ uptime
 14:29:33 up 14 days, 12:41,  7 users,  load average: 14.91, 16.01, 10.70
[rick at linuxmafia]
~ $ 

Suddenly, I am also getting delayed mail on linuxmafia.com like this, that (if 
lists.svlug.org hadn't been rendered poleaxed by /dev/null being clobbered) would 
have pointed directly to the problem:

Date: Wed, 16 Dec 2015 13:05:01 -0800
>From root at svlug.org Wed Dec 16 14: 8:52 2015
From: Cron Daemon <root at svlug.org>
To: root at svlug.org
Subject: Cron <root at lists> /etc/init.d/spamassassin restart >/dev/null

unix dgram connect: No such file or directory at /usr/sbin/spamd line 1748
Can't write to /dev/null: Permission denied



OK, my system load has climbed back down, after being blitzed by the backlog:

[rick at linuxmafia]
~ $ uptime
 14:37:23 up 14 days, 12:49,  7 users,  load average: 0.44, 7.36, 9.30
[rick at linuxmafia]
~ $ 

Meanwhile, lists.svlug.org is back up.  No recent entries in paniclog, which is 
a step forward:

lists:/var/log/exim4# tail paniclog
2015-12-16 13:02:51 1a9JDj-0000fN-4v failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:05:01 1a9JFt-0000fw-OT failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:09:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:17:41 1a9JS5-0000ho-4V failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:34:59 1a9JiW-0000ka-Kx failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:37:11 1a9Jkx-0000l7-FZ failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:39:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 13:49:57 1a9JxN-0000n0-Qr failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 14:05:01 1a9KBx-0000tT-Vx failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-16 14:09:07 failed to open /dev/null: Permission denied (euid=111 egid=111)
lists:/var/log/exim4# date
Wed Dec 16 14:44:11 PST 2015
lists:/var/log/exim4# 

I'll soon delete the /var/log/exim4/paniclog* files, as they're distracting and 
useless after fix whatever catstrophe they document.

For the record, the other, much earlier entries were:

lists:/var/log/exim4# zcat paniclog.10.gz 
2010-01-08 01:46:17 queue run: process 31156 crashed with signal 9 while delivering 1NS2ni-00052H-JK
2010-01-08 02:25:26 daemon: accept process fork failed: Cannot allocate memory
lists:/var/log/exim4# zcat paniclog.9.gz  
2010-01-08 09:05:26 1NTIC5-0001yr-RC <= 46fce4cb.5040502 at isionline.net H=([109.86.48.113]) [109.86.48.113]:20220 I=[157.22.20.227]:25 P=esmtp S=3202 id=7097717137.QD9TSOGH293346 at surheueqj.aerxbwgxexik.net T="Cialis 80% OFF on Pfizer." from <46fce4cb.5040502 at isionline.net> for stevegt at svlug.org web-team at svlug.org
2010-01-08 09:05:54 1NTIC5-0001yr-RC failed to write to main log: length=314 result=-1 errno=12 (Cannot allocate memory)
2010-01-08 09:20:07 daemon: accept process fork failed: Cannot allocate memory
2010-01-08 09:20:25 failed to write to main log: length=79 result=-1 errno=12 (Cannot allocate memory)
lists:/var/log/exim4# zcat paniclog.8.gz 
2010-01-10 10:44:32 failed to open /etc/exim4/acls/destwhitelist when checking "/etc/exim4/acls/destwhitelist": Cannot allocate memory
lists:/var/log/exim4# zcat paniclog.7.gz 
2010-01-14 14:25:44 1NVY2O-000122-QN failed to read delivery status for svlug at lists.svlug.org from delivery subprocess
2010-01-14 14:25:45 1NVY2O-000122-QN pipe transport process returned non-zero status 0x0009: terminated by signal 9
lists:/var/log/exim4# zcat paniclog.6.gz 
2010-01-15 11:03:48 H=(187-56-202-219.dsl.telesp.net.br) [187.56.202.219]:1519 I=[157.22.20.227]:25 sender verify fail for <whackieste93 at resetters.com>: response to "RCPT TO:<whackieste93 at resetters.com>" from resetters.com [77.222.33.138] was: 550
2010-01-15 11:04:19 failed to write to reject log: length=248 result=-1 errno=12 (Cannot allocate memory)
2010-01-15 11:04:22 1NVrSF-0006xf-UO SA: PANIC: wait on spamc child yielded, 9 (but message was accepted)
2010-01-15 11:04:24 1NVrSI-0006xo-BO SA: PANIC: wait on spamc child yielded, 9 (but message was accepted)
2010-01-15 21:50:24 1NW1Xc-0003I5-Uv failed to read delivery status for volunteers-owner at lists.svlug.org from delivery subprocess
2010-01-15 21:50:24 1NW1Xc-0003I5-Uv pipe transport process returned non-zero status 0x0009: terminated by signal 9
lists:/var/log/exim4# zcat paniclog.5.gz 
2010-01-16 09:40:21 1NWCYt-0006wg-6X SA: Action: spamd took more than 240 secs to run, accepting message (scanned in 0/0 secs | Message-Id: 1NWCYt-0006wg-6X). From <nievesr at all-yours.net> (host=7.156.148.190.dsl.intelnet.net.gt [190.148.156.7]) for svlug-announce at lists.svlug.org
Envelope-from: <nievesr at all-yours.net>
Envelope-to: <svlug-announce at lists.svlug.org>
P Received: from 7.156.148.190.dsl.intelnet.net.gt ([190.148.156.7]:52397 helo=62.11.148.190.dsl.intelnet.net.gt)
        by svlug.org with smtp (Exim 4.44 #1)
        id 1NWCYt-0006wg-6X
        for <svlug-announce at lists.svlug.org>; Sat, 16 Jan 2010 09:35:48 -0800
* From: Approved VIAGRA® Store <svlug-announce at svlug.org>
F From: Approved VIAGRA® Store <svlug-announce at lists.svlug.org>
  Subject: Confirmation Mail
* To: <svlug-announce at svlug.org>
T To: <svlug-announce at lists.svlug.org>
  MIME-Version: 1.0
  Content-Type: text/html
* X-rewrote-original-recipient: svlug-announce at svlug.org
* X-rewrote-original-recipient: svlug-announce at svlug.org
  X-SA-Exim-Connect-IP: 190.148.156.7
  X-SA-Exim-Rcpt-To: svlug-announce at lists.svlug.org
  X-SA-Exim-Mail-From: nievesr at all-yours.net
  X-SA-Exim-Scanned: No (on svlug.org); SA Timed out after 240 secs
2010-01-16 09:42:19 1NWCYt-0006wg-6X failed to write to reject log: length=1198 result=-1 errno=12 (Cannot allocate memory)
2010-01-16 09:54:26 daemon: accept process fork failed: Cannot allocate memory
2010-01-16 16:13:52 daemon: accept process fork failed: Cannot allocate memory
lists:/var/log/exim4# lists:/var/log/exim4# zcat paniclog.10.gz 
2010-01-08 01:46:17 queue run: process 31156 crashed with signal 9 while delivering 1NS2ni-00052H-JK
2010-01-08 02:25:26 daemon: accept process fork failed: Cannot allocate memory
lists:/var/log/exim4# zcat paniclog.9.gz  
2010-01-08 09:05:26 1NTIC5-0001yr-RC <= 46fce4cb.5040502 at isionline.net H=([109.86.48.113]) [109.86.48.113]:20220 I=[157.22.20.227]:25 P=esmtp S=3202 id=7097717137.QD9TSOGH293346 at surheueqj.aerxbwgxexik.net T="Cialis 80% OFF on Pfizer." from <46fce4cb.5040502 at isionline.net> for stevegt at svlug.org web-team at svlug.org
2010-01-08 09:05:54 1NTIC5-0001yr-RC failed to write to main log: length=314 result=-1 errno=12 (Cannot allocate memory)
2010-01-08 09:20:07 daemon: accept process fork failed: Cannot allocate memory
2010-01-08 09:20:25 failed to write to main log: length=79 result=-1 errno=12 (Cannot allocate memory)
lists:/var/log/exim4# zcat paniclog.8.gz 
2010-01-10 10:44:32 failed to open /etc/exim4/acls/destwhitelist when checking "/etc/exim4/acls/destwhitelist": Cannot allocate memory
lists:/var/log/exim4# zcat paniclog.7.gz 
2010-01-14 14:25:44 1NVY2O-000122-QN failed to read delivery status for svlug at lists.svlug.org from delivery subprocess
2010-01-14 14:25:45 1NVY2O-000122-QN pipe transport process returned non-zero status 0x0009: terminated by signal 9
lists:/var/log/exim4# zcat paniclog.6.gz 
2010-01-15 11:03:48 H=(187-56-202-219.dsl.telesp.net.br) [187.56.202.219]:1519 I=[157.22.20.227]:25 sender verify fail for <whackieste93 at resetters.com>: response to "RCPT TO:<whackieste93 at resetters.com>" from resetters.com [77.222.33.138] was: 550
2010-01-15 11:04:19 failed to write to reject log: length=248 result=-1 errno=12 (Cannot allocate memory)
2010-01-15 11:04:22 1NVrSF-0006xf-UO SA: PANIC: wait on spamc child yielded, 9 (but message was accepted)
2010-01-15 11:04:24 1NVrSI-0006xo-BO SA: PANIC: wait on spamc child yielded, 9 (but message was accepted)
2010-01-15 21:50:24 1NW1Xc-0003I5-Uv failed to read delivery status for volunteers-owner at lists.svlug.org from delivery subprocess
2010-01-15 21:50:24 1NW1Xc-0003I5-Uv pipe transport process returned non-zero status 0x0009: terminated by signal 9
lists:/var/log/exim4# zcat paniclog.5.gz 
2010-01-16 09:40:21 1NWCYt-0006wg-6X SA: Action: spamd took more than 240 secs to run, accepting message (scanned in 0/0 secs | Message-Id: 1NWCYt-0006wg-6X). From <nievesr at all-yours.net> (host=7.156.148.190.dsl.intelnet.net.gt [190.148.156.7]) for svlug-announce at lists.svlug.org
Envelope-from: <nievesr at all-yours.net>
Envelope-to: <svlug-announce at lists.svlug.org>
P Received: from 7.156.148.190.dsl.intelnet.net.gt ([190.148.156.7]:52397 helo=62.11.148.190.dsl.intelnet.net.gt)
        by svlug.org with smtp (Exim 4.44 #1)
        id 1NWCYt-0006wg-6X
        for <svlug-announce at lists.svlug.org>; Sat, 16 Jan 2010 09:35:48 -0800
* From: Approved VIAGRA® Store <svlug-announce at svlug.org>
F From: Approved VIAGRA® Store <svlug-announce at lists.svlug.org>
  Subject: Confirmation Mail
* To: <svlug-announce at svlug.org>
T To: <svlug-announce at lists.svlug.org>
  MIME-Version: 1.0
  Content-Type: text/html
* X-rewrote-original-recipient: svlug-announce at svlug.org
* X-rewrote-original-recipient: svlug-announce at svlug.org
  X-SA-Exim-Connect-IP: 190.148.156.7
  X-SA-Exim-Rcpt-To: svlug-announce at lists.svlug.org
  X-SA-Exim-Mail-From: nievesr at all-yours.net
  X-SA-Exim-Scanned: No (on svlug.org); SA Timed out after 240 secs
2010-01-16 09:42:19 1NWCYt-0006wg-6X failed to write to reject log: length=1198 result=-1 errno=12 (Cannot allocate memory)
2010-01-16 09:54:26 daemon: accept process fork failed: Cannot allocate memory
2010-01-16 16:13:52 daemon: accept process fork failed: Cannot allocate memory
lists:/var/log/exim4# zcat paniclog.4.gz
2010-01-18 07:42:32 SMTP connection from [122.168.101.202]:2708 I=[157.22.20.227]:25 (TCP/IP connection count = 5)
2010-01-18 07:43:14 failed to write to main log: length=115 result=-1 errno=12 (Cannot allocate memory)
lists:/var/log/exim4#


The first of the recent series was this one in paniclog.3.gz, and it's
all /dev/null complaints from then until today:

2015-12-13 22:38:53 failed to open /dev/null: Permission denied (euid=111 egid=111)
2015-12-13 22:42:17 1a8Mpn-0000Nc-92 failed to open /dev/null: Permission denied (euid=111 egid=111)

So, whatever happened, happened some shortly (probably) before 10:38pm this past Monday (two days ago).



Meanwhile, with lists.svlug.org back up, my linuxmafia.com host is
getting clobbered again with hundreds of 'Message frozen' messages from
lists.svlug'org's exim daemon, which was understandably unable to
process outbound mail at all for several days.

[rick at linuxmafia]
~ $ uptime
 15:15:19 up 14 days, 13:27,  8 users,  load average: 29.94, 29.64, 24.81
[rick at linuxmafia]
~ $ 

Load continuing to rise, and lots more 'Message frozen' advisories
making it through the gauntlet.  I'm betting that almost all of the load
is spamd (SpamAssassin system daemon) getting hammered, as it's a large,
slow Perl process.

[rick at linuxmafia]
~ $ uptime
 15:23:54 up 14 days, 13:36,  7 users,  load average: 32.21, 27.60, 25.39
[rick at linuxmafia]
~ $ 

If nothing else, this omnishambles is torture-testing my system nicely.
;->


Right (he says, after a couple of hours more of this nonsense), I don't
like to have to do it, but I'm killing spamd on linuxmafia.com, just so 
performance isn't destroyed for _days_.  Load is now dropping:


linuxmafia:~# uptime
 16:12:22 up 14 days, 14:24,  7 users,  load average: 20.54, 24.88, 24.79
linuxmafia:~# uptime
 16:12:24 up 14 days, 14:24,  7 users,  load average: 19.30, 24.55, 24.68
linuxmafia:~# uptime
 16:12:26 up 14 days, 14:24,  7 users,  load average: 19.30, 24.55, 24.68
linuxmafia:~# uptime
 16:12:32 up 14 days, 14:24,  7 users,  load average: 17.83, 24.16, 24.55
linuxmafia:~# 


And, heh, many hundreds more 'Message frozen' messages:  The madness
continues, but at least it's much faster madness.

I'm not happy about this, but we'll deal.

Trying to force delivery of the gobsmackingly large set of frozen
outbound messages on lists.svlug.org:

lists:/var/log/exim4# exim -qff
Fatal: no entropy gathering module detected

It's taking a _long_ time.

I noticed that, just before I did that, a significant amount of normal
queued-up mail transited the mailing lists and went out to subscribers.
However, by making exim concentrate on clearing the frozen messages,
I've probably in effect deferred normal operation a bit.

Half an hour later, hmm, taking too long, so I'm interrupting that
operation.

lists:/var/log/exim4# exim -qff
Fatal: no entropy gathering module detected

^C
lists:/var/log/exim4# 

How many messages in the queue?

lists:/var/log/exim4# exim -bpc
1737
lists:/var/log/exim4# 


Deleting all queued mail from the null sender:

lists:/var/log/exim4# exiqgrep -i -f "<>" | xargs exim -Mrm
Message 1a9LwA-0004Pb-Bu has been removed
Message 1a9M6R-0004Zr-OH has been removed
[...]
lists:/var/log/exim4# 

Sadly, that put only a small dent in the accumulation:

lists:/var/log/exim4# exim -bpc
1675
lists:/var/log/exim4# 

Deleting all outbound mail from 'mailman-owner-bounce' kills some more:

lists:/var/log/exim4# exiqgrep -i -f mailman-owner-bounces | xargs exim -Mrm


lists:/var/log/exim4# exim -bpc 
1562
lists:/var/log/exim4# 

Deleting some more junk:

lists:/var/log/exim4# exiqgrep -i -f officers-bounces | xargs exim -Mrm
lists:/var/log/exim4# exiqgrep -i -f smaug-bounces | xargs exim -Mrm

But that wasn't much reduction.

Thawing all 1500 frozen messages, so normal attempts to deliver will
re-commence:

lists:/var/log/exim4# exiqgrep -z -i | xargs exim -Mt

Starting a manual queue run:

lists:/var/log/exim4# exim -qf
[waiting, waiting, waiting...]

Cancelling.

Forcing delivery first of all pending outbound mail to the volunteers
mailing list:

lists:/var/log/exim4# exiqgrep -i  -f volunteers-bounces  | xargs exim -M
[waiting, waiting, waiting...]

Done.  That took about four minutes to grind through about 50 pending
messages.

I'm going to send _this_ mail to volunteers, force it out, and then 
let exim handle the rest of the ~1500 messages in its own time.


Well, this cost me much of my first day back from vacation.  Need I
mention:  _Please_ be bloody careful what y'all do with root authority.  
Please?  Please?


Useful page:  http://bradthemad.org/tech/notes/exim_cheatsheet.php




More information about the volunteers mailing list