[svlug] Procmail (spam) rule for junk tag gaps?
Karsten M. Self
kmself at ix.netcom.com
Wed May 14 11:54:02 PDT 2003
I'm getting spam slipping through spamassassin formatted as HTML email
with junk tags, eg:
maz>eme<kbmd107sgr8u>nt Pi<kagxhc6btb55l1n>ll On The Ma<kz69gfa28awdh>rk
...that's a typical "enlargement" spam message.
Are there any procmail geniuses who could give a tip on how to filter
same? The mail has a very high tag-to-text ratio, and the tags seem not
to have much/any whitespace. Hmm...
One source of inspiration is the Chinese character filters I'm using
(the original site is now offline), example:
# To allow _more_ high-bit chars, *decrease* the weight for high-bit lines.
# To allow _fewer high-bit chars, *increase* the weight for high-bit lines.
# Weight is 1/(percent high-bit), e.g.: 1/(0.05) = 20.
# Arbitrarally require message to be at least 3200 bytes to trip filter
# (to exclude short messages w/funky sigs). This is about 4 lines of
* > 3200
* -1^1 .
* 2^1 =[0-9A-F][0-9A-F]
* 10^1 [ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿]
* 10^1 [ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß]
* 10^1 [àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]
* 10^1 =[A-F][0-9A-F]
(The above includes extended charset characters and may not display
This uses weighting to count characters in a message and only trip if
the total amount of high-characterset characters exceeds a minimum.
Karsten M. Self <kmself at ix.netcom.com> http://kmself.home.netcom.com/
What Part of "Gestalt" don't you understand?
Sick of mal-formed websites? A stylesheet to override poor design:
More information about the svlug