| 1 |
AUTHOR: Declan Moriarty <junk _ mail AT iol.ie> |
|---|
| 2 |
|
|---|
| 3 |
DATE: 2005-11-12 |
|---|
| 4 |
|
|---|
| 5 |
LICENSE: GNU Free Documentation License Version 1.2 |
|---|
| 6 |
|
|---|
| 7 |
SYNOPSIS: Setting up an Open Source Anti-Spam kit on an lfs box |
|---|
| 8 |
|
|---|
| 9 |
DESCRIPTION: With an emphasis on configuration, this provides |
|---|
| 10 |
Installation & Configuration Instructions for Mail-SpamAssassin-3.1.0 |
|---|
| 11 |
and it's helper tools. |
|---|
| 12 |
|
|---|
| 13 |
ATTACHMENTS: |
|---|
| 14 |
|
|---|
| 15 |
spamstuff.tar.bz2 A config file and init script. |
|---|
| 16 |
|
|---|
| 17 |
PREREQUISITES: A Basic understanding of unix, and a hatred of spam. This |
|---|
| 18 |
hint does _not_ apply to earlier versions of SpamAssassin, but you |
|---|
| 19 |
should be OK with most recent (or future) versions of other programs. |
|---|
| 20 |
Perl5 is required. A configurable mail server also helps. I would |
|---|
| 21 |
suggest postfix instead of qmail, but whatever you know well will |
|---|
| 22 |
probably do. If your mail is relayed to you, get procmail also, or some |
|---|
| 23 |
other mda, otherwise calling all these will be difficult. I also give |
|---|
| 24 |
instructions for formail (part of the postfix package), althouugh any |
|---|
| 25 |
similar mail handling utility can do. |
|---|
| 26 |
|
|---|
| 27 |
HINT: |
|---|
| 28 |
|
|---|
| 29 |
SECTION 1: INTRODUCTION. |
|---|
| 30 |
|
|---|
| 31 |
This is long. The only consolation is that it's about all the reading |
|---|
| 32 |
you have to do. Some jargon first |
|---|
| 33 |
|
|---|
| 34 |
Spam = Unsolicited Bulk email, that is mail that the user did |
|---|
| 35 |
not subscribe for. People who subscribe to a mailing list agree to |
|---|
| 36 |
receive to bulk mail. That is solicited. Spam is not. The word is from |
|---|
| 37 |
the film "Monty Python and the Holy Grail", where knights used as a |
|---|
| 38 |
weapon the repition of the word spam. |
|---|
| 39 |
|
|---|
| 40 |
Ham = good mail |
|---|
| 41 |
a 'hit' is a test that identifies spam identifying something. |
|---|
| 42 |
false hits are tests that hit ham. |
|---|
| 43 |
False Positive = Good mail wrongly marked as spam |
|---|
| 44 |
False Negatives = Spam wrongly let through |
|---|
| 45 |
Lint = Test validity of setup |
|---|
| 46 |
|
|---|
| 47 |
Set your goals. Set your spam policy. I don't want bulk mail, I |
|---|
| 48 |
don't want any spam in my mail,and I will accept false positives. |
|---|
| 49 |
Relying on an isp for relaying mail, I cannot reject at smtp level, so I |
|---|
| 50 |
silently delete spam, after checking the subjects and sender. Others |
|---|
| 51 |
will be different, and your policy will differ accordingly. |
|---|
| 52 |
|
|---|
| 53 |
In fighting spam, you have many tools. Collect your first one. |
|---|
| 54 |
|
|---|
| 55 |
1. From this moment on, start keeping your spam. you need every bit of |
|---|
| 56 |
it you can hold onto, for testing. Don't read it, just store it in a |
|---|
| 57 |
mailbox somewhere. About a Meg or two is enough. Collect a few |
|---|
| 58 |
mailboxes with 50 or so, and at least one with a hundred. |
|---|
| 59 |
|
|---|
| 60 |
http:razor.sourceforge.net/ |
|---|
| 61 |
|
|---|
| 62 |
2. Razor-agents. This operates by sending checksums of mail to a central |
|---|
| 63 |
server. If they have been reported as spam, the mail is markable as |
|---|
| 64 |
spam. If not, the checksums are discarded and you are told the mail is |
|---|
| 65 |
OK. It's very good, but relies on reporting. For commercial use, send |
|---|
| 66 |
an email (explaining your linux installation) to partners@cloudmark.com |
|---|
| 67 |
|
|---|
| 68 |
http://www.rhyolite.com/anti-spam/dcc |
|---|
| 69 |
|
|---|
| 70 |
3. DCC, The Distributed Checksum Clearinghouse. This operates as above, |
|---|
| 71 |
sending checksums, but the dcc counts how many times it has received |
|---|
| 72 |
that checksum. That is what it reports. The dcc also keeps all |
|---|
| 73 |
checksums, so the server database is bigger. It goes back about six |
|---|
| 74 |
months. The DCC is an effectiive check for bulk mail. I believe |
|---|
| 75 |
commtouch offer a commercial service. |
|---|
| 76 |
|
|---|
| 77 |
http://spamassassin.apache.org/downloads.cgi |
|---|
| 78 |
|
|---|
| 79 |
4. SpamAssassin-3.1.0 is a major revision on previous versions. It |
|---|
| 80 |
offers heuristic or rule-based vetting of email and employs blocklists, |
|---|
| 81 |
and several novel and unusual features. Very configurable - the |
|---|
| 82 |
workhorse, and the PITA. Unlike most Perl applications, this one is |
|---|
| 83 |
inclined to land 'jam side down' or in a mess, and sorting is necessary. |
|---|
| 84 |
|
|---|
| 85 |
5. Others exist. Notably, Amavisd-new and clamav. This is a sensible |
|---|
| 86 |
balance for a home user. You may want clamav if you are processing mail |
|---|
| 87 |
for windoze clients. Amavisd-new is a sort of sweeper process. The |
|---|
| 88 |
trouble is, all run on perl, and there's a limit to any box's workload. |
|---|
| 89 |
I may include them later. |
|---|
| 90 |
|
|---|
| 91 |
Ownerships: |
|---|
| 92 |
|
|---|
| 93 |
Preferred practise is not to run anything as root, and most of the mail |
|---|
| 94 |
programs will become user 'nobody' if they find themselves running with |
|---|
| 95 |
uid 0. Also, you do not want to make a 'super-luser' who has everything |
|---|
| 96 |
set up for him, as then if any process is breached, they have access to |
|---|
| 97 |
the whole box. So mail is handled by restricted users with few |
|---|
| 98 |
privileges until the delivery, which is done as the user to whom mail is |
|---|
| 99 |
delivered. The ultimate in this is qmail, which has a mexican wave of |
|---|
| 100 |
processes owned by users with shells like /bin/true, appearing and |
|---|
| 101 |
dissappearing playing pass-the-parcel while your mail goes through. |
|---|
| 102 |
|
|---|
| 103 |
Installation instructions specify a reccomended user. Make your choice |
|---|
| 104 |
|
|---|
| 105 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|---|
| 106 |
|
|---|
| 107 |
SECTION 2. INSTALLING: |
|---|
| 108 |
|
|---|
| 109 |
Spam: |
|---|
| 110 |
|
|---|
| 111 |
1. The spam seems to land naturally. If it doesn't, I can probably send |
|---|
| 112 |
you some. But if you really want pain, register a domain. You instantly |
|---|
| 113 |
go on every spammer's list. Then you get email from spammers offering |
|---|
| 114 |
you a mailing list to spam with _every_ address from registered domains |
|---|
| 115 |
:-/. If spam doesn't land, what are you doing here? |
|---|
| 116 |
|
|---|
| 117 |
Razor-agent: |
|---|
| 118 |
|
|---|
| 119 |
2. Razor agents. You need razor-agents, and razor-agents-sdk. You also |
|---|
| 120 |
need to know that this service is marketed to windoze users at profit, |
|---|
| 121 |
and the open source community receive it free, or cheap. Free for |
|---|
| 122 |
individuals, cheap for business use under linux. |
|---|
| 123 |
|
|---|
| 124 |
This is a perl program. To avoid messing I reccomend a symlink |
|---|
| 125 |
between /usr/lib/perl5 and /usr/local/lib/perl5. Presuming you |
|---|
| 126 |
are following LFS instructions, only one of those directories |
|---|
| 127 |
should exist now. If perl libs are on /usr/local, it will never |
|---|
| 128 |
check /usr/lib, and vice versa. This makes sure that what you |
|---|
| 129 |
install will be found. |
|---|
| 130 |
|
|---|
| 131 |
Install razor-agents-sdk first with |
|---|
| 132 |
Perl Makefile.PL && |
|---|
| 133 |
make && |
|---|
| 134 |
make test |
|---|
| 135 |
These should pass, then install with |
|---|
| 136 |
make install |
|---|
| 137 |
|
|---|
| 138 |
Repeat for razor-agents |
|---|
| 139 |
|
|---|
| 140 |
You get 4 tools, razor-check, razor-report, razor-revoke, and |
|---|
| 141 |
razor-admin, each with it's own man page. The default log I have in |
|---|
| 142 |
/var/log/razor-agent.log instead of a homedir, but it should be owned |
|---|
| 143 |
and writable by the configured user |
|---|
| 144 |
|
|---|
| 145 |
After install, change to the razor user, and run 'razor-admin |
|---|
| 146 |
-create'. You should now have a ~/.razor subdir. |
|---|
| 147 |
|
|---|
| 148 |
Razor-admin -register registers an identity with cloudmark, which you |
|---|
| 149 |
need for reporting & revoking. Follow the prompts. Razor attaches a |
|---|
| 150 |
seriousness level to your reports. If you report spam that nobody else |
|---|
| 151 |
ever does, you're an idiot. If you report what others subsequently do, |
|---|
| 152 |
that's good. Your revokes are also examined; If you revoke what isn't |
|---|
| 153 |
spam, that's good. If you revoke the wrong stuff, you're a twit. That's |
|---|
| 154 |
all in their software, and don't worry. As good netizens receiving a |
|---|
| 155 |
free service, however, we want to provide feedback. |
|---|
| 156 |
|
|---|
| 157 |
Tart up ~/.razor-agents.conf to suit your site, copy the entire ~/.razor |
|---|
| 158 |
subdir to /etc/razor for a sitewide config. To allow other users to |
|---|
| 159 |
report, let them copy /etc/razor to ~/.razor and the same identity is |
|---|
| 160 |
used. |
|---|
| 161 |
|
|---|
| 162 |
With the config done up above, you should be able to save off a spam |
|---|
| 163 |
email as it's own mailbox (save to a mailbox called 'test' or |
|---|
| 164 |
something). In a terminal, type |
|---|
| 165 |
|
|---|
| 166 |
'cat test | razor-check -d' |
|---|
| 167 |
|
|---|
| 168 |
type 'cat test | razor-report' to report it. |
|---|
| 169 |
|
|---|
| 170 |
If this doesn't happen, check the firewall. Open Outgoing TCP port 2703 |
|---|
| 171 |
(Razor2) and TCP port 7 (Echo), then try again. Presuming trouble, |
|---|
| 172 |
|
|---|
| 173 |
cat test | razor-report -d > somefile.txt gives you verbose output of |
|---|
| 174 |
actions and you can spot problems that way. |
|---|
| 175 |
|
|---|
| 176 |
Vipul does not want any automatic reporting set up. One exception is if |
|---|
| 177 |
you have mail adresses which you know are going to be 100% spam, as |
|---|
| 178 |
seeded spamtraps, and you may indeed forward them. We will want to |
|---|
| 179 |
report manually, being good netizens. Be aware that the checksums are on |
|---|
| 180 |
the body, as the headers will differ anyhow. Further if you report spam |
|---|
| 181 |
sent to a mailing list, you're a twit, because they usually add a |
|---|
| 182 |
footer, making the mailing list copy different from the original. The |
|---|
| 183 |
list owner can report it, as he gets an unmodified copy. |
|---|
| 184 |
|
|---|
| 185 |
|
|---|
| 186 |
DCC: |
|---|
| 187 |
|
|---|
| 188 |
3. This is a bit trickier to play with. |
|---|
| 189 |
|
|---|
| 190 |
tar -zxvf dcc.tar.Z opens the archive. |
|---|
| 191 |
|
|---|
| 192 |
There is also dccm, a 'milter' for sendmail. If you use sendmail, and |
|---|
| 193 |
figure this out, please send me an appropiate chunk of hint on it, and |
|---|
| 194 |
I'll include it. |
|---|
| 195 |
|
|---|
| 196 |
This is a small, < 1000 messages per day setup using anonymous |
|---|
| 197 |
settings. Over that, contact somebody for a service (e.g. |
|---|
| 198 |
Commtouch). Over 100k messages, you start to save bandwidth by |
|---|
| 199 |
running your own servers. |
|---|
| 200 |
|
|---|
| 201 |
Select a user:group for this to live as and insert |
|---|
| 202 |
them in lines 2422 & 2423 of the configure script instead of |
|---|
| 203 |
'bin:bin'. No matter what options you provide, manpages will not |
|---|
| 204 |
install without this mod. Find that user's uid (in /etc/passwd) |
|---|
| 205 |
and put in in for UID in this line |
|---|
| 206 |
|
|---|
| 207 |
./configure --disable-server --disable-dccm with-uid=UID \ |
|---|
| 208 |
--with-rundir=/tmp && |
|---|
| 209 |
make |
|---|
| 210 |
Then 'make install' as root. |
|---|
| 211 |
|
|---|
| 212 |
--disable-server does just that; --disable-dccm disables building the |
|---|
| 213 |
sendmail milter; --with-rundir=/tmp puts the dccifd.pid in /tmp. |
|---|
| 214 |
Otherwise it wasnt a user writable /var/run/dcc/ for the pid, and some |
|---|
| 215 |
shutdown script clears out /var/run anyhow, removing /var/run/dcc/. This |
|---|
| 216 |
is all a pain in LFS. |
|---|
| 217 |
|
|---|
| 218 |
cd to /var/dcc and edit dcc_conf you need to change |
|---|
| 219 |
DCC_RUNDIR = /tmp |
|---|
| 220 |
GREY_ENABLE = 'off' (blank) unless you know what you're up to. |
|---|
| 221 |
DCCIFD = on |
|---|
| 222 |
DCCIFD_ARGS = -m /var/dcc/map -t cmn, 20 -S mail_host -x |
|---|
| 223 |
|
|---|
| 224 |
The syslog facility in LFS is not mail.err, but mail.log. Fix that also, |
|---|
| 225 |
and anything else to suit your site. Check the final lines. Razor finds it's |
|---|
| 226 |
own servers - dcc wants you to specify yours. Presuming you have a small |
|---|
| 227 |
private installation within their license, Connect to the internet, |
|---|
| 228 |
backup /var/dcc/map and enter the config shell by typing (as root) |
|---|
| 229 |
|
|---|
| 230 |
cd /var/dcc |
|---|
| 231 |
mv map map.orig |
|---|
| 232 |
cdcc # This gives a cdcc shell. Enter the following: |
|---|
| 233 |
|
|---|
| 234 |
cdcc> load map.txt # Takes in their map.txt of default servers |
|---|
| 235 |
cdcc> trace default # this delays, and returns information. |
|---|
| 236 |
cdcc> info # This should show resolved dcc servers. If it |
|---|
| 237 |
doesn't check your internet connection.If 127.0.0.1 is |
|---|
| 238 |
your server, it's no use to you. |
|---|
| 239 |
cdcc> new map # should write /var/dcc/map, a map of servers |
|---|
| 240 |
cdcc> quit |
|---|
| 241 |
|
|---|
| 242 |
You have built |
|---|
| 243 |
|
|---|
| 244 |
1. cdcc - a setup program |
|---|
| 245 |
2. dccproc - executable checker - mainly for you |
|---|
| 246 |
3. dccifd - The daemon used by spamassassin's spamd/spamc. |
|---|
| 247 |
|
|---|
| 248 |
start the daemon with |
|---|
| 249 |
/var/libexec/dccifd -I user:group |
|---|
| 250 |
|
|---|
| 251 |
It returns one line about changing uids and then retires into the |
|---|
| 252 |
background. 'pgrep dccifd' shows me 2 pids. There should be a (newly |
|---|
| 253 |
created) socket in /tmp, or maybe /var/dcc. 'pkill dccifd' should remove |
|---|
| 254 |
socket and pids. The user chosen should be able to write to (ie touch |
|---|
| 255 |
should succeed) the socket. |
|---|
| 256 |
|
|---|
| 257 |
Other Configuration: |
|---|
| 258 |
|
|---|
| 259 |
1. There is a whitelist /var/dcc/whiteclnt. Whitelist everyone |
|---|
| 260 |
you can think of - linuxfromscratch.org, ebay, paypal, and any other |
|---|
| 261 |
list server you may be on. This bit '-S mail_host' told dccifd to |
|---|
| 262 |
mention check mail_host in the header. This allows you to add mail_hosts |
|---|
| 263 |
to /var/dcc/whiteclnt in the appropiate section. Putting in IPs is no |
|---|
| 264 |
use. You can specify any header, but it only passes one, so don't |
|---|
| 265 |
spacify mail_host if you want to use some other header. |
|---|
| 266 |
|
|---|
| 267 |
2. There is a blacklist file, which isn't a lot of use as the |
|---|
| 268 |
spammers have to keep hopping from one place to another anyhow. If |
|---|
| 269 |
certain weirdos stay stuck in the same place, they belong in a |
|---|
| 270 |
blacklist. |
|---|
| 271 |
|
|---|
| 272 |
3. Greylisting is also an option. You may theoretically lose a |
|---|
| 273 |
small percentage of mail with this. It works as follows. In every mail |
|---|
| 274 |
transaction where this is done, your mail server says "Not right now - |
|---|
| 275 |
I'm busy. Send it in half an hour" Proper mail servers will send it |
|---|
| 276 |
later. Poorly set up mail servers may lose mail, either by not sending, |
|---|
| 277 |
or resending immediately and then giving up. Spammers will not resend in |
|---|
| 278 |
99% of cases, seeing as they can't hold messages back while relaying |
|---|
| 279 |
illegally through other servers with ease. So you don't get spammed, and |
|---|
| 280 |
your name comes off their list. That's the theory. |
|---|
| 281 |
Forget this if you have pop or imap. You'll reject nothing - |
|---|
| 282 |
just leave them on your server. This is for directly connected boxes |
|---|
| 283 |
receiving their mail by smtp only. |
|---|
| 284 |
|
|---|
| 285 |
Some words on querying: dccproc is like razors check, except it reports |
|---|
| 286 |
as well by default. If you check & report ham repeatedly with dcc, the |
|---|
| 287 |
count keeps going up. Use the -Q option for repeat tests to avoid |
|---|
| 288 |
reporting again. Each user is supposed only to report each mail once. |
|---|
| 289 |
For your tests, cat message | dccproc -QC checks and computes checksums |
|---|
| 290 |
without reporting. |
|---|
| 291 |
I would suggest a startup script for dcc and spamd (The server end of |
|---|
| 292 |
spamassassin). Mine is available. |
|---|
| 293 |
|
|---|
| 294 |
The threshold figure is set by -t. The three checksums are body, fuz1 |
|---|
| 295 |
and fuz2. All are covered by the 'cmn' setting. DCC say to set them at |
|---|
| 296 |
'many'. I found results dissappointing, and set it to 20, where things |
|---|
| 297 |
worked better. My dccifd options are |
|---|
| 298 |
|
|---|
| 299 |
-I luser:group # Who it runs as. A real person, please. |
|---|
| 300 |
# You need this or it runs as root! |
|---|
| 301 |
-p /tmp/dccifd # Location of socket. |
|---|
| 302 |
-m /var/dcc/map # Location of map [Default /var/dcc/map} |
|---|
| 303 |
-d -B set:debug # Debug (both options) |
|---|
| 304 |
-x # Try extra hard to connect to a server (I needed that) |
|---|
| 305 |
-t cmn,20 # Set all thresholds to 20 |
|---|
| 306 |
|
|---|
| 307 |
Make sure to finish the 'stop' section with rm -f /tmp/dccifd to |
|---|
| 308 |
remove a stray socket if it exists. An old socket or pid will prevent dccifd |
|---|
| 309 |
from restarting. |
|---|
| 310 |
|
|---|
| 311 |
To test, cat test |dccproc -QC |
|---|
| 312 |
|
|---|
| 313 |
It should return something like this |
|---|
| 314 |
|
|---|
| 315 |
X-DCC-CollegeOfNewCaledonia-Metrics: genius 1189; Body=47 Fuz1=84 |
|---|
| 316 |
Fuz2=84 |
|---|
| 317 |
reported: 0 checksum server |
|---|
| 318 |
env_From: 5469b142 22af2632 54c4c668 28e32b2e |
|---|
| 319 |
From: 55e30375 f82be1b7 c4cd63f1 1a942cc3 |
|---|
| 320 |
Message-ID: 70489480 1a6e3c39 561ad9e9 5d9d6b1d |
|---|
| 321 |
Received: d6b6cd69 a686160f 3a6cbc4b 0680596e |
|---|
| 322 |
Body: 213f0668 14a13b4f de8a25e1 3ebf5548 47 |
|---|
| 323 |
Fuz1: 965e5582 e856e858 e775658e 00321ffd 84 |
|---|
| 324 |
Fuz2: 4f6dc268 7b2844ec 6444c79a e3508371 84 |
|---|
| 325 |
|
|---|
| 326 |
|
|---|
| 327 |
You should not see 127.0.0.1. If you don't see the count, drop the -Q |
|---|
| 328 |
once. Lastly, run your startup command for dccifd. Stdout should see |
|---|
| 329 |
|
|---|
| 330 |
getpwnam(genius:users): Success. The socket should be created, thusly |
|---|
| 331 |
|
|---|
| 332 |
srw-rw-rw- 1 root root 0 2005-11-21 06:49 /tmp/dccifd= |
|---|
| 333 |
|
|---|
| 334 |
A favourite failure mode is to start & exit, leaving the socket, & maybe |
|---|
| 335 |
even the pid file, thus preventing future startups. Permissions! |
|---|
| 336 |
|
|---|
| 337 |
Once dccifd is running, you need to use spamassassin to check that it is |
|---|
| 338 |
working, but results from dccproc are a very good indicator. |
|---|
| 339 |
|
|---|
| 340 |
|
|---|
| 341 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|---|
| 342 |
|
|---|
| 343 |
SPAMASSASSIN: |
|---|
| 344 |
|
|---|
| 345 |
4. Cancel the day's appointments and buy yourself in some alcaholic |
|---|
| 346 |
tranquilizer. You may need it. Open the archive. Become root. |
|---|
| 347 |
If you had a previous version of Spamassassin, read the UPGRADE |
|---|
| 348 |
file. Heavy going. |
|---|
| 349 |
|
|---|
| 350 |
|
|---|
| 351 |
REQUIREMENTS: |
|---|
| 352 |
|
|---|
| 353 |
IPV6 in kernel (Some module 'requires' ipv6, which needs kernel support) |
|---|
| 354 |
OpenSSL-0.9.7 For the SSL modules and fancy encrypted stuff |
|---|
| 355 |
DB-4.3.27 for the database stuff. Perhaps mysql would do..tell me. |
|---|
| 356 |
Perl + Modules as outlines below. I had version 5.8.5, and gcc-3.3.1 |
|---|
| 357 |
|
|---|
| 358 |
OPTIONAL: |
|---|
| 359 |
|
|---|
| 360 |
pcre Check mail with Perl Compatible Regular Expressions |
|---|
| 361 |
Formail For playing with mailboxes |
|---|
| 362 |
Mysql depending on your database preferences. |
|---|
| 363 |
mboxsplit The spamassassin substitute for formail. A real puzzle. |
|---|
| 364 |
|
|---|
| 365 |
|
|---|
| 366 |
GOTCHA: |
|---|
| 367 |
|
|---|
| 368 |
Installs will decide for you whether your perl libs are in |
|---|
| 369 |
/usr/local/lib/perl5 or /usr/lib/perl5. Only one of those should exist. |
|---|
| 370 |
If both exist, modules previously installed have created the lib in the |
|---|
| 371 |
wrong place, and you have a problem there. Prevent it happening by |
|---|
| 372 |
symlinking. |
|---|
| 373 |
|
|---|
| 374 |
ln -s /usr/<existing>/lib/perl5 /usr/<non-existing>/lib |
|---|
| 375 |
|
|---|
| 376 |
That way, all files end up in one location. Some will reference it as |
|---|
| 377 |
/usr/local/lib/perl5, and some (Inc spamassassin) as /usr/lib/perl5 |
|---|
| 378 |
|
|---|
| 379 |
INSTALLATION: |
|---|
| 380 |
|
|---|
| 381 |
Open the Mail-SpamAssassin archive, log in as a luser and open |
|---|
| 382 |
the INSTALL in one console(1), while you raid CPAN as root in the |
|---|
| 383 |
second (2). I would reccomensd another root console (3), to sort things |
|---|
| 384 |
out. The commands you need in (2) are |
|---|
| 385 |
|
|---|
| 386 |
perl -MCPAN -e shell #open a perl shel |
|---|
| 387 |
o conf prerequisites_policy ask # get prerequisites |
|---|
| 388 |
|
|---|
| 389 |
That sets you up. Then |
|---|
| 390 |
|
|---|
| 391 |
i <Module::Name> # What's the story with <Module::Name> |
|---|
| 392 |
|
|---|
| 393 |
install <Module::Name> # guess! |
|---|
| 394 |
|
|---|
| 395 |
In the spamassassin install file (1) find the section "Modules". |
|---|
| 396 |
Optional modules not really optional. Below is my list from |
|---|
| 397 |
/usr/local/lib/perl5/5.8.5/i686-linux/perllocal.pod. The order is left |
|---|
| 398 |
to right, top to bottom. That will minimize the hitches. |
|---|
| 399 |
|
|---|
| 400 |
Module::Info Digest::SHA1 |
|---|
| 401 |
HTML::Tagset HTML::Parser |
|---|
| 402 |
Digest::HMAC Net::IP |
|---|
| 403 |
Net::DNS Net::CIDR::Lite |
|---|
| 404 |
Sys::Hostname::Long Mail::SPF::Query |
|---|
| 405 |
IP::Country Time::HiRes |
|---|
| 406 |
Business::ISBN::Data Business::ISBN |
|---|
| 407 |
Compress::Zlib MIME::Base64 |
|---|
| 408 |
Archive::Tar Algorithm::Diff |
|---|
| 409 |
Text::Diff Net::SSLeay |
|---|
| 410 |
IO::Socket::SSL Crypt::OpenSSL::Random |
|---|
| 411 |
Crypt::OpenSSL::RSA Mail::DomainKeys |
|---|
| 412 |
|
|---|
| 413 |
Razor-agents-sdk also installs some of these modules, and some other |
|---|
| 414 |
ones. Above is the Spamassassin list. |
|---|
| 415 |
|
|---|
| 416 |
If you have anything of value in /usr/share/spamassassin or |
|---|
| 417 |
/usr/local/share/spamassassin, _back_it_up! It will get overwritten or |
|---|
| 418 |
wiped. Any bizarre rulesets can go in /etc/mail/spamassassin. |
|---|
| 419 |
|
|---|
| 420 |
Finally, install Spamassassin with |
|---|
| 421 |
|
|---|
| 422 |
perl Makefile.PL && |
|---|
| 423 |
make && |
|---|
| 424 |
make test (Bless your patience :) && |
|---|
| 425 |
make install |
|---|
| 426 |
|
|---|
| 427 |
If you install it before updating perl, it barfs over some modules. |
|---|
| 428 |
Now, you probably will have /usr/share/spamassassin full of the latrest |
|---|
| 429 |
rules. |
|---|
| 430 |
|
|---|
| 431 |
|
|---|
| 432 |
CONFIG: |
|---|
| 433 |
|
|---|
| 434 |
Here's where I hope you have pcregrep and formail. This is actually |
|---|
| 435 |
basically operable usually, but in a mess. I would suggest surfing to |
|---|
| 436 |
|
|---|
| 437 |
http://www.rulesemporium.com/rules.htm |
|---|
| 438 |
|
|---|
| 439 |
and download whatever rule sets you choose. Pop them in |
|---|
| 440 |
/etc/mail/spamassassin. As root, mv the original local.cf (if it exists) |
|---|
| 441 |
aside and download mine. Pop it likewise in /etc/mail/spamassassin. |
|---|
| 442 |
Download 70_sare_sc_top200.cf also. Don't install it, just keep it handy. |
|---|
| 443 |
|
|---|
| 444 |
Enable all plugins. The plan apparently is to keep adding .pre files for |
|---|
| 445 |
plugins. I suggest leaving init.pre untouched and enabling all plugins |
|---|
| 446 |
in v310.pre. The lines are |
|---|
| 447 |
|
|---|
| 448 |
in init.pre: |
|---|
| 449 |
|
|---|
| 450 |
loadplugin Mail::SpamAssassin::Plugin::URIDNSBL |
|---|
| 451 |
loadplugin Mail::SpamAssassin::Plugin::Hashcash |
|---|
| 452 |
loadplugin Mail::SpamAssassin::Plugin::SPF |
|---|
| 453 |
|
|---|
| 454 |
in v310.pre: |
|---|
| 455 |
|
|---|
| 456 |
loadplugin Mail::SpamAssassin::Plugin::RelayCountry |
|---|
| 457 |
loadplugin Mail::SpamAssassin::Plugin::Razor2 |
|---|
| 458 |
loadplugin Mail::SpamAssassin::Plugin::TextCat |
|---|
| 459 |
loadplugin Mail::SpamAssassin::Plugin::AntiVirus |
|---|
| 460 |
loadplugin Mail::SpamAssassin::Plugin::Pyzor |
|---|
| 461 |
loadplugin Mail::SpamAssassin::Plugin::DCC |
|---|
| 462 |
loadplugin Mail::SpamAssassin::Plugin::SpamCop |
|---|
| 463 |
loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold |
|---|
| 464 |
loadplugin Mail::SpamAssassin::Plugin::AccessDB |
|---|
| 465 |
loadplugin Mail::SpamAssassin::Plugin::WhiteListSubject |
|---|
| 466 |
loadplugin Mail::SpamAssassin::Plugin::DomainKeys |
|---|
| 467 |
loadplugin Mail::SpamAssassin::Plugin::MIMEHeader |
|---|
| 468 |
loadplugin Mail::SpamAssassin::Plugin::ReplaceTags |
|---|
| 469 |
|
|---|
| 470 |
|
|---|
| 471 |
Download my init script or write your own. You need to start dccifd |
|---|
| 472 |
(because spamc/spamd use that) and spamd. Spamassassin wants to be a |
|---|
| 473 |
user, but not a real one. I added the user spamc in the group postfix. |
|---|
| 474 |
I have a pause (5 seconds) in the restart option so spamd will let go |
|---|
| 475 |
of ports before they try to take hold again. My spamd options are: |
|---|
| 476 |
|
|---|
| 477 |
-d # Daemonize = get lost in the background |
|---|
| 478 |
-l # allow learning thus facilitating bayes |
|---|
| 479 |
-m 10 # Max processes. These are seriously memory hungry |
|---|
| 480 |
I only have 10 to facilitate mass tests. 5 is plenty. |
|---|
| 481 |
-u spamc # run as user spamc. Otherwise it's nobody, and |
|---|
| 482 |
things fall over, because nobody can't write. |
|---|
| 483 |
|
|---|
| 484 |
Now I presume you will copy in my available config file and edit |
|---|
| 485 |
that, rather than your own. I describe a sitewide config, but user |
|---|
| 486 |
configs can be created, and maintained by different users. The same process |
|---|
| 487 |
applies. spamassassin -c creates a user config. You can test your setup with |
|---|
| 488 |
(as anybody:) |
|---|
| 489 |
|
|---|
| 490 |
cat test | spamc -R - you should get a report, and an extract. |
|---|
| 491 |
|
|---|
| 492 |
root is a positive disadvantage for all mail tests, as these programs |
|---|
| 493 |
refuse to hold onto root priviliges, and drop to a specified user, or to |
|---|
| 494 |
nobody. They are all called by the user _receiving_ the mail, so they |
|---|
| 495 |
can write in his maildir, which typically has 0600 permissions. Root |
|---|
| 496 |
will never receive mail this way, as user nobody certainly can't write |
|---|
| 497 |
to root's directory! Alias root to a user. You need root for starting these |
|---|
| 498 |
tools however |
|---|
| 499 |
|
|---|
| 500 |
Sorting out the bugs in things (There will be many) is achieved |
|---|
| 501 |
by these commands. |
|---|
| 502 |
|
|---|
| 503 |
1. spamassassin -D --lint > debug.txt 2>&1 Examine this file for |
|---|
| 504 |
negatives |
|---|
| 505 |
2. Change the -d to -D for spamd and restart from a root |
|---|
| 506 |
terminal. It will hold the terminal, and spew information. |
|---|
| 507 |
|
|---|
| 508 |
3. Poring over the entrails of /var/log/mail.log. All mail |
|---|
| 509 |
programs write to mail.log. If someone knows how to set up a separate |
|---|
| 510 |
syslog facility, let me know and I'll stuff one in for spam. I did have |
|---|
| 511 |
a go myself, but things fell over so I reverted. |
|---|
| 512 |
|
|---|
| 513 |
Look for the things that didn't happen, and config lines not parsed. |
|---|
| 514 |
Your rulesets, I presume, will be different from mine. Here's mine: |
|---|
| 515 |
|
|---|
| 516 |
[root@genius ~]# ls /etc/mail/spamassassin |
|---|
| 517 |
20_dec.cf 70_sare_html1.cf 72_sare_bml_post25x.cf |
|---|
| 518 |
99_DEC_Tripwire.cf 70_sare_adult.cf 70_sare_obfu.cf 82_antidrug.cf |
|---|
| 519 |
99_FVGT_meta.cf init.pre 70_sare_genlsubj0.cf 70_sare_oem.cf |
|---|
| 520 |
88_FVGT_body.cf local.cf 70_sare_genlsubj1.cf 70_sare_spoof.cf |
|---|
| 521 |
88_FVGT_headers.cf local.orig 70_sare_header0.cf 70_sare_uri0.cf |
|---|
| 522 |
88_FVGT_rawbody.cf nohits/ 70_sare_header1.cf 70_sare_uri1.cf |
|---|
| 523 |
88_FVGT_subject.cf spam@ 70_sare_html0.cf 70_sare_uri_eng.cf |
|---|
| 524 |
88_FVGT_uri.cf v310.pre |
|---|
| 525 |
|
|---|
| 526 |
20_dec.cf are my own rules, nohits/ sidelines dud rulesets, and spam@ is a |
|---|
| 527 |
symlink to /usr/share/spamassassin. |
|---|
| 528 |
|
|---|
| 529 |
ln -s /usr/share/spamassassin /etc/mail/spamassassin |
|---|
| 530 |
|
|---|
| 531 |
Spamassassin ignores subdirs, so you can have an archive. The bigger |
|---|
| 532 |
your throughput, the fewer rules you want to avoid loading the system. |
|---|
| 533 |
The best ones of the above lot are the sare header, html, uri, drug & |
|---|
| 534 |
adult. The FVGT rules are very efficient by comparison with some sare |
|---|
| 535 |
rules.The higher the number, the later it is read, and the more priority |
|---|
| 536 |
it has. Presuming you sort your bugs, you now have an integrated |
|---|
| 537 |
sitewide anti-spam setup. |
|---|
| 538 |
|
|---|
| 539 |
You now need one other item of information. Are your mails being |
|---|
| 540 |
checked against blacklists (like spamcop, sorbs.net) upstream? To find |
|---|
| 541 |
out, use 70_sare_sc_top200.cf. View it in one console and cd to your |
|---|
| 542 |
subdir with the spam mailmoxes (I am presuming they are named spam1, |
|---|
| 543 |
spam2, etc). The first entry in 70_sc_top200.cf today is |
|---|
| 544 |
|
|---|
| 545 |
Received =~ /\b12\.(?:210\.176\.205|211\.4\.79|217\.81\.151)\b/ |
|---|
| 546 |
|
|---|
| 547 |
Now you can check for that with pcregrep. You cannot restrict your |
|---|
| 548 |
search to the Received line too handy, but you can do this |
|---|
| 549 |
|
|---|
| 550 |
pcregrep '\b12\.(?:210\.176\.205|211\.4\.79|217\.81\.151)\b' spam? |
|---|
| 551 |
|
|---|
| 552 |
any instances will show. You will notice I removed the /regex/ |
|---|
| 553 |
delimiters and replaced them with 'regex'. Just one other word of |
|---|
| 554 |
warning: pcregrep appears not to like the /i at the end of most regexes |
|---|
| 555 |
in the rules. Use pcregrep -i and remove the /i. You can also use -c to |
|---|
| 556 |
check the number of times. I do not get any instances of the top200 |
|---|
| 557 |
spammers, so I presume the top 200 are not getting through directly to |
|---|
| 558 |
me. The ruleset is therefore unneccessary for me. I can get hits from |
|---|
| 559 |
the more obtuse dns blocklists, so not all are being checked. |
|---|
| 560 |
|
|---|
| 561 |
If you haven't got prce, egrep -e will apply posix rules which are |
|---|
| 562 |
close, but different. The main weakness is in unusual character types |
|---|
| 563 |
like \d which do not behave in egrep. |
|---|
| 564 |
|
|---|
| 565 |
|
|---|
| 566 |
INTEGRATION: |
|---|
| 567 |
|
|---|
| 568 |
Penultimately, Integration. If your mail is relayed to you, use |
|---|
| 569 |
procmail. If you are online 24/7 and serious.spammer.co.tw can reach |
|---|
| 570 |
your box directly, set up a reject configuration in your mail client. |
|---|
| 571 |
The amavisd-new package includes many configuration options for weird and |
|---|
| 572 |
wonderful mail clients with a better understanding of them than you |
|---|
| 573 |
will usually find in the documentation. |
|---|
| 574 |
|
|---|
| 575 |
Think this course through. Mailing lists will get spam, and will forward |
|---|
| 576 |
it. If you bounce repeatedly to a mailing list, you will be |
|---|
| 577 |
unsubscribed, sometimes automatically. |
|---|
| 578 |
|
|---|
| 579 |
Procmail's recipe looks like this (in ~/.procmailrc) |
|---|
| 580 |
|
|---|
| 581 |
:0fw |
|---|
| 582 |
| /usr/bin/spamc |
|---|
| 583 |
:0 |
|---|
| 584 |
* X-Spam-Level: \*\*\*\*\* |
|---|
| 585 |
$HOME/Mail/spam |
|---|
| 586 |
|
|---|
| 587 |
That pipes through spamd (which calls razor & dcc) and dumps it in a |
|---|
| 588 |
spam mailbox on 5 stars. man procmail or man procmailex help here. |
|---|
| 589 |
Those exact procmail lines put spam in ~/Mail/spam. Make sure it exists. |
|---|
| 590 |
If you are content to reject on razor's say so, you cat take the recipe |
|---|
| 591 |
from 'man razor-check', not load the spamassassin razor2 plugin, and preline |
|---|
| 592 |
it in procmail. This imposes a memory load (The 'c' in ':0Wc' means 2 message |
|---|
| 593 |
copies, 2 procmail instances) but avoids spamassassin. I ran for some |
|---|
| 594 |
months with this setup, it plucked 70% of spam and had one false |
|---|
| 595 |
positive (From the LFS list :-/.) In this case, reduce your spamd |
|---|
| 596 |
instances. |
|---|
| 597 |
|
|---|
| 598 |
|
|---|
| 599 |
|
|---|
| 600 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|---|
| 601 |
|
|---|
| 602 |
SECTION 4: TUNING: |
|---|
| 603 |
|
|---|
| 604 |
The standard spamassassin config is very soft, and lets some spam |
|---|
| 605 |
through. Mine is short on negative rules, and hard on porn particularly. |
|---|
| 606 |
Even if you don't want to use mine, download it and lint with it once, |
|---|
| 607 |
as it will show you errors on other places. Your friends are |
|---|
| 608 |
|
|---|
| 609 |
man Mail::SpamAssassin::Conf |
|---|
| 610 |
man Mail::SpamAssassin::Plugin::Name (e.g URIDNSBL) |
|---|
| 611 |
|
|---|
| 612 |
beware of the latter manpages,as they drift between config options and |
|---|
| 613 |
rules pretty seamlessly without telling you. Next tune up! As root, |
|---|
| 614 |
|
|---|
| 615 |
vim /etc/mail/spamassassin/local.cf |
|---|
| 616 |
|
|---|
| 617 |
Looking at my local.cf, The first things are basic setup. Leave the first |
|---|
| 618 |
line there unless you are using nfs, in which case it must come out. The |
|---|
| 619 |
host 216.171.238.83 is linuxfromscratch.org. |
|---|
| 620 |
|
|---|
| 621 |
PYZOR config options are there, but commented out. I tried it, and found |
|---|
| 622 |
it very little use. You can run a local server in a large outfit and |
|---|
| 623 |
allow your users to blacklist dynamically this way. It also runs in |
|---|
| 624 |
python, which is another interpeter and libs to load. They reccomend |
|---|
| 625 |
readyexec, which takes care of that some clever way. Suit yourself. |
|---|
| 626 |
The install is a doddle, but not worth it, imho. |
|---|
| 627 |
|
|---|
| 628 |
DCC options are clear enough - paths to everything, and much of the |
|---|
| 629 |
stuff on the dccifd command line. The very last option is for dccproc. |
|---|
| 630 |
Spamc/spamd use dccifd, the daemon, and if not found, dccproc. Dccproc |
|---|
| 631 |
is more resource hungry (starting an interpeter every time). If dccifd |
|---|
| 632 |
is there but not running, you barf. |
|---|
| 633 |
|
|---|
| 634 |
The -B option sets a check on spamhaus.org, which returns 127.0.0.2 as a |
|---|
| 635 |
positive result. Multiple -B options are allowed. It's there really as |
|---|
| 636 |
an example because the docs are _soo_bad. |
|---|
| 637 |
|
|---|
| 638 |
RAZOR options are simple. It's neat code. |
|---|
| 639 |
|
|---|
| 640 |
BAYES options allow learning from ham/spam. Also there are uridnsbl |
|---|
| 641 |
(blocklist stuff). It you don't need the blocklist, comment these out |
|---|
| 642 |
and comment out URIDNSBL in /etc/mail/spamassassin init.pre |
|---|
| 643 |
|
|---|
| 644 |
SPF is Sender Policy Framework. ISPs should have a policy, and the mail |
|---|
| 645 |
is checked against that. Weak, but it catches the occasional thing. |
|---|
| 646 |
|
|---|
| 647 |
Next come whitelist from. Include Family, friends, business contacts, |
|---|
| 648 |
paypal (If you're registered). The bayes_ignore entries should be all |
|---|
| 649 |
mailing lists, as some get spam, and their spam score will rise |
|---|
| 650 |
otherwise. |
|---|
| 651 |
|
|---|
| 652 |
Finally we get rules, listed under groups as one progresses through an |
|---|
| 653 |
email, and scored. The general policy is to assign a weight to a score, |
|---|
| 654 |
and arrive for spam at a score of 5 or above, and for other mail, to |
|---|
| 655 |
keep the score at below 5. To check any rule (This is where the'spam' |
|---|
| 656 |
symlink comes in handy) cd to /etc/mail/spamassassin and type |
|---|
| 657 |
|
|---|
| 658 |
grep -r RULE_NAME * |
|---|
| 659 |
|
|---|
| 660 |
Here's an example |
|---|
| 661 |
lfs:/etc/mail/spamassassin$grep -r FORGED_RCVD_HELO * |
|---|
| 662 |
|
|---|
| 663 |
local.cf:score FORGED_RCVD_HELO 1.22 |
|---|
| 664 |
spam/20_head_tests.cf:header FORGED_RCVD_HELO eval:check_for_forged_received_hel |
|---|
| 665 |
o() |
|---|
| 666 |
spam/20_head_tests.cf:describe FORGED_RCVD_HELO Received: contains a forged HELO |
|---|
| 667 |
spam/50_scores.cf:score FORGED_RCVD_HELO 0 0 0 0.135 |
|---|
| 668 |
|
|---|
| 669 |
20_head_tests is an original spamassassin ruleset. spam/50_scores.cf is |
|---|
| 670 |
the default score 0 until the fourth time when it scores 0.135 |
|---|
| 671 |
|
|---|
| 672 |
The scores relate to successive hits of a rule. It scores basically |
|---|
| 673 |
nothing, but I have lifted it to 1.22. It is an excellent indicator of |
|---|
| 674 |
spam or the linuxfromscratch lists where half cocked mail setups abound. |
|---|
| 675 |
If your mailer gives out a domain that a dns check can't resolve, you're |
|---|
| 676 |
in trouble here. If you have a legit A and MX record where people would |
|---|
| 677 |
expect to find them, you're ok. All broadband modems have urls in the |
|---|
| 678 |
range of the isp, so if your private network goes out, something smells. |
|---|
| 679 |
|
|---|
| 680 |
Mime and html rules are very good. Mind you , I have trained most people |
|---|
| 681 |
to send text. If you use html a lot, back some of these off. Some are still |
|---|
| 682 |
excellent spam indicators, even if you want to allow for half-assed mail |
|---|
| 683 |
from m$ outlook etc. These ones are always good |
|---|
| 684 |
|
|---|
| 685 |
HTML_EMBEDS 3 HTML_FONT_BIG 3 |
|---|
| 686 |
HTML_FONT_LOW_CONTRAST HTML_FONT_INVISIBLE HTML_IMAGE_ONLY_04 |
|---|
| 687 |
HTML_IMAGE_ONLY_08 HTML_IMAGE_ONLY_12 HTML_IMAGE_RATIO_(all) |
|---|
| 688 |
|
|---|
| 689 |
The high ratios are also useful. Even outlook sends text as well. |
|---|
| 690 |
The MIME tests are excellent also. |
|---|
| 691 |
|
|---|
| 692 |
The default spamassassin is ambivelant to porn (Some want this stuff?) I |
|---|
| 693 |
don't, so porn words are heavily punished in my config. |
|---|
| 694 |
|
|---|
| 695 |
Tests that throw false positives are: |
|---|
| 696 |
|
|---|
| 697 |
FORGED_<SOMEWHERE>_RCVD |
|---|
| 698 |
|
|---|
| 699 |
anything, Example: when a (top post)reply from hotmail.com comes from |
|---|
| 700 |
hotmail to a question from yahoo.com and then you get FORGED_YAHOO_RCVD. |
|---|
| 701 |
|
|---|
| 702 |
These clever tests like backhair trip over linux program versions. |
|---|
| 703 |
Posted kernel configs are CAPS. Spamsigns are detected in directory |
|---|
| 704 |
names. A subject line like VIA GRATIS (The way of thanks in latin) also |
|---|
| 705 |
has VIAGRA in there. You can't make a rule against 'love' because |
|---|
| 706 |
'glover' is a surname. Tune accordingly. try this |
|---|
| 707 |
|
|---|
| 708 |
cat spam1 |formail -n 2 -ds spamc -R >> spam1_reports (presuming ~50 messages) |
|---|
| 709 |
|
|---|
| 710 |
and repeat for all the others. DO NOT try that on a big mailbox, as |
|---|
| 711 |
spamc processes detach from formail, and it starts another before you |
|---|
| 712 |
finish. In 400 emails, I had 200 spamc processes looking for 10 spamd |
|---|
| 713 |
processes in one test. Then the modem backed up, and I lost all dns |
|---|
| 714 |
tests. If you don't have spare memory, drop the '-n 2' option and wait. |
|---|
| 715 |
The '-ds' splits the mailbox and pipes to the following command. |
|---|
| 716 |
|
|---|
| 717 |
Then try it on your ham, your saved messages, showing fasle positives. |
|---|
| 718 |
Also, |
|---|
| 719 |
|
|---|
| 720 |
cat <mailbox> |formail -n 2 -ds spamc -c, which simply outputs a line per |
|---|
| 721 |
test with the score. |
|---|
| 722 |
|
|---|
| 723 |
Another option is ' cat yourmail | procmail -d $USER ' and then it pops |
|---|
| 724 |
into ham or spam boxes appropiately. If you want to retest mail that has |
|---|
| 725 |
a header, try this line |
|---|
| 726 |
|
|---|
| 727 |
cat <mailbox> |formail -ds spamassassin -d >> file |
|---|
| 728 |
|
|---|
| 729 |
Removing the markups. This is not 100% reliable, so this sed |
|---|
| 730 |
|
|---|
| 731 |
sed -e '/X-Spam/d' -e '/>From/d' < input_file > output_file |
|---|
| 732 |
|
|---|
| 733 |
clears the remains. A sure sign that something has tripped over an old |
|---|
| 734 |
markup is a NO_RELAYS hit in the retests. |
|---|
| 735 |
|
|---|
| 736 |
Once you get spamd running and working, the above process is necessary |
|---|
| 737 |
before repeat checks. Killing dccifd before repeats is also clever. You |
|---|
| 738 |
can razor-check all you like. Remember to remove the socket if you kill |
|---|
| 739 |
dccifd. Or restart it with the "Query-only" option. |
|---|
| 740 |
|
|---|
| 741 |
cat ham2 |formail -ds spamc -R |less gives you the reports and an |
|---|
| 742 |
extract on successive lines. Open consoles as you need them. On another console, |
|---|
| 743 |
get any ham marked as spam onscreen and presuming gpm is working, you |
|---|
| 744 |
can find the problem this way. |
|---|
| 745 |
|
|---|
| 746 |
Get the rule onscreen grep -r SOME_RULE_NAME /etc/mail/spamassassin/* |
|---|
| 747 |
and locate the regex |
|---|
| 748 |
|
|---|
| 749 |
Set up the test pcregrep -i 'whatever_regex' yourmail |
|---|
| 750 |
|
|---|
| 751 |
In the general run of play, you can probably lower my html scores, and |
|---|
| 752 |
adjust for your own situation. If you are a doctor, you will obviously |
|---|
| 753 |
have to adjust or whitelist any mail sources that send mail about drugs. |
|---|
| 754 |
|
|---|
| 755 |
Try to find negative rules that apply to your situation. To add a rule, |
|---|
| 756 |
Find a similar rule. Don't fiddle with the 'eval do something' type |
|---|
| 757 |
rules as they are spamassassin builtins. The various header lines are |
|---|
| 758 |
specified by this sort of thing "Received: = ~ and just check those |
|---|
| 759 |
lines. Invent your own rules as appropiate. These headers (Received, |
|---|
| 760 |
From, Subject, etc.) are all in ram as variables when a message is |
|---|
| 761 |
checked. Invent your own regex, and don't forget to run |
|---|
| 762 |
|
|---|
| 763 |
spamassassin -D --lint afterwards to check it out. Never mind what the |
|---|
| 764 |
errors are, (some mistakes redirect) undo what you did last and lint |
|---|
| 765 |
again. Man perlre helps. Unrecognized options are a sign of missing |
|---|
| 766 |
plugins. I, for instance, do not use HashCash or RelayCountry plugins. |
|---|
| 767 |
If you decide to use them, enter the options off the man page. |
|---|
| 768 |
"Score set for nonexistent rule" in the lint means you are not using the |
|---|
| 769 |
same rules as me. Just remove the relevant line from local.cf |
|---|
| 770 |
|
|---|
| 771 |
Keep your spam for a month at least after you set the system running. |
|---|
| 772 |
You ideally need reports back of false positives and false negatives. Never |
|---|
| 773 |
get cocky, as there will be both. Tune up periodically. Spam changes. |
|---|
| 774 |
|
|---|
| 775 |
My current ratio is |
|---|
| 776 |
~ 99% of all spam successfully caught. |
|---|
| 777 |
~ 3% of ham marked as spam (Entirely from the lfs lists) . This |
|---|
| 778 |
is a high figure, but I'm lazy. The real problem is that if the query |
|---|
| 779 |
goes to spam, the answers do also. I retuned recently, and removed the |
|---|
| 780 |
Tripwire ruleset so I expect things will be better. |
|---|
| 781 |
|
|---|
| 782 |
What gets through is mail that mimicks your own mail, and genuinely sent |
|---|
| 783 |
spam from webmail, short stuff, that doesn't trigger enough to top the |
|---|
| 784 |
spam score. What gets wrongly caught usually is misinterpeted signs of spam. |
|---|
| 785 |
Regexes are a non thinking tool. This sort of email |
|---|
| 786 |
|
|---|
| 787 |
"Do you require a timepiece? http://spamsite.com/" |
|---|
| 788 |
|
|---|
| 789 |
is brief enough to be difficult to hit. Save off false positives and false |
|---|
| 790 |
negatives individually, and get them to land correctly by readjusting scores, |
|---|
| 791 |
linting, and restarting your spamd daemons. |
|---|
| 792 |
|
|---|
| 793 |
To correct the bayes learning, you can use |
|---|
| 794 |
|
|---|
| 795 |
sa-learn --ham --mbox <filename> OR |
|---|
| 796 |
sa-learn --ham <filename> for a single email |
|---|
| 797 |
sa-learn --forget does just that, and the database can be rebuilt. |
|---|
| 798 |
Likewise sa-learn --spam learns the other way. Man sa-learn. |
|---|
| 799 |
|
|---|
| 800 |
|
|---|
| 801 |
ACKNOWLEDGEMENTS: |
|---|
| 802 |
|
|---|
| 803 |
Authors of all software, and the regex Maestros of the anti-spam |
|---|
| 804 |
community. |
|---|
| 805 |
|
|---|
| 806 |
|
|---|
| 807 |
CHANGELOG: |
|---|
| 808 |
Nov. 21st 2005 Major Edit of innaccuracies, spellings, self congratulation & |
|---|
| 809 |
waffle. Tweak config files. |
|---|
| 810 |
|
|---|
| 811 |
Nov. 15th 2005: Finsihed this 1st draft. |
|---|