BigAdmin System Administration Portal
Feature Article
Print-friendly VersionPrint-friendly Version

A Guide to Centralized Spam and Virus Filtering

By Amy Rich


Integrating Sendmail, MIMEDefang, ClamAV, SpamAssassin, and Vipul's Razor

Spam and viruses constitute an enormous amount of mail on the Internet, wasting bandwidth and clogging the queues on defenseless mail servers. If delivered, the viruses also waste company resources as they spread. Mail administrators have taken a variety of approaches to defending their users from these two culprits, but it's difficult to keep up with new techniques and viruses and worms. This article explores the integration of Sendmail, MIMEDefang, ClamAV, SpamAssassin, and Vipul's Razor as a partial measure against spam and viruses.


How These Software Programs Work Together

On a machine running traditional Sendmail with no modifications, mail enters the machine via an SMTP transaction on port 25, is processed by the mail transfer agent or MTA (Sendmail) and is then handed off to the local delivery agent to be stored on disk for retrieval by the user. The process looks like Figure 1.

Figure 1: Routing of a Mail Message

Many sites opt to attack spam during the SMTP transaction, before it's accepted for delivery, using a combination of DNSBLs (DNS-based spam blocking lists), access lists, and the Milter (mail filter) API. The recommendations in this article focus on using the Milter API that ships with later versions of the Sendmail source code. The Milter API allows the use of third-party programs to scan and rewrite messages on the fly during the SMTP transaction. The Milter API is reasonably complex and would require its own article. For an in-depth look at how the Milter API interfaces with Sendmail and other programs, take a look at the reference material at www.milter.org.

Sendmail talks to the MIMEDefang software via a Milter provided with the MIMEDefang software. MIMEDefang provides a mechanism by which other programs may scan and modify the message, but it does not define any specific policy to do so. It's left to the system administrator to determine the policy defining what constitutes spam or a virus and what to do once one is found. When the software is already installed on the machine, MIMEDefang integrates seamlessly with SpamAssassin and a variety of virus scanners (see the MIMEDefang web site for a definitive list). For an extremely thorough explanation on how MIMEDefang works and how to write your own custom rules, read David F. Skoll's LISA 2003 presentation (PDF).

In this article, MIMEDefang is compiled to use the virus scanner ClamAV, the anti-spam software SpamAssassin, and various perl modules. I find MIMEDefang to be extremely flexible and extensible in this respect and can coordinate a number of modules and rules. Since MIMEDefang is written in perl, an administrator with perl experience can create a wide variety of rules to define to a site policy. SpamAssassin, as well as using its own internal mechanisms to recognize UCE (unsolicited commercial email) and UBE (unsolicited bulk email), also has the ability to refer to external spam databases like Vipul's Razor. With the addition of these software packages, the new flow of the message during the SMTP transaction phase looks like Figure 2.

Figure 2: Routing of a Message with the Addition of Sendmail Milter Components

Armed with this high-level understanding of how each of the pieces fits together to target spam and viruses, I'll now describe the actual installation process.


Obtaining and Installing the Software

The reference test system for this installation runs the Solaris Operating System (Solaris 8 02/02) and has the following relevant, locally built packages already installed:

  • GNU gcc 3.1
  • GNU gzip
  • GNU m4
  • GNU make
  • GNU tar
  • perl 5.6.1 from Sunfreeware
  • procmail
  • Sleepycat db version 4

Installing Required and Optional perl Modules

Most of the work involved in compiling the main software packages occurs while installing all of the necessary perl modules beforehand. Here's the list of modules, in order, that I compiled and installed on the reference system. Each module's page lists a description and provides a download link for the tar file:

  1. http://search.cpan.org/~eryq/IO-stringy-2.109/
  2. http://search.cpan.org/~gaas/MIME-Base64-2.23/
  3. http://search.cpan.org/~gbarr/libnet-1.17/
  4. http://search.cpan.org/~markov/MailTools-1.60/
  5. http://search.cpan.org/~mharnisch/Unix-Syslog-0.100/
  6. http://www.mimedefang.org/static/MIME-tools-5.411a-RP-Patched-02.tar.gz
  7. http://search.cpan.org/~pmqs/DB_File-1.807/
  8. http://search.cpan.org/~sburke/HTML-Tagset-3.03/
  9. http://search.cpan.org/~gaas/HTML-Parser-3.35/
  10. http://search.cpan.org/~gaas/Digest-1.05/
  11. http://search.cpan.org/~gaas/Digest-SHA1-2.07/
  12. http://search.cpan.org/~gaas/Digest-MD5-2.33/
  13. http://search.cpan.org/~gaas/Digest-HMAC-1.01/
  14. http://search.cpan.org/~vipul/Digest-Nilsimsa-0.06/
  15. http://search.cpan.org/~petdance/Test-Harness-2.40/
  16. http://search.cpan.org/~mschwern/Test-Simple-0.47/
  17. http://search.cpan.org/~crein/Net-DNS-0.45/
  18. http://search.cpan.org/~jhi/Time-HiRes-1.54/
  19. http://search.cpan.org/~gaas/URI-1.29/
  20. http://search.cpan.org/~dougw/Convert-TNEF-0.17/
  21. http://search.cpan.org/~mlehmann/Convert-UUlib-1.03/
  22. http://search.cpan.org/~pmqs/Compress-Zlib-1.32/
  23. http://search.cpan.org/~nedkonz/Archive-Zip-1.09/
  24. http://search.cpan.org/~hdias/File-Scan-0.78/

For each of these modules, obtain the source code and perform the following. (Note: Testing is optional. Some tests will fail without affecting the functionality of the installation, but testing may help pinpoint any issues with the modules before installation.)

tar zxf <module>.tar.gz
cd <module>
perl Makefile.PL
make
make test
make install

Installing Vipul's Razor

Vipul's Razor is another set of perl modules but must be obtained separately from those at CPAN at http://razor.sourceforge.net/. Razor is a spam detection database that's based on end-user contributions. The trustworthiness of a given user's input is based on that user's past spam reports and revocations. Mutating spam content is identified using statistical and randomized signatures.

Obtain the razor-agents-2.36.tar.gz package. (Note: The razor-agents-sdk-2.03 package isn't required because the prerequisite perl modules were covered in the preceding section.) Also obtain Quinlan's razor patch to fix razor's issue with perl taint checking. Unpack the source code, apply the patch and install the modules:

tar zxf razor-agents-2.36.tar.gz
mv Razor2.patch-quinlan razor-agents-2.36/
cd razor-agents-2.36
patch -p0 -d lib/Razor2 < Razor2.patch-quinlan
perl Makefile.PL
make
make install
/usr/local/bin/razor-client

Razor v2 requires that reporters be registered so that their associated submissions and revocations define their reputations over time. Create a default configuration file and registration with the following commands as a normal user, replacing user@your.domain and yourpasswd with values that you choose (if you do not specify the user or password, these values will be randomly chosen for you):

razor-admin -create
razor-admin -register -user=user@your.domain -pass=yourpasswd

The configuration file and registration information will be written to the directory ~/.razor/. See the following man pages for more information on using razor:

  • razor-report.1
  • razor-revoke.1
  • razor-check.1
  • razor-admin.1
  • razor-agents.5
  • razor-whitelist.5
  • razor-agent.conf.5

Installing SpamAssassin

SpamAssassin is a perl-based mail filter that identifies spam using types of checks such as:

  • Header tests
  • Body phrase tests
  • Bayesian filtering
  • Automatic address whitelist/blacklist
  • Manual address whitelist/blacklist
  • Collaborative spam identification databases (DCC [Distributed Checksum Clearinghouse], Pyzor, Razor2)
  • RBL (Realtime Blackhole List)
  • Character sets and locales

Obtain the source code: Mail-SpamAssassin-2.61.tar.gz. Install it as follows:

tar zxf Mail-SpamAssassin-2.61.tar.gz
cd Mail-SpamAssassin-2.61
perl Makefile.PL
What email address or URL should be used in the suspected-spam report
text for users who want more information on your filter installation?
(In particular, ISPs should change this to a local Postmaster contact)
default text: [the administrator of that system] postmaster@your.domain

Run Razor v2 tests (these may fail due to network problems)? (y/n) [n] y

make
make test
make install

Rule files for SpamAssassin are installed into /etc/mail/spamassassin/ and /usr/local/share/spamassassin. Additional system-wide rule files should be added to /etc/mail/spamassassin/ to prevent being overwritten when upgrading SpamAssassin. Read the spamassassin(1) man page and the USAGE file that comes with the SpamAssassin source code for various tips on customizing the installation. Also visit the SpamAssassin web site for FAQs, mailing lists, and wiki.


Installing Sendmail with Milter Support

Sendmail, the MTA, must be compiled with Milter support in order to use MIMEDefang. If you are upgrading from an older version of Sendmail, you will also need create the smmsp user and group:

/usr/sbin/groupadd -g 25 smmsp
/usr/sbin/useradd -c 'smmsp' -d /var/spool/clientmqueue -u 25 -g 25 \
  -s /bin/noshell smmsp

Obtain sendmail-8.12.11.tar.gz from sendmail.org, then unpack the software:

tar zxf sendmail.8.12.11.tar.gz
cd sendmail-8.12.11

Add the following lines, and any additional configuration information the build requires, to devtools/Site/site.config.m4:

   APPENDDEF(`conf_sendmail_ENVDEF',`-DMILTER')dnl
   APPENDDEF(`conf_libmilter_ENVDEF', `-D_FFR_MILTER_ROOT_UNSAFE')dnl

The reference installation also includes the following lines to prevent the generation of catman pages and explicitly replace NIS support with Sleepycat DB support:

   define(`confMAPDEF', `-DNDBM  -DMAP_REGEX -DNEWDB')dnl
   define(`confLIBS', `-lsocket -lnsl -ldb')dnl
   define(`confDONT_INSTALL_CATMAN', `true')dnl

Build and install Sendmail and libmilter. Be sure to leave the Sendmail source code unpacked and built, since various include files and libraries will be required by the MIMEDefang build:

./Build install
cd libmilter
./Build install

Once Sendmail is installed, generate a new cf file from your customized mc file. These two lines should appear in the mc file to enable the MIMEDefang Milter and set the Milter log level to 1.

define(`confMILTER_LOG_LEVEL',`1')dnl
INPUT_MAIL_FILTER(`mimedefang',
  `S=unix:/var/spool/MIMEDefang/mimedefang.sock, 
   F=T, T=S:1m;R:1m')

The Sendmail documentation describes the flags to INPUT_MAIL_FILTER should the above values need fine tuning. Some users of MIMEDefang and SpamAssassin experience issues with filter timeouts, which may be eased by increasing the timeout values in the mc file:

INPUT_MAIL_FILTER(`mimedefang',
  `S=unix:/var/spool/MIMEDefang/mimedefang.sock, 
  F=T, T=S:5m;R:5m')

Both of the preceding configurations return a tempfail code, requesting that the server try again later if MIMEDefang fails while processing a message. This is the recommended configuration if MIMEDefang is scanning for viruses so that incoming messages are rejected on error. If MIMEDefang is only scanning for spam, and it's more important that the message be received even if there is an error in the filter, change this line to deliver incoming message when the filter fails by removing the F=T argument:

INPUT_MAIL_FILTER(`mimedefang',
  `S=unix:/var/spool/MIMEDefang/mimedefang.sock, 
   T=S:5m;R:5m')

An example mc file, /etc/mail/gateway.mc, might look like the following on a mail hub:

divert(0)dnl
VERSIONID(`@(#)gateway.mc 1.0 (Mydomain) 01/08/2004')
OSTYPE(solaris2)dnl
FEATURE(`nouucp',`reject')dnl
MASQUERADE_AS(`my.domain')dnl
FEATURE(`redirect')dnl
FEATURE(`masquerade_envelope')dnl
FEATURE(`use_cw_file')dnl
FEATURE(`local_procmail')dnl
FEATURE(`always_add_domain')dnl
FEATURE(`masquerade_entire_domain')dnl
FEATURE(`allmasquerade')dnl
FEATURE(`access_db')dnl
LOCAL_USER(`root')dnl
EXPOSED_USER(`root')dnl
#
define(`confPRIVACY_FLAGS', `authwarnings,noexpn,novrfy')dnl
define(`confTO_IDENT', `0s')dnl
define(`confSMTP_LOGIN_MSG', `$j (NO UCE)')dnl
define(`confMILTER_LOG_LEVEL',`1')dnl
INPUT_MAIL_FILTER(`mimedefang', 
  `S=unix:/var/spool/MIMEDefang/mimedefang.sock, 
  F=T, T=S:1m;R:1m')
MAILER(local)dnl
MAILER(smtp)dnl

Run the mc file through m4 to generate a new Sendmail configuration file:

m4 /usr/lib/mail/m4/cf.m4 /etc/mail/gateway.mc > /etc/mail/gateway.cf
mv /etc/mail/gateway.cf /etc/mail/sendmail.cf

Installing ClamAV

Clam AntiVirus is a GPL anti-virus toolkit for UNIX. The ClamAV software package provides a flexible and scalable multi-threaded daemon, a command-line scanner, and a tool for automatic updating over the Internet. The programs are based on a shared library distributed with the Clam AntiVirus package.

ClamAV will not use digital signatures unless GNU MP is installed. GNU MP is a library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers. Before building ClamAV, fetch, unpack, configure, and install GNU MP (replace the setenv and unsetenv commands if you're not running a csh derivative):

ncftpget ftp://ftp.gnu.org/gnu/gmp/gmp-4.1.2.tar.gz
tar zxf gmp-4.1.2.tar.gz
cd gmp-4.1.2
setenv ABI 32 # for 32 bit compiling, otherwise ClamAV won't find  __gmpz_init
./configure
make install
unsetenv ABI

ClamAV and MIMEDefang need to run as the same user, so next create the defang user and group. I arbitrarily picked 125 as the UID and GID. Also add an entry to /etc/mail/aliases so that mail to the defang user goes to a real person:

   groupadd -g 125 defang
   useradd -u 125 -g 125 -d /var/spool/MIMEDefang defang
   echo "defang: somerealuser" >> /etc/mail/aliases
   newaliases
Obtain the clamav-0.65.tar.gz source code, unpack it, configure it, and install it. By default, clamd logs to the LOCAL6 facility. To change this, modify clamd/clamd.c before compiling the software:

tar zxf clamav-0.65.tar.gz
cd clamav-0.65
./configure --with-user=defang --with-group=defang
make
make install

Next modify /usr/local/etc/clamav.conf and comment out the Example line, turn on logging via syslog, specify the location of the PID and socket files, and indicate which user clamd should run as:

# Example
LogSyslog
PidFile /var/spool/MIMEDefang/clamd.pid
LocalSocket /var/spool/MIMEDefang/clamd.sock
User defang

Be sure to add a line to /etc/syslog.conf for the desired log facility and level, then HUP syslogd. Finally, initialize the virus database, create a log file, and set up a cron entry for the user defang that updates the virus database hourly:

touch /var/log/clam-update.log
chown defang:defang /var/log/clam-update.log
chmod 600 /var/log/clam-update.log
su - defang
/usr/local/bin/freshclam
echo "0 * * * * /usr/local/bin/freshclam --quiet -l /var/log/clam-update.log" \
  > /tmp/defang-cron
crontab /tmp/defang-cron
rm /tmp/defang-cron
exit

Installing MIMEDefang

As mentioned previously, MIMEDefang is a framework for filtering email that uses Sendmail's Milter API, along with some C and perl code, to block or tag spam and viruses. MIMEDefang is also capable of a wide range of other body and header modifications as described on the MIMEDefang web site.

MIMEDefang requires the defang user and group, which we've already created, a spool directory and a quarantine directory, and the previously built Sendmail source code. First make the necessary directories:

mkdir -p /var/spool/MIMEDefang /var/spool/MD-Quarantine
chmod 700 /var/spool/MIMEDefang /var/spool/MD-Quarantine
chown defang:defang /var/spool/MIMEDefang /var/spool/MD-Quarantine

Then obtain the source for mimedefang-2.39.tar.gz. Unpack the source code in the same directory where the Sendmail source code was unpacked (/usr/local/src, for example), configure it, and install it:

tar zxf mimedefang-2.39.tar.gz
cd mimedefang-2.39
./configure
make
make install

Finally, install the sample MIMEDefang filter:

cp /etc/mail/mimedefang-filter.example /etc/mail/mimedefang-filter

If you want to change the existing filter rules or add your own, modify /etc/mail/mimedefang-filter.


Creating a Startup Script

The Sendmail startup script needs to start ClamAV and MIMEDefang before it starts Sendmail itself. The Sendmail startup script also changed drastically with newer versions of Sendmail, which split the daemon into two parts for increased security. To prevent your combined startup file from being overwritten by a new Sendmail patch or package, remove /etc/init.d/sendmail and create a new script named something like /etc/init.d/mailserver. Be sure to create links from the appropriate /etc/rcN.d directories, replacing those for S88sendmail and K36sendmail.

Executing /etc/init.d/mailserver start should bring up ClamAV, MIMEDefang, and Sendmail in order.


Testing and Troubleshooting

It's a good idea to turn on full debug logging during some sort of burn-in period. Try passing the new setup some spam and some viruses to see if they're caught and tagged. SpamAssassin comes with two example text files, sample-spam.txt and sample-nonspam.txt. As the names suggest, one should be flagged as spam and one should not. SpamAssassin can be tested from the command line as follows:

spamassassin -x -t < sample-nonspam.txt > nonspam.out
spamassassin -x -t < sample-spam.txt > spam.out

The -x indicates that SpamAssassin should not create a preferences file for the user running the test, and the -t flag invokes SpamAssassin's testing mode. When running the test from the command line, the spam.out file should containing the header X-Spam-Flag: YES and the nonspam.out should not, even though both files will indicate that the message may be spam. This note is tacked on to the bottom of both messages because spamassassin was invoked in test mode. The analysis details of each message should indicate the following for sample-spam.txt:

Content analysis details:   (1002.5 points, 5.0 required)

  pts rule name              description
---- ---------------------- --------------------------------------------------
1000 GTUBE                  BODY: Generic Test for Unsolicited Bulk Email
  1.6 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence between 51 and 100
                             [cf: 100]
  0.9 RAZOR2_CHECK           Listed in Razor2 (http://razor.sf.net/)

And the following for sample-nonspam.txt:

Content analysis details:   (0.0 points, 5.0 required)

  pts rule name              description
---- ---------------------- --------------------------------------------------
  0.0 LINES_OF_YELLING       BODY: A WHOLE LINE OF YELLING DETECTED

Next try mailing these messages to a test account via an outside machine. MIMEDefang does not allow SpamAssassin to modify the headers directly, so the headers that get added are from MIMEDefang and not SpamAssassin. The spam message should contain the following headers and have a SpamAssassin attachment declaring the message to be spam:

X-Spam-Score: 1000 (****************************************) GTUBE
X-Scanned-By: MIMEDefang 2.39

The non-spam message should contain no X-Spam headers at all, but should contain the X-Scanned-By: MIMEDefang 2.39 header.

If the SpamAssassin command-line tests fail, there is an error with the SpamAssassin and/or Razor installation. Try running the command-line tests with the -D flag to turn on debugging. If the command line tests succeed, but the SMTP tests fail, there is an error with the way SpamAssassin interfaces with MIMEDefang, MIMEDefang itself, or with Sendmail and/or the MIMEDefang Milter. Be sure that syslog is recording debugging information for the MAIL facility. Also make sure that MIMEDefang has the Mail::SpamAssassin feature enabled by doing the following:

mimedefang.pl -features

Also try logging debugging information from mimedefang-multiplexor by starting it with the -d flag. Output will be sent to the file /var/log/mdefang-event-debug.log:

touch /var/log/mdefang-event-debug.log
chown defang:defang /var/log/mdefang-event-debug.log
/usr/local/bin/mimedefang-multiplexor -d -p \
  /var/spool/MIMEDefang/mimedefang-multiplexor.pid -m 2 -x 10 -U defang -b 600 \
  -l -s /var/spool/MIMEDefang/mimedefang-multiplexor.sock

The virus scanner ClamAV also needs testing. Though MIMEDefang is using the clamd daemon, running clamscan on the test directory in the ClamAV source directory can help determine if the compilation of ClamAV was successful:

cd clamav-0.65
clamscan -r -l scan.txt

clamscan should find five or six (depending on if you have unrar 3.0 or greater) files with viruses in the test directory. For SMTP testing, I've found the site TESTVIRUS.org quite useful. This site will also test any other virus scanner and show what each does and does not block.

If ClamAV appears to be working incorrectly, try setting the log level to DEBUG in /etc/syslog.conf and modify /usr/local/etc/clamav.conf to turn on verbose logging, debugging, and keep the daemon from forking into the background:

# Enable verbose logging.
LogVerbose

# Don't fork into background. Useful in debugging.
Foreground

# Enable debug messages in libclamav.
Debug

If the problem appears to be with Sendmail or the Milter, try using various Sendmail debugging flags to pinpoint the issue. To watch the Sendmail/Milter interaction, for example, invoke sendmail as:

sendmail -Am -bs -d64.5

Each software package also has a community of users to turn to for extra help. There are generic resources like the newsgroup comp.mail.misc or program-specific mailing lists and wikis. I mention a number of these in the following Resources section.


Resources
  • Sendmail web site
  • Milter web site
  • MIMEDefang web site
  • ClamAV web site
  • SpamAssassin web site
  • Vipul's Razor web site
  • Razor2 patch for perl taint checking
  • David F. Skoll's LISA 2003 presentation (PDF) on fighting spam with MIMEDefang
  • Free third-party packages for Solaris from Sunfreeware.com
  • search.cpan.org perl code search engine
  • comp.mail.misc newsgroup
  • comp.mail.sendmail newsgroup
  • libmilter API
  • libmilter documentation
  • MIMEDefang mailing list
  • ClamAV mailing lists
  • SpamAssassin mailing lists
  • SpamAssassin wiki
  • list of SpamAssassin tests
  • Razor Users and Razor Announce mailing lists
  • comp.lang.perl.misc newsgroup

  • Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License.


    BigAdmin
      
     
     
     
    Would you recommend this Sun site to a friend or colleague?
    Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright Sun Microsystems, Inc.