rblcheck 1.4 - Command-line interface to Paul Vixie's RBL filter.
Copyright (C) 1997, Edward S. Marshall <emarshal@logic.net>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

-- Compiling:

To compile rblcheck, simple edit the file "Makefile" to tastes; the
defaults are for gcc, but this program should compile happily on other
compilers. When you are satisfied with your compiler settings in Makefile,
type "make" and the program will try to compile. On some systems, you'll
need to add "-lresolv" to LDLIBS (Solaris, for example).

When invoked, rblcheck returns either -1 (for serious failure), 0 (to
indicate that the IP address is not filtered by any of the RBL-alike
services), or a positive number (indicating the number of filters that the
IP address is filtered by).

The program has several command line options:

-q      Be quiet about whether the IP address matched the filter or not.
        Handy for automatic processing scripts which just check the return
        code of rblcheck.

-t      Display, if available, a textual description of why the site was
        originally placed in a particular filter.

-l      List the currently defined RBL filters. By default, rblcheck will
        check "rbl.maps.vix.com" and "rbl.dorkslayers.com". You can change
        these defaults easily by editing rblcheck.c and changing the lines
        in main() which call togglesite().

-c      Clear the list of defined RBL filters. This is handy if you don't
        know what default filters might be defined on a system, and wish
        to hand-pick your list of filters.

-s <service> Add a new service to the list of defined RBL filters.

-h, -?  Get help about rblcheck.

-v      Display the version of rblcheck.

To verify that the program is working after you've compiled it, try the
following test:

	% rblcheck 127.0.0.2
	RBL filtered by rbl.maps.vix.com
        RBL filtered by rbl.dorkslayers.com
	% rblcheck 127.0.0.1
	not RBL filtered by rbl.maps.vix.com
        not RBL filtered by rbl.dorkslayers.com

If you get any other result from those two checks, then something has
gone terribly wrong, and you should email me with what happens. If it
works, go ahead and use it in good health, knowing that you've made
your world a better place to be. Well, ok, how about knowing that you
won't be getting nearly as much spam as you currently are? :-)


-- Using rblcheck with Procmail:

rblcheck was really designed to be used with procmail, as long as you have
access to the IP address of the system sending you email. Surprisingly,
most MTAs make obtaining this information more difficult than it needs to
be. The procmail rule I present here assumes you've found some way to put
the IP address of the sender in the variable TCPREMOTEIP. See the sections
below on Sendmail and QMail for ideas on how you can get ahold of this
value.

The following procmail rule will, once you have TCPREMOTEIP, use rblcheck
to look up the IP address in the MAPS RBL filter:

:0
* ! ? if [ -n "$TCPREMOTEIP" ]; then rblcheck -q "$TCPREMOTEIP"; fi
{
	EXITCODE=100
	LOGABSTRACT=all
	LOG="Filter: RBL-filtered address: \"$TCPREMOTEIP\"
"
	:0:
	$FILTER_FOLDER
}

"FILTERFOLDER" is assumed to have been set up ahead of time as the place
to put email that you don't want to see (either another incoming folder,
/dev/null, or a 'formail' invokation that rewrites the message and tacks
on an extra header or munges the subject so you can easily identify it.

Note that the EXITCODE variable above is for QMail, and indicates a
permanent error. Under Sendmail, 77 is more appropriate. Under anything
else, it's hard to say; your best bet is to refer to the documentation
regarding execution of programs.

One more thing: procmail has a nasty habit of munging the environment,
but most MTAs already do this for you. Hence, you should add the '-p'
flag to the invokation of procmail (either from a local delivery rule in
Sendmail, or from a .forward or .qmail file). This will ensure that
procmail doesn't clean out the TCPREMOTEIP variable.

To test the procmail recipe:

a) save any mail message, with full headers, to a file.

b) run procmail with the environment variable "TCPREMOTEIP" set to an
   offending address, and with the message you just saved as input:

    cat message | env - TCPREMOTEIP=127.0.0.2 procmail -p

c) check your procmail log and mailbox. If the message went through,
   you have a problem. If you have a message in your logfile stating
   that the message was bounced for being in the RBL, you're fine.

d) run procmail with the environment variable "TCPREMOTEIP" set to a
   non-filtered address, such as 127.0.0.1, and with the message as
   input:

    cat message | env - TCPREMOTEIP=127.0.0.1 procmail -p

e) check your procmail log and mailbox. If the message didn't go
   through, you have a problem. If you have a copy of the message in
   your mailbox, and no errors in your log file, you're fine.


-- Using rblcheck with Sendmail:

This solution for obtaining the IP address of the connecting host could be
considered to be a bit of a hack, but it works quite reliably. If you're
an ordinary user on your system, you won't be able to use this; talk to
your system administrator about the possibility of installing the
sendmail.cf patch below. Point them at this file as a source of
information.

Currently, in your sendmail.cf file, you'll probably something like:

Mlocal,	P=/bin/mail, F=lsDFMAw5:/|@SnE, S=10/30, R=20/40,
	T=DNS/RFC822/X-Unix,
	A=mail -f $g -d $u

Or, if you're using procmail as the local delivery agent:

Mlocal,	P=/usr/bin/procmail, F=lsDFMAw5:/|@ShPfn, S=10/30, R=20/40,
	T=DNS/RFC822/X-Unix,
	A=procmail -a $h -d $u

This is the local delivery rule used to execute .forward scripts.  Your
system might use something like "rsh" or another restricted shell instead
of "sh" for running programs. Don't let that scare you; they all basically
work the same.

Change the above lines to look like this (there will also be Mprog lines
which look similar; you can modify them in exactly the same manner):

Mlocal,	P=/usr/bin/env, F=lsDFMAw5:/|@SnE, S=10/30, R=20/40,
	T=DNS/RFC822/X-Unix,
	A=env TCPREMOTEIP="${client_addr}" mail -f $g -d $u

(replacing "mail -f $g -d $u" with "procmail -a $h -d $u" appropriately.)

ONLY change the P=... and A=... entries. Most certainly do not mess with
F=... unless you know what you are doing.

This will create an environment variable "TCPREMOTEIP", which you can now
use with rblcheck to determine if the address has been blocked. To test
this, set up an alias like:

    foo: |mailx -s "$TCPREMOTEIP" user@domain.com

and then send email to the alias "foo" (or whatever). You should
immediately get a piece of email with the IP address which sent the
message in the subject line. (Replace "mailx" with "mail" on some
systems.)

This is about the most efficient means of getting this information to
executed programs that I can see with sendmail. What would -really- be
nice here would be a way to program how Sendmail sets up the environment
before executing an external program, at the point of execution. Bug the
Sendmail developers if you agree with me. ;-)


-- Using rblcheck with QMail:

Getting this going under QMail turns out to be a real challenge, since
QMail doesn't have the same level of programmability that Sendmail has.
Hence, we need to employ an additional script to grab the IP address from
the headers. (Thanks to Russell Nelson for confirming QMail's behavior
here.)

QMail has a very specific means of adding Received: lines to messages,
making them relatively easy to parse. For example, the following headers
are typical:

Return-Path: <emarshal@xnet.com>
Delivered-To: emarshal@LOGIC.NET
Received: (qmail 26029 invoked from network); 13 Oct 1997 15:04:13 -0000
Received: from quake.xnet.com (HELO mail.xnet.com) (198.147.221.35)
  by labyrinth.logic.net with SMTP; 13 Oct 1997 15:04:13 -0000
...

We can disregard the Return-Path: and Delivered-To: lines; they're
unimportant to us. The Received: headers are the most interesting. The
first Received: line we'll see is the local delivery of the mail; hence,
the "qmail 2609 invoked from network". The second Received: line is the
most important to us; it's the one which contains the IP address of the
sender...in this case, 198.147.221.35. To complicate things, the "(HELO
mail.xnet.com)" section may not exist, and the IP address might have ident
information prepended to it (like "qmailr@198.147.221.35").

Two programs are provided to help you retrieve this information
automatically from the headers, both with the same semantics. "origip.c"
compiles into "origip", and for those who have trouble compiling it (if
you do, please email me with any errors), "origip.awk" is provided which
behaves the same way.

Essentially, you pass either of these programs an email message, and they
in turn extract the sending address and either print it back to you, or
exit with a non-zero return value.

To use this in procmail, just use:

TCPREMOTEIP=`origip || echo 127.0.0.1`

This will pipe the message through origip (replace origip with
"origip.awk" in the case of using the awk script), and will capture the
address. If there is an error, we'll default to 127.0.0.1, which will
allow the mail through.

(If you're undecided which program of the two you want to use, consider
that the C version is much faster, and will be maintained more than the
awk script. However, the C version is probably more prone to bugs. ;-)

Once you have that line in place, go ahead and use the procmail recipe
supplied above in good health.


-- Using rblcheck with inetd and smtpd:

If you use an smtp server which runs from inetd (sendmail can operate this
way, as can QMail and a number of other MTAs), here's a good way to do
site-wide filtering using rblcheck; add the following to /etc/hosts.allow:

smtpd: ALL: spawn /usr/local/bin/rblcheck -q %a && \
	exec /usr/local/bin/smtpd || /bin/echo \
	"469 Connection refused. See http://maps.vix.com/rbl/\r\b\r\n"

This gives you RBL support on a site-wide basis, even if native support
doesn't yet exist for your MTA of choice.


-- Other mailers/other filter packages:

Don't ask me. If you figure out a way to make this work under another
setup, let me know how you did it, and I'll add it here. If you find
better ways of doing this than the ones I'm using above, let me know too,
and you'll see your idea show up in here in the next release.


-- Notes on origip:

Not convinced that you should use the C version of origip? Here's some
data (using test_origip.sh as a testbed) which definitely speaks volumes:

With origip.awk:

18.60user 22.46system 0:43.59elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (102172major+51041minor)pagefaults 0swaps

With origip.c:

13.16user 19.41system 0:33.73elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (71381major+46044minor)pagefaults 0swaps

10 seconds faster, 3 seconds less system time used, 5 seconds less user
time, and over 30,000 less page faults. In other words: a LOT easier on
the box you'll be running this on. This test was run on a little under 600
messages, from various mailing lists and private messages, on a Linux
2.0.31 system.
