SURBL Lists
SURBLs contain web sites that appear in unsolicited messages.
They can be used with programs that can check message body
web sites against SURBLs, such as SpamAssassin 3 and others mentioned on
the links page.
Here's an overview of the lists and their data sources.
sc.surbl.org contains message-body web sites processed from
SpamCop
URI reports, also known as
"spamvertised"
web sites.
The reports are not used directly, but are subject to extensive processing.
Entries in sc.surbl.org expire automatically several days after
the SpamCop reports decrease.
Note that this list is not the same as
bl.spamcop.net,
which is a list of mail sender IP addresses found in message headers.
ws.surbl.org has records from Bill Stearns'
SpamAssassin ruleset
sa-blacklist,
plus some other manual lists. Bill's
policy
for inclusion and cleaning of the sa-blacklist
is quite sound, though it differs somewhat from some of the other
SURBLs.
ws and sc seem to detect some
different types of sites, so using both
they should to complement each other well.
Advantages of turning SA rulesets into SURBLs
Using SURBLs derived from SpamAssassin rulesets
instead of the rulesets offers several advantages.
First, there is much less memory usage in SpamAssassin,
since a large set of rules is not loaded,
instead being cached as DNS data in your local name server.
Second, since data in SURBLs are no longer tied to SpamAssassin,
they can be used in other programs that can check message body URIs against
a list, such as MTA plugins, other mail filters, etc.
Third, updates tend to be more timely since a DNSBL can be updated
automatically every few minutes with generally low overhead.
So applications using SURBLs gain efficiency,
modularity, portability, and automatically updated data.
SURBL and Bill Stearns strongly recommend using
SURBLs instead of sa-blacklist as a SpamAssassin ruleset.
Anyone using sa-blacklist should migrate to using SURBLs instead.
SURBLs are supported in SpamAssassin version 3 and later.
Outblaze
is kindly providing their internal URI blacklist which
is published as ob.surbl.org. The list is detecting about 70%
of unsolicited messages with about 0.03% false positives.
Outblaze describes the data as
coming from message body analysis and from user reports.
SURBL applies additional policies to its version of
the Outblaze URI data that are published as ob.surbl.org.
The user reports are also used, but not directly.
Note that Outblaze's sender IP blacklist, which is visible
on their web site, is not the same as their
URI blacklist.
The SURBL list is based on their separate URI blacklist
which is not visible on their web site.
AbuseButler
is kindly providing its
Spamvertised Sites
which have been most often reported over the past 7 days.
The philosophy and data processing methods are similar
to the sc.surbl.org data, and the results are similar, but not identical.
Data sources for AbuseButler include SpamCop
and native AbuseButler reporting.
The
Anti-Phishing Working Group
has a good definition of phishing on their web site.
Phishing and malware data from multiple sources are
included in the ph Phishing data source.
Phishing data
were first provided by
MailSecurity.
As of October 2006, we are also adding
PhishTank data to our phishing list.
As of September 2007, we are also listing
OITC phishing and malware data.
As of December 2007, we have added
The DNS blackhole
malware, malicious software and phishing site data
from malwaredomains.com to our phishing list.
As of April 2008, the list also includes
Malware Block List
data from malware.com.br.
As of June 2009, the ph list includes
ZeuS Tracker malware host data.
As of October 2009, data from
Malware Domain List
has been added to the ph list.
Note that the above is only a sampling of some of the PH data sources.
We apply additional policy-based filtering to most of the data sources
before listing.
Joe Wein's
jwSpamSpy
program forms the basis of the JP data,
being used both by Joe's own systems and also
Raymond Dijkxhoorn and his colleagues at
Prolocation.
Prolocation is processing more than
300,000 likely unsolicited messages per day
using jwSpamSpy plus their own policies and adding them to Joe's data.
The resulting list has a very good detection rate around
80% and a very low false positive rate around 0.01%.
JP is included in the default configuration of SpamAssassin 3.1
and other SURBL applications.
All of the SURBL data sources are combined into
a single, bitmasked list: multi.surbl.org.
Bitmasking means that there is only one entry per domain name
or IP address, but that entry will resolve into an address
(DNS A record)
whose last octet indicates which lists it belongs to.
The bit positions in that octet for the different lists are:
2 = comes from sc.surbl.org
4 = comes from ws.surbl.org
8 = comes from phishing data source (labelled as [ph] in multi)
16 = comes from ob.surbl.org
32 = comes from ab.surbl.org
64 = comes from jp data source (labelled as [jp] in multi)
If an entry belongs to just one list it will
have an address where the last octet has that value, for
example 127.0.0.8 means it comes from the phishing list
and 127.0.0.2 means it's in the data used in sc.surbl.org.
An entry on multiple lists gets the sum of those list numbers
as the last octet, so 127.0.0.6 means a record is on both
ws.surbl.org and sc.surbl.org (comes from: 2 + 4 = 6).
In this way, membership in multiple lists is encoded into a single response.
Please use multi and not the individual lists,
since using multi combines several queries into a single one,
reducing DNS usage and overhead.
The individual lists may be withdrawn at some point in future.
Therefore every SURBL application should use multi only.
We recommend using multi with programs that can
decode the responses into specific lists according to bitmasks,
such as SpamAssassin 3's
urirhssub
or
SpamCopURI
version 0.22 or later for use with SpamAssassin 2.64.
Default TTL for the live data in the combined list is 3 minutes.
Each entry also has a TXT record mentioning which lists it is on,
and pointing to this page.
While we expect the TXT records to be relatively stable,
we recommend that automatic processing be based on the A record.
Other lists may become available as future SURBLs.
Please check back here occasionally or on the
Announce mailing list for updates.
To request removal from a SURBL list, please start with the
the SURBL Lookup
page and follow the instructions on the removal form.
For the ph.surbl.org list (PH),
please be sure to remove and secure all phishing sites,
cracked accounts, viruses, malware loaders, trojan horses,
unpatched operating systems, insecure PHP boards, cracked SQL,
insecure ftp passwords, etc., from your server before contacting us.
lists.html version 2.39 on 5/29/10