CertCities.com | Column: Endmail Part I: The War on Spam

CertCities.com -- The Ultimate Site for Certified IT Professionals

Visit CertCities.com Forums and Ost Your Mind

Microsoft®

Cisco®

Security

Oracle®

A+/Network+"

Linux/Unix

More Certs

Newsletters

Salary Surveys

Forums

News

Exam Reviews

Tips

Columns

Features

PopQuiz

RSS Feeds

Press Releases

Contributors

About Us

Advanced Search

Free Newsletter

Sign-up for the #1 Weekly IT
Certification News
and Advice.

See What's New on
Redmondmag.com!

• Cover Story: IE8: Behind the 8 Ball

• Tech-Ed: Let's (Third) Party!

• A Secure Leap into the Cloud

• Windows Mobile's New Moves

• SQL Speed Secrets

Let us know what you
think! E-mail us at:

Home

Editorial

Columns

Column Story

Saturday: April 5, 2014

James Ervin

PRINTABLE FORMAT

E-MAIL STORY

POST YOUR COMMENTS

MORE COLUMNS

Endmail Part I: The War on Spam

James offers this overview of the methods for fighting Spam before diving into tools for Linux administrators.

by James Ervin

4/2/2003 -- Legend has it that in the Internet's youth, e-mail providers maintained a "gentleman's agreement" not to meddle with the contents of messages. Today, users frustrated with a far less gentle Internet demand meddling to reduce spam. This places providers in the unenviable position of protecting their customers' privacy while invading it. Bruce Schneier has commented sagely on this phenomenon: people don't want privacy; they just want assurances that their private information won't be misused.

Social Engineering
Like computer security, Schneier's métier, spam is a social problem rather than a technical one. The anti-spam community's toughest hurdle is the lack of a legally defensible and enforceable federal definition of spam-if one existed, it's certain that the flood of class-action lawsuits to follow would intimidate at least some unregenerate spammers. Unfortunately, thanks to a weak-kneed, uninformed Congress and its long history of protecting the interests of enterprise on the grounds that unrestricted commercial speech is included under the rubric of free speech, no federal anti-spam legislation has been enacted in the United States, although the European Parliament recently adopted measures going into effect in October 2003.

In the US, what exists is a hodgepodge of confusing and contradictory state laws: for instance, North Carolina law only prohibits unsolicited commercial e-mail. Elizabeth Dole's senatorial campaign used this loophole to send thousands of unsolicited non-commercial political advertisements. This tactic struck at least one constituent, Ken Pugh, as disingenuous, though his lawsuit for $80 was later thrown out of small-claims court. Undoubtedly, the definition of "solicited" will be up for grabs next time around. Without federal guidelines, pursuing such claims across state lines is laughable.

Public distaste for spam is so strong that even the Direct Marketing Association (DMA) has spoken out in favor of spam laws, on the ground that "legitimate" marketing is damaged by the prevalence of spam when users delete legitimate advertisements indiscriminately. Lest you think that the DMA truly has your best interests at heart, note that they are still lobbying against the recently approved national telemarketer's "opt-out" list in a lawsuit against the Federal Trade Commission. As a preemptive strike on the potential precedent this lists sets for spam, the DMA is now touting self-regulation as the answer to the spam problem, and has set up their own opt-out list known as the E-mail Preference Service.

To a different cast of mind, the DMA's position begs the question of whether "legitimate" marketing even exists. The most militant anti-spammers contend that the recipient is always the final arbiter of what constitutes spam-hence, spam is spam, "legitimate" or not. Yet it's not difficult to conceive of situations where the recipient's opinion should be disregarded-for instance, when a town council sends adverse weather advisories (perhaps "terror alerts" is more topical) to its constituency. Supreme Court Justice Potter Stewart's famous concurring opinion on the definition of obscenity in Jacobellis v. Ohio -- "I know it when I see it" -- applies equally well to the contentious definition of spam. Stewart's opinion is justly notorious, though: "common sense" can't be legislated effectively, because everyone's differs.

A frequently cited common-sense definition of spam is "unsolicited commercial bulk e-mail," and it's clear that the words "unsolicited," "commercial," and "bulk" go a long way towards expressing what many people feel is uniquely objectionable about spam. An imperfect definition is a better foundation for law than none, and this definition has been adopted by many anti-spam groups and local governments in an attempt to ameliorate the situation. Yet even if spam can be effectively defined, there are always people adept at circumventing definitions, or simply ignoring the law.

Technical Solutions
The latest spam-fighting techniques take advantage of the social spam-fighting finesse we all inherently possess.

Distributed Identification
Distributed identification applies peer-to-peer logic to the spam problem. A community of users identifies incoming mails as spam. Once enough users do so, the offending mails can be dealt with by other subscribers. The Razor Perl module used by SpamAssassin (the leading server-side spam identification utility) and its commercial offshoot, SpamNet (which offers a plug-in for Microsoft Outlook users) are the most visible proponents of this method. Unfortunately, centralized peer-to-peer networks are subject to a host of problems, including nefarious users, denial-of-service attacks, and so on.

Content Filtering
Content filtering reverses the gentleman's agreement: each incoming e-mail is presumed guilty, and searched for identifying marks. If the Mark of Cain is found, the mail is quarantined, discarded, or branded as spam so that an aware browser or delivery program can deal with it. The similarities to virus scanning are obvious and intentional. Despite the stench of McCarthy, content filtering is the most effective method of identifying spam. Seen in this light, distributed identification is simply a form of content filtering that uses humans as the sieve.

The most famous new content filter is described in an article which has become a battle cry for spam warriors, Paul Graham's "A Plan for Spam." Graham describes a method called Bayesian filtering, wherein users "seed" their e-mail filters with known spams. While traditional content filters use a static set of rules to determine whether e-mail is spam, Bayesian filters change over time, based upon your additions to the spam database. If you don't want to receive mails containing the word "orange," simply identify enough mails with that word as spam, and the filter should eventually comply. Many products employing Bayesian filtering are available, including Mozilla and the open-source Bogofilter, ifile, and bmf, command-line Unix utilities.

The tragic flaw of Bayesian filtering is that if truly effective, it will sow the seeds of its own demise -- like smallpox vaccinations, if there's no spam left to train filters on, we'll all be vulnerable to a sudden outbreak. Granted, that's unlikely, and projects such as the Spam Archive are doing the hard work of collecting spam for us. A more serious concern is that, given the inevitable increases in storage and bandwidth in years to come, spammers will begin doping their e-mails with legitimate words (perhaps entire dictionaries), confounding the Bayesian filters. Pornographic websites often include common search words in non-displayable HTML code to boost their rankings on common search engines. Graham asserts that a similar approach is unlikely to work, because spammers would have to tailor their messages to individual senders' and recipients' writing styles-however, it's conceivable that enterprising linguistics students might be able to offer statistical probabilities on what words spammers should include, if a reliable database of legitimate e-mail could be acquired. Imagine an Outlook virus that discreetly sent inboxes to the Direct Marketing Association, for instance.

Blacklisting
Blacklisting is the practice of simply not accepting mail from recidivist domains -- the Internet equivalent of sex offender registration databases. The Mail Abuse Prevention System (MAPS) publishes a list of abusive domains via the Domain Name System (DNS) and other methods. Unfortunately, blacklisting often results in the loss of legitimate e-mail, and should you happen to fall into one through no fault of your own, via identity theft or other means, it's very difficult to get out. It's not for nothing that MAPS calls their blacklist the Realtime Blackhole List.

The efficacy of blacklists is also limited by their high turnover: spammers can easily obtain a free e-mail account with an online service, send their spam, and slink back into the shadows like some sort of trap-door spider. Blacklists are legion; some blacklists, such as the Open Relay Database, aren't even devoted to spammers, but to sites which simply allow relaying of mail. To maintain control over mail delivery and be able to respond to legitimate complaints, some organizations maintain their own blacklists rather than subscribing to any of the free services. Sendmail 8.9, for instance, introduced the access database capability, which allows selective processing of incoming mail based upon the sender's name, domain, or IP; most mail delivery agents have a similar feature.

Next Time...
Since there's lots of money to be made, the entire force of human ingenuity is being brought to bear on both sides of the spam war. This has led to an interesting, perhaps disturbing convergence between the techniques of law enforcement and anti-spam zealots, for whom some information is adamantly not meant to be free, but kept tightly in check. Although it's tempting to wax nostalgic for the days when the gentlemen were all in accord, it's now impossible to de-commercialize the Internet.

Spam is a social problem on a global scale, less serious than famine or disease but equally recalcitrant. To misquote Bruce Schneier once again, there's no "magic spam dust," that can be sprinkled on the problem. The anti-spam community is well aware of this; but it's equally true that technical wizardry can ameliorate the problem. Next time, we'll look at a few of the previously mentioned spam tools in-depth.

Comments? Questions? Post your thoughts below!

James Ervin is alone among his coworkers in enjoying Michelangelo Antonioni films, but in his more lucid moments suspects that they're not entirely wrong.

More articles by James Ervin:

Secure Shell Tips and Tricks

*Nix: The Year In Review

Introductory Database Access with PHP

DNS Caching for Fun and Profit

-- advertisement --

There are 45 user Comments for “Endmail Part I: The War on Spam”
Page 1 of 5
4/2/03: Martha McDermit from New York says:	There aren't many things worse than spam, but James Ervin's columns are one of them. I suggest James right an article on what it is like to spend 2 years in a closet..at least while he does his research we can all enjoy the time.
4/3/03: Joey from NC says:	Martha McDummit is off-base. I enjoyed the article. It was a good summary of the state-of-the art. P.S. to editors: Can you fix her typo ("right" should be "write")?
4/13/03: Nina from Texas says:	Isn't "Martha McDermit" a common name that a spam-roboter generates as a reply-address when sending spam? There are days I get 40 spams a day and it makes me so upset, I'd even opt to an aggressive p2p-solution that lets all p2p-users attack the spammers sites (sites like "ahugeorgan.com" i.E.).
6/30/13: michael kors outlet from [email protected] says:	good share. michael kors outlet http://www.michaelkorsioutlet.org/
7/1/13: louis vuitton outlet from [email protected] says:	good share. louis vuitton outlet http://www.louisvuittonttoutlet.com
7/5/13: gucci outlet from [email protected] says:	ths gucci outlet http://www.guccioutletstore-online.com
7/5/13: christianlouboutinoutleta.com from [email protected] says:	ths christianlouboutinoutleta.com http://www.christianlouboutinoutleta.com
7/26/13: Herve Leger uk from [email protected] says:	good articles Herve Leger uk http://www.herveleger-outlet.co.uk/
8/30/13: nfl wholesale jerseys from [email protected] says:	thanks for share! nfl wholesale jerseys http://www.wholesalenflljerseys.com
9/2/13: sale outlet from [email protected] says:	Pop Over To These Guys sale outlet http://tensiletech.com/news/fashion.asp??p=8840
First Page Next Page Last Page

Your comment about: “Endmail Part I: The War on Spam”
Name:	(optional)
Location:	(optional)
E-mail Address:	(optional)
Comment:

-- advertisement (story continued below) --

top

Search | Site Map | Redmond Media Group | TechMentor Conferences | Tech Library Webcasts
This Web site is not sponsored by, endorsed by or affiliated with Cisco Systems, Inc., Microsoft Corp., Oracle Corp., The Computing Technology Industry Association, Linus Torvolds, or any other certification or technology vendor. CiscoÆ and Cisco SystemsÆ are registered trademarks of Cisco Systems, Inc. Microsoft, Windows and Windows NT are either registered trademarks or trademarks of Microsoft Corp. OracleÆ is a registered trademark of Oracle Corp. A+Æ, i-Net+T, Network+T, and Server+T are trademarks and registered trademarks of The Computing Technology Industry Association. (CompTIA). LinuxT is a registered trademark of Linus Torvalds. All other trademarks belong to their respective owners.

Reprints allowed with written permission from the publisher. For more information, e-mail