[Population: One] <A HREF="http://popone.innocence.com/ar
Nov. 19th, 2002 04:33 pmJeremy Bowers writes on the hidden dangers of Bayesian spam filters. Core of the argument: spammers can use any possible filter mechanism to fine tune their spam, and since the Bayesian filter is the best we have, once it fails we're doomed.
However, if you're trying to sell me something, you have to either a) market it in the body of the message, or b) give me a URL to look at. Here's the simple algorithm for filtering spam with URLs in it: if the sender is in my address book, let it through. Otherwise, mark it as possible spam. Jeremy neglects to consider the possibility of personalized filters which by their nature can't be duplicated by spammers, since they rely on information that only I have.
Bayesian filters may in the end prove to be personal enough, in fact, since they use your own email as the basis for the filters. All in all, I'm not too worried.
(Link by way of Workbench.)
no subject
Date: 2002-11-19 03:31 pm (UTC)Though right now I'm not using a Bayesian filter -- I'm relying mostly on SpamAssassin, and I've turned off Mac OS X Mail's adaptive filter. Right now SpamAssassin is doing an excellent job of categorizing spam according to a set of very specific rules. I like knowing exactly what those rules are; I didn't like having to double-check Mac OS X Mail periodically and 'teach' it if it ever miscategorized a message.
There was a recent article on Slashdot, I think, about people gradually switching from blacklists to whitelists and assuming that an email is spam unless proven otherwise. It'll be interesting to see what happens if this trend continues.