My recent article on spam filtering
led reader Lee Williams to send me this message.
A slight variant on your suggestion is based in AI,
and it's called bayesian classification. Rather than having the user
explain to the machine every keyword it wants to flag, the bayesian
classifier simply assigns probabilities to each (word, phrase) based on
what class you put the email in. Then, when it sees another email, it
is able to use these probabilities to estimate the class it belongs in.
It's actually a lot simpler than having the enduser manually create
these filters.
Paul Graham discusses this in his "plan for spam".
The key element in both solutions is that they are
completely customizable. The existence of individual filters would not
necessarily eliminate 100% of spam to each person, but in the aggregate
it would mean that each email reaches considerably fewer people,
thereby hopefully making it at least somewhat less cost-effective.
That article is good reading. I think it would be something I would
definitely try. An email client with such filtering would be a great
hit, in my opinion. [Editor's note: Apple's Mail client in OS X
10.2 includes a Bayesian filter. I've been using it for over a month
and am still training it. dk]
I wonder if a combination of the traditional manual filtering and
this would be even better. The manual filtering would allow you to set
a rule for specific phrase or person (say your ex for example) which
gives a 100% probability of being filtered.
One interesting point related to your statement about the email
reaching fewer people - which he doesn't make either - is that if a
user actually wanted to receive these emails - after all, someone must
be responding to them - they'd like this filter, too, because they'd
sort the spam they like into the "keep" folder.
Contrary to Graham's hypothesis, spam need not end. It would become
highly targeted at those people most likely to respond to it. The rest
of us would ignore it. If everyone had such filters, the things you're
most interested in would be the only things that would get through.
Everyone would like this type of filtering. Haters of spam,
responders of spam, and senders of spam.
Now for an even spookier thought - apply the same technology to a
TiVo with voice recognition and filter your TV commercials the same
way.
is a longtime Mac user. He was using digital sensors on Apple II computers in the 1980's and has networked computers in his classroom since before the internet existed. In 2006 he was selected at the California Computer Using Educator's teacher of the year. His students have used NASA space probes and regularly participate in piloting new materials for NASA. He is the author of two books and numerous articles and scientific papers. He currently teaches astronomy and physics in California, where he lives with his twin sons, Jony and Ben.< And there's still a Mac G3 in his classroom which finds occasional use.