Spam and Personal Privacy Presented by: Ashley Embry Outline What is Spam? A. Types of Spam Where Did the Word “spam” Originate? How Spam Begins: A General Explanation Who Has the Potential to be a Spammer? Statistics About Spam Getting Rid of Spam Breakdown of a Spam Filter
Presented by: Ashley Embry
A. Types of Spam
There are many definitions of spam that are used.
Spam is simply flooding the internet with many copies of the same message in an attempt to force the message on people who would not otherwise choose to receive it.
There are two main types of Spam:
1. Usenet Spam is aimed at people who read newsgroups but rarely or never post and give their information away.
2. E-mail spam targets individual users with direct mail messages. E-mail spam lists are created by scanning Usenet postings, stealing Internet mailing list, or searching for addresses.
The history of calling inappropriate postings in great numbers “spam” is from a Monty Python skit where a couple goes into a restaurant and the wife tries to get something other than Spam. In the background there is a group of Vikings who are singing the praises of Spam. Pretty soon the only thing that you can hear is…
Like the song spam is the endless repetition of worthless text.
A dictionary attack utilizes software that opens a
connection to the mail server and rapidly submits millions of random
e-mail addresses. Many of these addresses have slight variations,
such as "firstname.lastname@example.org" and email@example.com.
The software then records the address locations and adds those
addresses to the spammer's list. These lists are typically resold to
many other spammers .
Anyone can be a spammer.
Let’s say your grandmother bakes the best banana nut bread ever created, and you want to sell the recipe for $5.
You have 100 people in your personal e-mail address book. You send out an e-mail advertising,
“Big Momma’s Nana Nut Bread - only $5 !!!”
From your 100 e-mails you get 2 orders and make $10.
Imagine if you had sent out 1,000,000 e-mails…
In a single day in May, the No. 1 internet service provider AOL Time Warner (AOL) blocked 2 billion spam messages—88 per subscriber—from hitting it’s customers e-mail accounts.
Microsoft (MSFT) which operates the No.2 service provider MSN and Hotmail says it blocks an average of 2.4 billion spams per day.
Most spam blockers use filters that search for commonly used phrases or writing styles that are overly aggressive and found in mass e-mail marketing. Spammers try to fool the filters by changing their writing styles and formats so that their messages can sneak past the filters.
The best technology currently available to stop spam is spam filtering software.
The simplest filters use keywords such as “xxx,” “viagra,” etc, but they are also more likely to block the e-mails that you do want to receive.
The more advanced filters, Bayesian filters for example, take this approach further to statistically identify spam based on frequency.
When new mail arrives now, it is scanned into tokens, and the fifteen tokens whose probabilities are the farthest from the neutral probability of .5 are then used to calculate the probability that the e-mail is a spam.
To determine probability of the token being in a spam:
let ((g (* 2 (or (gettable token good) 0 ))
(b (or (gettable token bad) 0 ))
(unless (< (+ g b) 5)
(min .99 (float (/ (min 1 (/ b nbad))
(+ (min 1 (/ g ngood))
(min 1 (/ b nbad))))))
To determine if the e-mail is a spam using the probabilities of the 15 chosen tokens:
let ((prod (apply # ‘ * probs)))
(/ prod (+ prod (apply # ‘ * (mapcar # ‘ (lambda (x)
*information taken from www.paulgraham.com
Whether constructing a spam list or implementing a spam filtering program, spam is based on the concept and utilization of computer science.
By the end of this presentation you should be able to answer the following question:
Name 2 techniques we learned in CIS class that are used by spammers or in spam filtering.
“Before Spam Brings the Web to Its Knees.” June 10, 2003.
Brain, Marshall. “How Spam Works”
“Getting Rid of Spam”
Graham, Paul. “A Plan for Spam.” Aug.2002.
“Origins of Spam”
“Spam” July 20, 2004.