spam and personal privacy l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Spam and Personal Privacy PowerPoint Presentation
Download Presentation
Spam and Personal Privacy

Loading in 2 Seconds...

play fullscreen
1 / 21

Spam and Personal Privacy - PowerPoint PPT Presentation


  • 407 Views
  • Uploaded on

Spam and Personal Privacy Presented by: Ashley Embry Outline What is Spam? A. Types of Spam Where Did the Word “spam” Originate? How Spam Begins: A General Explanation Who Has the Potential to be a Spammer? Statistics About Spam Getting Rid of Spam Breakdown of a Spam Filter

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Spam and Personal Privacy' - adamdaniel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
spam and personal privacy

Spam and Personal Privacy

Presented by: Ashley Embry

outline
Outline
  • What is Spam?

A. Types of Spam

  • Where Did the Word “spam” Originate?
  • How Spam Begins: A General Explanation
  • Who Has the Potential to be a Spammer?
  • Statistics About Spam
  • Getting Rid of Spam
  • Breakdown of a Spam Filter
  • Conclusions
  • Questions for the class
what is spam
What is Spam?

There are many definitions of spam that are used.

  • Electronic junk mail or junk newsgroup postings.
  • Any unsolicited automated e-mail.
  • Email advertising for some product sent to a mailing list or newsgroup.

Spam is simply flooding the internet with many copies of the same message in an attempt to force the message on people who would not otherwise choose to receive it.

types of spam
Types of Spam

There are two main types of Spam:

1. Usenet Spam is aimed at people who read newsgroups but rarely or never post and give their information away.

2. E-mail spam targets individual users with direct mail messages. E-mail spam lists are created by scanning Usenet postings, stealing Internet mailing list, or searching for addresses.

where did the word spam originate
Where Did the Word “spam” Originate?

The history of calling inappropriate postings in great numbers “spam” is from a Monty Python skit where a couple goes into a restaurant and the wife tries to get something other than Spam. In the background there is a group of Vikings who are singing the praises of Spam. Pretty soon the only thing that you can hear is…

Like the song spam is the endless repetition of worthless text.

slide7
Another proposal is that “spam” was thought of by a computer lab group at the University of Southern California, who gave it the name because it has many of the same characteristics as the lunch meat Spam.
  • Nobody wants it or ever asks for it.
  • No one ever eats it; it is the first item to be pushed to the side when eating the entrée.
  • Sometimes it is actually tasty, like the 1% of junk mail that is really useful to some people.
how spam begins a general explanation
How Spam Begins: A General Explanation
  • Spammers only need access to your address. After that its just a matter of sending the e-mails.
    • The primary sources that spammers use are newsgroups and chat rooms.
    • The second source used is the Web itself. Spammers can create search engines that look for the @ sign which indicates an e-mail address.
    • The third source is sites created specifically to attract e-mail recipients.
      • “Win $1 million!!! Just Click Here!”
      • “ Would you like news letters form our partners”
slide9
Finally, probably the most common source of e-mail addresses comes from searching the e-mail servers of large e-mail hosting companies like Hotmail.
  • The Hotmail article “A Spammer’s Paradise” reads:

A dictionary attack utilizes software that opens a

connection to the mail server and rapidly submits millions of random

e-mail addresses. Many of these addresses have slight variations,

such as "jdoe1abc@hotmail.com" and jdoe2def@hotmail.com.

The software then records the address locations and adds those

addresses to the spammer's list. These lists are typically resold to

many other spammers .

who has the potential to be a spammer
Who Has the Potential to be a Spammer ?

Anyone can be a spammer.

Scenario

Let’s say your grandmother bakes the best banana nut bread ever created, and you want to sell the recipe for $5.

You have 100 people in your personal e-mail address book. You send out an e-mail advertising,

“Big Momma’s Nana Nut Bread - only $5 !!!”

From your 100 e-mails you get 2 orders and make $10.

Imagine if you had sent out 1,000,000 e-mails…

statistics about spam
Statistics About Spam

In a single day in May, the No. 1 internet service provider AOL Time Warner (AOL) blocked 2 billion spam messages—88 per subscriber—from hitting it’s customers e-mail accounts.

Microsoft (MSFT) which operates the No.2 service provider MSN and Hotmail says it blocks an average of 2.4 billion spams per day.

getting rid of spam
Getting Rid of Spam
  • Avoid giving out your e-mail address to unfamiliar or unknown recipients.
  • Use your e-mail application’s filtering features.
  • Report the spam e-mailer to the spammer’s ISP.
  • Use spam filtering software.
breakdown of a spam filter
Breakdown of a spam filter

Most spam blockers use filters that search for commonly used phrases or writing styles that are overly aggressive and found in mass e-mail marketing. Spammers try to fool the filters by changing their writing styles and formats so that their messages can sneak past the filters.

The best technology currently available to stop spam is spam filtering software.

The simplest filters use keywords such as “xxx,” “viagra,” etc, but they are also more likely to block the e-mails that you do want to receive.

example
Example

The more advanced filters, Bayesian filters for example, take this approach further to statistically identify spam based on frequency.

  • An example of how this statistical filtering works:
    • Start with one collection of spam and one of nonspam mail, and each collection had about 4000 messages in it.
    • Scan the entire text of each message of the collection.
    • Consider alphanumeric characters, dashes, apostrophes, and dollar signs to be as part of tokens (words) and everything else to be a token separator. (i.e. qt234abc, $75, u’tt)
    • Count the number of times each token occurs in each message. You will end up with two large tables with each one showing the different tokens and how many times it appeared in the messages.
slide15
Finally, create a third table that relates the token to the probability (ranging from .01 to .99) that an e-mail containing it is a spam.

When new mail arrives now, it is scanned into tokens, and the fifteen tokens whose probabilities are the farthest from the neutral probability of .5 are then used to calculate the probability that the e-mail is a spam.

slide16
Algorithms/Program language

To determine probability of the token being in a spam:

let ((g (* 2 (or (gettable token good) 0 ))

(b (or (gettable token bad) 0 ))

(unless (< (+ g b) 5)

(max .01

(min .99 (float (/ (min 1 (/ b nbad))

(+ (min 1 (/ g ngood))

(min 1 (/ b nbad))))))

To determine if the e-mail is a spam using the probabilities of the 15 chosen tokens:

let ((prod (apply # ‘ * probs)))

(/ prod (+ prod (apply # ‘ * (mapcar # ‘ (lambda (x)

(-1 x))

probs))))

slide17
Example token list with probabilities:

madam 0.99

promotion 0.99

shortest 0.047225013

sorry 0.0499

valuable 0.82347

*information taken from www.paulgraham.com

wrapping it up
Wrapping it Up

Whether constructing a spam list or implementing a spam filtering program, spam is based on the concept and utilization of computer science.

questions for the class
Questions for the Class

By the end of this presentation you should be able to answer the following question:

Name 2 techniques we learned in CIS class that are used by spammers or in spam filtering.

  • Pattern-Matching when searching for email addresses or when evaluating words for spam tendencies.
  • Writing algorithms to eventually implement program.
bibliography
Bibliography

“Before Spam Brings the Web to Its Knees.” June 10, 2003.

http.//www.businessweek.com/technology/content/jun2003/tc20030610_1670_tc104.htm

Brain, Marshall. “How Spam Works”

http://computer.howstuffworks.com/spam.htm

“Getting Rid of Spam”

http://www.webopedia.com/DidYouKnow/Internet/2002/GettingRidofSpam.asp

Graham, Paul. “A Plan for Spam.” Aug.2002.

http://www.paulgraham.com/spam.html

slide21
Mueller, Scott H. “ What is Spam?”

http://spam.abuse.net/overview/whatisspam.shtml

“Origins of Spam”

http://digital.net/~gandalf/spamfaq.html#item8c

“Spam” July 20, 2004.

http://www.webopedia.com/TERM/s/spam.html