A LVQ-based neural network anti-spam email approach

A LVQ-based neural network anti-spam email approach 楊婉秀教授資管碩一詹元順 94722001 2005/12/07

Outline • 1. Introduction • 2. Email sample and data preprocessing • 2.1 Email representation • 2.2 Feature extraction • 3. Anti-spam email LVQ model • 3.1 Spam email category. • 3.2 Learning vector quantization neural network model • 3.3 Anti-spam email LVQ algorithm • 3.4 Parameter setting • 4. Experiments and result • 5. Conclusion

1. Introduction(1/2) • Spam e-mail waste users time, money, network bandwidth as well as, meanwhile, clutter users' mailboxes, even be harmful, e.g. pornographic content. • In America, spam emails make enterprises to be loss up to 9 billions per year. • Without appropriate counter-measures, the situation will continue worsening and spam email will eventually undermine the usability of email.

1. Introduction(2/2) • Duhong Chen et al. compared four algorithms, Bayes, decision tree, neural networks, Boosting, and drew a conclusion that neural network algorithm has higher performance. • Experiments have proved that the LVQ-based anti-spare email filter has better performance than Bayes- based and BP neural network.-based approaches.

2. Email sample and data preprocessing(1/2) 2.1 Email representation • TFIDFi=TFi × log (N/DFi) (1) • TFi：the frequency that word ti appears in document d 2.2 Feature extraction • N：the total numbers of training documents • DFi：represents the numbers of documents which contain word ti

2. Email sample and data preprocessing(2/2) 2.2 Feature extraction • A：the numbers of emails which contain word t and belong to class s • B：that of emails which contain word but not belong to class s • C：that of emails which belong to class s but not contain word t • N：the total email number in training corpus

3. Anti-spam email LVQ model(1/5) 3.1 Spam email category.

3. Anti-spam email LVQ model(2/5) 3.2 Learning vector quantization neural network model • The model is divided into two layers. The first layer is competitive layer, in which each neuron represents a subclass. • The second is output layer, in which each neuron represents a class.

3. Anti-spam email LVQ model(3/5) 3.3 Anti-spam email LVQ algorithm(1/2)

3. Anti-spam email LVQ model(4/5) 3.3 Anti-spam email LVQ algorithm(2/2)

3. Anti-spam email LVQ model(5/5) 3.4 Parameter setting

4. Experiments and result(1/4) • This project makes use of email corpus from http://www.spamassassin.org/publiccorpus, which is open available source. • Select 1000 pieces e-mails randomly from the corpus, including 580 spam e-mails, 420 legitimate e-mails.

4. Experiments and result(2/4) • Anti-spare email filter performance is often measured in terms of spam precision (SP) and sparn recall (SR).

4. Experiments and result(3/4) • A criterion F1, which incorporates spam precision and spare recall.

4. Experiments and result(4/4)

5. Conclusion • Both neural network-based algorithms are usually better than that based on Bayes. • LVQ-based method classify spam emails into several subclasses in content so that the feature words of each subclass of spam email is more related and closer as well as characteristics of each subclass of spam emails are easier to identify.

A LVQ-based neural network anti-spam email approach

A LVQ-based neural network anti-spam email approach

Presentation Transcript

Email Security And Anti-Spam Tutorial

Enhancing Email Address Privacy on Anti-SPAM

Filtron : A Learning-Based Anti-Spam Filter

Automatic Inventory Control: A Neural Network Approach

Neural Network Based Control

A Neural Network Approach For Options Pricing

Enhancing Email Address Privacy on Anti-SPAM

Face Recognition: A Convolutional Neural Network Approach

A Neural Network Approach to Classifying Cartoons Based on Color

A Neural Network Approach for classifying TACS

Anti Spam

Filtron : A Learning-Based Anti-Spam Filter

Clustering-A neural network approach

A False Positive Safe Neural Network for Spam Detection

An Anti-Spam filter based on Adaptive Neural Networks

Email Security And Anti-Spam Tutorial

Spam Email

A Neural Network Approach to Topic Spotting

A Neural-Network Approach for Visual Cryptography

LVQ Selection of A BackProp Network

Email Security And Anti-Spam Tutorial

A False Positive Safe Neural Network for Spam Detection