1 / 15

Filtering of Spam E-Mails Using Back-Propagation Neural Networks

Filtering of Spam E-Mails Using Back-Propagation Neural Networks. Class : 資四A Professor : 楊維忠 Reporter : 林文仁 Team Members : 江念庭 林俊宇 黃國峰. Outline. Neural Network Back-propagation algorithm Flow chart of research Input & output System environment Flow chart of filtering e-mail Example

midori
Download Presentation

Filtering of Spam E-Mails Using Back-Propagation Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Filtering of Spam E-MailsUsing Back-Propagation Neural Networks Class:資四A Professor:楊維忠 Reporter:林文仁 Team Members:江念庭 林俊宇 黃國峰

  2. Outline • Neural Network • Back-propagation algorithm • Flow chart of research • Input & output • System environment • Flow chart of filtering e-mail • Example • Conclusion

  3. Neural Network Target Neural Network connections (called weights) between neurons Compare Output Input Adjust weights

  4. Forward pass Back-propagation algorithm—the multilayer feedforward network Input layer Hidden layer Output layer neuron1 w1 Σ …… b Σ 1 …… …… result neuron2 b …… 1 wi: weight of i wi neuronj b: bias : transfer function

  5. Flow chart of research 參考文獻 分析 mail & maillog, 定義垃圾郵件行為 樣本訓練 類神經網路 測試網路不適用並重新訓練 測試網路適用並結束訓練 與郵件伺服器相互整合

  6. Table of rules

  7. Input & output • Input • 共有28項規則,底下提出常遇到的項目。 • 6為 header-To(收件人) == header-Reply-To(收回覆信的人) ,則input第6項的值為1 • 17為 header-From(寄件人) != maillog-from(記錄檔裡的寄件人),則input第17項值為1 • 25為 header-Date(發信時間) 與 系統時間 差異太大,則input第25項值為1 • Output • Output value between 0.0 and 1.0

  8. System environment • OS • Red Hat Enterprise Linux AS 4 • Mail server • Sendmail 8.13.1 • Client using browser • OpenWebMail 2.52 • Provide web GUI for checking mail • Software tools • Matlab 7

  9. Milter (Mail Filter) Matlab BPN (Neural Network) Add, Change headers Flow chart of filtering e-mail maillog Sendmail server header get_value User’s mailbox

  10. Example-1 透過 telnet傳遞一封垃圾信 ehlo localhost Mail from: s13943013@mail.nuu.idv.tw RCPT TO: s13943013@mail.nuu.idv.tw Data From: “s” s13943013@mail.nuu.idv.tw To: s13943013@mail.nuu.idv.tw Reply-To: s13943013@mail.nuu.idv.tw Subject: 中文信 Date: +0800 …. Quit

  11. Example 收到信件 並已偵測 為SPAM

  12. Content of headers 收件人與收回覆的email相同 ,常理應不相同.

  13. Example-2 Server 上 Maillog 的內容

  14. Conclusion • Identification rate ≒ 80%. • Defined rules with subjectiveness. • Better to combine filtering of content. • eg. SpamAssassin

  15. Please give us your comments. Thank you.

More Related