1 / 21

Mining Social Network for Personalized Email Prioritization

Mining Social Network for Personalized Email Prioritization. Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae Yoo, Yiming Yang, Frank Lin, and Il-Chul Moon. Outline. Problem Description Approaches Experiments Contributions. Problem Description.

Download Presentation

Mining Social Network for Personalized Email Prioritization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Social Networkfor Personalized Email Prioritization Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae Yoo, Yiming Yang, Frank Lin, and Il-Chul Moon

  2. Outline • Problem Description • Approaches • Experiments • Contributions

  3. Problem Description • Email Overload is severe problem • Identifying Importance of email will alleviate email overload • Challenges • No access to other people’s emails and labels • Personalized labeling is time consuming • The same message may have different priority labels for different recipients • We want to leverage the sparse training data by using social network of each user Sparse Training Data

  4. Outline • Problem Description • Approaches • Social Clustering • Social Importance • Semi-supervised Importance Propagation • Experiments • Conclusion and Future Work

  5. Social Clustering – Motivation • Personal Email Inbox • Lots of unlabeled emails • No privacy issue • Observations • The sender can be important • Some senders are not appeared in the training set at all or very few instances • Need generalization of sender  Let’s find similar senders from social network

  6. Social Clustering – Contact Network • Personal Contact Network • G =(V,E ) • All the network is constructed from personal inbox Agent /Person 1 2 3 4 5

  7. Social Clustering – Newman Clustering • Newman Clustering Algorithm [Newman, 04] • Find social cliques or cohesive social groups • Based on edge betweeness • The number of shortest path that go through the edge / the total number of shortest path • Drop edges from highest edge betweeness • Hard clustering 9 1 4 4 4 4 4 2 3 5 6 1 1 Group A Group B

  8. Social Clustering– Validations • Clusters are coherent!

  9. Social Clustering – Feature Incorporation • Extended Vector Space • text: social network: • combined: • The combined vector space is used as enriched feature set to the email prioritizer

  10. Social Importance – Motivations • Social Importance • A person in the center of a cluster might be more important than others • Betweeness • Edge betweeness for Newman Clustering • Vertex betweeness • The degree of communication bottleneck from social network • Contact points among the network • Might be important person • We may try other kinds of social importance metrics too

  11. Social Importance – Metrics • Metrics • Degree (in, out, total) [Wasserman and Faust, 94] • Clique Counts (ClqCnt) [Wasserman and Faust, 94] • The number of clique sub-graphs which contain a node v • Betweeness (BetCent) [Freeman, 77] • HITS Authority (Authority) [Kleinberg, 99] • λ: the greatest Eigen value • r : the Eigen vector  similar to PageRank scores • Neighborhood Connectivity (“Clustering Coefficient”, ClustCoef) [Boykin and Roychowdhury, 05] • measure the connectivity among the neighbor of a node v

  12. Social Importance – Validations • Correlation coefficients with priority levels

  13. SIP- Motivations Agent /Person ? ? ? ? ? Emails 4 3 2 ? ? • Semi-supervised Importance Propagation (SIP) • Can we propagate importance labels? • Bi-partite graph, Labels only in Emails

  14. SIP- Email Network Agent /Person ? ? ? ? ? Emails A: Sender to Emails (N x M) BT: Email to Recipients (M x N) xk: kth importance labels for emails(M x 1) yk=Bxk(N x 1) 14 4 3 2 ? ?

  15. SIP - Algorithm • Problems of the above propagation • may not be irreducible • is insensitive to (not personalized) • Apply Personalized PageRank with • Normalize and column-wise normalize C :C’

  16. Outline • Problem Description • Approaches • Experiments • Contributions

  17. Experiments – Data Collection • Collected Data • 25 subjects are recruited from Canegie Mellon University • 7 users who submitted more than 200 emails • 1 faculty, 2 staffs, 4 students Training Testing time

  18. Experiments – Metrics • Mean Absolute Error (MAE) • 1.0 MAE means on average the prediction is deviated from the truth by one priority level • MAE considers the difference among the errors • It ranges from 0 to 4 when we use five importance level • 1 vs. 5 and 4 vs. 5 • Micro-MAE • Pooling the test instances from all users to obtain a joint test set • Macro-MAE • Compute each user MAE first and then take the average of per-user MAE

  19. Experiments – Setups • Features : four subsets • Basic Feature (BF) : from, to, cc, title, body • Newman Clustering (NC) • Social Importance (SI) • Semi-supervised Importance Propagation (SIP) • Ten times random shuffling among training data • Linear SVM • 10 Fold C.V. for parameter tuning • Tuned regularization parameter [10-3.. 103]

  20. Experiments – Results

  21. Contributions • The first study on personalized email prioritization • Using statistical classification and clustering • Based on fine-grained personal judgments with multiple users • Enriched representation through personal Social Network • Social Clustering • Social Importance Estimation • Semi-supervised Importance Propagation • Fully personalized methodology • Technical development and Evaluation

More Related