1 / 17

Real-Time Recommendation of Diverse Related Articles

Real-Time Recommendation of Diverse Related Articles. Yeung Fu Sing. Introduction. News website provide suggestion Base on popularity, recency or editor’s pick Sometimes on the content. Problem. E.g. News article: Barack Obama elected Suggestion: Obama wins election

vaughn
Download Presentation

Real-Time Recommendation of Diverse Related Articles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Real-Time Recommendation of Diverse Related Articles Yeung Fu Sing

  2. Introduction • News website provide suggestion • Base on popularity, recency or editor’s pick • Sometimes on the content

  3. Problem • E.g. News article: Barack Obama elected • Suggestion: Obama wins election Obama has four more years • Similar articles • Not what user want

  4. Aim • Readers want relevant article • But not very similar article • Diversity is required in suggestions • E.g News article: Obama wins election Suggestion: Mitt Romney admit defeat What can Obama do in four yrs?

  5. How to achieve diversity? • Through reading comments • Base on the characteristics of discussion • Usage of words • People participated • View on the news • Location of user

  6. Methodology For any article a, • Find relevant articles Compute the distance between a and other article ai Find candidate set with distance < r • Compute diversity between any pair of article in candidate set Find a recommendation set by finding k articles with maximum diversity among all k-article set

  7. Relevance • Compute by Jaccard Coefficient J(A,B) • Distance between two articles Distrel(a,ai) = 1 – J(A,B) Where A, B are features of a and ai respectively

  8. Relevance • Features extracted by Open Calasis Open Calasis • A software by reuters in text mining • Analyse documents • Return entities of a document Rel(a,ai) = J(OC(a),OC(ai))

  9. Diversity – Entities • Writer believe user will reveal and amplify difference through comments • Seek features of user comments • Use Open Calasis again Ddiv(ai,aj) = 1 – J(OC(ai),OC(aj))

  10. Diversity - Sentiments • Positivity of comments help identify diversity • Compute by counting positive and negative words • Can be calculated by Euclidean distance or simply average

  11. Diversity – User ID • User tend to read similar articles Ddiv(ai, aj)=1− J (Si , Sj) Where Si and Sjbe the set of user comment on article ai and aj respectively

  12. Diversity – User Location • User from similar location tends to read similar article • Similar to user ID, but we use countries Ddiv(ai, aj)=1− J (Si , Sj) Where Si and Sjbe the set of commented users’ countries on article ai and aj respectively

  13. Diversity • Diversity of every two article calculated by • The least diversity of the mentioned method • Total diversity of k articles is calculate by summing all diversity • K articles with the most diversity will be recommended

  14. Computation Problem • Difficult to compute JaccardCoeffcient • Use Locality-Sensitive Hashing to get a approximate set of candidates • Where min-hash functions are used • Time consuming to compute diversity of all pair • Compute diversity first in each hash bucket • Choose only k most diverse in each bucket

  15. Results • Algorithm 22 times faster than brute algorithm • 30.38 times faster than MMR, an algorithm designed in another paper • Recall is affected

More Related