170 likes | 272 Views
Utilizing sentiment analysis to summarize Turkish mobile phone product reviews, learning association rules for efficient feature-opinion matching.
E N D
Semantic Analysis of Product Reviews for Feature Summarization ERDEM ÖZDEMİR UTKU OZAN YILMAZBUĞRA MEHMET YILDIZ ÖMER FARUK UZARBilkent UniversityComputer Engineering Department CS 533 Information Retrieval Systems
Outline • Introduction • Motivation • Sentiment Analysis of Product Reviews • Preparation of Dataset • Learning • Processing of Product Reviews • Learning Association Rules • Presentation of Results • Progress So Far • Summary • Conclusion
Introduction • User participation to Web sites increased with Web 2.0 • Product reviews written by users in e-commerce sites • User opinions • Essential as they reflect the real experience of the people who actually use the products
Introduction • Use opinion mining (sentiment analysis) • Derive user opinions about product features • Determine their sentiment orientation • Analyzing if an opinion is positive or negative • Summarize that information to the user • Dataset • Use Turkish product reviews for mobile phones
Motivation • Influence of the experience of a product’s users on people who consider buying it • Their analysis will be useful for buyers, producers and e-commerce systems • Users start to read a small fraction of product reviews as the number of them in e-commerce systems increases • Usually results in unawareness of some features of the products and opinions about them • Product reviews are generally repetitive • Reading all of them is generally inefficient • There is a need for summarization in product reviews • Lack of such a system for Turkish language
Sentiment Analysis of Product Reviews • It consists of five steps • Preparation of Dataset • Learning • Processing of Product Reviews • Learning Association Rules • Presentation of Results
Preparation of Dataset • Use mobile phone reviews in Hepsiburada.com • Choice is based on the size of the dataset provided • Parse the website • To find links to cell phones • To extract user reviews • Strip off text from HTML tags • Put the parsed text into a database with some extra information • Reviewer’s grade of the product • People’s grade of the review etc
Learning • Calculate sentiment orientation of words • Using Word Net with seeded oriented words and Turney’s approach using search engine queries are not suitable for Turkish • Best approach so far is using the reviewer’s grade of the product • For each opinion word owj • Orientation (owj) = ∑ (tfi,j x idfj x gi) / |{r:owjЄ r}|
Learning • Calculation of likelihood of feature - opinion match • For each sentence • Find feature and opinions • Count number of times they appear together • Count their individual appearances • Calculate likelihood of feature opinion match • |Featurei & Opinionj|2 / |Featurei| x |Opinionj|
Processing of Product Reviews • Aims to find <Product, Feature> and <Feature,Opinion> matches • Example • “Fiyatına göre iyi bir telefon kullanışlı tavsiye ederim.” • Features: telefon, fiyat • Opinions: iyi, kullanışlı, tavsiye ederim • Matches: <iyi, telefon>, <telefon, tavsiye ederim>, <kullanışlı, telefon>
Processing of Product Reviews • First thing to do is applying POS Tagger to a sentence • “Konuşurken karşı tarafın sesi sanki biraz az geliyor gibi geldi bana.” → “Adverb Adj Noun+A3sg+Pnon+Gen Noun+A3sg+P3sg+NomFet Adj Adj Adj Verb+Pos+Prog1+A3sg Postp Verb+Pos+Past+A3sg Pron+A1sg+Pnon+Dat Punc“ • For opinion finding, we only use adjectives, we miss some opinions words like “tavsiye ediyorum” • For features, we search them from a list we have • “Kamerası iyi çekiyor.” (explicit feature : kamera) • “Telefon çekim kalitesi yüksek.” (implicit feature: kamera?)
Processing of Product Reviews • Assignment of opinions to features • Use rules • (Adv) Adj (Num) Noun, Noun (Adv|Adj) Adj Punc • Use Likelihood values • Find assignment among feature and opinions that maximize the sum of likelihoods which has been learned earlier in learning process. • Store features, feature-opinion pairs and their places that are mentioned in product
Learning Association Rules • Perform association rule analysis to obtain frequent feature item sets • Use transactions extracted in the previous step • Association rule • Implication in the form of X => Y • Existence of variable X implies existence of Y • Two kinds of association rules • Product => Feature • Feature => Opinion • After obtaining such association rules, prune the ones that are not repeated frequently and ones that are not interesting regarding their sentiment orientation
Presentation of Results • Provide a web user interface • Users can access the results by submitting the name of the product they want to fetch information about to the system • Example Interface
Progress So Far • Accomplished most of the essential steps of our project • Prepared our dataset • Fetch data from Hepsiburada.com • Process it • Put it into a database • Performed sentiment analysis • Obtained promising results with our methods • Now, we are working on our web user interface and processing of product reviews
Summary • Project’s five steps • Preparation of Dataset • Learning • Processing of Product Reviews • Learning Association Rules • Presentation of Results
Conclusion • Problems • Authors don’t use the language properly and correctly • There is no tool to perform syntax analysis of Turkish • Evaluation problem: How to calculate recall? • Simple solutions generally work better in diverse datasets and high dimensional problems