Peer-review analysis

Peer-review analysis Comprehensive exam Presentered by : Wenting Xiong Committees: Diane Litman Rebecca Hwa Jingtao Wang

Motivation • Goal Mine useful information in peers’ feedback and represent them in a intuitive and concise way • Tasks and related research topics • Identify review helpfulness NLP – Review analysis • Summarize reviewers’ comments NLP – Paraphrasing and Summarization • Sense-making of review comments interactive review exploration HCI – Visual text analytics

Part.1 NLP -- Review Analysis

Outline • Review helpfulness analysis • Sentiment analysis (opinion mining) Aspect detection Sentiment orientation Sentiment classification & extraction

1 Review helpfulness analysis • Automatic prediction • Learning techniques • Features utilities • The ground-truth • Analysis of perceived review helpfulness • Users’ bias when vote for helpfulness • Influence of the other reviews of the same product

1.1 -- Learning techniques • Problem formalization • Input: textual reviews • Output: helpfulness score • Learning Algorithms • Supervised learning – Regression • Product reviews (e.g. electronics) <Kim 2006>, <Zhang 2006>, <Liu 2007>,<Ghose 2010>, <O'Mahony 2010> • Trip reviews <Zhang 2006> • Movie reviews <Zhang 2006> • Unsupervised learning – Clustering • Book reviews <Tsur 2009> • Focus • Predict absolute scores VS. rankings • Identify most helpful <Liu 2007> vs. unhelpful <Tsur 2009>

1.1-- Feature utilities • Features used to model review helpfulness • Controversial results about the effectiveness of subjectivity features • term-based counts not useful <Kim, et. al, 2006>, category-based count shows positive words correlate with greater helpfulness <Ghose, et. al, 2010> • Data sparsity issues?

1.1 --The ground-truth • Various gold-standard of review helpfulness • Aggregated helpfulness votes Perceived helpfulness e.g. <Kim 2006> • Manual annotations of helpfulness Real helpfulness <Liu 2007> • Problems Percentage of helpful votes is not consistent with annotators judgments based on helpfulness specifications Error rate of preference pair < 0.5 <Liu 2007>

1 Review helpfulness analysis • Automatic prediction • Learning techniques • Features utilities • The ground-truth • Analysis of perceived review helpfulness • Biased voting of review helpfulness on Amazon.com • The perceived helpfulness is not only determined by the textual content

1.2 Analysis of perceived review helpfulness • Biased voting of review helpfulness on Amazon.com • Imbalanced vote • Winner Circle bias • Early bird bias <Liu 2007> • “x/y” does not capture the true helpfulness of reviews • The perceived helpfulness is not only determined by the textual content • Influence of the other reviews of the same product • Individual bias <Danescu-Niculescu-Mizil 2009>

1 Review helpfulness analysis • Summary • Effective features for identify review helpfulness • Perceived helpfulness VS. real helpfulness • Comments • New features • Introduce domain knowledge and information from other dimensions • Data sparsity problem • High-level features • Deep learning from low-level features • Other machine learning techniques • Theory-based generative models

Outline • Review helpfulness analysis • Sentiment analysis (opinion mining)

2 Sentient analysis (opinion mining) How people think about what? • Aspect detection • Sentiment orientation • Sentiment classification & extraction

2.1 Aspect detection • Frequency-based approach • Most frequent noun-phrase + sentiment-pivot expansion <Liu, 2004> • PMI (pointwise Mutual information) with meronymy discriminators + WordNet <Popescu 2005> • Generative approach • LDA, MG-LDA <Titov 2008>, sentence-level local-LDA <Brody 2010> • Multiple-aspect sentiment model <Titov 2008> • Content-attitude model <Sauper 2011>

2.2 Sentiment orientation • Aggregating from subjective terms • Manually constructed subjective lexicons • Bootstrapping with PMI • Adj & adv <Turney 2001> • opinion-bearing words <Liu 2004> • Graph-based approach • Relaxiation labeling <Popescu 2005> • Scoring <Brody 2010> • Domain adaptation • SCL algorithm <Blitzer 2007> • Through topic models • MAS -- aspect-independent + aspect-dependent <Titov 2008> • Content-attitude models -- predicted posterior of sentiment distribution <Sauper, 2011>

2.3 Sentiment classification and extraction • Classification • Binary <Turney 2001> • Finer-grained e.g. metric labeling <Pang 2005> • Data sparsity • Bag-of-Words vs. Bag-of-Opinions <Qu 2010> • Opinion-oriented extraction • Topic of interest • Pre-defined • Automatically learned • User-specified

2 Summary Comparing reviews’ helpfulness and sentiment • In terms of automatic prediction, both are metric inferring problem, that can be formalized as standard ML problems with same input X though different output Y • The learned knowledge about opinion topics and the associated sentiments would help model the general utility of reviews

Part.2 NLP -- Paraphrasing & Summarization

Outline • Paraphrasing Paraphrases are semantically equivalent with each other • Paraphrase recognition • Paraphrase generation • Summarization Shorter representation of the same semantic information of the input text • Informativeness computation • Extracted summarization of evaluative text

1.1 Paraphrase recognition • Discriminative approach • Various string similarity metrics • Different level of abstraction of textual strings <Malakasiotis 2009> Question: Any useful existing resourses for identifying equivalent semantic information? • Word-level: dictionary, WordNet • Phrase-level: ? • Sentence-level: ?

1.2 Paraphrase generation • Corpora • Monolingual vs. bilingual • Methods • Distributional-similarity based • Corpora based • Evaluation • Intrinsic evaluation vs. extrinsic evaluation

1.2 -- Corpora • Monolingual corpora • Parallel corpora • Translation candidates • Definitions of the same term • Comparable corpora • Summary of the same event • Documents on the same topic • Bilingual parallel corpora

1.1 -- Methods.1 • Distributional-similarity based methods • DIRT, paths frequently occur with same words at their ends • Using a single monolingual corpus • MI to measure association strength between slot and its arguments <Lin 2001> • Sentence-lattices, argument similarity of multiple slots on sentence-lattices • Using a comparable monolingual corpus • Hierarchical clustering for grouping similar sentences • MSA to induce lattices <Barzilay 2003>

1.2 -- Methods.2 • Corpora-based methods • Monolingual parallel corpus • Monolingual MT <Quirk 2004> • Merging partial parse trees FSA <Pang 2003> • Paraphrasing from definitions <Hashimoto 2011> • Monolingual comparable corpus • MSR paraphrase corpus <Dolan 2005> • Edit distance, Journalism convention • Sentence-lattices <Barzilay 2003> • Bilingual parallel corpus • Pivot approach <Callison-Burch 2005> <Zhao 2008> • Random-walk based HTP <Kok 2009>

1.2 -- Evaluation • Intrinsic evaluation • Responsiveness • Can access precision, but no recall • Standard test references <Callison-Burch 2008> • Manually aligned corpus • Lower bound precision & relative recall • Extrinsic evaluation • Alignment tasks in monolingual translation • Alignment error rate • Alignment precision, recall, F-measure <Dolan 2004> • Model-specific evaluation • FSA <Pang 2005>

2 Summarization Tasks in automatic summarization • Content selection • Information ordering • Automatic editing, information fusion Focus of this talk -- • Informativeness computation • Information selection (and generation) • Summarization evaluation

2.1 Computing informativeness • Semantic information (Topic identification) • Word-level • Frequency, TFIDF <Liu 2004>, Topic signature <Lin 2001>, PMI(w, topic) <Wang 2011>, external domain knowledge <Zhuang 2006> • Sentence-level • HMM content models <barzilay 2004> • Category classification + sentence clustering <Abu-Jbara 2011> • Summary-level • Sentiment-aspect match model + KL divergence <Lerman 2009> • Opinion-based sentiment scores for evaluative texts • Sentiment polarity, intensity, mismatch, diversity <Lerman 2009> • Discriminative approach to predict informativeness • Combine statistic, semantic, sentiment features in linear or log-linear models <wang 2011>

2.2 Information selection & generation • Extraction • Rank-based sentence selection • Aggregation of word informative weights (+ discourse features) <Carenini, 2006> <Wang, 2011> • Optimized by Maximal Marginal Relevance • Topic-based selection • HMM content model <Barzilay, 2004> • Languge-model based clustering of informative phrases <Liu, 2010> • Summarize citations based on category-cluster-setence <Abu-Jbara, 2011> • Structured evaluative summary • Aspect + overall rating <Hu, 2004> • Aspect + pos and cons <Zhuang, 2006> • Hierarchical aspects + sentiment phrasal expressions <Liu 2010> • Abstraction • Generate evaluative arguments based on aggregation of extracted information <Carenini, 2006> • Graph-based summarization using adjacently matrix to model dialogue structure <Wang, 2011>

2.3 Summarization evaluation • Pyramid (empirical) • Multiple human wrote gold-standards • SCU <Ani 2007> • ROUGE • Automatically compare with gold-standard • Consider correlation based on unigram, bigram, longest common subsequence <Lin 2004> • Fully automatic • Good summary should be similar to the input • KL divergence, JS divergence <Ani 2009> • User preference of sentiment summarizer

Paraphrasing and summarization -- Summary • Common theme • Semantic equivalence • Related to sentiment analysis in computing informativeness of reviews • Aspect-dependent sentiment orientation • Overall vs. distribution statistics • Aspect coverage • Compute through scoring or measuring probabilistic model's distribution divergence

Part. 3HCI -- Visual text analytics

Outline • Text visualization • Inner-set visualization for abstraction • Intra-set visualization for comparison • Interactive exploration • Design principles and examples

1 Text visualization • Inner-set visualization for abstraction • Semantic information • Sentiment information (opinions) • Intra-set visualization for comparison

1.1 Inner-set visualization techniques • Semantic information • Original text with highlighted keywords • Most detailed information • Topic-based representation • List of target entities (Jigsaw, <Stasko 2010>) • Haystack (Themail, <Viegas 2006>) • Tagcloud (OpinionSeer <Wu 2010>), TIARA <Liu 2009>, reviewSpotlight <Yatani, 2011>) • Vector-based representation • Dot in space (ThemeScapes <Wise 1995>)

1.1 Inner-set visualization techniques • Sentiment information • Value-based visual representation • Bar -- Opinion polarity and intensity <Liu 2005> • Histogram -- Rating distribution <Carenini 2006> • Double-square -- Frequency, polarity, intensity <Oelke 2009> • Thumbnail table -- opinion report for people in groups <Oelke 2009> Comment: • Requires NLP techniques for opinion mining and sentiment analysis • e.g. Intelligence support for identify salient information for exploration (Aspect that opinions are most (dis)consisitant) <Carenini 2006>

1 Text visualization • Inner-set visualization for abstraction • Semantic information • Sentiment information (opinions) • Intra-set visualization for comparison • Dimensionality of comparison • Via layout or visualizing metadata as axis

1.2 Intra-set visualization techniques • Dimensionality of exploration • 1D: layout or metadata • 2D: layout or/and metadata • 3D & 3D+: layout or/and metadata

1.2 Intra-set visualization -- 1D Exploration • Side-by-side • Compare single product reviews feature-by-feature <Liu 2005> • Connect interesting events of different period of times (Continuum, <Andre 2007>) • Explore the connection of entities across documents (Jigsaw, <Stasko 2010>) • Grid-layout of data in groups • Faceted metadata for image browsing <Yee 2003> • Facetbox for presenting filtering by facet-data <Lee 2009> • Exploring term-based language patterns across document <Don 2007> • Timeline -- temporal features • Themail <Viegas 2006>, Contitunn <Andre 2007> Tiara <Liu 2009>, TwitInfo <Marcus 2011> etc.

1.2 Intra-set visualization -- 2D Exploration • Aspect-based opinion analysis across multiple targets • Paired <Liu 2005> • Matrix <Orlke 2009> • Scatter plot of targets with metadata as axis • Discover the entity-coverage in documents (Jigsaw <Stasko 2010>) • Visual DL search result with categorical and hierarchical axes <Shneiderman 2000> • 2D graph (layout) • Exploring relationships between entities and documents (Jigsaw <Stasko 2010>) • *Diagram of social network (TIARA <Liu 2009>) • Spatial representation in 2D space • Triangle scatter-plot of opinions (OpinionSeer <Wu 2010>) • *Opinion space <Faridani 2010> • Circled correlation map of review aspects <Orlke 2009>

1.3 Intra-set visualization -- 3D Exploration • 3D-spacial representation • ThemeScapes <Wise 1995> • Theme strength as elevation (terrain map) • Combine multiple visualization of metadata variables • OpinionSeer <Wu 2010> • Radial visualization with co-centric rings + stacked graph + triangle scatter plot • TIARA <Liu 2010> • Stacked topic-models (Wordcloud) over timeline Pos • Discover unperceivable interactions among multiple factors Cons • Concise but hard to interpret • Interaction is more complex and hard to design

2 Interactive exploration Design principles and examples • Data on-demand and in-depth exploration From the data perspective • Overview then detailed view From the interaction perspective • zoom-in and zoom-out for exploration • Hierarchic filtering for search and browse • Detail information as tooltip in explanatory visualization • Support exploration of multiple interest • View switching for interest-specific visualization techniques • Query-based content browsing • Pivot action for navigating between related items • Context preserving • Overview + detailed view • Support local interactions (hierarchically structured data) • A view of selection history of browsing

Visual text analytics -- summary To conclude • Text visualization construct the semantic mapping between the text and visual variables • Visualize metadata together with textual information for comparison and exploration • Interaction design should follow human's intuition of data exploration • Data characteristics • Inherited connection between data and metadata

Visual text analytics -- Connection between NLP and HCI • NLP help visual analytic in extracting the target information and organize them in a desired way • Visual analytic provide exploratory tool for text analysis and opinion mining • Poses challenges to NLP in terms of both new corpora and interesting problems

Conclusion In terms of my own research interest • Review analysis • How to model the real helpfulness of peer-reviews • Paraphrasing and summarization • How to identify common themes and aggregate comments from different reviewers • Visual text analytic • How to create informative representation of reviews • And design intuitive interactive-exploration for students or teachers to mind useful information Challenges and contributions • Theory-based high level information of usefulness • Summary-style paraphrasing • Visualize connection between opinions with detailed semantic information in context

Peer-review analysis

Peer-review analysis

Presentation Transcript

Peer Review

Peer Review

Peer Review

Peer Review

Peer Review

Peer review

Peer Review

Peer Review

Peer Review

Peer Review

Peer Review

Peer Review

Peer Review

Peer Review

Peer Review

Peer review

Peer Review

Peer Review

Peer Review

Peer Review

PEER REVIEW