Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification John Blitzer, Mark Dredze and Fernando Pereira University of Pennsylvania, ACL 07’

Research Purposes • How to adapt classifiers across domains? • Books, DVDs, electronics, and kitchen applications • SCL • How to select domains to annotate that would be good proxies to many other domains? • A-distance

excellent Cell phone review Good-quality reception Domain adaptation: SCL • SCL: structural correspondence learning • Two type of words: • Excellent and awful: pivot features • New words: features Computer review Fast dual-core

SCL & SCL-MI • Select pivot features • Select m pivot features which occur frequently in both domain: frequency. • Using frequency is good in POS tagging because they are very often function words, but not the same in sentiment classification. • Choose the ones with the highest mutual information to the source label (pos, neg).

SCL & SCL-MI (exclusive pivots) • Top pivots selected by SCL, but not SCL-MI (left) and vice-versa (right) • Observe feature vector x. • Weight k:pivots, d:features • Apply the projection • Learn the predictor

Dataset • Amazon product reviews: books, DVDs, electronics and kitchen appliances. • Rating (0-5): 0-2 negative, 4-5 positive, 3 dropped. • Balanced composition (labeled): 1,000 positive and 1,000 negative examples for each domain. • Unlabeled: 3,685 DVDs and 5,945 kitchen instances.

Baseline & experiment settings • Linear predictors on unigram and bigram features for classification. • Trained to minimize stochastic gradient descent. • For SCL & SCL-MI: • Pivots must occur in more than five docs in each domain.

Experiments • Labeled set: 1600 training instances and 400 instances. • Baseline: linear classifier trained without adaptation • Upper bound: inside test • Ex. Baseline 72.8%, SCL-MI adaptation 79.7%, inside 80.4%: adaptation loss for baseline 7.6%, adaptation loss for SCL-MI 0.7%, relative reduction in error due to adaptation for SCL-MI 90.8%

Experiment Results

Results analysis

Correcting Misalignments • Supervised training objective • Vs: source model weight vector • 50 target domain labeled instances (for a single engineer to label with minimal effort)

Experiment results: loss • Show adaptation from only the two domains on which SCL-MI performed the worst relative to the supervised baseline.

Experimental Results: +50-tag

Measuring Adaptability • The A-distance: two domains can differ in arbitrary ways, we are only interested in the differences that affect classification accuracy. (A: sets on which a linear classifier returns positive value)

Use the Huber loss as a proxy for the A-distance. • Given two domains, compute SCL representation, create and train a linear classifier. • Compute the empirical average per-instance Huber loss, then calculate 100*(1-loss). Refer this as A-distance.

Proxy A-distance & adaptation loss • Select books or DVDs, but not both.

Conclusion and future work • Domain adaptation: useful in sentiment classification, improve SCL by using MI, correct misalignments by using small labeled target domain data. • Select labeled domain by A-distance. • Future work: addressing the ranking problem.

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification