Information Retrieval Laboratory Department of Computer Science ir.iit

Automatic Web Query Classification using Labeled and Unlabeled Training DataSteven M. Beitzel, Eric C. Jensen, David D. Lewis, Abdur ChowdhuryAleksander Kolcz, Ophir Frieder Information Retrieval Laboratory Department of Computer Science http://ir.iit.edu

Overview • Introduction: Query Classification • Motivations & Prior Work • Our approach • Results & Analysis • Conclusions • Future Work

Introduction • Goal is to develop a system that can identify a query with relevant topical categories • Automatic classifiers help a search service decide when to use specialized databases • Specialized databases may provide tailored, topic-specific results

Problem Statement • A query contains more information than just its terms • Search is not just about finding relevant documents – users have: • Target task • Target topic • General information need • Queries are simply an attempt to express all of the above in a couple of terms (average of 2.2 per query)

Popular Web Queries

Problem Statement (2) • Current search systems focus mainly on the terms in the queries • No focus on extracting topic information • Manual query classification is expensive • Does not take advantage of the large supply of unlabeled data available in query logs

Prior Work • Much early text classification was document-based • Query Classification: • Manual (human assessors) • Automatic • Clustering Techniques – doesn’t help identify topics • Supervised learning via retrieved documents • Still expensive – retrieved documents must be classified

Query frequency vs. % of Weekly Query Stream

Automatic Query Classification Motivations • Web queries have very few features • Achieving and sustaining classification recall is difficult • Web query logs provide a rich source of unlabeled data; we must harness this data to aid classification

Our Approach • Combine three methods of classification: • Labeled Data Approaches: • Manual (exact-match lookup using labeled queries) • Supervised Learning (Perceptron trained with labeled queries) • Unlabeled Data Approach: • Unsupervised Rule Learning with unlabeled data from a large query log • Disjunctive Combination of the above

Approach #1 - Exact-Match to Manual Classifications • A team of editors manually classified approximately 1M popular queries into 18 topical categories • General topics (sports, health, entertainment) • Mostly popular queries • Pros • Expect high precision from exact-match lookup • Cons • Expensive to maintain • Very low classification recall • Not robust to changes in the query stream

Approach #2 - Supervised Learning with a Perceptron • Goal: achieve higher levels of recall than human efforts • Supervised Learning • Used heavily in text classification • Bayes, Perceptron, SVM, etc… • Use manually classified queries to train a classifier

Supervised Learning Experiments • Perceptron-based machine learning system • Separate collections for training and testing: • Training: • Nearly 1M web queries manually classified by a team of editors • Grouped non-exclusively into 18 topical categories, and trained each category independently • Testing: • 20,000 web queries classified by human assessors • ~30% agreement with classifications in training set

Supervised Learning Exp. (2) • Test queries were submitted to the trained learner for evaluation • Calculated true-positive and false-positive rates over all feature sets for each class • Plotted classifier performance using Detection-Error Tradeoff (DET) curves

Supervised Learning DET Curves

Supervised Learning Analysis • The DET curves for each class show a clear trend: • To lower the rate of false-negatives, substantial false-positives must be tolerated • This is a clear illustration of the query classification “recall problem” that has been identified in prior studies

Approach #3 - Unsupervised Rule Learning Using Unlabeled Data • We have query logs with very large numbers of queries • Must take advantage of millions of users showing us how they look for things • Build on manual efforts • Manual efforts tell us some words from each category • Find words associated with each category • Learn how people look for topics, e.g. “what words do users use to find musicians or lawn-mowers”

Unsupervised Rule Learning Using Unlabeled Data (2) • Find good predictors of a class based on how users look for queries related to certain categories • Use those words to predict new members of each category • Apply the notion of selectional preferences to find weighted rules for classifying queries automatically

Selectional Preferences: Step 1 • Obtain a large log of unlabeled web queries • View each query as pairs of lexical units: • <head, tail> • Only applicable to queries of 2+ terms • Queries with n terms form n-1 pairs • Example: “directions to ICDM” forms two pairs: • <directions, to ICDM> and <directions to, ICDM> • Count and record the frequency of each pair

Selectional Preferences: Step 2 • Obtain a set of manually labeled queries • Check the heads and tails of each pair to see if they appear in the manually labeled set • Convert each <head, tail> pair into: • <head, CATEGORY> (forward preference) • <CATEGORY, tail> (backward preference) • Discard <head, tail> pairs for which there is no category information at all • Sum counts for all contributing pairs and normalize by the number of contributing pairs

Selectional Preferences: Step 2

Selectional Preferences: Step 3 • Score each preference using Resnik’s Selectional Preference Strength formula: • Where urepresents a category, as found in Step 2. • S(x) is the sum of the weighted scores for every category associated with a given lexical unit

Selectional Preferences: Step 4 • Use the mined preferences and weighted scores from Steps 3 and 4 to assign classifications to unseen queries

Forward Rules harlem club X ENT->0.722 PLACES->0.378 TRAVEL->1.531 harley all stainless X AUTOS->3.448 SHOPPING->0.021 harley chicks with X PORN->5.681 Backward Rules X gets hot wont start AUTOS->2.049 PLACES->0.594 X getaway bargain PLACES->0.877 SHOPPING->0.047 TRAVEL->0.862 X getaway bargain hotel and airfare PLACES->0.594 TRAVEL->2.057 Selectional Preference Rule Examples

Combined Approach • Each approach exploits different qualities of our query stream • A natural next step is to combine them • How similar are the approaches?

Evaluation Metrics • Classification Precision: • #true positives / (#true positives + #false positives) • Classification Recall: • #true positives / (#true positives + # false negatives) • F-Measure: Higher values of beta put more emphasis on recall

Effectiveness of each approach

Performance of Perceptron vs. SP Rules at varying levels of Beta

Conclusions • Our system successfully makes use of large amounts of unlabeled data • The Selectional Preference rules allow us to classify a significantly larger portion of the query stream than manual efforts alone • Excellent potential for further improvements

Future Work • Expand available classification features per query • Mine web query logs for related terms and patterns • More intelligent combination methods • Learned combination functions • Voting algorithms • Utilize external sources of information • Patterns and trends from query log analysis • Topical ontology lookups • Experiment using other datasets (KDD Cup) • Use automatic query classification to improve effectiveness and efficiency in a production search system

Information Retrieval Laboratory Department of Computer Science ir.iit