Why Search Works… • Traditional: print, TV, radio, billboards,… • Only very broadly targeted to demographics (some exceptions) • Search is monetarily successful because advertising is more precisely targeted • The Google model is giving and will continue to give traditional channels a run for their money Text Mining, Search and Navigation

Key Points • The online experience will be more deeply engaging. Text Mining, Search and Navigation

What’s wrong with what we do now? • Nothing, but… ten blue links + ads, ten years from now? • Ads are ‘tacked on’ to the user experience. • Paid Search / Contextual / Banner – all are still largely impersonal. • But, Behavioral Targeting… Text Mining, Search and Navigation

How might ads be targeted better? • I just bought a car – don’t show me more ads for cars • I just bought a house – show me ads for furniture • I like band X, but not Y • In general, build a model of what I’m in the market for • Per-user pricing, availability • User-driven asks (show me all ads for Z) Text Mining, Search and Navigation

User Models • User models can be used to enrich the online experience, not just advertising. • Automated teaching • Need a model of the user’s understanding. • Find other users with similar interests • Tailor news presentation to user’s interests Text Mining, Search and Navigation

Key Points • The online experience will be more deeply engaging. • We will need rich state models of users: likes, dislikes, ± interests, knowledge Text Mining, Search and Navigation

What About Search? Text Mining, Search and Navigation

Search: Somewhere in the Near Future Human Computer Dialog Query 84% Info. 12% Nav. 4% Trans. Indexed Web Data 78% Comm. … Structured Data: Distribution over Intents Structured Data: Diversity; Popular Pages; Aid Transaction Display Text Mining, Search and Navigation

Text Mining, Search and Navigation

How to get the information we need, to build good models for users? Ask them! Text Mining, Search and Navigation

Key Points • The online experience will be more deeply engaging. • We will need rich state models of users: likes, dislikes, ± interests, knowledge, and more. • Natural Language Processing will be key. Text Mining, Search and Navigation

Search Applications: And,Data Changes Everything • Example: AskMSR (Brill, Dumais, Banko, ACL 2002) • Commonly used resources for QA: • Part-of-speech tagger, parser, named-entity extractor, WordNet or other knowledge bases, passage or sentence retrieval, abduction, etc. • AskMSR doesn’t use any of them • Instead, AskMSR focuses on data: • There is a lot of data on the web – use it • Redundancy is a resource to be exploited • Data-driven QA: simple techniques, lots of data Text Mining, Search and Navigation

Data Changes Everything Banko and Brill, Mitigating…, ICHLTR 2001 Text Mining, Search and Navigation

Data Changes Everything Banko and Brill, Scaling…, 2001 Text Mining, Search and Navigation

Key Points • The online experience will be more deeply engaging. • We will need rich state models of users: likes, dislikes, ± interests, knowledge, and more. • Natural Language Processing will be key. • “Search” can be the engine under the hood for many different applications. • It’s better to use tons of data and simple models, versus smaller datasets and complex models. Text Mining, Search and Navigation

How to proceed? • Don’t know. But: Sam, a Search Chatbot. • Provide an engaging chat experience • Use Search to show images, urls, videos,… • Will build persistent user world models • Will have its own world model • Can show precisely targeted ads • Will leverage social networks Text Mining, Search and Navigation

The Eliza Effect • Eliza: J. Weizenbaum,1966 (!) • Demonstrated that extremely simple techniques can result in compelling dialog (sometimes, for some users) • Users tend to anthropomorphize computer behavior • This is gives us an advantage Text Mining, Search and Navigation

Our Prime Directive in Building Sam: Do as little supervision as possible. Text Mining, Search and Navigation

Let the Data do the Work • anarchism category: anarchism • anarchism category: political ideologies • anarchism category: political philosophies • anarchism category: social philosophy • autism category: autism • autism category: pervasive developmental disorders • autism category: childhood psychiatric disorders • autism category: communication disorders • autism category: neurological disorders • albedo category: electromagnetic radiation • albedo category: climatology • albedo category: climate forcing • albedo category: scattering, absorption and radiative transfer (optics) • albedo category: radiometry • abudhabi category: abudhabi • abudhabi category: capitals in asia • abudhabi category: cities in the united arab emirates • abudhabi category: coastal cities • a category: latin letters • a category: vowel letters Robert Rounthwaite, TMSN Text Mining, Search and Navigation

Using Category Graphs to Drive Dialog Use ODP and Wikipedia hierarchies to construct graph • User: I like ferrets. • Ferret: category: animals people keep as pets • Animals people keep as pets: rabbits • Sam:Do you like rabbits, too? Text Mining, Search and Navigation

Use Category Graphs to Build Models “World model” for both user and for Sam • Attach vector to each node, sparsely: • [like/dislike; interested/not; knows about; …] • Each component has confidence level • Leverage graph structure to explore Text Mining, Search and Navigation

Other Useful Sources of Data qi uj qi+1 qj qj+1 qj+2 uj+1 qi+2 uj+2 qi+3 Q. Mei, D. Zhou, K. Church, Query Suggestion Using Hitting Time, CIKM 2008 Text Mining, Search and Navigation

Temporal Querying Behavior M. Richardson, ACM Transactions on the Web 2008 Text Mining, Search and Navigation

We Are Not Alone • http://quizbot.trueknowledge.com/ • http://www.jabberwacky.com/ Jab: Do you like computers? Me: Yes, but only when they are turned on. Jab: I like ferraris, do you like them also? Me: Not really. Why do you like them? Jab: Yes. Me: Why do you like them? Jab: No, I like going to mars. Text Mining, Search and Navigation

One Possible Sentence Generator • Inputs: • Sentiment • Distribution over topics under discussion • Features from recent sentences • Sentence or phrase database (with statistics) • Distributions over users likes / interests, etc. • Close or popular nodes where bot lacks knowledge of user • Topic priors • Output: ranked sentences Text Mining, Search and Navigation

New Challenges for Machine Learning • How can we teach a chatbot to talk? • “Good / bad response” buttons: reinforcement learning? • ESP-like games for labeling for learning to rank sentences? • Build natural sentences from phrases? • How can we learn effective user models? • Combine from multiple users to form good priors • Use active learning during chat to reduce uncertainty in the user’s model Text Mining, Search and Navigation

Demo Joint work with Scott Imig, Silviu Cucerzan S. Cucerzan, Large Scale Named Entity Disambiguation based on Wikipedia data, Proc. 2007 Joint Conference on EMNLP and CNLL Text Mining, Search and Navigation

~ Some New Results on Ranking ~ Text Mining, Search and Navigation

Empirical Optimality of -rank Joint work with: • Pinar Donmez (CMU) • Krysta Svore (MSR) • YisongYue (Cornell) Text Mining, Search and Navigation

Some IR Measures • Precision: Recall: • Average Precision: Compute precision for each positive, average over all positions • Mean Average Precision: Average AP over queries • Mean Reciprocal Rank (TREC QA) • Mean NDCG: , averaged over queries Text Mining, Search and Navigation

IR Measures, cont. These measures: • Depend only on the labels and the sorted order of the documents • Viewed as a function of the scores output by some model, are everywhere either flat or discontinuous • SVM MAP: Yue et. al, SIGIR ’07 • Tao Qin, Tie-Yan Liu, Hang Li, MSR Tech Report 164 (2008) Text Mining, Search and Navigation

LambdaRank: Background Text Mining, Search and Navigation

The RankNet Cost Modeled posteriors: Target posteriors: Define Cross entropy cost: Model output probabilities using logistic: Text Mining, Search and Navigation

RankNet Cost ~ Pairwise Cost Text Mining, Search and Navigation

Pairwise Cost Revisited Pairwise cost fine if no errors, but: 13 errors 11 errors Text Mining, Search and Navigation

LambdaRank Instead of using a smooth approximation to the cost, and taking derivatives, write down the derivatives directly. Then use these derivatives to train a model using gradient descent, as usual. Text Mining, Search and Navigation

The Lambda Function NDCG gain in swapping members of a pair of docs, multiplied by RankNet cost gradient as a smoother: Let be the set of documents labeled higher (lower) than document : Text Mining, Search and Navigation

Lambda Functions for MAP, MRR Text Mining, Search and Navigation

Local Optimality • Check the gradient vanishes at solution. • Get bound on probability that we’re not at a local max, using one-sided Monte Carlo P (We miss ascent direction despite k trials) How large must k be for if ? Answer: Text Mining, Search and Navigation

Data Sets • Artificial: 300 features, 50 urls/query, 10k/5k/10k train/valid/test split • Web 1: 420 features, 26 urls/query, 10k/5k/10k split • Web 2: 30k/5k/10k split Text Mining, Search and Navigation

Which function to choose? • LocalGradient: finite element estimate of gradient, with margin • LocalCost: estimate local gradient using neighbors + weighted RankNet cost • SpringSmooth: smoother version of RankNetWeightPairs • DiscreteGradient: finite element estimate using optimal position Text Mining, Search and Navigation

10K Web 30K Web Artificial Text Mining, Search and Navigation

Sample Size Matters • Number of pairs drops by >2 for MRR and MAP • For MRR, number of samples drops much further Text Mining, Search and Navigation

IR Measure Optimality - Conclusions • Typically, IR practitioners would train models with small numbers of ‘smart’ features (~ BM25), and perform grid search • However, adding many weak features improves performance • We have shown that the LambdaRank gradients optimize three IR measures directly Text Mining, Search and Navigation

Contents

Contents

Presentation Transcript

Contents

Contents

Contents

Contents

Contents

Contents

Contents

CONTENTS

Contents

Contents