Task-aware query recommendation

Task-aware query recommendation Henry Feild James Allan Center for Intelligent Information Retrieval University of Massachusetts Amherst July 29, 2013 Dublin, Ireland

Related searches: dsec info dsec delaware solar energy coalition created deaf smith electric cooperative created dayton service engineers collaborative created Delaware Solar Energy Coalition The Delaware Solar Energy Coalition (DSEC) was founded… http://delsec.org when was the dsec created what is the dupont science essay contest when was the dsec created Deaf Smith Electric Cooperative – Home The Deaf Smith Electric Cooperative (DSEC). http://dsec.org Related searches: dupontchallenge information dupont essay submission dupont challenge history deaf smith electric cooperative created dayton service engineers collaborative created DuPont Challenge The DuPont Challenge calls on students to rsearch, think critically… http://thechallenge.dupont.com DuPont For more than 200 years, DuPont has brought world-class science and engineering… http://www.dupont.com

Potential benefits of context Find key terms/phrases crisis intervention professional services crisis intervention services object management group andrewwatson watsonomg elliptical trainer elliptical trainer benefits michworks michigan unemployed Disambiguation Term expansion

Likelihood of x-length tasks 57% of AOL tasks consist of 2 or more queries Based on a sample of 503 AOL users; tasks were manually labeled by UMass students.

A couple of issues Interleaved tasks Multi-tasking within a fixed window pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits You are more likely than not to encounter off-task queries even going back one query! Aggregated over 503 AOL users 3 days of Yahoo! data 3 months of AOL data Jones & Klinkner (CIKM 2009)

Ideal system Identify on-task queries pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits pampered chef hosting pampered chef elliptical trainer elliptical trainer benefits Ignore off-task queries

Research questions Does on-task context help? 1 pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits How much does off-task context hurt? 2 How well does current technology deal with mixed contexts? 3

Basic models Reference query only pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits Term-Query Graph Query Recommender (Bonchiet al. SIGIR’12) 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 total trainer pro reviews

Basic models Reference query only pampered chef pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits elliptical trainer hosting pampered chef elliptical trainer benefits 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 total trainer pro reviews Decay model λ: 0.4 λ: 0.6 λ: 0.8 0.7 pampered chef merchandise 0.4 pampered chef parties 0.3 cooking supplies 0.3 pampered chef stuff λ: 1.0 0.6 orbitrek elliptical trainer 0.4 image 8.0 elliptical trainer 0.4 total trainer pro 0.2 total trainer pro reviews 0.6 pampered chef parties 0.3 pampered chef parties 0.2 cooking supplies 0.2 pampered chef stuff 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 elliptical trainer vs treadmill 0.2 elliptical trainer vs treadmill 0.2 the benefits of an elliptical trainer 0.1 total trainer pro 0.1 orbitrek elliptical trainer

Task-aware models Hard task threshold model 0.0 0.01 0.01 1.0 0.90 0.90 elliptical trainer benefits pampered chef elliptical trainer benefits hosting pampered chef elliptical trainer elliptical trainer elliptical trainer hosting pampered chef elliptical trainer benefits hosting pampered chef pampered chef pampered chef 0.0 0.02 0.02 Same task? (Lucchese et al. WSDM’11) 1.0 1.0 weight = task_decay (decay over on-task queries only) Soft task threshold model λ: 0.1 0.01 λ: 0.8 λ: 0.5 0.90 λ: 0.1 0.02 λ: 1.0 λ: 1.0 1.0 weight = decay × same_task_score

Task-aware models (cont’d) 0.0 0.0 elliptical trainer benefits hosting pampered chef pampered chef elliptical trainer pampered chef hosting pampered chef elliptical trainer benefits elliptical trainer 0.0 0.0 weight = decay × same_task_scoreif on-task; 0 otherwise (soft task score for on-task queries; 0 for off-task queries) Firm task threshold model 1 Firm task threshold model 2 0.01 0.01 0.0 0.7 0.5 0.90 0.90 0.0 0.02 0.02 1.0 1.0 1.0 1.0 weight = task_decay × same_task_score

Data and evaluation Query recommendations: 2006 AOL search log used to bulida Term-Query Graph Tasks: 2010—2011 TREC Session Track On-task contexts - 212 judged sessions - 2+ queries/session - task = session Off-task contexts Mixed contexts Evaluation - When’s the soonest a user can find a novel document? - MRR over documents retrieved with the top scoring recommendation (ClueWeb ‘09) - removed documents retrieved in top 10 of context queries 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 elliptical trainer vs treadmill ClueWeb elliptical-trainer.com sportsequipment.com totaltrainer.com …

Effect of on-/off-task context Does on-task context help? 1 on-task context reference-query only How much does off-task context hurt? 2 off-task context How far back in the user’s history we look, including the reference query Off-task context hurts, on average Yes, on average

Effect of noise (mixed contexts) Firm1-, Firm2-, and hard-task reference-query only soft-task decay How well does current technology deal with mixed contexts? — Pretty well 3 Soft task threshold model is easily distracted by noise Decay model can’t handle even 1 noisy query Firm 1 & 2 and Hard Task Threshold models are robust to noise Lucchese et al. (WSDM 2011)

Variance across task set Some really high highs Some really low lows One tiny little gain Tasks MRR Difference Tasks Several moderate lows

Conclusions • on-task context can be very helpful • it can also hurt • off-task context is bad • current state-of-the-art task segmentation works well

Future work • do these results hold for other query recommendation algorithms? • are our findings consistent across additional tasks? • can we predict when context will help?

Thanks!

Example Reference-query only recommendations: 0.08 alabama satellite internet providers 0.25 sattelite internet 1.00 satellite internet providers 0.02 satelite internet for 30.00 0.06 satellite internet providers northern california Decay recommendations: 0.00 2006 volvo xc90 reviews 0.00 volvo xc90 reviews 1.00 hughesinternet.com 1.00 hughessatelite internet 0.08 alabama satellite internet providers Hard task recommendations: 1.00 hughesinternet.com 1.00 hughessatelite internet 0.08 alabama satellite internet providers 0.17 satellite high speed internet 0.25 sattelite internet Context: 5. (1.00) satellite internet providers 4. (0.44) hughes internet 3. (0.02) ocdbeckham 2. (0.04) reviews xc90 1. (0.04) buy volvo semi trucks MRR same-task score MRR MRR

Data and evaluation Query recommendations: 2006 AOL search log used to bulida Term-Query Graph Tasks: 2010—2011 TREC Session Track Mixed contexts On-task contexts Off-task contexts - 50 × 212 sessions - all queries but the last are replaced by queries from other tasks - 50 samples per TREC session - 212 judged sessions - 2+ queries/session - task = session - 10 × 50 × 212 sessions - added up to 10 off-task queries per session - 50 samples per TREC session and noise level Evaluation - MRR over documents retrieved with the top scoring recommendation (ClueWeb ‘09) - removed documents retrieved in top 10 of context queries

Task-aware query recommendation

Task-aware query recommendation

Presentation Transcript

Exploring the Query-Flow Graph with a Mixture Model for Query Recommendation

Query Task

Path Operator Task Force Recommendation

Migration Cost Aware Task Scheduling

Thermal-aware Task Placement in Data Centers

Contention-aware scheduling with task duplication

Migration Cost Aware Task Scheduling

Similarity-Aware Query Processing in Sensor Networks

web science Presentation on the topic - Query Recommendation

Context-Aware Recommendation

Funding Solution Task Force Recommendation #26

Data Quality Aware Query Systems

Query Task Model (QTM): Modeling Query Execution with Tasks

Path Operator Task Force Recommendation

Model-based Context-Aware Recommendation

Data Quality Aware Query Systems

Context-Aware Query Classification

web science Presentation on the topic - Query Recommendation

“Temperature-Aware Task Scheduling for Multicore Processors”

Buffer-pool aware Query Optimization

Can social network be used for location-aware recommendation?