Empirical Development of an Exponential Probabilistic Model

Empirical Development of anExponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI), MIT

Goal: Better Generative Model • Generative v. discriminative model • Applies to many applications • Information retrieval (IR) • Relevance feedback • Using unlabeled data • Classification • Assumptions explicit

Using a Model for IR Hyper-learn • Define model • Learn parameters from query • Rank documents • Better model improves applications • Trickle down to improve retrieval • Classification, relevance feedback, … • Corpus specific models

Overview • Related work • Probabilistic models • Example: Poisson Model • Compare model to text • Hyper-learning the model • Exponential framework • Investigate retrieval performance • Conclusion and future work

Related Work • Using text for retrieval algorithm • [Jones, 1972], [Greiff, 1998] • Using text to model text • [Church & Gale, 1995], [Katz, 1996] • Learning model parameters • [Zhai & Lafferty, 2002] Hyper-learn the model from text!

Probabilistic Models • Rank documents by RV =Pr(rel|d) • Naïve Bayesian models RV =Pr(rel|d)

Probabilistic Models • Rank documents by RV =Pr(rel|d) • Naïve Bayesian models # occs in doc = Pr(dt|rel) features t RV =Pr(rel|d) Pr(d|rel) 8 words • Open assumptions • Feature definition • Feature distribution family Defines the model!

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents Pr(dt|rel) =

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents • Poisson Model • θ: specifies term distribution dt -θ θ e Pr(dt|rel) = dt!

Example Poisson Distribution + θ=0.0006 Pr(dt|rel) Pr(dt|rel)≈1E-15 Term occurs exactlydt times

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents • Learn a θ for each term • Maximum likelihood θ • Term’s average number of occurrence • Incorporate prior expectations

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents • For each document, find RV • Sort documents by RV = Pr(dt|rel). words t RV

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents Which step goes wrong? • For each document, find RV • Sort documents by RV = Pr(dt|rel). words t RV

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents dt -θ θ e Pr(dt|rel) = dt!

How Good is the Model? + θ=0.0006 Pr(dt|rel) 15 times Term occurs exactlydt times

How Good is the Model? + θ=0.0006 Pr(dt|rel) Misfit! 15 times Term occurs exactlydt times

Hyper-learning a Better FitThrough Textual Analysis Using an Exponential Framework

Hyper-Learning Framework • Need framework for hyper-learning Mixtures Poisson Bernoulli Normal

Hyper-Learning Framework • Need framework for hyper-learning • Goal: Same benefits as Poisson Model • One parameter • Easy to work with (e.g., prior) Mixtures Poisson Bernoulli Normal One parameter exponential families

Exponential Framework • Well understood, learning easy • [Bernardo & Smith, 1994], [Gous, 1998] Pr(dt|rel) = f(dt)g(θ)e • Functions f(dt) and h(dt) specify family • E.g., Poisson: f(dt) = (dt!)-1,h(dt) = dt • Parameter θ term’s specific distribution θh(dt)

Using a Hyper-learned Model • Define model • Learn parameters from query • Rank documents

Using a Hyper-learned Model • Hyper-learn model • Learn parameters from query • Rank documents

Using a Hyper-learned Model • Hyper-learn model • Learn parameters from query • Rank documents • Want “best” f(dt) and h(dt) • Iterative hill climbing • Local maximum • Poisson starting point

Using a Hyper-learned Model • Hyper-learn model • Learn parameters from query • Rank documents • Data: TREC query result sets • Past queries to learn about future queries • Hyper-learn and test with different sets

Recall the Poisson Distribution + Pr(dt|rel) 15 times Term occurs exactlydt times

Poisson Starting Point - h(dt) + h(dt) Pr(dt|rel) =f(dt)g(θ)e θh(dt) dt

Hyper-learned Model - h(dt) Hyper-learned Model - h(dt) + h(dt) Pr(dt|rel) =f(dt)g(θ)e θh(dt) dt

Poisson Distribution + Pr(dt|rel) 15 times Term occurs exactlydt times

Hyper-learned Distribution Hyper-learned Distribution + Pr(dt|rel) 15 times Term occurs exactlydt times

Performing Retrieval • Hyper-learn model • Learn parameters from query • Rank documents

Performing Retrieval Labeled docs • Hyper-learn model • Learn parameters from query • Rank documents θh(dt) Pr(dt|rel) = f(dt)g(θ)e • Learn θ for each term

Learning θ • Sufficient statistics • Summarize all observed data • τ1: # of observations • τ2: Σobservations d h(dt) • Incorporating prior easy • Map τ1 and τ2θ 20 labeled documents

Performing Retrieval • Hyper-learn model • Learn parameters from query • Rank documents

Results: Labeled Documents Results: Labeled Documents Precision Recall

Performing Retrieval • Hyper-learn model • Learn parameters from query • Rank documents Short query

Retrieval: Query Retrieval: Query • Query = single labeled document • Vector space-like equation RV = Σa(t,d) + Σb(q,d) • Problem: Document dominates • Solution: Use only query portion • Another solution: Normalize t in doc q in query

Retrieval: Query Precision Recall

Conclusion • Probabilistic models • Example: Poisson Model • Hyper-learning the model • Exponential framework • Learned a better model • Investigate retrieval performance - Bad text model - Easy to work with - Heavy tailed! - Better …

Future Work • Use model better • Use for other applications • Other IR applications • Classification • Correct for document length • Hyper-learn on different corpora • Test if learned model generalizes • Different for genre? Language? People? • Hyper-learn model better

Questions? Contact us with questions: Jaime Teevan teevan@ai.mit.edu David Karger karger@theory.lcs.mit.edu

Empirical Development of an Exponential Probabilistic Model

Empirical Development of an Exponential Probabilistic Model

Presentation Transcript

DBMS with probabilistic model

Empirical Development of an Exponential Probabilistic Model

The Colonial Origins of Comparative Development: An Empirical Investigation

THE EXPONENTIAL GARCH MODEL

Probabilistic Model Checking

Probabilistic Model of Range

Probabilistic Model

Observational Test of Halo Model: an empirical approach

Empirical Model of CSF Flow

An Empirical Model of Decadal ENSO Variability

An evaluation of the probabilistic information in multi-model ensembles

The Probabilistic Model

Model Oriented Programming: An Empirical Study of Comprehension

Probabilistic Voting Model

Empirical Model

The Colonial Origins of Comparative Development: An Empirical Investigation

An evaluation of the probabilistic information in multi-model ensembles

Exponential Model

Model Oriented Programming: An Empirical Study of Comprehension