a statistical analysis of the precision recall graph n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Statistical Analysis of the Precision-Recall Graph PowerPoint Presentation
Download Presentation
A Statistical Analysis of the Precision-Recall Graph

Loading in 2 Seconds...

play fullscreen
1 / 28

A Statistical Analysis of the Precision-Recall Graph - PowerPoint PPT Presentation


  • 118 Views
  • Uploaded on

A Statistical Analysis of the Precision-Recall Graph. Ralf Herbrich, Hugo Zaragoza , Simon Hill. Microsoft Research, Cambridge University, UK. Overview. 2-class ranking Average-Precision From points to curves Generalisation bound Discussion. “Search” cost-functions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Statistical Analysis of the Precision-Recall Graph' - nailah


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a statistical analysis of the precision recall graph

A Statistical Analysis of the Precision-Recall Graph

Ralf Herbrich, Hugo Zaragoza, Simon Hill.

Microsoft Research,

Cambridge University,

UK.

Microsoft Research

overview
Overview
  • 2-class ranking
  • Average-Precision
  • From points to curves
  • Generalisation bound
  • Discussion
search cost functions
“Search” cost-functions
  • Maximise the number of relevant documents found in the top 10.
  • Maximise the number of relevant documents at the top (e.g. weight inversely proportional to rank)
  • Minimise the number of documents seen by the user until he is satisfied.
motivation
Motivation
  • Why should 45 August, 2003 work for document categorisation?
  • Why should any algorithm obtain good generalisation average-precision?
  • How to devise algorithms to optimise rank dependant loss-functions?
2 class ranking problem
2-class ranking problem

X,Y

Mapping: X  R

{y=1}

Relevancy: P(y=1|x)  P(y=1|f(x))

collection samples
Collection samples
  • A collection is a sample:

z= ((x1,y1),...,(xm,ym))  (X x {0,1})m

  • where:
    • y = 1 if the document x is relevant to a particular topic,
    • z is drawn from the (unknown) distribution πXY
  • let k denote the number of positive examples
ranking the collection
Ranking the collection
  • We are given a scoring function f :XR
  • This function imposes an order in the collection:
    • (x(1) ,…,x(m)) such that : f(x(1)) > … > f(x(m))
    • Hits (i1,…, ik)are the indices of the positive y(j).

f(x(i))

y(i) = 1 1 0 1 0 0 1 0 0 0

ij = 1 2 4 7

classification setting
Classification setting
  • If we threshold the function f, we obtain a classification:
  • Recall:
  • Precision:

f(x(i))

t

precision vs pgc
Precision .vs. PGC

PGC

PGC

PRECISION

PRECISION

the precision recall graph
The Precision-Recall Graph

After reordering:

f(x(i))

graph summarisations

1

0.9

0.8

0.7

0.6

0.5

Precision

0.4

0.3

0.2

0.1

0

0

0.2

0.4

0.6

0.8

1

Recall

Graph Summarisations

Break-Even point

overfitting
overfitting?

Average Precision (TEST SET)

Average Precision (TAIN SET)

overview1
Overview
  • 2-class ranking
  • Average-Precision
  • From points to curves
  • Generalisation bound
  • Discussion
from point to curve bounds
From point to curve bounds
  • There exist SVM margin-bounds [Joachims 2000] for precision and recall.
  • They only apply to a single (unknown a priori) point of the curve!

Precision

Recall

features of ranking learning
Features of Ranking Learning
  • We cannot take differences of ranks.
  • We cannot ignore the order of ranks.
  • Point-wise loss functions do not capture the ranking performance!
  • ROC or precision-recall curves do capture the ranking performance.
  • We need generalisation error bounds for ROC and precision-recall curves
generalisation and avg prec
Generalisation and Avg.Prec.
  • How far can the observed Avg.Prec. A(f,z)be from the expected average A(f) ?
  • How far can train and test Avg.Prec.?
approach
Approach
  • McDiarmid’s inequality: For any function g:ZnR with stability c, for all probability measures P with probability at least1-δ over the IID draw of Z
approach cont
Approach (cont.)
  • Set n= 2m and call the two m-halves Z1 and Z2. Define gi (Z):=A(f,Zi). Then, by IID :
bounding a f z a f z i
Bounding A(f,z) - A(f,zi)
  • How much does A(f,z) change if we can alter one sample (xi,yi)?
  • We need to fix the number of positive examples in order to answer this question!
    • e.g. if k=1, the change can be from 0 to 1.
stability analysis
Stability Analysis
  • Case 1: yi=0
  • Case 2: yi=1
main result
Main Result

Theorem: For all probability measures, for all f:XR, with probability at least 1- δover the IID draw of a training and test sample both of size m, if both training sample z and test sample z contain at least αm positive examples for all α(0,1), then:

positive results
Positive results
  • First bound which shows that asymptotically training and test set performance (in terms of average precision) converge!
  • The effective sample size is only the number of positive examples.
  • The proof can be generalised to arbitrary test sample sizes.
  • The constants can be improved.
open questions
Open questions
  • How can we let k change, so as to investigate:
  • What algorithms could be used to directly maximise A(f,z) ?
conclusions
Conclusions
  • Many problems require ranking objects in some degree.
  • Ranking learning requires to consider non-point-wise loss functions.
  • In order to study the complexity of algorithms we need to have large deviation inequalities for ranking performance measures.