Clustering and exploring search results using timeline constructions
Download
1 / 12

Clustering and Exploring Search Results using Timeline Constructions - PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on

Clustering and Exploring Search Results using Timeline Constructions. Presenter: Tsai Tzung Ruei Authors: Omar Alonso, Michael Gertz , Ricardo Baeza -Yates. 國立雲林科技大學 National Yunlin University of Science and Technology. CIKM 2009. Outline. Motivation Objective

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Clustering and Exploring Search Results using Timeline Constructions' - jerod


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Clustering and exploring search results using timeline constructions

Clustering and Exploring Search Results usingTimeline Constructions

Presenter: Tsai TzungRuei

Authors: Omar Alonso, Michael Gertz, Ricardo Baeza-Yates

國立雲林科技大學

National Yunlin University of Science and Technology

CIKM 2009


Outline
Outline

  • Motivation

  • Objective

  • Time annotated document model

  • Methodology

  • Experiments

  • Conclusion

  • Comments


Motivation
Motivation

  • Any of the current search engines does not exploit

    the temporal information embedded in the documents.

  • Do you think current timelines for organizing or clustering search results (such as in Google’s timeline) are useful for some of your daily search activities?

  • Do you use (or would use) timelines to explore search results?

  • Please indicate some search scenarios where you use timelines or would like to use timelines to organize search results.

  • Please give some examples of search scenarios where current search engines do not sufficiently support the concept of timelines to organize and explore search results?

  • What other features would you like to see in the context of timelines?

時間軸


Objective
Objective

  • To present an add-on to traditional information retrievalapplications in which we exploit various temporal informationassociated with documents to present and cluster documentsalong timelines.


Time annotated document model
TIME ANNOTATED DOCUMENT MODEL

  • Time and Timelines

  • Temporal Expressions

  • Temporal Document Profiles

Our base timeline, denoted Td, is an interval of consecutive

day chronons.EX: “March 12, 2002; March 13, 2002;March 14, 2002”

implicit temporal expression

EX:“Valentine's Day 2006”

Explicit temporal expressions

EX:December 2004

Relative temporal expressions

EX:“today”

Explicit

implicit

timestamps

Relative


Methodology
Methodology

  • PROTOTYPE

    • Process Overview

Alembic (POS tagger)

GUTime temporal tagger

  • XML

  • Document

  • (tdp)

Corpora

Oracle


Methodology1
Methodology

  • TCluster

    • Constructing a Time Outline for the documents in the hit list Lq.

    • Document Clustering

    • Ranking Documents in a Cluster

a hit list Lq =[d1, d2, . . . , dk] of k documents


Experiments
Experiments

  • DMOZ

    • Introduction :a multilingual open content directory

2010, 2006, 2002,

1998 and 1994

document clusters

Result

documents are well classified by users in terms of the actual event.

World Cup

documents

pre-defined categories(5)< TCluster (21)

Each World Cup document has a single event

as the main theme.


Experiments1
Experiments

  • The TimeBank 1.2 corpus

    • It contains news articles that have been annotated using TimeML with temporal expressions related to events, times and temporal links between events and times.

Result

A 50% increase in the number of clusters discovered by TCluster


Experiments2
Experiments

  • Relevance Evaluation using AMT

    • It is a crowdsourcing platform

Result

The average response was 4.04

(with an 80% agreement level)


Conclusion
Conclusion

  • MAJOR CINTRIBUTION

    • TCluster algorithm provides great flexibility and allows users to explore clusters of search result documents that are organized along well-defined timelines, supporting different levels of time granularity.

    • The utility of the time-based clustering over existing approaches that cluster documents only based on document timestamps.

  • FUTURE WORK

    • To want to study the weighting of relative temporal expressions as well as different sentence distance functions for determining the rank of documents in a cluster.


Comment
Comment

  • Advantage

    • Provides a new method of time searching

  • Drawback

    • Some mistakes

  • Application

    • information retrieval

    • Clustering