Clustering and exploring search results using timeline constructions
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

Clustering and Exploring Search Results using Timeline Constructions PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on
  • Presentation posted in: General

Clustering and Exploring Search Results using Timeline Constructions. Presenter: Tsai Tzung Ruei Authors: Omar Alonso, Michael Gertz , Ricardo Baeza -Yates. 國立雲林科技大學 National Yunlin University of Science and Technology. CIKM 2009. Outline. Motivation Objective

Download Presentation

Clustering and Exploring Search Results using Timeline Constructions

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Clustering and exploring search results using timeline constructions

Clustering and Exploring Search Results usingTimeline Constructions

Presenter: Tsai TzungRuei

Authors: Omar Alonso, Michael Gertz, Ricardo Baeza-Yates

國立雲林科技大學

National Yunlin University of Science and Technology

CIKM 2009


Outline

Outline

  • Motivation

  • Objective

  • Time annotated document model

  • Methodology

  • Experiments

  • Conclusion

  • Comments


Motivation

Motivation

  • Any of the current search engines does not exploit

    the temporal information embedded in the documents.

  • Do you think current timelines for organizing or clustering search results (such as in Google’s timeline) are useful for some of your daily search activities?

  • Do you use (or would use) timelines to explore search results?

  • Please indicate some search scenarios where you use timelines or would like to use timelines to organize search results.

  • Please give some examples of search scenarios where current search engines do not sufficiently support the concept of timelines to organize and explore search results?

  • What other features would you like to see in the context of timelines?

時間軸


Objective

Objective

  • To present an add-on to traditional information retrievalapplications in which we exploit various temporal informationassociated with documents to present and cluster documentsalong timelines.


Time annotated document model

TIME ANNOTATED DOCUMENT MODEL

  • Time and Timelines

  • Temporal Expressions

  • Temporal Document Profiles

Our base timeline, denoted Td, is an interval of consecutive

day chronons.EX: “March 12, 2002; March 13, 2002;March 14, 2002”

implicit temporal expression

EX:“Valentine's Day 2006”

Explicit temporal expressions

EX:December 2004

Relative temporal expressions

EX:“today”

Explicit

implicit

timestamps

Relative


Methodology

Methodology

  • PROTOTYPE

    • Process Overview

Alembic (POS tagger)

GUTime temporal tagger

  • XML

  • Document

  • (tdp)

Corpora

Oracle


Methodology1

Methodology

  • TCluster

    • Constructing a Time Outline for the documents in the hit list Lq.

    • Document Clustering

    • Ranking Documents in a Cluster

a hit list Lq =[d1, d2, . . . , dk] of k documents


Experiments

Experiments

  • DMOZ

    • Introduction :a multilingual open content directory

2010, 2006, 2002,

1998 and 1994

document clusters

Result

documents are well classified by users in terms of the actual event.

World Cup

documents

pre-defined categories(5)< TCluster (21)

Each World Cup document has a single event

as the main theme.


Experiments1

Experiments

  • The TimeBank 1.2 corpus

    • It contains news articles that have been annotated using TimeML with temporal expressions related to events, times and temporal links between events and times.

Result

A 50% increase in the number of clusters discovered by TCluster


Experiments2

Experiments

  • Relevance Evaluation using AMT

    • It is a crowdsourcing platform

Result

The average response was 4.04

(with an 80% agreement level)


Conclusion

Conclusion

  • MAJOR CINTRIBUTION

    • TCluster algorithm provides great flexibility and allows users to explore clusters of search result documents that are organized along well-defined timelines, supporting different levels of time granularity.

    • The utility of the time-based clustering over existing approaches that cluster documents only based on document timestamps.

  • FUTURE WORK

    • To want to study the weighting of relative temporal expressions as well as different sentence distance functions for determining the rank of documents in a cluster.


Comment

Comment

  • Advantage

    • Provides a new method of time searching

  • Drawback

    • Some mistakes

  • Application

    • information retrieval

    • Clustering


  • Login