Twinner understanding news queries with geo content using twitter
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

TWinner : Understanding News Queries with Geo-content using Twitter PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

TWinner : Understanding News Queries with Geo-content using Twitter. Satyen Abrol,Latifur Khan University of Texas at Dallas,Department of Computer Science GIR ’1 0. 29 April, 2011 Sengyu Rim. Outline. Introduction Related Work Twitter as News-wire Determining News Intent

Download Presentation

TWinner : Understanding News Queries with Geo-content using Twitter

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Twinner understanding news queries with geo content using twitter

TWinner: Understanding News Queries with Geo-content using Twitter

Satyen Abrol,Latifur Khan

University of Texas at Dallas,Department of Computer Science

GIR ’10

29 April, 2011

SengyuRim


Outline

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion

2/26


Introduction

Introduction

  • Motivations

    • Users find news through search engines

    • The search results of common search engines are different from the

      user expected

      • Non-critical information

      • Unorganized content

    • Necessary for search engines to understand the intend of the user query


Introduction1

Introduction

Motivation

E.g

what event in Korea attracted most attention in 2002?

A naive user is searching the news with keyword “korea” on 2002.06-18

Food:

Kimchi

Map:

korea

News:

Korea:Italy

2:1

Wiki: Korea

4/26


Introduction2

Introduction

  • Analyze the content of a popular social networking site,

    Twitter to know the intention of the user query

    • Twitter provides popular news topics

    • Twitter provides keywords that may enhance the user query

  • TWinner makes two novel contributions to the field of

    Geographic information retrieval

    • Identifying the intent of the user query

    • Adding additional keywords to the query


Introduction3

Introduction

  • The architecture of the news intent system Twinner


Outline1

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion


Related work

Related Work

  • To identify and disambiguate the locations of users

    • Natural Language Processing

    • Data Mining

  • To establish the relationship between the location of the

    news and news content

    • A model using NLP techniques


Outline2

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion


Twitter as news wire

Twitter as News-wire

  • Twitter

    • Free social networking

    • Micro-blogging service

    • Medium for news updates


Outline3

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion


Determining news intent

Determining News Intent

  • Identification of Location

    • Geo-tags the query to a location with certain confidence

  • Frequency-Population Ratio

    • FPR always remains constant in the absence of a news making

      event irrespective of the location

    • Used to assign a news intent confidence to the query

    • FPR = (α + β) * Nt

      • α: the population density factor

      • β: location type constant

      • Nt:the number of tweets per minute at that instant


Determining news intent1

Determining News Intent

  • Experiments on determining the effect of geo-type and

    population density


Determining news intent2

Determining News Intent

  • The drawback of FPR

    • Fails to take into account the geographical relatedness of features

  • Modified FPR

    • FPR = Σ δi (αi + βi) * Nt

      • δi: factor that each geo-location related to the primary search query


Outline4

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion


Assigning weights to tweets

Assigning Weights to Tweets

  • Detecting Spam Messages

    • Spam messages carry little or no relevant information

    • Nature of spam messages

    • The formula that tags to a certain level of confidence whether the message is spam or not

      • Np: the number of followers

      • Nq: the number of people the user is following

      • μ: an arbitrary constant

      • Nr: the ratio of number of tweets containing a reply to the total

        number of tweets


Assigning weights to tweets1

Assigning Weights to Tweets

  • On basis of user location

    • The experiment conducted to understand the relation between Twitter

      messages and the location of the user


Assigning weights to tweets2

Assigning Weights to Tweets

  • Using Hyperlinks Mentioned in Tweets

    • 30-50% of the general Twitter messages contain a hyperlink to

      external website

    • The news Twitter messages of this percentage increases to 70-80%

    • We also make use of this pointer to assign the weights to tweets


Assigning weights to tweets3

Assigning Weights to Tweets

  • Semantic Similarity

    • Summarize the Twitter messages into a couple of keywords

    • Naïve approach picks k keywords ignoring the sematic similarity

    • The definition of the semantic similarity

      • M: the total number of articles searched in New York Times Corpus

      • f(x): the number of articles for term x

      • f(y): the number of articles for term y


Assigning weights to tweets4

Assigning Weights to Tweets

  • Reassigns the weight of all keywords on the basis of the following

    formula

    • Wi*= Wi + ΣSij* Wj

      • Wi*: the new weight of the keyword i

      • Wi: the weight without semantic similarity

      • Sij: the semantic similarity derived from semantic formula

      • Wj : the initial weight of the other words being considered

  • Identifies k keywords that are semantically dissimilar but together contribute maximum weight.

    • Spq<Sthreshold, the similarity between any two word(p) and word(q)

      belonging to the set of k is less than a threshold

    • W1+W2+W3+….+Wk is maximum for all groups satisfying the condition

      above mentioned


Outline5

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion


Experiment and results

Experiment and Results

  • Experiments-to see the validity of the hypothesis

    • First: a naïve user is looking for the latest on the happenings in the

      context to the Ford Hood incident on 12th November 2009

    • Second: a naïve user is looking for the latest on the happenings in the

      context to ‘Russia’ on 5th December 2009

    • Third: :a naïve user is looking for the latest on the happenings in the

      context to ‘Haiti’ on 18th January 2010


Experiment and results1

Experiment and Results

  • Results


Experiment and results2

Experiment and Results

  • Result-shows the contrast in search results produced by using

    original query and after adding keywords obtained by TWinner


Outline6

Outline

  • Introduction

  • Related Work

  • Twitter as News-wire

  • Determining News Intent

  • Assigning Weights to Tweets

  • Experiments and Results

  • Conclusion


Conclusion

Conclusion

  • We present a system to predict a user’s news intent

    • Takes location mentioned and time of query into consideration

    • Makes use of the social networking site Twitter to understand the relationship between geo-information and the news intend of the query

  • Future work

    • Understanding the content of the social media message

    • Sentiment conveyed by the messages

    • Enhancing the accuracy of the system


Thank you

Thank you!


  • Login