textnet a text based intelligent system l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
TextNet – A Text-Based Intelligent System PowerPoint Presentation
Download Presentation
TextNet – A Text-Based Intelligent System

Loading in 2 Seconds...

play fullscreen
1 / 19

TextNet – A Text-Based Intelligent System - PowerPoint PPT Presentation


  • 246 Views
  • Uploaded on

TextNet – A Text-Based Intelligent System Sanda Harabagiu Dan Moldovan as (mis-)interpreted by Peter Clark Introduction Overall goal: Given a sentence/paragraph, create a representation of the unstated, extra knowledge (“context”) which it suggests.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'TextNet – A Text-Based Intelligent System' - ostinmannual


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
textnet a text based intelligent system

TextNet – A Text-Based Intelligent System

Sanda Harabagiu

Dan Moldovan

as (mis-)interpreted by Peter Clark

introduction
Introduction
  • Overall goal:
    • Given a sentence/paragraph, create a representation of the unstated, extra knowledge (“context”) which it suggests.
    • Input: sentence graph; Output: bigger, richer graph
  • Purpose: Question-answering etc. (?)
  • Sources of this extra knowledge:
    • (Extended) WordNet
    • the Internet
wordnet
WordNet
  • Organized around concepts (“synsets”), not words
  • Contains:
    • ~100k concepts (“synsets”)
    • ~350k connections (14 types)
    • English definitions (“glosses”) for most synsets

{“athletic game”}

132132

“Game involving athletic activity.”

isa

{“tennis”,

“lawn tennis”}

433243

“A game played with rackets by twp or

four players who hit the ball over a net

that divides the court.”

wordnet4
WordNet
  • Organized around concepts (“synsets”), not words
  • Contains:
    • ~100k concepts (“synsets”)
    • ~350k connections (14 types)
    • English definitions (“glosses”) for most synsets

{“athletic game”}

“Game involving athletic activity.”

athletic game

isa

{“tennis”,

“lawn tennis”}

“A game played with rackets by twp or

four players who hit the ball over a net

that divides the court.”

tennis

extended wordnet
Extended WordNet
  • Disambiguate and transform glosses into network representations.

“Tennis court: A court in which tennis is played.”

def

location-of

tennis court

court

play

object

tennis

{“tennis”,

“lawn tennis”}

extended wordnet6
Extended WordNet
  • Disambiguate and transform glosses into network representations.

“Serve: A stroke in tennis that puts the ball in play.”

def

agent

serve

stroke

put

object

manner

context

tennis

ball

play

extended wordnet7
Extended WordNet
  • Resulting structure is no longer just a big graph

Original WordNet

Processed Glossary Definitions

def

ball

ball

def

Concepts in context

(particular subtypes/

situations for concepts)

“Raw” concepts

(isa hierarchy,

other relations)

inference extraction

“The kid hit the ball very hard.”

hit

agent

manner

object

kid

ball

hard

“Inference Extraction”
  • Goals:
    • provide supplementary information about a sentence
    • explain relation between sentences
  • Approach:
    • Deductive inference (e.g., “snore –entails sleep”)
    • Find and add information into the sentence representation
  • Challenge:
    • Many possible connections
path finding
Path-finding

To find path(s) between A and B:

  • use spreading activation/marker passing:
    • place markers at A and B
    • propogate markers to neighboring nodes
    • at quiescence, look for marker collisions
  • “Propogation rules” determine when to propogate
    • “asymmetric and transitive relations are more useful”
    • “going up the isa hierarchy allows hierarchical deductions”
    • “the same is true for relations such as entail and causation. For example, if a man is snoring, then he is sleeping, and further he is temporarily unconscious.”
slide11

“The kid hit the ball very hard.”

hit

agent

manner

object

kid

ball

hard

  • Find connections which “explain” these relations

within context of tennis

within context of ball

context

agent

isa

isa

object-of

hit

game

play

player

person

kid

within context of tennis

within context of ball

agent

agent-of

object

context

object-of

hit

game

play

player

hit

ball

slide12

“The kid hit the ball very hard.”

hit

agent

manner

object

kid

ball

hard

  • Find connections which “explain” these relations

within context of return

within context of drive

manner-of

gloss

(“isa”)

gloss

(“isa”)

context

hard

return

stroke

tennis

within context of tennis

agent

agent-of

object-of

game

play

player

hit

inter sentential global context
Inter-sentential Global Context
  • Find connections between “local contexts”

S1: The kid hit the ball very hard.

S2: It landed almost always near the baseline.

within context of move

isa

gloss

(“isa”)

object

isa

hit

move

change

location

within context of destination

within context of arrive

gloss

(“isa”)

object

gloss

(“isa”)

isa

place

destination

reach

arrive

land

is wordnet or a dictionary sufficient to fully build the context
Is WordNet (or a dictionary) sufficient to fully build the context?

“GPS systems are used for hiking.”

  • QN: Can we relate “GPS” and “hiking” using a dictionary?
  • From Oxford Dictionary:
    • “GPS: a navigation system”
    • “Hiking: long walk in the countryside taken for pleasure”
    • “Walk: place or track or route for foot passengers”
    • “Route: course or way taken from starting point to destination”
  • But:
    • Missing knowledge that hiking involves following/navigating a particular trail, as opposed to just wandering aimlessly
finding and adding extra contextual knowledge from the internet
Finding and Adding Extra, Contextual Knowledge from the Internet
  • WordNet doesn’t contain all the background K
  • So can we addextra K using other texts too?
    • run-time, extra elaboration of current graph
    • further expansion of WordNet?
  • Approach:
    • Start with some initial “seed” text
    • Retrieve paragraphs containing relevant words
    • Elaborate their “local and global contexts”
    • Determine relevance using a similarity measure
    • Select “the most appropriate new context”
    • Add its graph (or parts of it?) to the original graph
finding relevant documents
Finding Relevant Documents
  • Two problems:
    • Discovery: Which keywords to search with?
      • use words in the original seed text, or closely related words
      • e.g., “play AND (tennis OR ball OR baseline) AND hit”
    • Quality: How relevant are the results?
      • measure the degree of overlap of graphs for seed and new texts
  • Lexical ambiguity is a root problem
    • Disambiguation by assuming new words belong to same/close synsets as in the original query (dubious!)
a real example
A Real Example…
  • Text: about player who gets tendinis from hitting ball too hard
  • Build initial graph of sentences (but info missing)
  • Look for additional information on Internet
    • try multiple queries
    • select the best result (= graph most coherent with original text)
    • layer this graph on top of the original text graph
      • Original text + WordNet:
        • hit –isaaffect isa- injure –result injury
        • hit –purpose  land –location backline
      • Internet text:
        • backline –result ace
      • WordNet
        • ace –isa serve –attr unreachable –purpose win
  • Hence (!)
    • “Winning is the motivation for actions causing tennis injuries”
summary
Summary
  • Interesting, ambitious
  • Right idea (used by others too)
  • Didn’t work (?); no further publications on TextNet
  • Critical details not clear from the paper
    • Problem  finding good connections, rather = avoiding finding bad connections