using ontological relationships to provide indexing of plain t ext searches
Download
Skip this Video
Download Presentation
Using Ontological Relationships to Provide Indexing of Plain T ext Searches

Loading in 2 Seconds...

play fullscreen
1 / 12

Using Ontological Relationships to Provide Indexing of Plain T ext Searches - PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on

Using Ontological Relationships to Provide Indexing of Plain T ext Searches. Research by Fletcher Liverance [email protected] November 14 th , 2011. How Does a Search Engine Work?. 1. User submits a keyword based query to the search engine.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Using Ontological Relationships to Provide Indexing of Plain T ext Searches' - udell


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
using ontological relationships to provide indexing of plain t ext searches

Using Ontological Relationships to Provide Indexing of Plain Text Searches

Research by Fletcher Liverance

[email protected]

November 14th, 2011

how does a search engine work
How Does a Search Engine Work?

1. User submits a keyword based query to the search engine

4. Pages are ranked and returned to the user

2. The indexer locates all relevant pages containing those keywords

3. The database returns all pages found in the index

how does a search engine work1
How Does a Search Engine Work?

Benefits

  • Fast
  • Machine learnable
  • Straight forward

Drawbacks

  • Pattern matching
  • Keyword based
  • Garbage in, garbage out
garbage in garbage out
Garbage in, Garbage out

Scenario

You saw this television series and you’d like to find out more about it, but you don’t know what the name of the series or any of the characters are.

What do you do?

http://www.dan-dare.org/FreeFun/Images/CartoonsMoviesTV/WinnieThePoohWallpaper1024.jpg

semantic relationships
Semantic Relationships
  • Ontology

“An ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents.”http://www-ksl.stanford.edu/kst/what-is-an-ontology.html

  • Resource Description Framework (RDF)

“RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link. Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.”

http://www.w3.org/RDF/

Disney

Winnie the Pooh

Bear

isMadeBy

isA

hasFriend

hasClothing

hasColor

Piglet

Shirt

Yellow

hasColor

isA

Pig

Red

semantic relationships1
Semantic Relationships

How can we locate useful semantic relationships?

  • Link Distance
  • Link Direction
  • Link Relationship

Bear

Disney

hasColor

isA

isA

isMadeBy

isA

Company

Brown

Winnie the Pooh

Mammal

hasFriend

hasClothing

hasColor

Piglet

Shirt

Yellow

hasColor

isA

hasRGB

Pig

Red

0xFFFF00

modified search indexing
Modified Search Indexing

1. User submits a keyword based query to the search engine

4. Searches are ranked and returned to the user as additional search suggestions

2. Search analyzer creates additional searches based on ontological information

3. Search engine performs parallel searches of top search terms

current work
Current Work
  • NASA SWEET Ontologies
    • 6000 concepts
    • 200 ontologies
    • Scientific
    • Loose relationships
  • National Oceanographic and Atmospheric Administration
    • 30+ years of scientific research
    • Text based
    • Unsorted
    • 2+ gigabytes
    • Domain specific terminology
challenges future work
Challenges & Future Work
  • How to rank plain text
    • No links or history
    • No ‘page views’
  • Limited ontology coverage
    • 6000 concepts in NASA SWEET ontologies
    • ~170,000 words in the English language
    • Many more unique names and scientific terms
    • How can ontologies be automatically generated?
  • Graph matching
    • Identifying related terms in a large graph is difficult
    • Multiple links per node, must identify appropriate links
ad