Googledictionary
Download
1 / 11

GoogleDictionary - PowerPoint PPT Presentation


  • 165 Views
  • Updated On :

GoogleDictionary. Paul Nepywoda Alla Rozovskaya. Goal. Develop a tool for English that, given a word, will illustrate its usage. Who Will Benefit. Learners of English Teachers of English Native speakers who wish to find common usages of a word. Similar Tools?. Dictionaries BUT our tool

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'GoogleDictionary' - ipo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Googledictionary l.jpg

GoogleDictionary

Paul Nepywoda

Alla Rozovskaya


Slide2 l.jpg
Goal

  • Develop a tool for English that, given a word, will illustrate its usage


Who will benefit l.jpg
Who Will Benefit

  • Learners of English

  • Teachers of English

  • Native speakers who wish to find common usages of a word


Similar tools l.jpg
Similar Tools?

  • Dictionaries

    BUT our tool

  • focuses on the usage of words and not on

    defining their meanings

  • ranks expressions based on frequency

  • extracts examples straight from context


Similar tools5 l.jpg
Similar Tools?

  • Google

    BUT our tool

    • focuses on finding high frequency neighboring words instead of simply the documents that contain the target word


Data resources l.jpg
Data Resources

  • Corpus of newspaper articles (3.5 Million words) [used for demo]

  • Advantage: large amount of data

  • Disadvantage: limited domain

  • Use a search engine to build a corpus of documents containing the target word

  • Advantages: various domains, dynamic data source

  • Disadvantage: time to download documents


Implementation 1 l.jpg
Implementation (1)

  • Search a corpus to determine the most typical words by extracting words within a certain window of the target word and rank words based on their frequencies

    -compute rank of single words and pairs of words within a window


Implementation 2 l.jpg
Implementation (2)

  • Computing rank of expression

  • Tf :raw count

  • Idf of a word :

  • Position Normalization:

    Reward context words closer to the target


Interface l.jpg
Interface

  • Output ranked list of expressions with

    example sentences via the Web

    Examples:

    course

    information

    notorious

    come

    come(without idf)


Further improvements l.jpg
Further Improvements

  • Use a search engine to build a corpus

  • Allow phrase searching

  • Provide option to search for highly frequent phrases as opposed to idiomatic expressions


Conclusion l.jpg
Conclusion

  • We have presented a tool that given a word will find typical usages of the word in natural language

  • The tool should be useful for

    • learners of English

    • native speakers


ad