lecture 24 distributiona l based similarity ii
Download
Skip this Video
Download Presentation
Lecture 24 Distributiona l based Similarity II

Loading in 2 Seconds...

play fullscreen
1 / 14

Lecture 24 Distributiona l based Similarity II - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Lecture 24 Distributiona l based Similarity II. CSCE 771 Natural Language Processing. Topics Distributional based word similarity Readings: NLTK book Chapter 2 ( wordnet ) Text Chapter 20. April 10, 2013. Overview. Last Time (Programming) Examples of thesaurus based word similarity

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Lecture 24 Distributiona l based Similarity II' - oliver-gill


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lecture 24 distributiona l based similarity ii
Lecture 24Distributional based Similarity II

CSCE 771 Natural Language Processing

  • Topics
    • Distributional based word similarity
  • Readings:
      • NLTK book Chapter 2 (wordnet)
      • Text Chapter 20

April 10, 2013

overview
Overview
  • Last Time (Programming)
    • Examples of thesaurus based word similarity
      • path-similarity – memory fault ; sim-path(c1,c2) = -log pathlen(c1,c2)nick, Lin
    • extended Lesk – glosses of words need to include hypernyms
  • Today
    • Distributional methods
  • Readings:
    • Text 19,20
    • NLTK Book: Chapter 10
  • Next Time: Distributional based Similarity II
figure 20 8 summary of thesaurus similarity measures
Figure 20.8 Summary of Thesaurus Similarity measures
  • Elderly moment IS-A memory fault IS-A mistake
  • sim-path correct in table
example computing ppmi
Example computing PPMI
  • Need counts so lets make up some
    • we need to edit this table to have counts
associations
Associations
  • PMI-assoc
  • assocPMI(w, f) = log2 P(w,f) / P(w) P(f)
  • Lin- assoc - f composed of r (relation) and w’
  • assocLIN(w, f) = log2 P(w,f) / P(r|w) P(w’|w)
  • t-test_assoc (20.41)
figure 20 10 co occurrence vectors
Figure 20.10 Co-occurrence vectors
  • Dependency based parser – special case of shallow parsing
  • identify from “I discovered dried tangerines.” (20.32)
    • discover(subject I) I(subject-of discover)
    • tangerine(obj-of discover) tangerine(adj-mod dried)
vectors review
vectors review
  • dot-product
  • length
  • sim-cosine
slide14
http://www.cs.ucf.edu/courses/cap5636/fall2011/nltk.pdf how to do in nltk
  • NLTK 3.0a1 released : February 2013
  • This version adds support for NLTK’s graphical user interfaces. http://nltk.org/nltk3-alpha/
  • which similarity function in nltk.corpus.wordnet is Appropriate for find similarity of two words?
  • I want use a function for word clustering and yarowskyalgorightm for find similar collocation in a large text.
  • http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Linguistics
  • http://en.wikipedia.org/wiki/Portal:Linguistics
  • http://en.wikipedia.org/wiki/Yarowsky_algorithm
  • http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html
ad