Distance functions and IE – 4?. William W. Cohen CALD. Announcements. Current statistics: days with unscheduled student talks: 6 students with unscheduled student talks: 4 Projects are due: 4/28 (last day of class) Additional requirement: draft (for comments) no later than 4/21.
William W. Cohen
SecondString (Cohen, Ravikumar, Fienberg):
Monge-Elkan is the best on average....
carefully-tuned heuristics (aka hacks)
similar (but not identical process) applied to word n-grams from text to do IE: extract if n-gram -> CD
NF => CD
NF-25 in OD
NF in CD?
“... NF-kappa B...”
In general “peaks” in the matrix scores indicate highly similar substrings.
words and expansions
Overall: precision 71.1%, recall 78.8% (opt)