150 likes | 227 Views
This homework task requires systematically searching for gapping constructions in the WSJ corpus using tregex. Report findings in the next class.
E N D
LING 581: Advanced Computational Linguistics Lecture Notes February 2nd
tregex • Assuming corpus wsj-00-24-tregex.mrg and Java runtime memory setting –mx1000m
TREEBANK_3/docs/prsguid1.pdf • Homework Task • Systematically tregex search patterns for selected constructions from the Bracketing Guidelines, e.g. Gapping • Report on how many constructions are found in the Wall Street Journal text • Present your results next time in class
Example: looking for passives Pattern: using variable names and regex group numbering for coindexation matching for passives (NP-SBJ-i and object of VP [NP [ –NONE- [ -*-I ]]])