1 / 39

Can Lexical Semantics Predict Grammaticality in English Support- Verb-Nominalization Constructions?

Can Lexical Semantics Predict Grammaticality in English Support- Verb-Nominalization Constructions?. Anthony Davis Leslie Barrett CodeRyte , Inc. TheLadders.com. English SVN constructions. s upport verb + nominalization (direct object) ≈ main verb (root of nominalization)

derica
Download Presentation

Can Lexical Semantics Predict Grammaticality in English Support- Verb-Nominalization Constructions?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Can Lexical Semantics Predict Grammaticality in English Support- Verb-Nominalization Constructions? Anthony Davis Leslie Barrett CodeRyte, Inc. TheLadders.com

  2. English SVN constructions • support verb + nominalization (direct object) ≈ main verb (root of nominalization) • ‘take a walk’ ≈ ‘walk’, ‘hold a belief’ ≈ ‘believe’ • Attested/acceptable SVN combinations appear somewhat (though not completely) arbitrary • ?‘make a walk’, but ‘make a decision’ (cf. French ‘prendreunedécision’) • ‘have/harbor a belief’

  3. English SVN constructions • How much is predictable and how much just idiomatic? Can semantic properties of support verbs and nominalizations account for the observed combinations?

  4. English SVN constructions • How much is predictable and how much just idiomatic? Can semantic properties of support verbs and nominalizations account for the observed combinations? • Semantic compatibility of shared argument(s): • ‘feel pity’, ‘perform/undergo an evaluation’ • But many cases are less clear: ‘take a bath’, ‘hold the belief/?knowledge’, ‘have the belief/knowledge’

  5. English SVN constructions • How much is predictable and how much just idiomatic? Can semantic properties of support verbs and nominalizations account for the observed combinations? • Aspectual or Aktionsart compatibility: • For instance, stative SV and N: ‘have/hold/harbor a belief/dislike’ • Less clear for other classes

  6. English SVN constructions • How much is predictable and how much just idiomatic? Can semantic properties of support verbs and nominalizations account for the observed combinations? • Levin class features (fine-grained semantic properties)…

  7. Strategy of this research • Find SVN combinations in a corpus • Cluster support verbs and nominalizations by Levin class features • Test for statistically significant effects • Evaluation and conclusion

  8. Finding SVN combinations • Use pointwise mutual information (PMI) between verbs and the heads of their direct objects to find verb-object pairs that are strongly associated • Select those with high PMI values that are SVN constructions (or at least plausibly are, in most cases)

  9. PMI between verbs and their objects • PMI measures how closely two events are associated: • Here, we calculate PMI values over ~150 million words of parsed New York Times articles (we used the XLE parser from PARC and discarded sentences longer than 25 words)

  10. Examples of PMI values • Some objects of ‘eat’ • We also calculated PMI for broader semantic classes of terms, e.g.: food, substance. Semantic classes were taken from Cambridge International Dictionary of English (CIDE); there are about 2000 of them, arranged in a shallow hierarchy verb=eat obj=hamburger 139.249 obj=pretzel 90.359 objin class Food 18.156 objin class Substance 7.89 objin class Sound 0.324 objin class Place 0.448

  11. Examples of PMI values • 20 highest PMI values for SVN combinations:

  12. Finding SVN combinations • We examined the 200 highest-PMI verb-object combinations (PMI > 5) in which the verb is commonly a support verb, and selected the 146 of them that actually appear to be SVN constructions for further analysis • These combinations contain 21 support verbs and 118 nominalizations

  13. Levin-class features • Levin (1993) is a well-known study of verb diathesis alternations and their underlying lexical semantics • Several thousand verbs are categorized by the alternations they exhibit, and in groups with others verbs displaying the same set of alternations • We use these categories as features of support verbs and the verb roots of nominalizations

  14. Levin-class examples • 2. Alternations involving arguments within VP • 2.1 Dative alternation • 2.2 Benefactive alternation • 2.3 Locative alternation … • 26. Verbs of creation and transformation • 26.1 build verbs • 26.2 grow verbs • 26.3 verbs of preparing

  15. Levin-class features • For each verb (support verb or root of nominalization), create a vector of binary features from its Levin-class memberships: • Example: give

  16. Levin-class features • For each verb (support verb or root of nominalization), create a vector of binary features from its Levin-class memberships: • Example: give • Levin-classes: 1.1.2.1, 2.1, 13.1

  17. Levin-class features • For each verb (support verb or root of nominalization), create a vector of binary features from its Levin-class memberships: • Example: give • Levin-classes: 1.1.2.1, 2.1, 13.1 • vector

  18. Two ways to cluster the vectors • Concatenate the vectors of the support verb and the nominalization for each of the 146 SVN constructions

  19. Two ways to cluster the vectors • Concatenate the vectors of the support verb and the nominalization for each of the 146 SVN constructions • support verbnominalization

  20. Two ways to cluster the vectors • Concatenate the vectors of the support verb and the nominalization for each of the 146 SVN constructions • support verbnominalization SV vector nom vector

  21. Two ways to use feature vectors • Concatenate the vectors of the support verb and the nominalization for each of the 146 SVN constructions • support verbnominalization SV vector nom vector concatenated SVN vector

  22. Two ways to use feature vectors • Concatenate the vectors of the support verb and the nominalization for each of the 146 SVN constructions • support verbnominalization SV vector nom vector then cluster these concatenated SVN vectors concatenated SVN vector

  23. Two ways to use feature vectors • Alternatively, cluster the two sets of vectors separately... SV vectors nom vectors

  24. Two ways to use feature vectors • … and look for correlations between SV clusters and nom clusters in the 146 SVN pairs nom vectors SV vectors more pairs than expected here fewer pairs than expected here

  25. Clustering concatenated vectors • 146 SVN pairs clustered into 4, 5, 6, or 7 clusters • CLUTO (Karypis, et al), using “direct” clustering method and cosine similarity metric • Resulting clusters (in 7-way clustering) are a mixed bag… • All and only the 12 pairs with take as support verb • All 13 pairs with feel, plus 3 (of 8) with suffer • Nominalizations denoting emotion (e.g., (‘harbor disdain/resentment’, ‘extend appreciation’) • Nominalizations denoting creation, transformation, or destruction (‘undergo transformation/conversion’, ‘suffer alteration/devastation’, ‘perform extermination’)

  26. Significance of clusters • Does the average PMI of SVN pairs differ significantly across clusters? • Can’t make any assumptions about distributions of PMI scores, so we use score ranks • Test with Kruskal-Wallace analysis of variance (still assumes, perhaps wrongly, identical distributions of ranks; test is for equality of medians) • Test statistic is:

  27. Significance of clusters • Results fall short of significance (P ≈ 0.08) • No support for “better” clusters having significantly higher PMI values • Features of individual support verbs (like take)may overwhelm any semantic effects

  28. Clusters of support verbs

  29. Clusters of nominalizations • In all clusterings (4-7 clusters), cluster 0 has the same members: • ‘ache’, ‘admiration’, ‘appreciation’, ‘desire’, ‘dislike’, ‘enjoyment’, ‘evaluation’, ‘feeling’, ‘need’, ‘pity’, ‘regret’, ‘resentment’, ‘respect’, ‘reverence’, ‘taste’, ‘trust’, ‘veneration’, ‘want’ • Clearly, there’s some underlying semantic similarity here (emotion, sensation, judgment)

  30. Contingency tables for SVN pairs • We examined the distribution of SVN pairs by the cluster membership of their support verbs and nominalizations, for all clusterings • Example for 3 SV and 4 nom clusters:

  31. Chi-squared tests

  32. Chi-squared, “leaving one out” Does significance vanish when one cluster is removed? (3 support verb & 4 nominalization clusters) yes

  33. What’s the source of the significance? • SV cluster 2: • feel, harbor, hold, maintain, suffer • The most distinguishing and discriminative Levin class features are “29: verbs with predicative complements” and its subclass “29.5: conjecture verbs’ • SV cluster 3: • create, effect, form, have, make, perform, take • The most distinguishing and discriminative Levin class feature is “26: verbs of creation and transformation”

  34. What’s the source of the significance? • Nominalization cluster 0: • ache, admiration, appreciation, desire, dislike, etc. • The most distinguishing and discriminative Levin class features are “2: Alternations involving arguments within VP” and some of its subclass features • This effect is unsurprising • Support verbs denoting creation or transformation aren’t a good semantic match for nominalizations denoting emotion, sensation, or judgment • The number of SVN pairs with SV from cluster 2 and Nom from cluster 0 is low in our tables • However, the Levin-class features characterizing Nom cluster 0 are not directly related to this semantic mismatch

  35. Evaluation • Overall, the Levin-class features appear not to be the key to understanding semantic regularities (to the extent they exist) in SVN constructions; why?

  36. Evaluation • Overall, the Levin-class features appear not to be the key to understanding semantic regularities (to the extent they exist) in SVN constructions; why? • The data and analysis we have employed here fail to reveal the genuine relationship between the semantic factors underlying Levin classes and those underlying the acceptability of SVN constructions

  37. Evaluation • Overall, the Levin-class features appear not to be the key to understanding semantic regularities (to the extent they exist) in SVN constructions; why? • The semantic factors underlying the acceptability of SVN constructions are indeed different from those underlying Levin classes, so no strong correlation is to be expected

  38. Evaluation • Overall, the Levin-class features appear not to be the key to understanding semantic regularities (to the extent they exist) in SVN constructions; why? • The role of semantic factors in the acceptability of SVN constructions is overshadowed by other considerations that we have not tested for • SVN acceptability is probably somewhat arbitrary; therefore, no strong correlation is to be expected

  39. ¡Thanks! ¿Questions? Thanks to Oliver Jojic and Robert Rubinoff at StreamSage for the PMI calculations, to Shachi Dave at StreamSage for running the XLE parser, and to PARC for the use of the parser.

More Related