1 / 77

CONCEPTUAL KNOWLEDGE: EVIDENCE FROM CORPORA, THE MIND, AND THE BRAIN

CONCEPTUAL KNOWLEDGE: EVIDENCE FROM CORPORA, THE MIND, AND THE BRAIN. Massimo Poesio Uni Trento, Center for Mind / Brain Sciences Uni Essex, Language & Computation (joint work with A. Almuhareb, E. Barbu, M. Baroni, B. Murphy). MOTIVATIONS.

arleen
Download Presentation

CONCEPTUAL KNOWLEDGE: EVIDENCE FROM CORPORA, THE MIND, AND THE BRAIN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CONCEPTUAL KNOWLEDGE: EVIDENCE FROM CORPORA, THE MIND, AND THE BRAIN Massimo PoesioUni Trento, Center for Mind / Brain SciencesUni Essex, Language & Computation(joint work with A. Almuhareb, E. Barbu, M. Baroni, B. Murphy)

  2. MOTIVATIONS • Research on conceptual knowledge is carried out in Computational Linguistics, Neural Science, and Psychology • But there is limited interchange between CL and the other disciplines studying concepts • Except indirectly through the use of WordNet • This work: use data from Psychology and Neural Science to evaluate (vector-space) models produced in CL

  3. OUTLINE • Vector space representations • A `semantic’ vector space model • How to evaluate such models • Attribute extraction and Feature norms • Category distinctions and Brain data

  4. CONCEPTUAL SEMANTICS IN VECTOR SPACE

  5. LEXICAL ACQUISITION IN CORPUS / COMP LING • Vectorial representations of lexical meaning derived from IR • WORD-BASED vector models: • vector dimensions are words • Schuetze 91, 98; HAL, LSA, Turney, Rapp • GRAMMATICAL RELATION models: • vector dimensions are pairs <Rel,W> • Grefenstette 93, Lin 98, Curran&Moens, Pantel, Widdows, Pado & Lapata, …..

  6. FEATURES IN VECTOR SPACE MODELS GRAMMATICAL RELATIONS WORDS

  7. STRENGHTS OF THIS APPROACH: CATEGORIZATION

  8. LIMITATIONS OF THIS WORK • Very simplistic view of concepts • In fact, typically extract lexical representations for WORDS (non-disambiguated) • Limited evaluation • Typical evaluation: judges’ opinions about correctness of distances / comparing with WordNet • Most work not connected with work on concepts in Psychology / Neural Science

  9. OUR WORK • Acquire richer, more semantic-oriented concept descriptions by exploiting relation extraction techniques • Develop task-based methods for evaluating the results • Integrate results from corpora with results from psychology & neural science

  10. THIS TALK • Acquire richer, more semantic-oriented concept descriptions by exploiting relation extraction techniques • Develop task-based methods for evaluating the results • Integrate results from corpora with results from psychology & neural science

  11. OUTLINE • Vector space representations • A `semantic’ vector space model • How to evaluate such models • Attribute extraction and Feature norms • Category distinctions and Brain data

  12. MORE ADVANCED THEORIES OF CONCEPTS • In Linguistics: • Pustejovsky • In AI: • Description Logics • Formal ontologies • In Psychology: • Theory Theory (Murphy, 2002) • FUSS (Vigliocco Vinson et al)

  13. SEMANTIC CONCEPT DESCRIPTIONSPUSTEJOVSKY (1991, 1995) • Lexical entries have a QUALIA STRUCTURE consisting of four ‘roles’ • FORMAL role: what type of object it is (shape, color, ….) • CONSTITUTIVE role: what it consists of (parts, stuff, etc.) • E.g., for books, chapters, index, paper …. • TELIC role: what is the purpose of the object (e.g., for books, READING) • AGENTIVE role: how the object was created (e.g., for books, WRITING)

  14. BEYOND BUNDLES OF ATTRIBUTES: DESCRIPTION LOGICS, THEORY THEORY • We know much more about concepts than the fact that they have certain attributes: • We know that cars have 4 wheels whereas bicycles have 2 • We don’t just know that people have heads, bodies and legs, but that heads are attached in certain positions whereas legs are attached in other ones • Facts of this type can be expressed even in the simplest concept description languages, those of description logics

  15. BEYOND SIMPLE RELATIONS: DESCRIPTION LOGICS Bear  (and Animal ( 4 Paw) …) Strawberry  (and Fruit (fills Color red) … ) Female  (and Human ( Male))

  16. WORD SENSE DISCRIMINATION • The senses of palm in WordNet • the inner surface of the hand from the wrist to the base of the fingers • a linear unit based on the length or width of the human hand • any plant of the family Palmae having an unbranched trunk crowned by large pinnate or palmate leaves • an award for winning a championship or commemorating some other event

  17. CONCEPT ACQUISITION MEETS RELATION EXTRACTION • We developed methods to identify SEMANTIC properties of concepts (`Deep lexical relations’) • ATTRIBUTES and their VALUES • Almuhareb & Poesio 2004, 2005 • Extracting QUALIA • Poesio & Almuhareb 2005 • Letting relations emerge from the data: STRUDEL • Baroni et al, Cognitive Science to appear • Extracting Wu & Barsalou –style relations • Poesio Barbu Giuliano & Romano, 2008 We showed that for a variety of tasks such conceptual descriptions are ‘better’ than word-based or grammatical function-based descriptions

  18. ALMUHAREB & POESIO 2005: USING A PARSER LOOKING ONLY FOR (POTENTIAL) ATTRIBUTES AND THEIR VALUES BETTER THAN USING ALL GRS EVEN IF ATTRIBUTES OBTAINED USING TEXT PATTERNS (“THE X OF THE Y” )

  19. ATTRIBUTES AND VALUES VS. ALL CORPUS FEATURES

  20. SUPERVISED EXTRACTION OF CONCEPT DESCRIPTIONS • Using a theory of attributes merging ideas from Pustejovsky and Guarino (Poesio and Almuhareb, 2005) • Using Wu and Barsalou’s theory of attributes (Poesio Barbu Romano & Giuliano, 2008)

  21. SUPERVISED EXTRACTION OF CONCEPT DESCRIPTIONS • Using a theory of attributes merging ideas from Pustejovsky and Guarino (Poesio and Almuhareb, 2005) • Using Wu and Barsalou’s theory of attributes (Poesio Barbu Romano & Giuliano, 2008)

  22. THE CLASSIFICATION SCHEME FOR ATTRIBUTES OF POESIO & ALMUHAREB 2005 • PART • (cfr. Guarino’s non-relational attributes, Pustejovsky’s constitutive roles) • RELATED OBJECT • Non-relational attributes other than parts, relational roles • QUALITY • Guarino’s qualities, Pustejovsky’s formal roles • ACTIVITY • Pustejosvky’s telic and agentive roles • RELATED AGENT • NOT AN ATTRIBUTE (= everything else)

  23. A SUPERVISED FEATURE CLASSIFIER • We developed a supervised feature classifier that relies on 4 types of information • Morphological info (Dixon, 1991) • Question patterns • Features of features • Feature use • Some nouns used more commonly as features than as concepts: i.e., “the F of the C is” more frequent than “the * of the F is” • (These last four methods all rely on info extracted from the Web)

  24. THE EXPERIMENT • We created a BALANCED DATASET • ~ 400 concepts • representing all 21 WordNet classes, including both ABSTRACT and CONCRETE concepts • balanced as to ambiguity and frequency • We collected from the Web 20,000 candidate features of these concepts using patterns • We hand-classified 1,155 candidate features • We used these data to train • A binary classifier (feature / non feature) • A 5-way classifier

  25. OUTLINE • Vector space representations • An example of `Semantic-based’ vector space model • Evaluating such models • Attribute extraction and Feature norms • Category distinctions and Brain data

  26. EVALUATION • Qualitative: • Visual inspection • Ask subjects to assess correctness of the classification of the attributes • Quantitative: • Use conceptual descriptions for CLUSTERING (CATEGORIZATION)

  27. VISUAL EVALUATION: TOP 400 FEATURES OF DEER ACCORDING TO OUR CLASSIFIER

  28. VISUAL EVALUATION: QUALITIES

  29. QUANTITATIVE EVALUATION • ATTRIBUTES • PROBLEM: can’t compare against WordNet • Precision / recall against hand-annotated datasets • Human judges (ourselves): • We used the classifiers to classify the top 20 features of 21 randomly chosen concepts • We separately evaluated the results • CATEGORIES: • Clustering of the balanced dataset • PROBLEM: The WordNet category structure is highly subjective

  30. ATTRIBUTE CLASSIFICATION

  31. CLUSTERING WITH 2-WAY CLASSIFIER

  32. CLUSTERING: ERROR ANALYSIS

  33. CLUSTERING: ERROR ANALYSIS

  34. CLUSTERING: ERROR ANALYSIS

  35. CLUSTERING: ERROR ANALYSIS IN WORDNET: PAIN

  36. LIMITS OF THIS TYPE OF EVALUATION • No way of telling how complete / accurate are our concept descriptions • Both in terms of relations and in terms of their relative importance • No way of telling whether the category distinctions we get from WordNet are empirically founded

  37. BEYOND JUDGES / EVALUATION AGAINST WORDNET • Task-based evaluation • Evidence from other areas of cognitive science • (ESSLLI 2008 Workshop - Baroni / Evert / Lenci )

  38. TASK-BASED (BLACK-BOX) EVALUATION • Tasks requiring lexical knowledge: • Lexical tests: • TOEFL test (Rapp 2001, Turney 2005) • NLP tasks: • Eg, anaphora resolution (Poesio et al 2004) • Actual applications • E.g., language models (Mitchell & Lapata ACL 2009, Lapata invited talk)

  39. EVIDENCE FROM OTHER AREAS OF COGNITIVE SCIENCE • Attributes: evidence from psychology • Association lists (priming) • E.g., use results of association tests to evaluate proximity (Lund et al, 1995; Pado and Lapata, 2008) • Comparison against feature norms: Schulte im Walde, 2008) • Feature norms • Category distinctions: evidence from neural science

  40. OUTLINE • Vector space representations • An example of `Semantic-based’ vector space model • How to evaluate such models • Attribute extraction and Feature norms • Category distinctions and Brain data

  41. FEATURE-BASED REPRESENTATIONS IN PSYCHOLOGY • Feature-based concept representations assumed by many cognitive psychology theories (Smith and Medin, 1981, McRae et al, 1997) • Underpin development of prototype theory (Rosch et al) • Used, e.g., to account for semantic priming (McRae et al, 1997; Plaut, 1995) • Underlie much work on category-specific defects (Warrington and Shallice, 1984; Caramazza and Shelton, 1998; Tyler et al, 2000; Vinson and Vigliocco, 2004)

  42. FEATURE NORMS • Subjects produce lists of features for a concept • Weighed by number of subjects that produce them • Several existing (Rosch and Mervis, Garrard et al, McRae et al, Vinson and Vigliocco) • Substantial differences in collection methodology and results

  43. SPEAKER-GENERATED FEATURES (VINSON AND VIGLIOCCO)

  44. COMPARING CORPUS FEATURES WITH FEATURE NORMS (Almuhareb et al 2005, Poesio et al 2007) • 35 concepts in common between the Almuhareb & Poesio dataset and the dataset produced by Vinson and Vigliocco (2002, 2003) • ANIMALS: bear, camel, cat, cow, dog, elephant, horse, lion, mouse, sheep, tiger, zebra • FRUIT: apple, banana, cherry, grape, lemon, orange, peach, pear, pineapple, strawberry, watermelon • VEHICLE: airplane, bicycle, boat, car, helicopter, motorcycle, ship, truck, van • We compared the features we obtained for these concepts with the speaker-generated features collected by Vinson and Vigliocco

  45. RESULTS • Best recall: ~ 52% (using all attributes and values) • Best precision: ~ 19% • But: high correlation (ro=.777) between the distances between concept representations obtained from corpora, and the distances between the representations for the same concepts obtained from subjects (using the cosine as a measure of similarity)

  46. DISCUSSION • Substantial differences in features and overlap, but correlation similar • Problems: • Each feature norm slightly different • They have been normalized by hand: LOUD, NOISY, NOISE all mapped to LOUD

  47. AN EXAMPLE: STRAWBERRY

  48. Problem: differences between feature norms • motorcycle • Vinson & Vigliocco: • wheel, motor, loud, vehicle, wheel, fast, handle, ride, transport, bike, human, danger, noise, seat, brake, drive, fun, gas, machine, object, open, small, travel, wind • Garrard et al: • vehicle, wheel, fast, handlebar, light, seat, make a noise, tank, metal, unstable, tyre, coloured, sidecar, indicator, pannier, pedal, speedometer, manoeuvrable, race, brakes, stop, move, engine, petrol, economical, gears • McRae et al: • wheels, 2_wheels, dangerous, engine, fast, helmets, Harley_Davidson, loud, 1_or_2_people, vehicle, leather, transportation, 2_people, fun, Hell's_Angels, gasoline • Mutual correlation of ranks ranges from 0.4 to 0.7

  49. DISCUSSION • Preliminary conclusions: need to collect new feature norms for CL • E.g., use similar techniques to collect attributes for WordNet • See Kremer & Baroni 2008 • For more work on using feature norms for conceptual acquisition, see • Schulte im Walde 2008 • Baroni et al to appear • For the correlation between feature norms and information in WordNet (meronymy, isa, plus info from glosses): Barbu & Poesio GWC 2008

  50. OUTLINE • Vector space representations • An example of `Semantic-based’ vector space model • How to evaluate such models • Attribute extraction and Feature norms • Category distinctions and brain data

More Related