Informatica umanistica d lessicografia computer
This presentation is the property of its rightful owner.
Sponsored Links
1 / 34

INFORMATICA UMANISTICA D: LESSICOGRAFIA & COMPUTER PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

INFORMATICA UMANISTICA D: LESSICOGRAFIA & COMPUTER. Dizionari elettronici WordNet. Dizionari elettronici. Strumenti informatici usati non piu’ solo per realizzare dizionari cartacei, ma per sviluppare nuovi tipi di dizionari che consentono nuove forme di ricerca.

Download Presentation

INFORMATICA UMANISTICA D: LESSICOGRAFIA & COMPUTER

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Informatica umanistica d lessicografia computer

INFORMATICA UMANISTICA D: LESSICOGRAFIA & COMPUTER

Dizionari elettronici

WordNet


Dizionari elettronici

Dizionari elettronici

Strumenti informatici usati non piu’ solo per realizzare dizionari cartacei, ma per sviluppare nuovi tipi di dizionari che consentono nuove forme di ricerca


Dizionari per l inglese in forma elettronica

DIZIONARI PER L’INGLESE IN FORMA ELETTRONICA

  • Oxford English Dictionary, seconda edizione

  • Oxford Talking Dictionary

  • Concise Oxford Dictionary

  • Learner dictionaries:

    • Longman Dictionary of Contemporary English (LDOCE)

    • Collins COBUILD English Dictionary


Concise oxford dictionary

CONCISE OXFORD DICTIONARY

  • RICERCA:

    • Headword search (con *)

    • Hypertext search

    • Full text search (also of phrases / groups)

  • FILTRI:

    • etymology, phrasal verbs, suffixes


Collins cobuild

COLLINS: COBUILD

  • Disponibile da:

    • http://www.biblio.unitn.it/BancheDati/BancheDati.asp


Dizionari elettronici per l italiano

DIZIONARI ELETTRONICI PER L’ITALIANO

  • Il VELI

  • Zanichelli: CD-ROM Multilingue, Scaffale Elettronico

  • Devoto-Oli

  • Garzanti: IPA  `parla’


Devoto oli

DEVOTO-OLI


Esempio devoto oli

ESEMPIO: DEVOTO-OLI

  • Ricerca normale

    • Forme di citazione (incrementale)

  • Hyperlinks

  • Definizione / declinazione

  • Sinonimi / contrari

  • Ricerca avanzata

  • No: pronuncia; citazioni?

  • Limitato: storico


Devoto oli sinonimi e contrari

DEVOTO-OLI: SINONIMI E CONTRARI


Esempio zingarelli interattivo

ESEMPIO:ZINGARELLI INTERATTIVO


Informatica umanistica d lessicografia computer

MRDS

  • Distinzione importante:

    • Dizionari consultabili elettronicamente

    • Dizionari MACHINE READABLE

    • Dizionari MACHINE TRACTABLE

  • Particolarmente utili: dizionari creati per EFL:

    • LDOCE

    • COBUILD

  • Progetto piu’ ambizioso: ODE in XML


Esempio ode su cd rom in xml

ESEMPIO: ODE su CD-ROM (in XML)

Esempio di database lessicografico in XML (= estremamente machine tractable)


Ode in xml overview

ODE IN XML: OVERVIEW


Ode in xml formato delle entries

ODE IN XML: FORMATO DELLE ENTRIES

<se>

<cn>815750</cn>

- <hg> <hw>stock</hw> </hg>

<s1>

<ps>noun</ps>

- <s2 num="1">

- <df>the goods or merchandise kept on the premises of a shop or warehouse and available for sale or distribution:</df>

<ex>the store has a very low turnover of stock</ex>

|

  </S2>

<S2 num=“2”>

…… </S2>

</S1> <s1> <ps>adjective</ps>

…..


Ode in xml informazioni nlp

ODE IN XML: INFORMAZIONI NLP

-<nlp>

<sup>merchandise</sup>

<ss>Commerce</ss>

- <morph id="01">

- <mu sy="NN">

<inf>stock</inf>

<ph>stQk</ph>

</mu>

+ <mu sy="NNS">

<ph>stQks</ph>

</mu>

</morph>

</nlp>


Eldit

ELDIT

  • (Elektronisches Lern(er)wörterbuch Deutsch-Italienisch – Dizionario elettronico per apprendenti italiano-tedesco )

  • Un esempio di dizionario

    • Per apprendimento

    • Nato in forma elettronica

  • Lezione su ELDIT: il 14/5


Wordnet

WordNet


Semantica lessico un riassunto

EAT-LEX-1

SEMANTICA & LESSICO: UN RIASSUNTO

“eat”

“eats”

eat0600

eat0700

“ate”

“eaten”

WORD-FORMS

LEXEMES

SENSES


L organizzazione del lessico

STOCK-LEX-1

STOCK-LEX-2

STOCK-LEX-3

L’ORGANIZZAZIONE DEL LESSICO

stock0100

stock0200

stock0600

“stock”

stock0700

stock0900

stock1000

WORD-FORMS

LEXEMES

SENSES


Sinonimia

CHEAP-LEX-1

CHEAP-LEX-2

INEXP-LEX-3

SINONIMIA

cheap0100

“cheap”

….

……

cheapXXXX

inexp0900

“inexpensive”

inexpYYYY

WORD-FORMS

LEXEMES

SENSES


Wordnet1

WORDNET

  • A lexical database created at Princeton

    • Freely available for research from the Princeton site

    • http://www.cogsci.princeton.edu/~wn/

  • Information about a variety of SEMANTICAL RELATIONS

  • Three sub-databases (supported by psychological research as early as (Fillenbaum and Jones, 1965))

    • NOUNs

    • VERBS

    • ADJECTIVES and ADVERBS

  • Each database organized around SYNSETS


Synsets

SYNSETS

  • Senses (or `lexicalized concepts’) are represented in WordNet by the set of words that can be used in AT LEAST ONE CONTEXT to express that sense / lexicalized concept: the SYNSET

  • E.g.,

    {chump, fish, fool, gull, mark, patsy, fall guy, sucker, shlemiel, soft touch, mug}(gloss: person who is gullible and easy to take advantage of)


Il database dei nomi

IL DATABASE DEI NOMI

  • About 90,000 forms, 116,000 senses

  • Relations:


Ipernimia

IPERNIMIA

2 senses of robin                                                       Sense 1robin, redbreast, robin redbreast, Old World robin, Erithacus rubecola -- (small Old World songbird with a reddish breast)       => thrush -- (songbirds characteristically having brownish upper plumage with a spotted breast)           => oscine, oscine bird -- (passerine bird having specialized vocal apparatus)               => passerine, passeriform bird -- (perching birds mostly small and living near the ground with feet having 4 toes arranged to allow for gripping the perch; most are songbirds; hatchlings are helpless)                   => bird -- (warm-blooded egg-laying vertebrates characterized by feathers and forelimbs modified as wings)                       => vertebrate, craniate -- (animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium)                           => chordate -- (any animal of the phylum Chordata having a notochord or spinal column)                               => animal, animate being, beast, brute, creature, fauna -- (a living organism characterized by voluntary movement)                                   => organism, being -- (a living thing that has (or can develop) the ability to act or function independently)                                       => living thing, animate thing -- (a living (or once living) entity)                                           => object, physical object --                                                => entity, physical thing --


Meronimia

MERONIMIA

wn beak –holon

Holonyms of noun beak

1 of 3 senses of beak

Sense 2

beak, bill, neb, nib

PART OF: bird


Verbi

VERBI

  • About 10,000 forms, 20,000 senses

  • Relations between verb meanings:


Relazioni tra significati verbali

RELAZIONI TRA SIGNIFICATI VERBALI

V1 ENTAILS V2 when Someone V1 (logically) entails Someone V2- e.g., snore entails sleep

TROPONYMY when To do V1 is To do V2 in some manner- e.g., limp is a troponym of walk


Aggettivi avverbi

AGGETTIVI & AVVERBI

  • About 20,000 adjective forms, 30,000 senses

  • 4,000 adverbs, 5600 senses

  • Relations:


Come usarlo

COME USARLO

  • Online: http://cogsci.princeton.edu/cgi-bin/webwn

  • Scaricatevelo, poi da command line:

    • Get synonyms:

      • wn –synsn bank

    • Get hypernyms:

      • wn –hypen robin

    • (also for adjectives and verbs): get antonyms

      • wn –antsa right


I limiti di wordnet

I LIMITI DI WORDNET

  • Coverage

    • words not in WordNet

      • Crocidolite, spinoff (spin-off)

    • Missing information: MERONYMY

  • Context-dependent senses:

    • slump, crash, bust all synonyms in the WSJ corpus

  • The structure of WordNet

    • Some information is encoded in complex ways (room, wall, floor)

  • But: MOVING TARGET!!


Meronimia in wordnet un esperimento

MERONIMIA IN WORDNET: UN ESPERIMENTO

  • 100 bridging descriptions in a mereological relation

  • Ran a script trying to find a direct link in WordNet (1.7) between one of the senses of the BD and one of the senses of any of the previous NPs

  • Results: in only 6 cases there is in WordNet a direct lexical relation between a BD and one of the CFs


John looked at the house the wall was crumbling

ARTIFACT

IS-A

IS-A

HOUSING

BUILDING

IS-A

IS-A

PART-OF

HOUSE

HOME

ROOM

PART-OF

PART-OF

WALL

FLOOR

John looked at the HOUSE. The WALL was crumbling.


Soluzione acquisizione lessicale

SOLUZIONE: ACQUISIZIONE LESSICALE

  • Parziale (aggiungi informazioni a WordNet, specialmente per domini specialistici)

  • Totale (crei un nuovo lessico a partire da zero)


Letture

LETTURE

  • Jackson, cap. 6.7

  • Marello, cap. 5.5

  • C. Fellbaum. WordNet: An electronic lexical database. MIT Press, 1998

    • cap. 1


  • Login