the web in theoretical linguistics research two case studies using the linguist s search engine
Download
Skip this Video
Download Presentation
The Web in Theoretical Linguistics Research: Two Case Studies Using the Linguist’s Search Engine

Loading in 2 Seconds...

play fullscreen
1 / 57

the web in theoretical linguistics research: two case studies using the linguist s search engine - PowerPoint PPT Presentation


  • 205 Views
  • Uploaded on

The Web in Theoretical Linguistics Research: Two Case Studies Using the Linguist’s Search Engine . Philip Resnik, Aaron Elkiss, Heather Taylor, and Ellen Lau University of Maryland. Berkeley Linguistics Society. February 20, 2005. * Theje dberk eobbfid dbeonc kdoeb.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'the web in theoretical linguistics research: two case studies using the linguist s search engine' - daniel_millan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the web in theoretical linguistics research two case studies using the linguist s search engine

The Web in Theoretical Linguistics Research:Two Case Studies Using the Linguist’s Search Engine

Philip Resnik, Aaron Elkiss,

Heather Taylor, and Ellen Lau

University of Maryland

Berkeley Linguistics Society

February 20, 2005

slide2

*Theje dberk eobbfid dbeonc kdoeb

Did that sound ok to you?

“a small, imperfect experiment…”

slide3

Nature of Grammar

Data-oriented

Probabilistic

Ordered constraints

Hard / Categorical

Conventional / Binary

{__,?,??,?*,*,**}

Contrasts

Magnitude estimation

Nature of Elicitation

Schütze (1996)

Cowart (1997)

Bard, Robertson, and Sorace (1996)

Crocker and Keller (2005)

Sorace and Keller (2005)

slide4

Nature of Grammar

Data-oriented

Probabilistic

Ordered constraints

Hard / Categorical

Language Technology

Linguist

Source of Language Sample

Naturally occurring

Corpora

Part-of-speech taggers

Treebanks

Statistical parsers

Semantic role labeling

…etc.

?

Nature of Elicitation

slide5

If you build it, they will come…

Manning (2003): “…it remains fair to say that these tools have not yet made the transition to the Ordinary Working Linguist without considerable computer skills.”

% export TGREP_CORPUS=wsj_mrg.crp

% tgrep -n __ | grep . | gzip > wsj_mrg.txt.gz

% tgrep2 -C -p wsj_mrg.txt wsj_mrg.t2c.g

NP !<< PP [> NP | >> VP]

roadmap
Roadmap
  • Motivations
  • The Linguist’s Search Engine
  • Case Study 1: Psycholinguistics
  • Case Study 2: Syntax
  • Conclusions
a brief illustration of the lse
A Brief Illustration of the LSE
  • Pollard and Sag (1994); discussion in Manning (2003)
    • (a) We consider Kim to be an acceptable candidate
    • (b) We consider Kim an acceptable candidate
    • (c) We consider Kim quite acceptable
    • (d) We consider Kim among the most acceptable candidates
    • (e) *We consider Kim as an acceptable candidate
    • (f) *We consider Kim as quite acceptable
    • (g) *We consider Kim as among the most acceptable candidates
    • (h) *We consider Kim as being among the most acceptable candidates
slide8

Type an example of the structure you’re interested in.

LSE generates an automatic analysis

(You don’t have to agree with the analysis!)

Query By Example

slide11

A few mouseclicks later, you have a description of the structure you’re looking for.

The LSE creates the query for you.

slide13

Hit ‘search’ and the LSE retrieves sentences whose analysis matches the structure you specified.

two case studies
Two Case Studies
  • Focus in this talk:
    • What was the study about?
    • How was the LSE useful?

In both cases, my co-authors were naïve users of the Linguist’s Search Engine. I didn’t discover the LSE had been useful to them until after the fact.

case study i psycholinguistics
Case Study I: Psycholinguistics
  • Nina Kazanina, Ellen Lau, Moti Lieberman, Colin Phillips and Masaya Yoshida, “Active Dependency Formation in the Processing of Backwards Anaphora”. 17th Annual CUNY Sentence Processing Conference, University of Maryland, College Park. March 2004.

http://www.ling.umd.edu/ninaka/Papers/CUNY_2004_slides.pdf

active dependency formation

While he was watching TV, John heard the phone ring.

  • Early pronoun signals upcoming dependency formation
  • Active processing of dependency observed?
  • Dependency formation constrained by grammar?
Active Dependency Formation

The teacher asked what the team was laughing about __.

  • Wh-word signals upcoming dependency formation
  • Active processing of dependency observed

 filled gap effect

  • Dependency formation constrained by grammar

 island constraints

slide19

Original data for testing prediction

Gender mismatch effect

While she was cooking dinner, John listened to the radio.

She was cooking dinner while John listened to the radio.

Principle C rules out coreference in c-commanded position, so no mismatch effect should be observed

Active Dependency Formation

Results looked good, but there was a confound!

She was cooking dinner while John listened to the radio.

She was cooking dinner while John listened to the radio.

Needed a construction where the target position is expected;

otherwise processor might simply have stopped looking for target.

slide20

Options:

  • Rely on experimenter intuition
  • Do a pilot study
  • Sift through a corpus

Active Dependency Formation

Possible solution: expletive constructions

It was clear to his mother that John should go.

It was clear to him that John should go.

It was clear to his mother that John should go.

It was clear to himthat John should go.

No Principle C

Principle C

Question: does this construction really have the right properties?

  • Is the second clause consistently expected?
  • Is it consistently expletive rather than referential?
slide21

Query by example:

It was clear to him

Becomes

It AUX [clear to NP]

slide23

Active Dependency Formation

Result:

  • Verified that virtually all results of the search did involve expletive it with a following clause.
  • Obtained reassurance in designing the follow-up study
  • Later double-checked using an off-line completion study

The LSE made it easy to start with linguists’ intuitions and find relevant evidence in naturally occurring text.

The LSE also makes it easy to look for additional relevant data that may not have occurred to the experimenter.

slide24

Query by example:

It AUX Adj PP that…

Any adjective

PP with any preposition

slide25

clear

important

vital

manifest

interesting

necessary

obvious

case study ii syntax
Case Study II: Syntax
  • Heather Taylor, “Interclausal (co)dependency: the case of the comparative correlative”, Proc. Michigan Linguistics Society, October 2004.

http://www.ling.umd.edu/events/syntax/abstracts/heather1.PDF

comparative correlatives
Comparative Correlatives*

The Xer …, the Yer …

  • Highlighted in recent debates about the UG approach
  • Central question: are these constructions amenable to an analysis based on UG principles, or do they present a challenge to the UG view?

Central claim here: the LSE is useful regardless of which side of the debate you’re on.

*A.k.a. Conditional correlatives, correlative conditionals, “more-more” constructions

slide28

Comparative Correlatives

Culicover and Jackendoff (1999)

Taylor (2004)

IP/CP

CP

Sui generis

CP

CP

CP

CP

CP

[the more XP]i (that) IP

[the more XP]j(that) IP

… ti …

… tj …

Interclausal relationships accounted for outside the syntax

UG analysis relating CCs to conditionals

comparative correlatives29

is

is

Comparative Correlatives
  • McCawley’s generalization (1988, 1998):

Deletion of copular main verbs in CCs is sensitive to semantic properties of the subject (generic/specific)

  • The better an advisor , the more successful a student
  • The more obnoxious Fred , the less attention you should pay

is

  • But analysis of LSE data exposes the role of:
    • Phonological weight of the subject
    • Parallelism (copula in both clauses, deletion in both clauses)

casting doubt on the generalization’s validity

comparative correlatives30
Comparative Correlatives

*The more obnoxious Fred,

the less attention you should pay to him.

?The more obnoxious Fred’s younger brother,

the less attention you should pay to him.

?The longer the day’s activities are, the sleepier the campers.

?The longer the day’s activities, the sleepier the campers are.

√The longer the day’s activities, the sleepier the campers.

Informant judgments confirm the tendencies indicated by naturally occurring data.

comparative correlatives31
Comparative Correlatives
  • Overt then?
    • The hungrier Romeo gets, then the more pizza he eats.
    • Cf. If Romeo gets hungrier, then he eats more pizza.
comparative correlatives35
Comparative Correlatives
  • Overt then
    • The hungrier Romeo gets, then the more pizza he eats.
    • Cf. If Romeo gets hungrier, then he eats more pizza.
  • LSE searches suggest that overt then is not anomalous.
  • Might this support a UG account that provides a unified treatment of CCs and conditionals?

One more fact to add to the theoretical debate!

conclusions

Traditional?!

Conclusions
  • The LSE is useful to traditional linguists
    • Confirming/disconfirming intuitions

(theory  data)

    • Exposing a wider range of data

(data  theory)

  • The LSE complements new methodological trends
    • Magnitude estimation, etc.
  • The LSE is available for anyone to use
    • http://lse.umiacs.umd.edu
conclusions39
Conclusions
  • Chomsky (1979): “You can also collect butterflies and make many observations. If you like butterflies, that’s fine; but such work must not be confounded with research, which is concerned to discover explanatory principles of some depth and fails if it does not do so.”
  • Einstein (1940): “Science is the attempt to make the chaotic diversity of our sense-experience correspond to a logically uniform system of thought [in which] experience must be correlated with the theoretical structure… What we call physics comprises that group of natural sciences which base their concepts on measurements…”
a web search tool for the ordinary working linguist
A Web Search Tool for the Ordinary Working Linguist
  • Must have linguist-friendly “look and feel”
  • Must minimize learning/ramp-up time
  • Must permit real-time interaction
  • Must permit large-scale searches
  • Must allow search on linguistic criteria
  • Must be reliable
  • Must evolve with real use
lse example text in parallel translation
LSE Example: Text in Parallel Translation

Example: seeing how English “completive particle” usages

(eatup versus simply eat, indicating a telic event) are rendered

in different languages.

lse example implicit objects
LSE Example: Implicit Objects
  • Resnik (1993, 1996):
    • Information-theoretic model of selectional constraints
    • Model makes predictions with respect to implicit objects
  • Implicit objects
    • John ate Ø (= John ate something edible)
    • *John found Ø (can’t mean John found something findable).
  • Question from audience:
    • “Doesn’t your model then predict that the verb titrate should permit implicit objects?”
    • Options
      • Find informants for whom titrate is in the working vocabulary
      • Slog through corpora looking for titrate used “intransitively”
slide52

Gender mismatch effect (van Gompel and Liversedge, 2003)

  • When she wasn’t busy, the girl visited the boy very often.
  • When she wasn’t busy, the boy visited the girl very often.

she

*

the boy

Active Dependency Formation

Gender mismatch effect

reveals active processing

Can grammatical information constrain the process?

  • Principle C: pronoun can’t

co-refer with antecedent that it c-commands.

  • Prediction: no gender mismatch effect with c-commanded positions
more on comparative correlatives see taylor 2004
More on Comparative Correlatives(see Taylor, 2004)
  • The two clauses behave like a subordinate and matrix clause, respectively
    • Tag questions form on clause2 and not clause1
    • Only clause2 can host subjunctive case
    • In German, the word order is consistent with clause1 being subordinate to matrix clause2
    • In Dutch there is flexibility in the word order of clause2 characteristic of matrix clauses
  • NPI licensed in clause1 but not in clause2
  • Extraction is equally permissible from both
slide55
Conditionals
    • Presence of then
    • Tag questions form on clause2 and not clause1
    • NPI licensed in clause1 but not in clause2
    • Extraction from both clauses
    • Variable binding facts “shadow” each other
    • Lack of Condition C binding between clauses
    • Codependence
      • Each clause depends on the presence of the other
      • The licit values of X in the “comparative strings” are determined by each other
      • Parallelism in copula deletion
ad