slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Correlates between Performance, Prosodic and Phrase Structures in Bangla and Hindi Insights from a Psycholinguistic Expe PowerPoint Presentation
Download Presentation
Correlates between Performance, Prosodic and Phrase Structures in Bangla and Hindi Insights from a Psycholinguistic Expe

Loading in 2 Seconds...

play fullscreen
1 / 22

Correlates between Performance, Prosodic and Phrase Structures in Bangla and Hindi Insights from a Psycholinguistic Expe - PowerPoint PPT Presentation


  • 126 Views
  • Uploaded on

Correlates between Performance, Prosodic and Phrase Structures in Bangla and Hindi Insights from a Psycholinguistic Experiment. Kalika Bali 1 , Monojit Choudhury 1 , Diptesh Chatterjee 2 , Sankalan Prasad 2 , Arpit Maheswari 3 1 Microsoft Research Lab India, 2 IIT Kharagpur, 3 IIT Bombay

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Correlates between Performance, Prosodic and Phrase Structures in Bangla and Hindi Insights from a Psycholinguistic Expe' - traci


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Correlates between Performance, Prosodic and Phrase Structures in Bangla and HindiInsights from a Psycholinguistic Experiment

Kalika Bali1, Monojit Choudhury1, Diptesh Chatterjee2, Sankalan Prasad2, Arpit Maheswari3

1Microsoft Research Lab India, 2IIT Kharagpur, 3IIT Bombay

Contact: monojitc@microsoft.com

syntactic processing pipeline
Syntactic Processing Pipeline

शिमला से मनाली ना जाकर सीधे दिल्ली से विमान ले लो |

POS/

Morphological Analysis

शिमला\NP से\PP मनाली\NP ना\RP जाकर\PL सीधे\RB दिल्ली\NP से\PP विमान\NN ले\VM लो\VA|

Parsing

शिमला\NP से\PP मनाली\NP ना\RP जाकर\PL सीधे\RB दिल्ली\NP से\PP विमान\NN ले\VM लो\VA|

syntactic processing pipeline1
Syntactic Processing Pipeline

शिमला से मनाली ना जाकर सीधे दिल्ली से विमान ले लो |

POS/

Morphological Analysis

शिमला\NP से\PP मनाली\NP ना\RP जाकर\PL सीधे\RB दिल्ली\NP से\PP विमान\NN ले\VM लो\VA|

Chunking

[शिमला\NP से\PP] मनाली\NP[ना\RP जाकर\PL]सीधे\RB[दिल्ली\NP से\PP] विमान\NN[ले\VM लो\VA]

Parsing

शिमला\NP से\PP मनाली\NP ना\RP जाकर\PL सीधे\RB दिल्ली\NP से\PP विमान\NN ले\VM लो\VA|

chunking in speech technology
Chunking in Speech Technology
  • Chunks correspond to prosodic boundaries

Therefore, useful for speech synthesis

शिमला से मनाली ना जाकर सीधे दिल्ली से विमान ले लो|

शिमला से - मनाली - ना - जाकर - सीधे - दिल्ली से - विमान - ले लो|

शिमला - सेमनाली - नाजाकर - सीधेदिल्ली - सेविमान ले - लो|

शिमला से मनाली - ना जाकर – सीधेदिल्ली से - विमानले लो|

what is a chunk
What is a Chunk?
  • Theoretical Perspective:
    • Chunk  {Phrase, Clause, …}
    • Chunk  {“Modifier + Modified”, “Main verb + Aux”, …}
  • Cognitive Perspective:
    • Realized in speech through Prosodic boundaries
    • Perceived by the speaker as “more connected”
  • Computational Perspective:
    • Easy to identify (local context)
    • Helps in parsing/speech processing
abney s chunks
Abney’s CHUNKS
  • “the non-recursive core of an intra-clausal constituent, extending from the beginning of the constituent to its head, but not including post-head dependents.” – Abney, 1995
    • Maximal: a chunk that is not contained inside another chunk
  • Philosophy: linguistic theories should explain human intuition and performance
  • Based on the “performance structure” of the native speakers – Gee and Grosjean, 1983; Abney, 1991
objective of the present study
Objective of the present study
  • Empirical investigation of the nature of chunks in Indian languages from a cognitive perspective
    • Evidence from Prosody
    • Native speaker intuition
  • Compare with
    • Phrase structure
    • Other suggestions of chunks
  • Hindi and Bangla
  • Motivated by (Gee and Grosjean, 1983; Abney, 1991)
chunks in indian languages
Chunks in Indian Languages
  • Relatively free word order
    • मैंये काम खत्म कर लूँगा, नहीं तो बाद में समय न मिलेगा
    • ये काम मैंखत्म कर लूँगा, नहीं तो बाद में समय न मिलेगा
    • मैंये काम खत्म कर लूँगा, बाद में नहीं तो समय न मिलेगा
  • Consequences
    • No concept of verb phrase (Bharatiet al 1995)
    • Clausal connectors need not indicate clause boundary
  • Chunk = Local Word Groups (Bharatiet al 1995)
    • मैं [ये काम] खत्म [कर लूँगा], [नहीं तो][बाद में] समय न मिलेगा
chunks in indian languages contd
Chunks in Indian Languages (contd.)
  • LWG in agglutinative languages?
    • আমি [এই কাজটা] শেষ [করে ফেলবো], কারণ পরে সময় পাবনা৷

Alternatives Suggestions

  • Maximal recognizable phrases (Ray et al, 2003)
    • मैं [ये काम][खत्म कर लूँगा], [नहीं तो][बाद में] समय [न मिलेगा]
  • Nested chunking based on non-intrusive fragments (Das et al, 2005)
    • [मैं [ये काम][खत्म [कर लूँगा]]], [[नहीं तो][बाद में] समय [न मिलेगा]]
experimental methodology
Experimental Methodology
  • Subjects: 6 native speakers for each language
  • Subjects were given 10 sentences (text)
  • Read them out in natural way
  • Divide every sentence into two parts and then recursively each part into two parts, such that words in each partition are more related to each other.
    • ((खबर)(सुनते ही))((मैं)(तुंरत)((घर से )(भागा)))
sentence selection
Sentence Selection
  • Near translations to facilitate comparison across languages
  • Coverage of various syntactic phenomena
    • Embedded/relative clauses
    • Sentence and phrase-level adverbs
    • Conjuncts
    • Noun Groups:
      • Compound Nouns, Named Entities and MWE
      • Qualifier + Adjectives* + Determiner + Noun
      • Noun + [complex] Postpositions
    • Verb Groups:
      • Polar +Vector + Auxiliaries + Particles
      • Noun + Verb
prosodic structure
Prosodic Structure
  • Identify major (> 7ms) and minor breaks (> 2.5ms)
  • Count the number of subjects having breaks

शिमला से मनाली ना जाकर सीधे दिल्ली से विमान ले लो|

- - - 3| - - 6| - - - 6| - - -

সিমলা হয়ে মানালী না গিয়ে সোজা দিল্লী থেকেই ফ্লাইট নিয়ে যান৷

- 2*| - 6| - - - 6| - - - 6| - - -

performance structure
Performance Structure

शिमला से मनाली ना जाकरसीधे दिल्ली से विमान ले लो|

(शिमला से मनाली ना जाकर)(सीधे दिल्ली से विमान ले लो|)

((शिमला से मनाली)(ना जाकर))((सीधे दिल्ली से)(विमान ले लो|))

(((शिमला से) मनाली)(ना जाकर))((सीधे (दिल्ली से))(विमान (ले लो|)))

शिमला3से2मनाली1ना 3जाकर0सीधे 2दिल्ली3 से1विमान2ले लो

शिमला0से1मनाली2ना 0जाकर3सीधे 1दिल्ली0 से2विमान3ले लो

performance structure1
Performance Structure

शिमला से मनाली ना जाकरसीधे दिल्ली से विमान ले लो|

(शिमला से मनाली ना जाकर)(सीधे दिल्ली से विमान ले लो|)

((शिमला से मनाली)(ना जाकर))((सीधे दिल्ली से)(विमान ले लो|))

(((शिमला से) मनाली)(ना जाकर))((सीधे (दिल्ली से))(विमान (ले लो|)))

शिमला3से2मनाली1ना 3जाकर0सीधे 2दिल्ली3 से1विमान2ले लो

शिमला0से1मनाली2ना 0जाकर3सीधे 1दिल्ली0 से2विमान1ले लो

- - - *| - - | - - - *| - - -

3

2

2

1

1

1

0

0

0

0

शिमला से मनाली ना जाकरसीधे दिल्ली से विमान ले लो|

observations
Observations
  • Do the subjects agree on the boundaries?
    • Yes, always for clause boundaries, and often for phrase boundaries.
    • Lot of confusion within phrases
  • Are the prosodic and performance structure similar?
    • Both show major breaks at clause boundaries
    • If a major break in one structure, then at least a minor in the other
  • Are the structures in Hindi and Bangla similar?
    • Except for a few cases, they are indeed very similar
observations1
Observations
  • Do the subjects agree on the boundaries?
    • Yes, always for clause boundaries, and often for phrase boundaries.
    • Lot of confusion within phrases
  • Are the prosodic and performance structure similar?
    • Both show major breaks at clause boundaries
    • If a major break in one structure, then at least a minor in the other
  • Are the structures in Hindi and Bangla similar?
    • Except for a few cases, they are indeed very similar

दृढ किन्तु बहुत मृदु स्वरों में

- 6| - 3| - 3| - 3| - -

observations2
Observations
  • Do the subjects agree on the boundaries?
    • Yes, always for clause boundaries, and often for phrase boundaries.
    • Lot of confusion within phrases
  • Are the prosodic and performance structure similar?
    • Both show major breaks at clause boundaries
    • If a major break in one structure, then at least a minor in the other
  • Are the structures in Hindi and Bangla similar?
    • Except for a few cases, they are indeed very similar

शिमला से मनाली ना जाकर सीधे दिल्ली से विमान ले लो |

- - - 3| - - 6| - - - 6| - - -

- - *| - *| - - | - *| - - | - *| - -

সিমলা হয়ে মানালী না গিয়ে সোজা দিল্লী থেকেই ফ্লাইট নিয়ে যান ৷

- 2*| - 6| - - - 6| - - - 6| - - -

- - *| - *| - - | - *| - - *| - - -

chunks are often larger than lwg
Chunks are often larger than LWG

[गमले के][टुकडों को] फैक मत देना |

- - 4| - - 6| - - -

- - *| - - | - | - -

खबर सुनते ही मैं तुंरत [घर से] भागा |

- - - 6| - - 6| - - -

- *| - - | - - *| - - -

खबर सुनते ही मैं [घर से] तुंरत भागा |

- - - 6| - - - 2| - -

- | - - | - *| - - *| - -

extra syntactic factors governing chunk boundaries
Extra-Syntactic Factors Governing Chunk Boundaries
  • Chunk Length

টবের 2*| ভাঙ্গা টুকরো গুলো 5| ফেলে দিয়ো না

এক2*|বিশাল 2| পথ অবরোধের 3| আয়োজন করেছিলো

  • Familiarity to the text

তৃণমূল কংগ্রেসের 2| সদস্যরা

तृणमूल 3| कांग्रेस के सदस्यों1*|ने

  • Focus of the utterance
  • Phonology
conclusion
Conclusion
  • Cognitive reality of chunks
    • Agreement across speakers, structures, languages
  • Chunks are NOT
    • Local word groups
    • Phrases
  • Chunks are
    • Completely context dependent
    • Governed by syntactic plus extra-syntactic factors
  • A theory of chunks is necessary at least for speech applications
thank you for your kind attention

Thank You for your kind attention

Contact: monojitc@microsoft.com