runn bl ndal
Download
Skip this Video
Download Presentation
Þórunn Blöndal

Loading in 2 Seconds...

play fullscreen
1 / 12

Þórunn Blöndal - PowerPoint PPT Presentation


  • 172 Views
  • Uploaded on

Þórunn Blöndal. ÍSTAL The Icelandic Corpus of Spoken Language Nordtalk – NorFa: Using spoking language corpora Göteborg Aug 19-24 2002. Research on Spoken Icelandic. Research on regional differences in pronunciation language acquisition the development of narrative skills.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Þórunn Blöndal ' - elwyn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
runn bl ndal

Þórunn Blöndal

ÍSTAL

The Icelandic Corpus of Spoken Language

Nordtalk – NorFa:

Using spoking language corpora

Göteborg Aug 19-24 2002

research on spoken icelandic
Research on Spoken Icelandic
  • Research on
    • regional differences inpronunciation
    • language acquisition
    • the development of narrative skills
the stal group
The ÍSTAL Group
the goal
The Goal
  • From the outset, the ÍSTAL group’s primary objective was to establish a corpus of spoken language for use in two broadly defined fields:
      • linguistic research on the spoken language; i.e., in syntax, morphology, conversation analysis, etc.
      • computational linguistics and language technology
slide5

?

?

Interview

?

interviews

shopping

formal

meetings

?

informal

conversation

?

phone

conversation

task-oriented

dialogue

formal conver-

sation (doctor/patient

consultation, etc.)

native / non-

native speakers

non-native

speakers

of Icelandic

children / parents

?

the orthography
The Orthography

Standard orthography is used in ÍSTAL, but deviations from the most common pronunciation are given in brackets:

  • dálítið (a little) > dáldið > doldið

Loan words embedded in Icelandic are spelled according to Icelandic phonetic rules:

  • OK >ókei
the header
The Header ...
  • Heiti upptöku: 04-701-02 – Number
  • Dagsetning upptöku: 040400 – Date
  • Stutt lýsing á efni: Spjall á kennarastofu – Short description
  • Kaflar umritunar: kynlífsvæðing – Topics transcribed
  • Stuttnefni: kennkynlíf – Abbreviated title
  • Lengd upptöku: 00:08:58 – Duration
  • Upptökutæki: Sony digital, mini disc MZ-B3 – Recording device
  • Þátttakandi: A = Þ1; kk 34 kennari – Participant – male 34 - teacher
  • Þátttakandi: B = Þ2; kk 41 kennari – Participant – male 34 - teacher
  • Þátttakandi: C = Þ3; kvk 40 kennari – Participant – female 34 - teacher
  • Þátttakandi: D = Þ4; kvk 45 kennari – Participant – female 34 - teacher
  • Heiti umritunar: UM-04-701-02 – Second listening/transcription
  • Umritari: KE – Transcriber’s initials
  • Dagsetning umritunar: 0800 – Date of secondlistening/ transcription
  • Hvað umritað: Material transcribed
slide9
....
  • Umritunarkerfi: AUGLUMSTAFS – Standard orthography
  • Yfirlesari: Proofreader
  • Dagsetning yfirlesturs: Proofreading date
  • Skráður tími: **?**
  • Athugasemdir: Í upphafi koma Sv skólastjóri=Sv, og Be=Be inn í samtalið sem er tekið upp í frímínútum á kennarastofu. Þ1, Þ2, Þ3 og Þ4 eru samkennarar. – Comments: In the beginning of the conversation, the headmaster (Sv) and Be participate; then they leave. Participants 1, 2, 3, and 4 are colleagues, teachers in the same school.

2-yfirlestur: HBE 060102– Second proofreading 060102

stal as it is now
ÍSTAL as it is now
  • The data bank contains 31 conversations with 2 to 4 participants.
  • The participants are 30-60 years old.
  • The data are collected in various geographical regions of Iceland.
  • Each transcription is marked with a header showing information on the participants’ age, gender, and relationship to one another; the duration of the conversation; and other relevant information.
  • The total duration of transcribed material is approximately 20 hours.
  • Of 31 conversations, 6 take place among males, 5 are among females, and 20 are mixed.
  • The material is transcribed according to the standard orthography with only slight deviation.
stal s role in research
ÍSTAL’s Role in Research

The following have been presented as works in progress:

·Comparison between word frequencies in spoken and

written Icelandic (Ásta)

·Investigation on ‘það’(‘it’ /’there’) in Icelandic (Eiríkur)

·A collaborative completion of turn constructional units (TCU) in conversation (Þórunn)