Runn bl ndal
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

Þórunn Blöndal PowerPoint PPT Presentation


  • 137 Views
  • Uploaded on
  • Presentation posted in: General

Þórunn Blöndal. ÍSTAL The Icelandic Corpus of Spoken Language Nordtalk – NorFa: Using spoking language corpora Göteborg Aug 19-24 2002. Research on Spoken Icelandic. Research on regional differences in pronunciation language acquisition the development of narrative skills.

Download Presentation

Þórunn Blöndal

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Runn bl ndal

Þórunn Blöndal

ÍSTAL

The Icelandic Corpus of Spoken Language

Nordtalk – NorFa:

Using spoking language corpora

Göteborg Aug 19-24 2002


Research on spoken icelandic

Research on Spoken Icelandic

  • Research on

    • regional differences inpronunciation

    • language acquisition

    • the development of narrative skills


The stal group

The ÍSTAL Group

  • Ásta Svavarsdóttir

    • The Institute of Lexicography ([email protected])

  • Eiríkur Rögnvaldsson

    • University of Iceland ([email protected])

  • Hrafnhildur Ragnarsdóttir

    • Iceland University of Education ([email protected])

  • Kristín Bjarnadóttir

    • The Institute of Lexicography ([email protected])

  • Sigurður Konráðsson

    • Iceland University of Education ([email protected])

  • Þóra Björk Hjartardóttir

    • University of Iceland ([email protected])

  • Þórunn Blöndal

    • Iceland University of Education ([email protected])


The goal

The Goal

  • From the outset, the ÍSTAL group’s primary objective was to establish a corpus of spoken language for use in two broadly defined fields:

    • linguistic research on the spoken language; i.e., in syntax, morphology, conversation analysis, etc.

    • computational linguistics and language technology


Runn bl ndal

?

?

Interview

?

interviews

shopping

formal

meetings

?

informal

conversation

?

phone

conversation

task-oriented

dialogue

formal conver-

sation (doctor/patient

consultation, etc.)

native / non-

native speakers

non-native

speakers

of Icelandic

children / parents

?


Sony mz b3

Sony MZ-B3


The orthography

The Orthography

Standard orthography is used in ÍSTAL, but deviations from the most common pronunciation are given in brackets:

  • dálítið (a little) > dáldið > doldið

    Loan words embedded in Icelandic are spelled according to Icelandic phonetic rules:

  • OK >ókei


The header

The Header ...

  • Heiti upptöku: 04-701-02 – Number

  • Dagsetning upptöku: 040400 – Date

  • Stutt lýsing á efni: Spjall á kennarastofu – Short description

  • Kaflar umritunar: kynlífsvæðing – Topics transcribed

  • Stuttnefni: kennkynlíf – Abbreviated title

  • Lengd upptöku: 00:08:58 – Duration

  • Upptökutæki: Sony digital, mini disc MZ-B3 – Recording device

  • Þátttakandi: A = Þ1; kk 34 kennari – Participant – male 34 - teacher

  • Þátttakandi: B = Þ2; kk 41 kennari – Participant – male 34 - teacher

  • Þátttakandi: C = Þ3; kvk 40 kennari – Participant – female 34 - teacher

  • Þátttakandi: D = Þ4; kvk 45 kennari – Participant – female 34 - teacher

  • Heiti umritunar: UM-04-701-02 – Second listening/transcription

  • Umritari: KE – Transcriber’s initials

  • Dagsetning umritunar: 0800 – Date of secondlistening/ transcription

  • Hvað umritað: Material transcribed


Runn bl ndal

....

  • Umritunarkerfi: AUGLUMSTAFS – Standard orthography

  • Yfirlesari: Proofreader

  • Dagsetning yfirlesturs: Proofreading date

  • Skráður tími: **?**

  • Athugasemdir: Í upphafi koma Sv skólastjóri=Sv, og Be=Be inn í samtalið sem er tekið upp í frímínútum á kennarastofu. Þ1, Þ2, Þ3 og Þ4 eru samkennarar. – Comments: In the beginning of the conversation, the headmaster (Sv) and Be participate; then they leave. Participants 1, 2, 3, and 4 are colleagues, teachers in the same school.

    2-yfirlestur: HBE 060102– Second proofreading 060102


Stal as it is now

ÍSTAL as it is now

  • The data bank contains 31 conversations with 2 to 4 participants.

  • The participants are 30-60 years old.

  • The data are collected in various geographical regions of Iceland.

  • Each transcription is marked with a header showing information on the participants’ age, gender, and relationship to one another; the duration of the conversation; and other relevant information.

  • The total duration of transcribed material is approximately 20 hours.

  • Of 31 conversations, 6 take place among males, 5 are among females, and 20 are mixed.

  • The material is transcribed according to the standard orthography with only slight deviation.


Stal s role in research

ÍSTAL’s Role in Research

The following have been presented as works in progress:

·Comparison between word frequencies in spoken and

written Icelandic (Ásta)

·Investigation on ‘það’(‘it’ /’there’) in Icelandic(Eiríkur)

·A collaborative completion of turn constructional units (TCU) in conversation (Þórunn)


Thank you

Thank You!


  • Login