spoken multimedia corpora for pedagogical purposes n.
Skip this Video
Loading SlideShow in 5 Seconds..
Spoken multimedia corpora for pedagogical purposes PowerPoint Presentation
Download Presentation
Spoken multimedia corpora for pedagogical purposes

Loading in 2 Seconds...

play fullscreen
1 / 27

Spoken multimedia corpora for pedagogical purposes - PowerPoint PPT Presentation

  • Uploaded on

Birmingham Corpus Linguistics Conference 2007. Spoken multimedia corpora for pedagogical purposes. Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford University). Introduction.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Spoken multimedia corpora for pedagogical purposes

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
spoken multimedia corpora for pedagogical purposes

Birmingham Corpus Linguistics Conference 2007

Spoken multimedia corpora for pedagogical purposes

Sabine Braun (University of Surrey)

Pascual Pérez-Paredes (Universidad de Murcia)

Ylva Berglund (Oxford University)

  • The usefulness of corpora in language pedagogy is widely recognised.
  • But there is a need for pedagogically relevant corpora, reflected e.g. in initiatives to create 'ad-hoc' corpora in pedagogical contexts.
  • The creation of pedagogically relevant corpora raises challenges for corpus design.
  • Past and current initiatives have largely focussed on written corpora; spoken discourse is becoming more important in pedagogical contexts.
  • The creation of pedagogically relevant spokencorpora raises additional challenges for corpus design.
the challenges 1
CORPUS DESIGNTraditional reference corpora (content, size, data format,transcription, annotation, query)

CORPUS EXPLOITATIONData-Driven Learning (focus on non-linear reading: concordances and co-texts)

The challenges (1)
  • Corpora contain textual records of discourse; their interpretation requires (re-)contextualisation.
  • Learners may have difficulties analysing corpus data; they require pedagogical mediation.
  • Pedagogical corpus uses differ from linguistic description; this requires e.g. pedagogically motivated query options.
  • Corpora need to be integrated with curricula; this requires e.g. complementarity of content and effective delivery.

Do not fully support pedagogical requirements.

the challenges 2
CORPUS DESIGNTraditionally: representation in written format

CORPUS EXPLOITATIONWork with text-only data and e.g. conversational markup

The challenges (2)
  • Spoken discourse is more dependent on shared physical contexts.
  • It is adjusted to aural and online perception (e.g. chunking)
  • It is affected by limitations of processing capacity (false starts, repair).
  • It is marked by accents.
  • It is multimodal.

Again, this does not fully support pedagogical requirements.

  • Format: multimedia to retain multimodal character of spoken language
  • Content: complementary with curriculum topics, more coherence than in traditional corpora
  • Pedagogically motivated transcription, annotation and alignment (transcript-video)
  • Combination of query methods: text-based exploration and application of corpus techniques
  • Pedagogical enrichment of corpora with complementary resources (e.g. exercises, explanations)
  • Effective delivery of corpora and additional resources to learners/teachers
corpus creation 1

Professional English

Accounts of professional life

Different varieties


7 European languages

Youth language corpora

Speakers 13-15 and 16-18

Corpus creation (1)
  • Examples: ELISA and SACODEYL
  • Interview format
  • Video clips with transcript
  • Communicatively relevant topics, e.g. in SACODEYL topics outlined in the Common European Framework
  • Elicitation process: briefing informants and prompting them during the interview, ensuring naturally flowing discourse
corpus creation 11
Corpus creation (1)

Example of topics in SACODEYL

corpus creation 2
Corpus creation (2)




TEI-compliant corpora


Pedagogic annotation

XML files

corpus creation 21
Corpus creation (2)




TEI-compliant corpora




Pedagogic annotation

XML files

corpus creation 3
Corpus creation (3)



corpus creation 22
Corpus creation (2)






role: Entrevistado

sex: Hombre

age: 16


person: E

name: Andrés Mercader Rodríguez

role: Entrevistador

sex: Hombre

age: 32




Title: La Unión Europea une a los ciudadanos

Date Recording:2006-11-05

Date Transcription:2007-02-02

Locale:I.E.S. Floridablanca,Murcia, España

Principal Investigator: Pascual Perez-Paredes

Researcher:Pascual Perez-Paredes

Transcriber: Encarnación Tornero Valero


Autority: SACODEYL Project


corpus query
Corpus query
  • Query options will support text- and corpus-based exploration and include e.g.
    • Easy access to entire interviews
    • A topic index supporting the analysis of similar sections across interviews ("topic concordances")
    • Other indices based on the annotation categories
    • Ready-made data (e.g. frequency lists of each interview; selective concordances)
    • A concordancer for extended/advanced search; adapted to pedagogical requirements
pedagogical enrichment
Pedagogical enrichment
  • The corpora will be enriched with prototypical learning activities.
  • These will focus on one interview section or one interview as a whole or sections across interviews…
  • They will include e.g.
    • linguistic and cultural explanations and exercises(form-focussed as well as communication-oriented),
    • (listening) comprehension and production tasks,
    • explorative tasks (concordance-based as well as interview-based).
  • Use of authoring tool Telos Language Partner to create learning packages with ranges of activities.
corpus delivery
Corpus delivery
  • Effective delivery as a further prerequisite for integration into curriculum
  • In SACODEYL, use of Moodle learning platform, giving access to:
    • Corpora (query interfaces)
    • Resources created in the project (different types of learning activities)
    • Resources created by future corpus users
  • Method outlined is transferable to other pedagogical contexts, topics, languages
  • Method helps to use corpora more efficiently in pedagogical contexts – from sporadically used resource to systematic exploitation
  • Corpus creation complies with standards to facilitate reuse of corpora for other contexts (research)


Sabine Braun: s.braun@surrey.ac.uk

Pascual Pérez-Paredes: sacodeyl@um.es

Ylva Berglund: ylva.berglund@oucs.ox.ac.uk

And visit our poster session…

As well as our websites: