1 / 58

Introduction to CHILDES and TalkBank

Introduction to CHILDES and TalkBank. Brian MacWhinney CMU - Psychology, Modern Languages, Language Technologies Institute. The goal of TalkBank. The core idea. Human communication is a single unified process. However, patterns in communication are analyzed by 20 different fields.

ronat
Download Presentation

Introduction to CHILDES and TalkBank

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to CHILDES and TalkBank Brian MacWhinney CMU - Psychology, Modern Languages, Language Technologies Institute

  2. The goal of TalkBank

  3. The core idea • Human communication is a single unified process. • However, patterns in communication are analyzed by 20 different fields. • The time scales of the processes varies from milliseconds to centuries. • But all of these processes must have their ultimate effect in the Moment. • We can capture the Moment on video.

  4. Principles • Data-sharing, Informed Consent • Multimedia • Open Access, Web Access, Commentary • Specified Format • Interoperability • Community integration

  5. Availability • http://childes.psy.cmu.edu • http://talkbank.org • programs, manuals, fonts, morphologies, CA conventions, video production guides, XML Schema, links to other programs • data can be either downloaded or played back over the web

  6. Current target areas • CHILDES • PhonBank • BilingualBank • AphasiaBank • CABank • ClassBank

  7. CHILDES • Child Language Data Exchange System • Founded in 1984 in Concord MA • Director: Brian MacWhinney macw@cmu.edu • Programmers: Leonid Spektor, Franklin Chen • 3000 Members • 130 corpora • Over 3200 published articles

  8. CHILDES and TalkBank

  9. Practical Considerations • Learning CLAN takes about a week • Transcription is slow. Perhaps 15:1 ratio. Blitzscribe, LENA, etc. probably will not work • Currently available data may not be perfect for a given issue • Corpora may need enhancement through MOR or Coder’s editor

  10. Tools from the Web • Data: childes.psy.cmu.edu/data • CLAN: childes.psy.cmu.edu/clan • Manuals: childes.psy.cmu.edu/manuals • Morphosyntax: childes.psy.cmu.edu/morgrams • Phon childes.psy.cmu.edu/phon • Tutorial videos talkbank.org/training • Digital video: talkbank.org/dv • CA Methods: talkbank.org/CABank

  11. Why no handout? “Overviews” link has this PPT presentation CHILDES is now fully electronic. No more paper.

  12. Available Methods • Microanalysis - CA, phonetics, ethology • Microgenetic analysis - CA, code-switching (NEXT) • Group and treatment comparisons - Genesee • Error analysis - YipMatthews • Diffusion analysis - in preschools • Longitudinal studies - growth curves • Modeling - neural nets, dynamic systems, evolutionary models

  13. CLAN Tools • Transcribing • Editing • Counts -- FREQ, KWAL • Analyses: MOR, GRASP, PHON • Interoperability -- ELAN, Praat, SFS, EXMARaLDA, CLAPI, PHON

  14. CA marks inUnicode

  15. Transcripts linked to media

  16. Ground Rules • Ethical use, informed consent • Levels of permission • Respect for dignity of participants • Respect for contributors • Requirement to cite sources • Requirement to contribute data

  17. Info-CHILDES and Membership • Info-childes@googlegroups.com • Archived at LinguistList • Info-CHIBolts for nuts and bolts • Membership list • IASCL Membership

  18. Getting Set Up • Download CLAN from Programs link

  19. Windows issues • You can work in c:\childes • But your administrator may have this locked, so, you may need shortcuts. • Windows IPA is difficult. • Windows compression may produce .wmf

  20. Downloading Manuals CHAT, CLAN

  21. Getting Started • Open CLAN Manual to Chapter 2 • Double-click application • Control-D to open Commands Window • Set Working Directory to c:\childes\clan\lib\samples

  22. Should look like this: Windows will be c:\childes\clan\lib\samples

  23. Run FREQ • Freq sample.cha • Hit RUN or carriage return • In output, does “want” occur 3 times?

  24. Interface Features • Help • CLAN • Files In • Recall • Set MOR, Lib, Output directories

  25. Files In

  26. Building Commands • mlu +t*CHI +f sample.cha • mlu *.cha • Wildcards • File output • *.cha

  27. Changing Directories • Set Working to: ne32 combo +t*MOT +s"is^*ing" *.cha • Set Working to: samples kwal +sbunny +w2 -w2 0042.cha • Triple click on output line to go back to source file

  28. GEM • Set Working to: Workshop • GEM +s* pau001.cha • Open output, play audio

  29. Exercises - Chapter 8 • MLU50 • mlu +t*CHI +z50u +f *.cha • MLU5 • maxwd +t*CHI +g1 +c5 +dl 68.cha | mlu > 68.ml5.cex • TTR • freq +t*CHI +s"*-%%" +f *.cha

  30. BatchFile • maxwd +t*CHI +g1 +c5 +dl 14.cha | mlu > 14.ml5.cex • maxwd +t*CHI +g1 +c5 +dl 55.cha | mlu > 55.ml5.cex • maxwd +t*CHI +g1 +c5 +dl 66.cha | mlu > 66.ml5.cex • maxwd +t*CHI +g1 +c5 +dl 68.cha | mlu > 68.ml5.cex • maxwd +t*CHI +g1 +c5 +dl 98.cha | mlu > 98.ml5.cex • Batch batch.cex • Or just run by highlighting in Commands (Windows)

  31. Tables

  32. The Editor

  33. Playing a linked file • Esc-8 • Esc-A • Cont-Click • F5

  34. Linking a File - F5 • Cursor on *FAT • Find file • F5 • Press space for each utterance • Save

  35. F5 Tricks • Go back to last good link • Space quickly through contained overlap • If a bullet is missing, cut and paste an old one • For precision, try Sonic Mode

  36. Sonic Mode • Esc-0 to start • Highlight area • Shift-click to move edge • Have cursor on line in file • S to insert time marks • Triple click a linked sentence

  37. Transcribing • Open new window (Command-N) • Insert headers • @Begin • @Languages: en • @Participants: CHI Target_Child, MOT Mother, FAT Father, ROS Brother • @Date • F5 with space at each utterance • Go back and transcribe each bullet (c-click) • Adjust time marks using Esc-A

  38. F5, locate sound, enter bulletsclick on bullets, transcribe

  39. Or use SoundWalker

  40. Or use the Video Editor

  41. CHECK • CHECK is CRUCIAL • Internal: Esc-L • External: check *.cha • External CHECK provides fuller control

  42. Options • Backup • Wrapping • Line Numbers • CHECK

  43. More Options Line numbers F5 bullets SoundAnalyzer

  44. Coder's Editor • Open barry.cha • Esc-0 • Cursor on first line • Open codeshar.cut • %spa • Insert $NIA:AC:IN

  45. Coder's Editor Commands • F1 finish current tier and go to the next • Esc-c finish coding current tier • Esc-t restrict coding to a particular speaker • Esc-Esc go on to the next speaker • Esc-s rotate subcodes • Control-g cancel illegal command

  46. Send to Praat Open Praat, Click before link, Send to Praat, Run Analysis

  47. Learning to Digitize

  48. Searching, Replacing • Cont-R, Cont-F • Space, No, !, control-G

  49. Fixing Things • CHSTRING • INSERT (inserts @ID headers) • FIXIT • LONGTIER • FIXBULLETS • REN • COMBTIER

  50. Tour of English MOR Files • Download a copy • A-rules • C-rules • Sf.cut • Lexicon

More Related