300 likes | 320 Views
Learn about FluencyBank project's database, teaching resources, and CLASP for child language assessment. Explore the challenges and research funded by NSF/NIDCD. Discover how to code fluency and analyze fluency profiles using FluCalc software.
E N D
Workshop: FluencyBank, Child Language Assessment Project (CLASP), and Dialect Variation Nan Bernstein Ratner Julianne Garbarino Veronica Builes Courtney Overton Pamela Dominguez The University of Maryland, College Park
Acknowledgements Funding: NIHHD 9R01HD082736-11 (Brian MacWhinney, PI). Corpus-based assessment of child language; Consultant. 2014-2019, NIDCD: 1 R01 DC015494-01 (Brian MacWhinney, co-PI). A shared database for the study of the development of language fluency. , NSF BCS-1626300/1626294: The development of language fluency across childhood. N. Bernstein Ratner (PI) & B. MacWhinney, Co-I (Collaborative Research), 2016-2019. NIDCD: 1R01DC016076-01(Nan Bernstein Ratner, PI): 2018-2023) Validation and norming of children's expressive language sample analysis measures. $1,722,800
My connection with Brian • Goes back to when he and Catherine Snow were scarfing up data for CHILDES right after its inception, and I had helpfully typed my transcripts • So my work goes back to the beginning… • Sharing data with TalkBank is like the expression “Love isn’t love until you give it away” – my original data has now been used in over 100 studies and publications • It’s one of my most cited “contributions to science” • So, junior [and senior] researchers, share your data • More on “kumbaya science” tomorrow…
Overview • I will introduce the FluencyBank project and its resources: • Database • Fluency codes • Teaching resources • FluCalc utility • I will introduce CLASP (the Child Language Assessment Project) • Uses CHILDES NA database • Kideval utility to derive 40+ language sample analysis (LSA) measures • Real-time reference norming for children 2-6 years of age • I will discuss the challenges of LSA transcription and measures when working with child (or adult) speakers of African-American English
FluencyBank Resources and Utilities • FluencyBank was funded in 2016 by joint funding from the National Science Foundation and the NIDCD to Brian MacWhinney and Nan Bernstein Ratner. • Its goals are to study development of typical and atypical fluency profiles across childhood. • Four current thrusts: • Database for research cooperation • Coding conventions for disfluency • Software programs to analyze fluency profiles for both clinical and research • Teaching/instructional resources • Emphasis on the ABC’s of stuttering: behavior, affect and cognition
A sample of research hypotheses funded by the NSF/NIDCD grants: • How are fluency patterns seen in typical, language-delayed, stuttering and Spanish-English bilingual preschoolers similar or different? • What fluency profiles distinguish likelihood of recovery from childhood onset fluency disorder (stuttering) from persistence? • Do different fluency types (e.g., repetitions, “stalls”) seem to serve different functions? • Are they seen on different types of words? Or in different locations?
Status of the database: a fast tour • Fluency research is carried out by a minority of research settings, even in CSD • A huge obstacle has been failure to adopt either a standard transcription platform (although use of SALT is common), or computable, transparent codes • Reliability of fluency coding notoriously poor – need for sonic linkage to improve and verify • Some major, groundbreaking longitudinal datasets of stuttering/typical children: • Illinois Stuttering Research Project (Ehud Yairi, Nicoline Ambrose and colleagues) – already posted, but in need of organization, linkage and fluency recoding • Purdue Stuttering Project (Anne Smith, Christine Weber and colleagues): currently in fluency recoding • Iowa Stuttering Project (pending) • HARD WORK TO TRANSFORM SALT CODES (unsystematic, imprecise) to CHAT, lots of lab hours • Other major data sets, including Weismer (for fluency in LI), Hoff (fluency in bilingualism)
How coding for fluency works: • Needed to develop a system that • would work in any orthography, • CLAN’s mor parser can “see through” • includes severity information (e.g., numbers of iterated segments) • Resulting codes (live since late 2016); see SLP Guide and CHAT manuals:
Software to create fluency profiles: FluCalc • FluCalc relies on the new CHAT codes • If transcript is video/audio linked, can derive speech rate • Computes raw and proportioned values (over words or syllables) for: • Typical disfluencies (e.g., unfilled and filled pauses, phrase repetitions and revisions) • Atypical “stutterlike” disfluencies, SLDs: (blocks, consonant prolongations, sound/syllable/monosyllabic whole word repetitions, vocal breaks) • Differential diagnosis of “developmental disfluency”/stuttering (now called childhood onset fluency disorder in DSM-V, ICD-11), using weighted disfluency score (Ambrose & Yairi, 2005). • “Cross-tier” analysis of grammatical class and fluency (content/function word ratio of disfluent items) • Already referenced in major grad texts in stuttering!
FluCalc Demonstration: • I will run one of my original children who stutter on FluCalc
List of measures in FluCalc:Each value is reported in raw counts and proportions (over words OR syllables) • Total utterances in the sample • Total intended words, as identified by MOR • # Prolongation: raw count of sound prolongations • # Broken word • # Block • # PWR (Part-word repetition) • # PWR RU (Repetition units): iterations, or number of excess repetitions in a part-word repetition. • # Phonological fragment –abandoned word attempts, e.g. &+fr- tadpole, where the speaker appears to change word choices • # WWR (whole word repetition) • # WWR RU (repetition units; please see PWR above)
FluCalc output, continued • # Phrase repetitions •# Word revisions •# Phrase revisions •# Pauses (hesitations) • # Pause duration (if specified by coder) •# Filled pauses •# SLD (stutter-like disfluency); summing over categories in columns Prolongations through whole-word repetitions (WWR), with the exception of columns reporting repetition units (RUs) •# TD (typical disfluencies): summing categories in columns labeled Phrase Repetitions through Filled Pauses) • # Total (SLD+TD): this sums all forms of disfluency, both stutter-like and typical, seen in the sample
Most critically, a reference value to distinguish normal disfluency from stuttering • Weighted SLD. This is an adapted version of the SLD formula for distinguishing between typical disfluency and stuttering profiles in young children. It was originated by Yairi & Ambrose (1999, 2005). • This formula multiplies the SUM of part-word and whole-word repetitions by the MEAN of the observed repetition units in the sample; it then adds this value to TWICE the sum of prolongations and blocks. • A weighted score greater than 4.0 is considered greater than values obtained from typically fluent children and merits concern.
Options to analyze disfluency loci linguistically • Switch to identify part-of-speech for stuttered words • Current project: the supposed content-function “switch” between childhood and adulthood in features of stuttered words.
Switching gears: CLASP • AphasiaBank laid the groundwork for a bundled utility to profile speakers with aphasia (EVAL). • KidEval was developed shortly thereafter. • What KidEval tracks: DEMO
A longstanding set of problems in child language sample analysis (LSA) • Lots of proposed measures, but: • Few get used in clinical practice --- too time-consuming • Normative values are VERY poor --- small populations, little diversity • SALT has only 87 children under age 6!!!! • Almost all LSA measures are biased against speakers of non-mainstream American English (MAE). • A set of potential solutions: • Automate the analyses (many were pre-existing, but some are novel, such as SUGAR [Pavelko & Owens, 2018). • Use CHILDES archive to re”norm” the measures
CLASP challenges: Variable success with conventional LSA measures • From Overton, Perry & Builes (this meeting): ability to distinguish Ellis Weismer children with LD from TD children
Your KidEval printout (Excel):We’re sorry it’s so small … it provides a LOT of data
Like Eval, KidEval now also produces dynamic reference scores (~1000 children and growing) • Need to curate and annotate individual files for grammaticality • Sample report:
A sticking point for all LSA proposals: dialect • Most LSA measures will “penalize” non-MAE speakers who do not realize final inflections, auxiliary and copular forms, etc. in speaking • Transcription challenges: • Pronunciation: non-MAE is signaled by grammar, lexical forms and phonology • Yet, CLAN is meant to ignore phonology on the main tier • “Deleted” forms – presumes standard MAE perspective • Norm challenges: covariance of non-MAE and lower SES – when does “deficiency” overlap with “disorder”?
CLASP work with AAVE is just starting • How to balance pronunciation on main tier with dialect variation? • We already encode some pronunciation variants (e.g., catenatives) • Possibility for automatic “dialect detector” for lexicalized dialect items in mor • “Omission” vs. “optional realization” of phonological and grammatical elements • Discussion of terminology – difference vs. disorder • At the same time, want to reduce clinical load in transcription – otherwise, system won’t be used.
Our clinical goal: • A free, open-access, easily used, quick, and easily interpreted set of clinical tools • One step, no complicated command lines! • That has sufficient diversity to create robust norms that do not over- or under-identify children from 2-6 years, when standardized tests perform least well. • That take clinicians away from tedious assessment and back into the work they should be doing – intervention!!
Not to scare Brian, but… • In the process of developing the CLASP proposal, we found a large population of people not well represented by any language sample analysis utilities:
TEENBank!School-aged children from 6-18 have virtually no LSA data in TalkBank or elsewhere And the work just keeps on rolling! Stay tuned…
Thanks to: • Brian, of course, and Catherine Snow, for roping me in • Mary, for putting up with me and putting me up • LEONID! • Julianne Garbarino • Past and present project managers: Mark Baer, Veronica Builes, Courtney Overton • All of my students and past-students, who are really my colleagues in this effort • Our CLASP consultants: Jan Edwards, Barbara Pearson, Monique Mills