1 / 58

Using Oral History to Learn About Searching Spontaneous Conversational Speech

Using Oral History to Learn About Searching Spontaneous Conversational Speech. Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies University of Maryland, College Park. Outline. Spoken word collections The MALACH Project

muncel
Download Presentation

Using Oral History to Learn About Searching Spontaneous Conversational Speech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Oral History to Learn About Searching Spontaneous Conversational Speech Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies University of Maryland, College Park University of Kentucky

  2. Outline • Spoken word collections • The MALACH Project • Building a IR test collection • First experiments • Some things to think about

  3. A Web of Speech?

  4. Spoken Word Collections • Broadcast programming • News, interview, talk radio, sports, entertainment • Scripted stories • Books on tape, poetry reading, theater • Spontaneous storytelling • Oral history, folklore • Incidental recording • Speeches, oral arguments, meetings, phone calls

  5. Outline • Spoken word collections • The MALACH Project • Building an IR test collection • First experiments • Some things to think about

  6. Shoah Foundation Collection • Substantial scale • 116,000 hours; 52,000 interviews; 32 languages • Spontaneous conversational speech • Accents, elderly, emotional, … • Accessible • $100 million collection and digitization investment • Manually indexed (10,000 hours) • Segmented, thesaurus terms, people, summaries • Users • A department working full time on dissemination

  7. Interview Excerpt • Audio characteristics • Accented (this one is unusually clear) • Separate channels for interviewer / interviewee • Dialog structure • Interviewers have different styles • Content characteristics • Domain-specific terms • Named entity mentions and relationships

  8. Topic Segmentation Categorization Extraction Translation Language Technology English Czech Russian Slovak Interactive Search Systems Speech Technology Search Technology Test Collection Interface Development User Studies The MALACH Project

  9. English ASR Accuracy Training: 200 hours from 800 speakers

  10. Outline • Spoken word collections • The MALACH Project • Building an IR test collection • First experiments • Some things to think about

  11. History Linguistics Journalism Material culture Education Psychology Political science Law enforcement Book Documentary film Research paper CDROM Study guide Obituary Evidence Personal use Who Uses the Collection? Discipline Products Based on analysis of 280 written requests

  12. 8 independent searchers Holocaust studies (2) German Studies History/Political Science Ethnography Sociology Documentary producer High school teacher 8 teamed searchers All high school teachers Thesaurus-based search Rich data collection Intermediary interaction Semi-structured interviews Observational notes Think-aloud Screen capture Qualitative analysis Theory-guided coding Abductive reasoning Observational Studies

  13. Powerful testimonies give teachers ideas on what to discuss in the classroom (topic) and how to introduce it (activity). “The brainstorming really guided my search today, and I felt like I finally had a big enough chunk of time on this to really find something, but I need about a couple hundred more hours … Yesterday in my search, I just felt like I was kind of going around in the dark. But that productive writing session really directed my search, even though I stayed with [the same testimony] the whole time.” • Group discussions clarify themes and define activities, which hone teachers’ criteria. 8 teachers, working in groups of 4

  14. Thesaurus-Based Search 8 teachers, working in groups of 4

  15. Relevance Criteria 6 Scholars, 1 teacher, 1 film producer, working individually

  16. Topicality Total mentions 6 Scholars, 1 teacher, 1 movie producer, working individually

  17. Search Architecture Query Formulation Speech Recognition Automatic Search Boundary Detection Content Tagging Interactive Selection

  18. Interviews Topic Statements Comparable Collection Ranked Lists Relevance Judgments Evaluation Mean Average Precision MALACH Test Collection Query Formulation Speech Recognition Automatic Search Boundary Detection Content Tagging

  19. 4,000 English Interviews 9.947 segments ~400 words each (total: 625 hours) 10,000 hours, full-description manual indexing

  20. <DOCNO>VHF00017-062567.005</DOCNO> <KEYWORD> Warsaw (Poland), Poland 1935 (May 13) - 1939 (August 31), awareness of political or military events, schools </KEYWORD> <PERSON> Sophie Perutz, Henry Hemar </PERSON> <SUMMARY> AH talks about the college she attended before the war. She mentions meeting her husband. She discusses young peoples' awareness of the political events that preceded the outbreak of war. </SUMMARY> <SCRATCHPAD>graduated HS, went to college 1 year, professional college hotel management; met future husband, knew that they'd end up together; sister also in college, nice social life, lots of company, not too serious; already got news from Czechoslovakia, Sudeten, knew that Poland would be next but what could they do about it, very passive; just heard info from radio and press </SCRATCHPAD> <ASRTEXT> no no no they did no not not uh i know there was no place to go we didn't have family in a in other countries so we were not financially at the at extremely went so that was never at plano of my family it is so and so that was the atmosphere in the in the country prior to the to the war i graduate take the high school i had one year of college which was a profession and that because that was already did the practical trends f so that was a study for whatever management that eh eh education and this i i had only one that here all that at that time i met my future husband and that to me about any we knew it that way we were in and out together so and i was quite county there was so whatever i did that and this so that was the person that lived my sister was it here is first year of of colleagues and and also she had a very strongly this antisemitic trend and our parents there was a nice social life young students that we had open house always pleasant we had a lot of that company here and and we were not too serious about that she we got there we were getting the they already did knew he knew so from czechoslovakia from they saw that from other part and we knew the in that that he is uhhuh the hitler spicy we go into this year this direction that eh poland will be the next country but there was nothing that we would do it at that time so he was a very very he says belong to any any organizations especially that the so we just take information from the radio and from the dress </ASRTEXT>

  21. Topic Construction • 280 topical requests, in folders at VHF • From scholars, teachers, broadcasters, … • 50 selected for use in the collection • Recast in TREC topic format • Some needed to be “broadened” • 30 assessed during Summer 2003 • 28 yielded at least 5 relevant segments

  22. An Example Topic Number: 1148 Title: Jewish resistance in Europe Description: Provide testimonies or describe actions of Jewish resistance in Europe before and during the war. Narrative: The relevant material should describe actions of only- or mostly Jewish resistance in Europe. Both individual and group-based actions are relevant. Type of actions may include survival (fleeing, hiding, saving children), testifying (alerting the outside world, writing, hiding testimonies), fighting (partisans, uprising, political security) Information about undifferentiated resistance groups is not relevant.

  23. Assessment Strategy • Exhaustive (Cranfield) is not scalable • Pooled (TREC) is not yet possible • Requires a diverse set of ASR and IR systems • Will be used at CLEF 2005 Speech Retrieval track • Search-guided (TDT) was viable • Iterate topic research/search/assessment • Augment with review, adjudication, reassessment • Requires an effective interactive search system • 28 topics: 821 hours/3 months/4 assessors

  24. Defining Topical “Relevance” • “Classic” relevance (to “food in Auschwitz”) • Direct Knew food was sometimes withheld • Indirect Saw undernourished people • Additional relevance types • Context Intensity of manual labor • Comparison Food situation in a different camp • Pointer Mention of a study on the subject

  25. Recording Judgments • 14 topics independently assessed • Assessors later met to resolve differences • 14 topics assessed and then reviewed • Decisions of the reviewer were final Average: 3.2 minutes per judgment

  26. “Relevant” Mapping to Binary Relevance Number of judgments, by type and degree of relevance 3,643 adjudicated judgment pairs

  27. Assessor Agreement (2122) (1592) (184) (775) (283) (235) • 44% topic-averaged overlap for Direct+Indirect 2/3/4 judgments 14 topics, 4 assessors in 6 pairings, 1806 judgments

  28. Outline • Spoken word collections • The MALACH Project • Building an IR test collection • First experiments • Some things to think about

  29. ASR-Based Search Mean Average Precision Title queries, adjudicated judgments

  30. Comparing Index Terms +Persons Title queries, adjudicated judgments

  31. Comparing ASR and Metadata Title queries, adjudicated judgments

  32. Failure Analysis ASR % of Metadata Title queries, adjudicated judgments

  33. What Causes the Difference? • Hypothesis 1: Good human indexers • Maybe people don’t speak the query terms • Human indexers can still detect the topic • Hypothesis 2: Weak ASR language model • ASR does best on “newspaper terms” • {Bulgaria, partisans} >> {Auschwitz, sonderkommando} • Mixture: 200 hours in-domain + gigaword corpus

  34. Failure Analysis ASR % of Metadata Title queries, adjudicated judgments

  35. Searching Manual Transcripts jewish kapo(s) fort ontario refugee camp Title queries, adjudicated judgments

  36. Ideas Explored

  37. Results on 15 interviews Named Entity Word Error Rate (NE WER) Halved the Named Entity WER! Subheading: 20pt Arial Italicsteal R045 | G182 | B179 Overall Word Error Rate (WER) Highlight: 18pt Arial Italic,teal R045 | G182 | B179 Key Result: Use smaller , metadata-adapted vocabularies Text slide withsubheading and highlight

  38. Categorization Using Automatic Speech Recognition kNN, test: 332 segments, 216 categories with 10+ training samples Subheading: 20pt Arial Italicsteal R045 | G182 | B179 Training: training ASR Word Error Rate ~ 47% • Equal performance for human and ASR transcripts • Improvement with additional training data Highlight: 18pt Arial Italic,teal R045 | G182 | B179 Text slide withsubheading and highlight

  39. Segment with Categories what can you tell me about the holidays in your house they were very very nice and my father was very religious and everything was kept like it's suppose to be you know like it's written in the Torah … and there was a extra room specialif a very poor man came and he stayed overnight there wasa room for him to sleep there … my mother mainly she thought that this is a very big mitzvah to do you know because there in in where I come from there was a lot of poor people not from our town from out of town but use to come and sometimes they couldn't make it back home so they slept over and sometime they stayed over Shabbats … my mother helped in the business and we had a nanna that took care of us and we also had a maid in the house…my mother only did the cooking … my father with his second wife didn't have children for twenty years he lived with her and they didn't have any children … then he married my mother and with my mother they had four children… Human Indexing kNN Jewish customs and observance family life socioeconomic status Czechoslovakia 11/11/1918 - 3/14/1939 Jewish customs and observance family life extended family members family homes Poland 11/11/1918 - 8/31/1939 Precision: 2/5, Recall: 2/4, F = 44%

  40. 3,199 Training segments test segments Spoken Words (hand transcribed) Spoken Words (ASR transcript) kNN Categorization Thesaurus Terms Thesaurus Terms F=0.19 (microaveraged) Index Thesaurus Terms ASR Words Title queries, linear score combination, adjudicated judgments Category Expansion

  41. Average of 3.4 relevant segments in top 20 +27% ASR-Based Search Mean Average Precision Title queries, adjudicated judgments

  42. Sensitivity Analysis

  43. Human-Assigned Segment Boundary ... because the roads were crowded with with army units going back and forth you know .. and you also were off you had to walk no on the main road because you were afraid you were going to be picked up for work .. that's what some did they came to Loetche and some people were picked up and held four weeks for work .. when they came home they told us on the way --- segment boundary --- we came we came home was was about the time of Succoth .. you know the city was deserted there was a they were already taking people to work .. when we came home we couldn't recognize the city .. my parents first of all they confiscated everything .. they told us to get out of the orchard .. they took whatever they wanted they took over the whole ranch ... arrival Agenda slide

  44. Probabilistic Models for Segmentation • Model features • semantic • left-right window similarity • lexical • “key” words and phrases: • yes: “tell me”, “back to”, no: “did they”, “and there” • prosodic • silence duration, rate of speech • structural • position in the file, clause length Agenda slide

  45. Segmentation Using Automatic Speech Recognition Training: training ASR Word Error Rate ~ 47% Subheading: 20pt Arial Italicsteal R045 | G182 | B179 • Equal performance for human and ASR transcripts • Modest improvement with additional ASR training data Highlight: 18pt Arial Italic,teal R045 | G182 | B179 Text slide withsubheading and highlight

  46. CLEF CL-SR Evaluation • Test collection release • Available on research license from ELDA • Packaged as a standard IR collection • ASR Transcripts / known topic boundaries • Contrastive metadata • Training topics (with relevance judgments) • 25 Double-blind evaluation topics • Runs due June 1 2005, results returned Aug 1 • Plan to add Czech in 2006

  47. What Have We Learned? • User studies help guide test collection design • Named entities are important to scholars • Age at time of experience is important to teachers • Test collections guide component development • Dynamic ASR lexicon cuts NE error rate in half • Text classification seems to be helping • Presently depends on lexical overlap w/thesaurus

  48. Shoah Foundation Sam Gustman Cambridge University Bill Byrne Johns Hopkins Jim Mayfield (APL) Charles University Jan Hajic Univ of West Bohemia Josef Psutka IBM TJ Watson Bhuvana Ramabhadran Michael Picheny Martin Franz Nanda Kambhatla University of Maryland Doug Oard (IS) Dagobert Soergel (IS) David Doermann (CS) Bonnie Dorr (CS) Philip Resnik (Linguistics) The MALACH Team

  49. Some Things to Think About • Privacy protection • Working with real data has real consequences • Are fixed segments the right retrieval unit? • Or is it good enough to know where to start? • What will it cost to tailor an ASR system? • $100K to $1 million per application? • Is ASR fast enough to really scale up? • 0.1 to 10 machine-hours per hour of speech

  50. For More Information • The MALACH project • http://www.clsp.jhu.edu/research/malach • CLEF-2005 evaluation • http://www.clef-campaign.org • NSF/DELOS Spoken Word Access Group • http://www.dcs.shef.ac.uk/spandh/projects/swag

More Related