1 / 50

Critical Issues in Information Systems

Critical Issues in Information Systems. BUSS 951. Seminar 10 Transcription & Coding. Transcription & Coding An Introduction. Transcribing & Coding. transcription and coding is a major requirement for language based methods of analysis transcription - convertion of speech to writing

hazelhull
Download Presentation

Critical Issues in Information Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Critical Issues in Information Systems BUSS 951 Seminar 10 Transcription & Coding

  2. Transcription & CodingAn Introduction

  3. Transcribing & Coding • transcription and codingis a major requirement for language based methods of analysis • transcription- convertion of speech to writing • coding- is the addition of relevant information to the transcription • needed because spoken and written language are very different

  4. + interactive 2 or more participants + face-to-face in the same place and time + language as action using language to accomplish some task + spontaneous without rehearsing what is going to be said + casual informal and everyday - interactive one participant - face-to-face on his or her own - language as action using language to reflect - spontaneous planning, drafting and rewriting - casual formal and special occasions Speech is not WritingDifferences in Spoken & Written Texts

  5. Transcribing & Coding iterate until the text is transcribed and coded Transcribe Seek to Lead-in Zone Playback Coding cue the tape (rewind and fast forward) until you get to the part of the tape you are seeking

  6. CHAT Standard

  7. CHAT • one of the best standards is CHAT- Codes for the Human Analysis of Transcripts • well defined standard • even in research literature, transcriptions are often ad hoc & idiosyncratic • formal standards are difficult to obtain

  8. CHAT • developed for subsequent computer processing in mind • suite of programs is available called CLAN to parse the text • excellent provision for creating transcripts even when the text is difficult to understand • speaker has an accent or has a speech problem

  9. CHAT • standard is extensible; provides a consistent way of adding new headers if necessary • developed by Brain MacWhinney and Jane Walter at the CHILDES- Child Language Data Exchange Research Centre Department of Psychology, Carnegie Mellon University

  10. CHAT Structure • CHAT has a basic structure common to all transcripts • a block of so-called Constant Headersat the top of the transcriptstarting with an @Begin • the body of the transcriptconsisting of turns taken by speakers called Mainlines, followed by zero through to many Dependent Tiers • a single command which is used to signal the end of the transcript, @End

  11. CHAT StructureTop of Transcript

  12. CHAT StructureTop of the Transcript (1) • the top of any transcript always has two compulsory commands: @Begin @Participants: MCL MicroLabs Assistant, STU Student • @Beginindicates the start of the transcript. It must always be the first line of any CHAT transcript. It does not include any other information...

  13. CHAT StructureTop of the Transcript (2) • @Participantsspecifies is a mandatory ConstantHeader- a command only used once per transcript- which lists the interactants in the transcript. The syntax as with all transcripts is critical. • the three letter codes after the header indicate a person who speaks or is other wise involved with the text • the string after the three letter code explains the role of that participant in the text

  14. CHAT StructureTop of the Transcript (3) • below the @Begin and @Participants can be listed other optional constant headers including @Age of, @Sex of, @SES of @Age of MCL: 35 @SES of MCL: middle @Sex of MCL: male

  15. CHAT StructureTop of the Transcript (4) • optional Constant Headersmust follow the @Participants header because they need to refer to the three letter participant identifier • whether you include them will depend on if they are significant: is the age of a participant important in the text? • a complete list follows...

  16. Table 1 : CHAT Constant Headers. CHAT Constant Headers. Constant Headers that have proved to be useful in workplace language studies (Clarke 1996b, 1996c) are presented against a white background while less relevant Constant Headers are presented against a shaded background. @Begin indicates the start of CHAT file @Participants: list of actors in file @Age of XXX: speakers age in yymmdd format @Birth of XXX: date of birth of speaker @SES of XXX: socio-economic status of speaker @Education of XXX: speakers education in years @Sex of XXX: indicates gender of the speaker @Filename: name of transcription data file @Coding: version of CHAT being used @Warning: relative completeness of the transcript @End indicates the end of CHAT file

  17. CHAT StructureTop of the Transcript (6) • the CHAT Constant Headerscan also be represented using a syntax diagram, which are also used for describing the syntax rules for computer languages like Pascal • a diagram follows...

  18. Figure 3 : CHAT Constant Headers Syntax Diagram

  19. CHAT StructureTop of the Transcript (8) • Completed transcript so far... @Begin @Participants: MCL MicroLabs Assistant, STU Student @Age of MCL: 35 @SES of MCL: middle @Sex of MCL: male @Age of STU: 18 @SES of STU: middle @Sex of STU: male

  20. CHAT StructureTranscript Body

  21. CHAT StructureTranscript Body (1) • most of the transcript body of mainlines which indicate that a participant is taking a turn in the conversation • other features are also found in the transcript body include: • Dependent Tierswhich are used to add special coding for a given turn • Changeable or Repeating Headers

  22. CHAT StructureMainlines (1) • a mainline is a turn taken by a participant, indicated by an * • who takes a turn is indicated by one of the participant identifiers, listed in the @Participants constant header...

  23. CHAT StructureMainlines (2) • the text comprising the speakers turn is transcribed after the * and participant identifier • an example of a completed mainline: *MCL what software do you want

  24. CHAT StructureDependent Tiers (1) • Dependent Tiers are used to add extra detail • many different types of them • always relate only to a specific turn, and if necessary, are only ever listed below the mainline to which they refer

  25. CHAT StructureDependent Tiers (2) • dependent tiers are identified in a transcript by the use of a % followed by the appropriate dependent tier code • the dependent tier code tells the reader what kind of information is being coded for the above mainline

  26. CHAT StructureDependent Tiers (3) • an example showing a mainline and its two dependent tiers (%sit, %com) is provided below: *MCL what software do you want %sit STU and MCL are at the service desk %com STU looks like he is lost • a list of valid dependent tiers follows...

  27. Table 4 : CHAT Dependent Tiers. Dependent Tiers that have proved to be useful in workplace language studies (Clarke 1996b, 1996c) are presented against a white background while less relevant Dependent Tiers are presented against a shaded background. %flo simplified flowing original %pho phonetic and phonemic transcription %par paralinguistic features %int intonation and prosody %lan code shifting into secondary language %act actions %fac facial actions %gpx gestures and proxemics %add addressee %sit situational coding %exp explanation %com comments by investigator/transcriber %alt alternative utterance %tim time stamp coding %spa speech act coding %mor morphemic semantics %phs phrase structure notation %err error coding %cod general purpose coding

  28. CHAT StructureChangeable/Repeating Headers (1) • Repeating Headers can be inserted repeatedly in a transcript, but they are only used when a significant condition has changed • inserted in a transcript, a Repeating Header is valid for the remainder of the transcript, or until another Header of the same type overrides it

  29. CHAT StructureChangeable/Repeating Headers (2) • a list of valid Changeable or Repeating Headers is provided on the next slide • just like the Constant Headers, Changeable or Repeating Headers can be described using a syntax diagram, which is on the slide following the list

  30. CHAT StructureSummary...so far! • so far we have described three separate types of structure that occur within the body of a CHAT transcript: • Mainlines (for transcribing turns) • Dependent Tiers (for coding turns) • Changeable or Repeating Headers

  31. CHAT StructureSpecial Mainline Codes (1) • sometimes it is important to add additional information into the mainline itself • NOTE the following about the body of the CHAT transcript: • an actual turn as shown in lower case on a mainline, and • that there is normally no punctuation on mainlines

  32. CHAT StructureSpecial Mainline Codes (2) • this is because when punctuation is used it conforms to CHAT Special Mainline Codes • Special Mainline Codes occur in one of two types: • Utterance Junctures and Delimiters • Utterance Ambiguity Codes • we will describe both types in order...

  33. CHAT StructureSpecial Mainline Codes (3) • Utterance Junctures and Delimiters- • indicate either junctures or brakes in the turn (pauses etc). These Special Mainline Codes are referred to as Utterance Internal Junctures • indicate how a turn was completed (as a question, the speaker was interrupted etc). These Special Mainline Codes are referred to as Post Utterance Delimiters

  34. CHAT StructureSpecial Mainline Codes (4) • Utterance Junctures and Delimiters continued... • indicate how a turn was started, either by a participant taking up anothers talk (called latching), or by completing anothers talk (called completion). These Special Mainline Codes are referred to as Pre Utterance Delimiters • a list follows...

  35. Utterance Junctures and Delimiters (a) Utterance Internal Junctures [#] Short Pause [#long] Long Pause [#ss.mm] Timed Pause , Comma (b) Post Utterance Delimiters . Period ? Question ! Exclamation [...] Trailing off [\] Interruption (c) Pre Utterance Delimiters [>] Latching [+] Completion

  36. CHAT StructureSpecial Mainline Codes (6) • Utterance Ambiguity Codescan also be inserted into a mainline • used when there has been: • a problem with the transcription process, or • when an unusual condition occurs (when a gesture substitutes for a word) words used special coding is required...

  37. CHAT StructureSpecial Mainline Codes (7) • Utterance Ambiguity Codesmay also be moved to their own dependent tiers if the mainline is getting cluttered up with coding • the table that follows shows the valid CHAT Utterance Ambiguity Codes ...

  38. CHAT StructureBottom of the Transcript

  39. CHAT StructureBottom of the Transcript (1) • the only unique syntax for the bottom of the transcript is the @Endmandatory Constant Header • needed to indicate when a transcript is finished • a relatively complete transcript extract showing required features follows. NOTE that : is not part of the CHAT standard...

  40. Tool Support

  41. Tool Support(1) • the CHAT system has a number of tools available for it • one tool called CLAN consists of a parser for checking the syntax of CHAT transcripts • multimedia versions of CLAN are being developed; useful when meetings have been videotaped

  42. Tool Support (2)Needed for Transcription NOT Coding • these tools are great for building elaborately coded transcripts • they are not so helpful when dealing with workplace language • coding is not the major problem- its transcription that takes the greatest effort in workplace language studies

  43. Tool Support (3)Transcription • there are of course a number of transcription systems which when combined with CHAT and CLAN could form a useful workplace language system • but, the ‘State-of-the-Art’ still not very good

  44. Tool Support (4)Speech Recognition? • some manufacturers claim to get 95% accuracy in transcription, but this is only possible under very constrained conditions: • these systems cannot handle speech which is continuous and flowing- the software cannot find where words start and end • these systems cannot transcribe speech unless the system has been trained to understand each and every speaker

  45. Tool Support (5) • in some circumstances the inability of current systems to recognise Flowing Speechmay not be a great problem because workplace transcripts can be sparse • Some excellent system are becoming available eg./ Dragon DICTATE for Windows

  46. Tool Support (6) • but, it has taken the IS Discipline 20 years to come up with reasonable CASE tools to support traditional systems development activities • we may need another 20 years to provide the same level of support for semio-informatics!

More Related