Turn taking and diarization
Download
1 / 25

Turn-Taking and Diarization - PowerPoint PPT Presentation


  • 145 Views
  • Uploaded on

Turn-Taking and Diarization. Julia Hirschberg CS 4706. Today. Turn-taking behaviors in human-human conversation Task/circumstance dependencies Conversational Analysis Linguistic/cultural differences How do we take and give up turns? Diarization: Automatic Turn Segmentation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Turn-Taking and Diarization' - whitley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Turn taking and diarization

Turn-Taking and Diarization

Julia Hirschberg

CS 4706


Today
Today

  • Turn-taking behaviors in human-human conversation

    • Task/circumstance dependencies

    • Conversational Analysis

    • Linguistic/cultural differences

    • How do we take and give up turns?

  • Diarization: Automatic Turn Segmentation


Turn taking behavior
Turn-taking Behavior

  • Dialogue characterized by turn-taking

    • How do speakers know what to say and when to say it?

  • Conversational partners expect certain patterns of behavior in normal conversation

    Pat: You got an A? That’s great!

    Chris: Yeah, I’m really smart you know.

    Chris: Well, I was just lucky I happened to read the chapter on dialogue systems right before the test. Otherwise I never would have squeaked through.

    • General patterns in ordinary conversation

    • Deviation is significant



Expectations of what to say may depend on task at hand
Expectations of What to Say May Depend on Task at Hand ’74)

  • Telephone

    • Openings

      Pat: Hello?

      Chris: Hi, Pat. It’s Chris.

      Pat: Hi!

    • Closings (6-turn)

      Chris: Well, I just wanted to see how you were doing

      Pat: Thanks for calling. We'll have to have lunch sometime

      Chris: I'd like to

      Pat: Okay

      Chris: Okay

      Pat: See you

      Chris: Yeah, see you


  • Email ’74)

    Pat: “Hi, can we switch lunch to 12:30? I’m running late.”

    Chris: “Sure. 12:30.”

    Pat: “Great. See you.”

  • Service encounters

    Clerk: Good morning. Is there something I can help you with?

    Pat: Hi. Yeah. I wonder if you could show me….

  • Meetings

    Boss: Today I want to focus on next year’s goal statements. Chris, could you report please….

    Chris: …

    Boss: Pat, now let’s hear from you…

    Pat: …

  • News broadcasts

    Anchor: …Chris Smith reports from Rome now on the upcoming conclave. Chris?

    Reporter: Thanks, Pat….. And now back to Pat Jones in New York.


Today1
Today ’74)

  • Turn-taking behaviors in human-human conversation

    • Task/circumstance dependencies

    • Conversational Analysis

    • Linguistic/cultural differences

    • How do we take and give up turns?

  • Diarization: Automatic Turn Segmentation


Conversational analysis sacks et al 74
Conversational Analysis ’74)(Sacks et al ’74)

  • Can we characterize expectations of ‘what to say’ more generally?

  • ‘Rules’ of turn-taking

    • If, during this turn the current speaker has selected A as the next speaker, then A must speak next

    • If the current speaker does not select the next speaker, any other speaker may take the next turn

    • If no one else takes the next turn, the current speaker may take the next turn

  • Rules Apply at Transition Relevant Places (TRPs) where something allows speaker changes to occur


Where can speaker shifts occur
Where Can Speaker Shifts Occur ’74)

  • Adjacency pairs

    • Question/answer

    • Greeting/greeting

    • Compliment/downplayer

  • Dispreferred responses

    • Silence

    • ‘No’ to a simple request without explanation

    • Changing the topic abruptly without transition

    • Important for Spoken Dialogue Systems


Today2
Today ’74)

  • Turn-taking behaviors in human-human conversation

    • Task/circumstance dependencies

    • Conversational Analysis

    • Linguistic/cultural differences

    • How do we take and give up turns?

  • Diarization: Automatic Turn Segmentation


Cultural differences in turn taking
Cultural Differences in Turn-Taking ’74)

  • Chinese telephone conversations

    • Openings (Zhu ’04)

      • Mandarin vs. British

      • Identification differences

        • British self-report

        • Chinese callees ask the caller

    • Closings (Sun ’05)

      • 39 female-female telephone conversations

      • Closings initiated through matter-of-fact statement of intention to end conversation

      • Verbalized thanking occurs except in mother/daughter closings – not the standard English model

    • Finnish business calls (Halmari ’93) vs. American

      • Americans get right to the point

      • Finns chat


Today3
Today ’74)

  • Turn-taking behaviors in human-human conversation

    • Task/circumstance dependencies

    • Conversational Analysis

    • Linguistic/cultural differences

    • How do we take and give up turns?

  • Diarization: Automatic Turn Segmentation


Individual differences british politicians beattie 82
Individual Differences: British Politicians (Beattie ’82) ’74)

  • Data: 25m televised interviews before 1979 British General election

    • Margaret Thatcher (Tory leader): the Iron Lady

    • Jim Callaghan (Prime Minister): Sunny Jim

  • Who interrupts?

    • Less intelligent, highly neurotic, extroverted

    • Men interrupt women

    • Interruptions may indicate

      • Desire for dominance

      • Desire for social approval

      • Conveyance of ‘joint enthusiasm’, heightened involvement


  • Method: ’74)

    • Identify spkr 2 attempts to take the turn

      • Smooth switches: no simultaneous speech, spkr 1’s utterance complete, turn to spkr 2

      • Simple interruptions: simultaneous speech, spkr 1 doesn’t complete utterance, turn to spkr 2

      • Overlap: simultaneous speech, spkr 1 completes utterance, turn to spkr 2

      • Butting-in: simultaneous speech but no change of turn, spkr 1 keeps the turn

      • Silent interruption: spkr 1’s utterance incomplete, no simultaneous speech, turn to spkr 2


  • Analyze acoustic/prosodic and gestural information ’74)

    • Turn-yielding behavior

      • Pauses

      • Speaking rate slows

      • Drawl at end of clause

      • Drop in pitch or loudness

      • Completion of syntactic clause

      • Gesture of termination

    • Attempt suppression signals

      • Filled pauses

      • Gestures


Results
Results ’74)

  • Mrs. Thatcher interrupted almost twice as often as she interrupts interviewer (19/10)– unlike Callaghan (14/23)

    • Thatcher: Starts slow and gets faster, few FPs (4)

    • Callaghan: starts fast and gets slower, many FPs (22)

  • Public perception: Thatcher is domineering in interviews and Callaghan is a ‘nice guy’

    • But Thatcher does not dominate

    • Why is Thatcher interrupted?

      • Interruptions come at end of syntactic clause when drawl on stressed syllable in clause and falling intonation


  • Why does she do this?

    • Speech training before election?

  • Why is she still perceived as domineering?

    • When interrupted she doesn’t cede the floor despite lengthy stretches of simultaneous speech


  • Today4
    Today ’74)

    • Turn-taking behaviors in human-human conversation

      • Task/circumstance dependencies

      • Conversational Analysis

      • Linguistic/cultural differences

      • How do we take and give up turns?

    • Diarization: Automatic Turn Segmentation


    Diarization automatic speaker identification segmentation
    Diarization: Automatic Speaker Identification/Segmentation ’74)

    • Segment audio corpora (Broadcast News, meetings, telephone conversations) into speaker segments

      • Speaker segmentation

      • Speaker identification

      • Speech and music

    • Speaker segmentation (Diarization)

      • Initial segmentation

      • Segment clustering based on acoustic features

      • State-of-the-art: 8.47% error


    • Speaker identification ’74)

      • Linguistic information to identify speaker types and speaker names (LIMSI ’04)

        • Templates (“<name> has this report from <location>”)

        • Results: 10.9% error on test set

          • But only 10% of segments contain relevant patterns

          • Estimate 25% error on broadcast news if segmentation and clustering is done to id all of each speaker’s segments


    <DOC> ’74)

    <DOCNO> CNN19980104.1130.0000 </DOCNO>

    <DOCTYPE> MISCELLANEOUS TEXT (automatic initial) </DOCTYPE>

    <DATE_TIME> 01/04/1998 11:30:00.00 </DATE_TIME>

    <BODY>

    <TEXT>

    </TEXT>

    </BODY>

    <END_TIME> 01/04/1998 11:30:34.71 </END_TIME>

    </DOC>

    <DOC>

    <DOCNO> CNN19980104.1130.0034 </DOCNO>

    <DOCTYPE> NEWS STORY </DOCTYPE>

    <DATE_TIME> 01/04/1998 11:30:34.71 </DATE_TIME>

    <BODY>

    <TEXT>

    in northern kentucky are forcing 3,000 people in two states to flee their

    homes.

    the fire started early this morning at the cargill company plant in

    maysville near the ohio river.

    authorities have been going door-to-door advising people in kentucky and

    ohio to take shelter in area high schools.

    the fire is in a building where several fertilizers and chemicals are

    stored.


    officials say all they can do is let the fire burn itself out, because

    spraying water on the flames would be too dangerous.

    <TURN>

    at the current time, our only way of getting it under control is to stay

    away from it.

    we've backed everyone off from the fire by about a mile and a quarter and

    evacuated homes in that radius and the chief threat at this point is a very

    small risk of a very large explosion caused by 400 tons of ammonia nitrate

    stored in the building.

    <TURN>

    foir people have been taken to hospitals.

    one firefighter was injured and treated on the scene.

    </TEXT>

    </BODY>

    <END_TIME> 01/04/1998 11:31:31.00 </END_TIME>

    </DOC>

    <DOC>

    <DOCNO> CNN19980104.1130.0091 </DOCNO>

    <DOCTYPE> NEWS STORY </DOCTYPE>

    <DATE_TIME> 01/04/1998 11:31:31.00 </DATE_TIME>

    <BODY>

    <TEXT>

    authorities in brooklyn, new york, say an explosion at a tire company has


    caused at least three buildings to collapse. out, because

    it set off a four-alarm fire, which has been contained.

    officials tell cnn one person was injured.

    investigators have not determined the cause of the incident.

    </TEXT>

    </BODY>

    <END_TIME> 01/04/1998 11:31:48.11 </END_TIME>

    </DOC>

    <DOC>

    <DOCNO> CNN19980104.1130.0108 </DOCNO>

    <DOCTYPE> NEWS STORY </DOCTYPE>

    <DATE_TIME> 01/04/1998 11:31:48.11 </DATE_TIME>

    <BODY>

    <TEXT>

    unexpected weather conditions are the rule across much of the united states

    this weekend.

    angela astore reports.

    <TURN>

    <ANNOTATION> Reporter: </ANNOTATION>

    it was a nice day to play along the beach -- spend a few hours fishing --

    or get in a game of golf -- not uncommon -- unless it's january in chicago.

    record high temperatures were set yesterday from minnesota to

    massachusetts.

    warm air drawn northward from the gulf of mexico was behind the rise in the

    mercury.


    it was a different scene in the northwest, where snow is the story.

    but the winter weather didn't stop this man from getting in some warmer

    pursuits.

    and he wasn't bothered by the fact that he couldn't see where his golf

    balls landed.

    <TURN>

    it's not really where it's going to land that's important at this point

    while you are learning.

    once you've learned, then it is.

    we'll worry about that when the snow clears.

    right now, it's probably better that i don't see where they land.


    Next class
    Next Class story.

    • Errors and Corrections in SDS


    ad