1 / 50

Presenting in virtual worlds: An architecture for a 3D presenter

Presenting in virtual worlds: An architecture for a 3D presenter. Herwin van Welbergen. Supervised by: Dr. Job Zwiers Prof. Dr. Ir. Anton Nijholt Ir. Dennis Reidsma. What we want to do. During a presentation, several modalities are used to convey information Speech Gesture Sheets

hchew
Download Presentation

Presenting in virtual worlds: An architecture for a 3D presenter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presenting in virtual worlds:An architecture for a 3D presenter Herwin van Welbergen Supervised by: Dr. Job Zwiers Prof. Dr. Ir. Anton Nijholt Ir. Dennis Reidsma

  2. What we want to do • During a presentation, several modalities are used to convey information • Speech • Gesture • Sheets • Our goal is to create a realistic virtual human presenter that presents in a 3D meeting room, using several of those modalities Human Media Interaction

  3. Proces of presenting Human Media Interaction

  4. Focus en approach • Focus on the presenters behavior, not on what causes this behavior • Presentations are generated from a script, based on annotated ‘real’ presentations • Build an architecture that enables the presenter to express itself on multiple modalities • Selected modalities are implemented on the architecture in a theoretically sound way • Speech • Pointing gesture • Sheet changes • Posture and posture shifts • Evalutation/extensions Human Media Interaction

  5. Architecture concerns • Timing/synchronization between modalities • Consistency • Extensibility • Interruptible => real-time behaviour generation • Make use of the existing HMI meeting room Human Media Interaction

  6. Architecture: meeting room Human Media Interaction

  7. Architecture: sheets Human Media Interaction

  8. Architecture Human Media Interaction

  9. Script language: goal • Describe action on different modalities • Describe the synchronization between the actions Human Media Interaction

  10. MultiModalSync • Every modality has its own channel • Synchronization is achieved by defining a leading modality • The leading modality can change over time Human Media Interaction

  11. MultiModalSync: example Human Media Interaction

  12. Synchronization of gesture and speech • Phases of a gesture • Preparation (optional) • Stroke • Retraction (optional) • Phonologic synchronization rule (McNeill): • The stroke precedes or ends at the phonological peak syllable of the speech • This means that the stroke has to be synchronized with the peak of accompanying speech Human Media Interaction

  13. Speech • Loquendo TTS • Synchronization on a word-level • Lip synchronization • Amplitude of the speech => opening of the jaw Human Media Interaction

  14. Posture • Defines start and end position for limbs moved in gesture units • Posture shifts are implemented by interpolating between the begin and the end pose Human Media Interaction

  15. Pointing gesture • Which modality (left hand, right hand, head) • How long does the preparation take • What are the end positions of hand and head? • How do the hand and the head move? Human Media Interaction

  16. Pointing gesture: modality • Point to the left with the left hand, point to the right with the right hand • During pointing, the eye fixates on the target (gaze anchoring) • If the hands are busy doing something else, point with just the head Human Media Interaction

  17. Pointing gesture: Fitts’ law • Predicts how much time is needed to move from a start position to the target area • Depended on the distance to travel and the size of the target • Models quick, aimed pointing actions • Can be used to determine the minimum preparation time Human Media Interaction

  18. Pointing gesture: Hand end position • The presenter only uses his shoulder, elbow and hand to point • The position of the wrist is known • To calculate: the rotation of elbow and shoulder • Can be found analytically • The solution has one degree of freedom • The elbow always points down Human Media Interaction

  19. Pointing gesture: head end position • The neck has 3 rotational degrees of freedom • Pointing the nose at a target is a 2-dimensional task • Donders’ law for the head: to each gaze direction belong 3 unique values for the 3 degrees of freedom of neck rotation Human Media Interaction

  20. Pointing gesture: How does the hand move? • Velocity profile is bubble shaped • This bubble is not necessarily symmetrical • Adjustable: • Length acceleration phase • Maximum speed • Assumption • The elbow and hand travel along the shortest path toward the end position Human Media Interaction

  21. Pointing gesture: How does the head move? • The rotation axis of the head is constant during gaze movement • The velocity profile is (again) bubble shaped Human Media Interaction

  22. Pointing gesture: retraction(1) • Kendon: If a retraction phase occurs, than the movement in that retraction phase is symmetrical to the movement in the preparation phase • Tested using videos Human Media Interaction

  23. Pointing gesture: retraction(2) Human Media Interaction

  24. Involuntary movement • Even while standing still, our body moves in subtle ways • Eye blinking • Chest and shoulders move when we breath in and out • Balancing • A virtual human that does not make this kind of movement will look stiff and unnatural • Simulated by putting small random movements on the joints of the presenter Human Media Interaction

  25. Demo Human Media Interaction

  26. Evaluation: architecture • Timing • Timing on a word level is sufficient to satisfy the phonological synchrony rule • More variation in timing and tighter planning can be achieved by identifying the phonological peak in words • The model of changing modalities is more flexible than using speech as leading modality Human Media Interaction

  27. Evaluation: architecture(2) • Consistency • As predicted: consistency conflicts between implemented and not implemented modalities • Extensibility • The architecture is used in other projects at HMI • Interruptible presenter (Jaak Vlasveld) • Virtual guide (Marco van Kessel) Human Media Interaction

  28. Possible extensions(1) • Improve current features • Improve posture shifts motions • Use more joints in pointing gesture to reduce stiffness • Stroke animation for pointing gesture • Synchronization at peak syllable level • Etc… Human Media Interaction

  29. Possible extensions (2) • Broaden the presenters ability to express itself • More gesture types • Beat • Iconics • Metaphors Human Media Interaction

  30. Possible extensions(3) • Raise the presenting process to a higher level • Now: the script determines what to express in speech and what in gesture • Next abstraction step: Implement a process that determines what to say and what gestures to make • Based on content of the presenters’ story • Can be guided by style and emotional state Human Media Interaction

  31. Questions? Human Media Interaction

  32. Eastereggs Human Media Interaction

  33. Digital entertainment in a virtual museum • Presenter as virtual museum guide • Corpus of annotated paintings • General aspects • Properties of specific sub areas • Automatically generated presentations Human Media Interaction

  34. Pointing gesture: retraction(3) • Rules • If a pointing gesture is directly followed by another gesture: skip the retraction phase and start the new gesture • Otherwise, move back to the resting position in a similar way as the movement in the preparation phase (but backward) Human Media Interaction

  35. Evaluation: Separate modalities • Sheets • By identifying rectangular sheet areas, the pointing gesture can be adjusted to the shape of the target area • Speech • Posture • Poses are useful to define the start and end position of the body during gestures • Posture changes could be done in better ways Human Media Interaction

  36. Evaluation: Involuntary movement • Goal: reduce the stiffness of the presenter • Evaluated with a user test • 17 subjects thought the involuntary moving presenter moved in a more natural way • 1 of the subjects did not see a difference • 2 subjects thought the involuntary moving presenter moved in a more natural way • All subjects agreed that the involuntary moving presenter was less stiff Human Media Interaction

  37. Evaluation: Pointing gesture • Fitts Law • 3 out of 4 pointing gestures in an example presentation could be modelled using Fitts’ law • Minimum preparation time is useful for gesture planning • Symmetry • Donders’ Law • IK-technique • Real-time • Looks somewhat stiff because only the shoulder and the elbow are used to move the hand Human Media Interaction

  38. Borrel • Torenkamer Bastille Human Media Interaction

  39. Possible extensions • Speech • Synchronization at syllable level / phonological peak • Pointing gesture • Stroke animation • Decrease stiffness by moving more joints in the body • Posture change • Predefined animation • Posture change animation model Human Media Interaction

  40. Waarom virtuele mensen? • Tonen en valideren van theorieen over menselijk gedrag of menselijke beweging • Mensen reageren op media op dezelfde manier als ze op mensen reageren • Theorie: door interactie met media menselijker te maken wordt deze plezieriger en efficienter Human Media Interaction

  41. Bestaande script-talen(1) • Gebruik van stempels met vaste tijden (NITE-XML, CoGest, etc) • Vooral gebruikt voor annotatie • Limiteerd flexibiliteit, de timing van alle acties moet van te voren bepaald worden • SMIL-achtige aanpak (CML, STEP,VHML) • Gebruikt par, seq and wait • Iedere mogelijke manier van synchronizatie kan hiermee uitgedrukt worden • Verschillende modaliteiten zijn niet duidelijk gescheiden • Het hele script moet gelezen worden voordat met de uitvoer begonnen kan worden Human Media Interaction

  42. Bestaande script-talen(2) • Defineer een hoofd modaliteit die de timing van de andere modaliteiten bepaald • Er bestaat geen modaliteit die de timing van alle andere modaliteiten bepaald • Als zo’n modaliteit zou bestaan, dan zou deze moeten kunnen wisselen Human Media Interaction

  43. Mogelijke uitbreidingen: gebaar/spraak selectie • Welke gebaren de presenter gebruikt en wat hij zegt komt nu uit een script • Volgende logische abstractiestap: maak het proces dat bepaald welke gebaren en welke spraak geselecteerd worden • Bestaand werk: • Voor wijsgebaren (Krahmer) • Voor iconische gebaren (Cassel) Human Media Interaction

  44. Andere mogelijke uitbreidingen • Interruptie • Geavanceerdere presenter • Gebruik van vingers voor gebaren • Realistische modellen voor bijv. ademen en knipperen van ogen • Stijl en emotie Human Media Interaction

  45. Mogelijke uitbreidingen: meer types gebaren • Meer types gebaren • Beat • Iconisch • Metaforisch • Conflict oplossing • Kies een andere modaliteit • Combineer gebaren • Voer een van de gebaren niet uit Human Media Interaction

  46. Gebaar: Wat is een gebaar? • Een beweging van het lichaam of de ledematen dat een idee uitdrukt of bekrachtigt • Wat is het verschil met andere lichaamsbeweging? • Gebaren zijn symmetrisch • Piek structuur • Duidelijk start en einde Human Media Interaction

  47. Gebaar: structuur • Gesture unit: meerder gebaren die direct achter elkaar worden uitgevoerd Human Media Interaction

  48. Eisen aan het presentatie-script • De synchronisatie moet niet af hangen van constante tijds waarden • De synchronizerende modaliteit moet veranderd kunnen worden • De modaliteiten moeten duidelijk gescheiden zijn, zodat het script goed te lezen is • Het moet mogelijk zijn om te beginnen met de executie van het script voordat het volledig ingelezen en gepland is Human Media Interaction

  49. MultiModalSync(2) • Kanalen worden parallel uitgevoerd • Binnen een kanaal worden de expressies sequentieel uitgevoerd • Synchronisatie punten kunnen binnen kanalen of binnen expressies worden gedefineerd • Een kanaal kan gesynchronizeerd worden met andere kanalen, door te wachten op een synchronizatie punt • Expressions kunnen gebruik maken van synchronizatie punten voor hun timing Human Media Interaction

  50. Content • Goal • Focus and approach • Architecture • Script language • Presenting modalities • Demo • Evaluation • Possible extensions Human Media Interaction

More Related