Communications, Collaboration, and Community

Communications, Collaboration, and Community Anoop Gupta Microsoft Research Collaborators: Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro Toyama, Zhengyou Zhang, and others

Deployment-Driven Multidisciplinary Research:Challenges and Opportunities Anoop Gupta Microsoft Research Collaborators: Michael Cohen, Ross Cutler, Zicheng Liu, Yong Rui, Kentaro Toyama, Zhengyou Zhang, and others

Collaboration and Multimedia Group • 16 people • 9 Researchers, 5 R-SDEs, 1 Designer, 1 Usability • Diverse: Systems, Cog Psych, Sociologist, Vision, Graphics • Focus: • Peripheral awareness and people-centric interfaces • Tele-presentation and tele-meeting technologies • Make audio-video information a first-class citizen • Enhanced online communities =>Technologies, Applications, and Social Factors

Peripheral awareness and people-centric interfaces • How do we stay aware of relevant information without annoying notifications • How do we stay aware of people, communicate with them, and bring them to the front of the user interface • How can we leverage technology to provide a better idea of people/environment state

Tele-presentations and tele-meetings • Leverage the combination of • cheap sensors (cameras, microphones, …), • cheap computing power, bandwidth, and storage, • Advances in vision-graphics-SP technologies • Convincing remote presence and interactivity • Whiteboard, note-taking, local interaction tools • High quality recording and archiving • Rich indices and browsing support

Make audio-video information a first-class citizen • Low-cost and high-quality capture • Automatic index creation and highlights • Rich support for annotation and collaboration • Browsing tools and interfaces

Enhanced online communities • Tracking Interaction / Social History • Incentive Structures • Encourage high quality content creation • Encourage interaction • Discourage inappropriate behavior • Filtering and Synopsis • Community Portals

Outline • Our group • Research approach • Project samplings • Office activity modeling • Distributed meetings • Tele-presentations • Face modeling • Concluding Remarks / Challenges

Evaluation / Publication Refine Prototype Product Impact Build Prototype Research Approach • Deployment-driven research • End-users vs. other researchers as main customer • Robustness vs. Functionality • Multiple sensor technologies with graceful degradation • Value existing infrastructure • Simplicity of set-up and operation • Design with end-user in the loop • Field evaluations • Multi-disciplinary tool-set

1. Office Activity Modeling(joint with ASI group at MSR) • Uses of Office Awareness • Intelligent messaging • Send messages on appropriate channel • instant message, office phone, e-mail, mobile, etc. • Intelligent instant messaging • Stopped typing = not there • Peripheral awareness for “buddies” • Is now a good time to drop by Jack’s office?

So how does the deployment-driven approach impact our decisions?

Environment and Outputs • Environment • Office with door (w/ window); Cubicle; Open plan; … • Number of people • (0 / 1+) | (0 / 1 / 1+) | (0/1/2/3/…) • Gross activity • At desk; On PC ; On phone; In meeting; … • Fine activity • Who are the people present • Reading; Answering mail; … • Activity Trends • Usually comes in at 7am, leaves at 5pm • Never comes in on weekends • …

Sensors • Keyboard / Mouse • Calendar (appointment schedule) • Desktop microphone • TAPI-enabled phone (VoIP) • Desktop camera • Other: • Motion detector, high-quality microphone / headset; bird’s-eye camera; laser/IR gates;thermal cameras etc.

Making the Inferences… in increasing approximate expected order of research interest • Use reliable sensors as much as possible • Use reliable sensors to label data for other sensors • For vision, stick to reliably extractable, robust cues (e.g., presence of motion, optic flow) • “Quasi-supervised” learning, using data labeled as above

Results • Eve/Priorities project at MSR (ASI) • Integrates capture of features (keyboard/mouse use, app use, vision, audio events,…) • Language for combining low-level features • Bayesian fusion • Vision component can determine whether person is facing front or not, but still not as robust as desired • Current work in quasi-supervised learning of low-level features… Hope to deploy base versions in summer

Results(preliminary) Concatentation of 3 sections of low-level vision data only, sampled from 8-hour log Unsupervised clustering segments sections cleanly.

Correlates with high keyboard/mouse activity, no speech Ground truth: 1 person at monitor Results(preliminary)

Benefits and Challenges • Benefits • Prioritizing problems and context • How far we need to push the solution • Earlier benefits for end-users; enables social science research • Drawbacks • Need substantial engineering (plus algorithmic) skills • Need multidisciplinary team

2. Distributed Small Group Meetings • Scenario: • Imagine 8-10 people • In conference room, from desktops, mobile • Rich back and forth interaction • Archival and browsing support

Contextualized Research Challenges • Novel camera, microphone, display systems • Speaker tracking; multi-person tracking • Gaze and pose correction • Activity tracking and gesture recognition • Graphical avatars and virtual environments • Real and virtual camera management • Automated indexing and browsing support • Integration of handheld devices • User interface / User experience

First Prototype Omni-directional camera Meeting environment 360-degree panorama view An example omni image

Second Prototype • Cost $300 vs. $10K • Much better quality ~3000 x 500 pixels • All processing done on the PC

All-up Computer controlled User controlled User + Computer + Overview Remote Interfaces

Short/Medium Term Plan • Cameras, Calibration, Stitching • Camera design to minimize parallax • Automatic camera calibration • Real-time on today’s processors • Speaker detection and multiple-person detection • Microphone array sound source localization • Computer vision tracking of multiple people • Fusing A/V for better speaker detection • Simple remote participation interface • Automatic camera management • Video compression, storage, and transmission • Automatic index creation and meeting browsing Expect to deploy in a few conference rooms during summer

3. Tele-Presentations • Enable people to • Easily broadcast/capture lectures (speaker and audience) • Esthetically pleasing • Participate from remote locations • Solution components • Tracking cameras, microphone arrays, … • Video production rules from professionals • Mapping of rules to cameras and software video director • Remote presence and interactivity system (TELEP) • First prototype being used in the small lecture room at MSR

Key Modules • Speaker tracking and audience tracking • Computer-vision-based tracking • Microphone-array-based tracking

Key modules (cont) • Virtual video director (FSM) • Maintain min shot duration • Dynamic max shot duration • Function of shot quality • Triggers TIME_EXPIRE event • Monitoring status change • Triggers STATUS event • Encode editing knowledge into transition probabilities

Initial Deployment Results • Tested concurrent human operator and our system • Field study • Lab study • Results: • Human operator better, but difference is not statistically significant • People could not distinguish which operator was human and which was computer

Technical Challenges • Design and configuration of camera/m-phone systems • More robust lecturer tracking • Smooth tracking in close-up shots • Multiple lecturers • Lecturers move into the audience area • More robust audience tracking • Background noise and room reverberation • More sophisticated rules and knowledge • Human operators have much better ability to deal with exceptions • A flexible/learning automated camera management system

4. Face Modeling • Technical goals: • Build a realistic-looking face model from video images • The face model can be animated right away • Painless in data acquisition & Efficient in model building • Commodity equipment (computer+camera) • No special requirement on the acquisition condition (background, lighting, …) • Uses: • Enhanced chat / gaming environments • Conferencing over low-bandwidth links

System Overview

Examples

Example Application: Virtual Poker • Designed as a social interface • Each player controls an avatar • Some behaviors automatically generated

I guess it’s my turn Virtual Poker • Players automatically turn to follow action/voice

Research Challenges • Teeth, tongue, eyes and hair • Personalized facial expressions • Real-time animation driven from video • Yet more robust and easy to use

Outline • Our group • Research approach • Project samplings • Office activity modeling • Distributed meetings • Tele-presentations • Face modeling • Concluding Remarks / Challenges

% Complete Effort Spent Concluding Remarks • Focus on deployment-driven research • Tremendous leverage in: • Prioritizing problems we explore • Context we assume while solving • How far we push the solution • Earlier benefits for end-users • Enabling social science research • Keeping management support

Challenges: • Need more resources (or pursue fewer things) • Need substantial engineering (plus algorithmic) skills • Premier conferences do not appreciate engineering aspects • Not all important research yields to above constraints • Some solution options: • Community shared infrastructure (environments) into which things can be plugged (e.g., SUIF for compilers) • Premier conferences / Senior researchers attitudes • Funding agency attitudes

Focus on multidisciplinary research • Tremendous leverage in providing: • More robust solutions (or solutions at all) • More cost effective solutions • Getting deployment of research ideas out to end-user and the knowledge from resulting feedback • Challenges: • Vision, Video, Graphics, Hardware, Speech, SP, … • Need diversity within the group plus close ties externally • Need supportive management and funding structure • Academic departments, lab research groups, conferences, tenure organized around traditional disciplinary boundaries • Discourages pushing one discipline as hard as possible when another provides an easier answer

Some solution components: • Strong leaders (e.g., Hennessy – Brought Arch, Compilers, Prog. Lang, OS folks together) • Premier conferences / Senior researchers attitudes • Funding agency attitudes

Questions / Discussion • Graphics: What is the killer application in the workplace? • Vision: How can we identifying the state of the art to a non-expert? • Are you satisfied with the degree of connection with the end-user/reality in your sub-field? • What do you think of the role of multi-disciplinary research? Who should do it? • Do we have balance?

Graphics: What is the killer application in the workplace • We have tried: • 3D Shell • 3D Avatars in tele-meetings • 3D in visualizations, … • … • Killer application still eludes us

Vision: Identifying the state of the art • E.g., Speech • Speaker dependent or independent • Size of vocabulary • Language model / Grammar / Domain • Microphone quality • What’s the equivalent for vision • How can we characterize / partition / … the space in a way so that the non-expert knows when/where vision technology can be relied upon

Questions / Discussion

Communications, Collaboration, and Community

Communications, Collaboration, and Community

Presentation Transcript

Community Collaboration

Community, Collaboration and Creativity

School, Family, and Community Collaboration

Unified Communications and Collaboration: Trends and Changes

Community Collaboration and Health Activism

Computer Mediated Communications and Collaboration (CMCC)

Community and Collaboration

Cooperation, Community and Collaboration

Community Collaboration

Communications and Collaboration

Funder and Community Collaboration

COMMUNITY COLLABORATION PROJECT

Community Collaboration and Enhanced Partnerships

Communications, Collaboration, and Community

From Communications to Collaboration Transformations in networking and communications

Semantic Web: Collaboration and Community

B2B COMMUNITY AND COLLABORATION

Lesson 26 Communications and Collaboration

Policy and Communications – Better Collaboration

School and Community Foundation Collaboration

Unified Communications, Collaboration, and Cloud

Teamcenter’s Community Collaboration