an oxygenated presentation manager n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
An Oxygenated Presentation Manager PowerPoint Presentation
Download Presentation
An Oxygenated Presentation Manager

Loading in 2 Seconds...

play fullscreen
1 / 48

An Oxygenated Presentation Manager - PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on

An Oxygenated Presentation Manager. Larry Rudolph Oxygen Workshop, January, 2002. Goals & Overview. Integrate Many Oxygen Technologies Application Driven Use an application that we understand Personally use often Would help if were more human-centric Portable (as opposed to E-21)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'An Oxygenated Presentation Manager' - uttara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
an oxygenated presentation manager
An Oxygenated Presentation Manager

Larry Rudolph

Oxygen Workshop, January, 2002

Larry Rudolph & Shalini Agarwal

goals overview
Goals & Overview
  • Integrate Many Oxygen Technologies
  • Application Driven
    • Use an application that we understand
    • Personally use often
    • Would help if were more human-centric
    • Portable (as opposed to E-21)
  • Develop Architectural Infrastructure
    • Exposes new requirements
  • Critique of Presentation Manager
    • What is wrong with it
    • What needs improvement

Larry Rudolph & Shalini Agarwal

application scenario
Application Scenario

Larry Rudolph & Shalini Agarwal

an oxygen application
An Oxygen Application

Components

  • Input
    • Vision
    • Speech
    • Touch
  • Output
    • Projector
    • Handheld
    • Archive
  • Processing
    • Changing configuration
  • Equipment
    • Today, it is too hard 
      • Linux laptop; windows laptop; camera; microphone; network; projector; power blocks
    • Tomorrow, much easier 
      • a couple of H21’s

Larry Rudolph & Shalini Agarwal

camera watching laser point on screen
Camera watching laser point on screen
  • Camera Challenges
      • Inexpensive ones have wrong focal length
      • Alignment issues
        • Use edge of screen, display pattern, figure out from what is known to be visible
        • We ended up displaying a pattern of concentric circles
      • Relative size of laser point depends on distance
        • Beyond ten feet, had to use only certain types of lasers
        • Could slow-down camera and let pixels saturate (too complicated)

Larry Rudolph & Shalini Agarwal

camera watching laser point on screen cont
Camera watching laser point on screen (cont)
  • Camera Interface
      • Click at point (x,y)
        • Hold laser at same location for 5 seconds
      • Select horizontal line ( (x1,y1) , (x1,y2) )
        • Sweep laser back and forth, line is diameter of ellipse
      • Select object centered at point (x,y)
        • Sweep laser in circle, point is center of circle
      • Previous or Next
        • Click in left (right) 1/8 of screen

Larry Rudolph & Shalini Agarwal

microphone listening to speaker
Microphone listening to speaker
  • Microphone
    • Many technologies;
    • Lapel-mic; mic array; room microphone
    • Current approach: ipaq
      • Continuous recognition
      • Push to speak
  • Audio server on ipaq
    • Detects start and stop
      • Best results when human pushes to start and releases to stop
    • Audio wave file sent to Galaxy speech system
  • Galaxy output actions via CGI-script
    • A nice unifying mechanism
    • One more complicated component

Larry Rudolph & Shalini Agarwal

speaker controlling presentation via ipaq
Speaker controlling presentation via ipaq
  • Ipaq output to CGI-script Server
    • Same actions as from speech server
  • Action are
      • Next slide, Previous slide, Goto slide #n, Goto slide named <xxx>
      • Next item, Previous item, Goto item #n, Goto item named <xxx>
      • Next animations, previous animation, goto animation #n
      • Start presentation <name>, End presentation, Pause presentation
      • Initialize Camera, test microphone
  • Handheld (Ipaq) display
    • GUI generated from speechbuilder grammar
    • List of slides, items per slides
      • Currently use ad-hoc solution where power-point sends lists to ipaq. Need more automatic solution

Larry Rudolph & Shalini Agarwal

output to projector handheld archive
Output to projector, handheld, archive
  • Unlimited number of video / audio output producers
    • E.g. powerpoint just one producer of output
    • At any time, each output device has an associated producer
      • This producer can receive input from several producers
  • Handheld has proxy
    • To reduce bandwidth to ipaq
    • Current slide, list of slides, list of commands
  • Archive
    • Each slide shown, audio (from a different microphone) sent to archive
      • Currently just gif of current slide

Larry Rudolph & Shalini Agarwal

processing controlling session
Processing – controlling session
  • Do not let powerpoint control the world
    • Slide viewer; movie player; program execution; browser; etc
    • Want to mix all types of applications
    • Presenter has control of the output
      • Eg: Switch output producer from powerpoint to media player
  • Remove interrupting technologies
    • Dynamically disconnect any input / output source
  • All done via core language
    • Or some other glue language, e.g. meta-glue
    • Which does all the other infrastructure issues

Larry Rudolph & Shalini Agarwal

multi modal input

Multi-Modal Input

Shalini Agarwal

Oxygen Conference

January 8th, 2002

initial experience with presentation manager
Initial Experience With Presentation Manager

One Single Monolithic Context

Command within slide, between slides, between applications

Problem

Too many false positives

Preliminary Solution

Slide tracking

  • e.g. recognize “Next Slide” command only after at least 60% of words on slide have been said
  • e.g. recognize “Show Demo” only after slide 17

Still lots of problems

  • Many slide styles hard to track (e.g. figures not words on slide)
  • Tracking for within slide different than for between slides

Larry Rudolph & Shalini Agarwal

a better solution multiple contexts
A Better Solution: Multiple Contexts

Very Active Research Area

Intelligent-room project; Galaxy; Others

Three layers, each having its own context

  • Slide (Next Item, Next Animation)
  • Presentation (Next Slide, Goto Conclusion, Goto Example)
  • Session (Start Presentation, Switch to Browser, Show Questions)

Challenges

  • Each context requires its own speech recognition system
  • Multicasting sound wave to each system
  • Selecting the best result

Larry Rudolph & Shalini Agarwal

extending the galaxy system

Language

Generation

Speech

Synthesis

Dialogue

Management

Hub

Audio

Database

Server

Speech

Recog.

Discourse

Resolution

Language Processing

Extending the Galaxy System
  • Start with context for speech and then extend
  • Note, our goals are similar but not identical to those of the Spoken Language Group
  • We are not dialog-based
  • Exploit their work
  • Follow Galaxy
    • Recognizer scores different guesses at words
    • Language Processing Unit uses input grammar to select best input sentence
    • Scott Cyphers gave us the nbest interface

Larry Rudolph & Shalini Agarwal

slide15

Recognizer chooses 10 best guesses at word matches (for this context)

Language Processor picks best sentence from recognizer based on input grammar

Larry Rudolph & Shalini Agarwal

system structure

Language Processor

Recognizer

go to slide nine

go to slide nine

Presentation Layer

Sound Input

go to twenty nine

go to nine

System Structure

Larry Rudolph & Shalini Agarwal

system structure1

Language Processor

Recognizer

go to slide nine

go to slide nine

Presentation Layer

Sound Input

go to twenty nine

go to nine

System Structure

Language Processor

Recognizer

next item

next item

next movie

Slide

Layer

previous item

Selector

start

presentation

Language Processor

Recognizer

Session

Layer

end presentation

start presentation

start presentation

start explorer

Larry Rudolph & Shalini Agarwal

system structure2

Language Processor

Recognizer

next item

next item

next movie

Slide

Layer

previous item

Language Processor

Recognizer

Selector

go to slide nine

go to slide nine

Presentation Layer

Sound Input

go to twenty nine

go to nine

start

presentation

Language Processor

Recognizer

Session

Layer

end presentation

start presentation

start presentation

start explorer

System Structure

Larry Rudolph & Shalini Agarwal

add recognizer for t9
Add Recognizer for T9

Language Processor

Recognizer

next item

Slide

Layer

T9 Input

Language Processor

Recognizer

Selector

go to slide nine

Presentation Layer

Sound Input

start

presentation

Language Processor

Recognizer

Session

Layer

start presentation

Larry Rudolph & Shalini Agarwal

add recognizer for graffiti
Add Recognizer for Graffiti

Language Processor

next item

Slide

Layer

Recognizer

T9 Input

Language Processor

Selector

go to slide nine

Presentation Layer

Sound Input

Graffiti Input

Recognizer

start

presentation

Language Processor

Session

Layer

start presentation

Recognizer

Larry Rudolph & Shalini Agarwal

other input modes
Other Input Modes
  • T9 (telephone keypad)
    • To input a, b, or c press “2”;
    • Current cell phones have dictionary to select

correct word

    • Lots of false positives (very annoying)
      • Remember my introduction?
    • Using an application-dependent grammar would reduce errors
  • Pen-based character input
    • Use strokes to input characters
    • Current palm pilot only recognizes “Graffiti” alphabet
    • Lots of false positives (very annoying)
    • Using an application-dependent grammar would reduce errors

Larry Rudolph & Shalini Agarwal

replacing the recognizers
Replacing the Recognizers
  • Build recognizers for T9 and Graffiti
  • Use Galaxy system to process results from new recognizers

Language

Generation

Speech

Synthesis

Dialogue

Management

Hub

Audio

Database

Server

T9

Recog.

Speech

Recog.

Discourse

Resolution

Language Processing

Graffiti

Recog.

Larry Rudolph & Shalini Agarwal

conclusion
Conclusion
  • Each application defines an input grammar
  • This grammar can be used to
    • Ensure that each application gets valid input
        • It might not be what the user wanted, but the application will understand it
    • Reduce false-positives
    • Identify the input suitable for associated application
        • Choose the application with the highest score
        • If tie, must do something else (future research)
    • Enable T9, Graffiti, Speech, other input modes

Larry Rudolph & Shalini Agarwal

vision gesture recognition
Laser Pointer

Great for drawing attention to content

Audience is primary consumer

Secondary use to control presentation

But it is not a mouse

Semantics are tied to slide context

Differs from Intelligent-room use

Small number of identified gestures

Gestures easily punctuated

Low computational overhead

Soon will be handled with a H21

Vision / Gesture Recognition

Larry Rudolph & Shalini Agarwal

critique of vision gesture recognition
Laser Pointer

Great for drawing attention to content

Cheap technology but mostly distracting

Too shaky, imprecise

But it is not a mouse

More awkward to use than mouse

Another gadget to hold in the hand, button to identify, batteries to maintain

Small number of identified gestures

There are better ways of drawing attention to slide content

I rarely use it and don’t like it when others do

Low computational overhead

Dumb vs Intelligent Device Discussion

Critique of Vision / Gesture Recognition

Larry Rudolph & Shalini Agarwal

speech recognition
Speech Recognition
  • Initially seems like great idea
    • Speaker is already speaking, so can use it to control presentation
  • Want passive, intelligent listener
    • Not a dialog
    • No “prompt” :: alienating distraction
  • Want no mistakes
    • For dialog, better to guess than ignore
    • For us, high cost for incorrect guess
    • Most words are not relevant to speech system
  • More trouble than it is worth
    • But may be good for real-time search of content

Larry Rudolph & Shalini Agarwal

more useful aspect output modalities
More useful aspect – Output modalities
  • Presenter has put the time and effort into the production
    • Simplier is better
  • Audience has harder task
    • Understand material being presented
    • Record thoughts, impressions, connections
    • Filter for later review
    • Process in real-time
    • Keep-up with presentation
    • Do all this with minimal distractions
  • Output modalities
    • Content for live audience
    • Content for speaker (superset of audience)
    • Content for retrieval
      • Correlate notes with content

Larry Rudolph & Shalini Agarwal

record and correlate notes with presentation
Record and correlate notes with presentation

Larry Rudolph & Shalini Agarwal

assumptions
Assumptions
  • Actuators / Sensors (I/O) in the environment
  • Many are shared by apps & users
  • Many are flaky / faulty
  • “User” does not know much about them
  • Environment, application, users desires change over time

Larry Rudolph & Shalini Agarwal

an oxygen application1
An Oxygen Application
  • Interconnected Collection of Stuff
  • Who specifies the stuff?
    • I don’t know, but its mostly virtual stuff
    • Many layers of abstraction
      • “Don’t ask, its turtles all the way down”
  • Two main layers of programming
    • Professionals
    • Users, e.g. grandmother

Larry Rudolph & Shalini Agarwal

communications oriented programs
Communications-Oriented Programs
  • Connecting the (virtual) stuff done by user
    • Home stereo / theater analogy
      • Plug Stuff together; unplug it if doesn’t work
      • Don’t like it, unplug it
  • Device drivers, services, clients, don’t know to whom or to what they connect
    • In client/server model,
      • server knows a lot about the client,
      • the client knows even more about the server
  • Extend Unix Pipes

Larry Rudolph & Shalini Agarwal

slide34

Physical Devices

Programs (Processes)

App

Larry Bear’s

CORE

App

CORE

CORE

Other COREs

Larry Bear

Larry Rudolph & Shalini Agarwal

message flow
Message Flow
  • Messages flow between nodes & core
    • Core is both language and router
  • Within Core Router, some messages
    • are interpreted and may trigger actions
    • other messages get routed to other nodes
  • Request-Reply message strategy
    • Even number of messages
    • No reply within time period, means error

Larry Rudolph & Shalini Agarwal

core language elements
CORE Language Elements
  • Four elements
    • Nodes,
    • Links,
    • Messages,
    • Rules
  • Features
    • Interpreted Language
    • Statement is a message & reply
    • Each element has an inverse

Larry Rudolph & Shalini Agarwal

node handler nickname specifier

Presentation

Speech

Slide

Speech

Command

Speech

Nodehandler = (nickname, specifier)

Nodes – Specify via INS

Cam = [device=web-cam; location=518;…]

PTRvision = [device=process; OS=Linux;File=Laser Vision, ..]

CORE

Laser

Vision

Larry Rudolph & Shalini Agarwal

node statement handler
Node Statement Handler
  • When ‘node’ message arrives
    • Verified for correctness (statements allowed)
    • Routed to Node Manager (just another node)
  • Node Manager
    • INS lookup, verifies if allowed, creates if needed
    • Creates core thread to manage communication with node
    • Bookkeeping & reply message with handle/error

Larry Rudolph & Shalini Agarwal

links
Links

Lcamera,vision = (Cam,PTRvision)

Presentation

Speech

Slide

Speech

Command

Speech

CORE

Laser

Vision

Larry Rudolph & Shalini Agarwal

link statement handler
Link Statement Handler
  • Message routed to ‘link’ manager
  • Two queries to node mng for thread cntl
  • Message to thread controller of source node
    • Specifying destination thread controller
  • Message to thread controller of dest node
    • Specifying source thread controller
  • Bookkeeping & reply message handler/error

Larry Rudolph & Shalini Agarwal

messages
Messages

Messages flow over the links

Next

Slide!

Presentation

Speech

Slide

Speech

Command

Speech

CORE

Laser

Vision

Larry Rudolph & Shalini Agarwal

message handling
Message Handling
  • Messages can be encrypted
  • Core statement messages have fixed format
  • Everything else is data message
  • Each node thread has two unbounded buffers
    • Core to node & Node to core
  • Logging, rollback, fault-tolerance

Larry Rudolph & Shalini Agarwal

rules
Rules

RULES: (trigger,action)

( MESSQuestion , Lslide,lcd-- & Lslide,qlcd )

Presentation

Speech

Slide

Speech

Questions

Command

Speech

CORE

Questions

Questions

Laser

Vision

Larry Rudolph & Shalini Agarwal

rule statement handler
Rule Statement Handler
  • ( trigger , consequence )
  • Both are “event sets”
  • Eight basic events:

+Node, -Node, +Link, -Link

+Message, -Message, +Rule, -Rule

  • Event set is a set of events
  • Trigger is true when events are true
  • Consequence makes events true

Larry Rudolph & Shalini Agarwal

rules a link is a rule
Rules – A link is a rule
  • A message event is of form

(node, message specifier)

( message specifier , node )

    • Message came from or going to node
  • A link (x,y) is just shorthand for the rule:

+( x , m )  ( - (x, m) , +(m , y) )

If a message m arrives at node x, then make that event false (remove the message) andmake the event of m arriving at y from core true.

Larry Rudolph & Shalini Agarwal

rules access control lists
Rules – Access Control Lists
  • An access control list is just a rule
  • When messages arrive at node, if they arrive from valid node, then allowed to continue to flow.
  • Modifying access control lists is just adding or removing rules.

Larry Rudolph & Shalini Agarwal

rules1
Rules
  • Rule statement gets sent to rule manager
  • Event set is just another shorthand for rules
  • Rule manager sends command to trigger node thread that tells it about the consequence
  • Rules are reversible

Larry Rudolph & Shalini Agarwal

reversibility
Reversibility
  • Each statement is invertible (reversible)
  • If there is an error in the application specification, then can undo it all.
  • General debugging is possible with reversible rules and message flow

Larry Rudolph & Shalini Agarwal