410 likes | 564 Views
The General Temporal Workbench (formerly General Multimedia Workbench) A Universal System for Exploring Time-based Phenomena Donald Byrd School of Informatics & Jacobs School of Music Indiana University/Bloomington 11 Sept. 2009. Introduction: Time is Of the Essence.
E N D
The General Temporal Workbench (formerly General Multimedia Workbench) A Universal System for Exploring Time-based Phenomena Donald Byrd School of Informatics & Jacobs School of Music Indiana University/Bloomington 11 Sept. 2009
Introduction: Time is Of the Essence • Gandini Juggling’s Mozart (Symphony no. 25, 1st mvmt) performance • “When you hear music, after it’s over, it’s gone, in the air. You can never capture it again.” —Eric Dolphy (1964) • Likewise for all complex temporal phenomena • …and timescale can be microseconds or millions of years • What if you really want to think about what happened (or, for creative artists, want to happen)? • Need a way to “freeze” it • Playing a recording over & over isn’t enough! • Obvious answer: visualization—but what’s the best way? • …and is visualization the only answer? 10 Sep. 2009
Motivation: We Have Big Problems • Long-standing, difficult problems in all fields • …plus deluge of data in many fields • Even arts & humanities are getting lots of hard data • Promises to help, but not much help so far! • What we need is insight, not data; how to get there? • Widely recognized as an important goal • The cross-discipline argument • Problems in all fields have much in common => a general system could be very valuable, if it’s possible • A general system is possible • The cross-creativity/analysis argument • Problems of creators & analysts in a field have much in common => system for creation & analysis also very valuable, if possible • A system for both is also possible 3 July 2009
Examples • Create/teach performers/rehearse/study a multimedia show • Gandini Juggling’s Mozart Symphony • …or marching band, or dance w/ music & lighting effects, etc. • Study Hendrix’s Star-Spangled Banner (VATvs. published transcription) • Look for patterns in patient’s medical history (Lifelines) • Research on embodied language acquisition (Chen Yu) • Learn role in opera or musical (GTW simulation) • Organize ethnomusicology field research (EVIA AWB) • Study world events (JFK assassination, Salem Witchcraft Trials) 10 Sep. 2009
Jimi Hendrix’s Woodstock Star-Spangled Banner (a) Timeline overview w/ labeled segments (Variations Timeliner) (b) In music notation, guitar tablature, & words (published transcription) 28 Feb. 2009
Timelines (1) Applications of “timelines” in a broad sense • audio editing (a few hundred millisec. to a few min.) • juggling (a few seconds; vertically oriented) • video & motion data of two participants interacting in lab (seconds) • “bubble” diagrams of structure of pieces of music (minutes) • movie/video of animal behavior, interview, show, etc. (minutes to hours) • video annotation (hours) • weather (hours to days) • appointment calendar (week or month; 2-D) • the assassination of President Kennedy (a few days) • Salem witchcraft accusations (a month) • personal history: medical, criminal justice, etc. (years) • dinosaurs (tens of millions of years) 28 Mar. 2009
Timelines (2) Assassination of President Kennedy (SIMILE Timeline) 12 Mar. 2009
Concrete & Abstract Forms in Different Fields • Symbolic forms in music vs. text (& other areas)… • Real-time = low level (concrete) • Symbolic = high level (abstract) • Very high-level: segmentation w/ labels • Ex: Hendrix Star-Spangled Banner • Concrete & abstract forms are useful for all temporal phenomena 14 Apr. 2009
Different Fields Have Much in Common • From music to remote disciplines in small steps, for three aspects of music • Steps shown aren’t unique—many paths are possible • All are complex enough that no one way of “looking” at it can capture everything • For all, people often want to compare two or more instances of the phenomenon 15 Apr. 2009
Solution 1: Better Human/Machine Partnerships Above figure is from Yu et al (2008), slightly modified • Integrate info visualization & analysis/data mining (Shneiderman 2002; Yu et al 2008) => closed loop: use visual perception to generate hypotheses for analysis; present results of analysis visually • Cf. browsing vs. searching dichotomy; HCIR, visual analytics • …or substitute synthesis for analysis, e.g., for composers 7 Apr. 2009
Solution 2: Allow All Sensory Modes • Visual: visualization is most generally useful, but not the only answer • Auditory: sound is central for many applications • sonification is surely valuable for some non-audio phenomena • Tactile: important for the blind • Other (olfactory): important for ?? • Don’t rule out support for all sensory modalities 14 Mar. 2009
Solution 3: Don’t Reinvent (or do without) the Wheel! • Problems of all temporal phenomena have much in common • …but people rarely share ideas or software across disciplines • Issue of “disjoint technical vocabulary/literature” (cf. Swanson 1988) • Value of exploratory search (cf. Jeremy Pickens, etc.) • Idea: a “General Temporal Workbench” (GTW) • Formerly General Music/General Multimedia Workbench (GMW) • Supports multiple: coordinated, editable, interactive visualizations & sonifications (eventually “tactilizations”, “olfactizations”?) • …of multiple instances • …of any combination of temporal phenomena • …plus data mining & analysis (Solution 1)! • For creative applications or analytical applications? • Both—the design is neutral • NB: a misleading question: insightful analysis involves creativity 15 May 2009
Use Existing Timeline Software • There’s an endless variety of timelines • Orientation: horizontal, vertical, 2D, other • Spacing: linear, logarithmic, piecewise linear, etc. • Multiscale coordinated • Playability: audio, video, Flash, etc. • Useful generic “timeliners” can go far beyond the basics • Ex: SIMILE Timeline & Timeplot • Even doing “the basics” well isn’t that easy • Ex: axis tick marks & numbers for them • Can consider all time-domain displays as variations on timelines • But what is there besides time-domain displays? 11 Sep. 2009
Use Existing Frequency Domain Viewers • Frequency domain = patterns in time domain • Example: economic cycles; Kondratiev’s theory of periodic collapses of capitalist economy :-) • Time/frequency domain (hybrid) more useful than pure • Visualization example: spectrogram (via Fourier analysis) • Very well-known in hard sciences, less in soft sciences • …almost unheard of in arts & humanities, & by public • Exception: computer music • Are periodic changes in cultural areas plausible? • Politics: U.S. House of Reps. elections correlate w/ franked mail • Direct experience important => periods of 1-2 generations(?) • Higher education population turnover => periods of 4 years(?) • …or periodic changes of blood glucose for diabetics? • Type-1 diabetics do constant self-medicating => need user-friendly tool 10 Jul 2009
What Fields are Candidates for a GTW? (1) • What fields can really benefit from synergy of “not reinventing the wheel”? • Relevant features • Complex enough that no one way of “looking” at it can capture everything, i.e., needs multimodal access • people often want to compare two or more instances of the phenomenon • (less important) specialized graphical notation(s) are widely used for symbolic form 19 Feb. 2009
What Fields are Candidates for a GTW? (2) • How many fields have at least Features 1 & 2? • An unusual example: juggling (Juggling Lab) • No single way of looking at it can capture all the information • Standard: video and/or animated stick figures • Optional: notation (“siteswap”), timeline showing paths of balls • People often want to compare versions of a trick • General human & animal movement is really complex • Conclusion: all non-trivial fields have 1 & 2; very many have all three. • Speculation: area w/ over (say) 100K person-years of serious interest probably has enough complexity to have Features 1 & 2 • Speculation: area w/ over (say) 500K person-years of serious interest probably has enough complexity to have all three 19 Feb. 2009
HCI: Multiple “Visualizations” Can be Great or Worthless • Parable of blind men & elephant • Point of multiple visualizations: let the user put the pieces together & “see” big picture • The more different the visualizations involved, the better… • But the more different the visualizations, the greater the danger of user getting confused! • Ease of navigating between is critical => need coordination • Often helpful to have a (small) overview on screen • Ex: viewing modes in PowerPoint • Ex: “Scrollbar with confetti” (Byrd 1998) gives overview with (v. often) no additional screen space • Similar principles apply to sonifications, etc. 1 Mar. 2009
The Ultimate Music-and-More System (1) • If system could do “everything” with music, should be useful for lots besides music! • Not just useful for many domains, allows synergy/leverage • Related to “abstraction”, “modularizing”, “factoring” • = breaking problem down into separate parts • Cf. high-school algebra • Result: no need to reinvent the wheel • Example: timelines • Example: apply frequency-domain approaches & software in many fields • But could a program do that? Is this practical? 26 Mar. 2009
The Ultimate Music-and-more System (2) • Practical iff it can be broken down to independent chunks • Modular design (in layers) is vital • Architecture plan for GTW I. Completely general framework: no knowledge of domain II. Generic domain-specific modules for file I/O, support for low-level modules, maybe “automatic” alignment III. API for user-written analyzers & “visualizers” 18 Feb. 2009
Architecture for a General Visualizer/Analyzer (1) • Configuration = software + UI (windows, etc.) • Software for common audio & video uses… 15 Mar. 2009
Architecture for a General Visualizer/Analyzer (2) For sequential art & movies based on sequential art (John Walsh)… 15 Mar. 2009
GTW “Screenshot” 1 • Scenario: music-informatics researcher (or ethnomusicologist) comparing two audio-segmentation algorithms • …or composer comparing input & output of synthesis programs 20 Apr. 2009
GTW “Screenshot” 2 (& “Demo”) • Scenario: singer (or conductor) comparing videos of performances to learn role in opera, musical, etc. • …or stage director, choreographer, or lighting designer comparing previous versions to own ideas • …or scholar studying performances (perhaps juggling w/ music!) 4 Apr. 2009
What’s Special About the GTW? • Design (& planned implementation) for our “solutions” • Better human/machine partnerships: tight coupling of visualization & analysis • Don’t reinvent wheel: any presentations of anything temporal in any combination • Factors, roughly from most to least fundamental: • Architecture separates framework from assumptions about use (domain knowledge) • Support for rapidly changing focus between very different visualizations at vastly different scales • Can automatically adjusts (in own windows) which visualizations take screen space & sizes/layouts of interfacing programs’ windows • Support for showing relationships between features in different views • Configuration files set up internals and UI; experts can create for each use case • Designed to support comparing “similar” documents • Doesn’t assume consistency between coordinated representations • Can act as “slave” (client for, e.g., SEASR/Meandre, Max/MSP, Pd) or master • Fully multimodal: presentation in non-visual form (sonification, Braille) on same basis as visualization • Analysis modules can communicate w/ presentation modules 16 Mar. 2009
The Truth: The GTW Can’t Do Everything • ... but it can enable YOU to! • But you probably aren’t a technology expert • Really, it can enable experts in each field to, with much more synergy (=> less new work) than now • Something like an operating system 19 Feb. 2009
Similar Tools for Non-Temporal Phenomena • Existing, very general tools for other situations • Network Workbench (Katy Börner/IU SLIS): visualize/explore networks • Google Map API: visualize/explore “space” (surface of the earth) • Both have proven very useful • But many phenomena have temporal and non-temporal aspects • T. & network: artificial life, computer games, studying software (debugging, etc.): traversal => temporal form • T. & spatial: folksong style vs. region of origin, art or general history, etc. • Cf. Timemap (Google), Salem Witchcraft Accusations webpage • All three: public health (as in epidemiology) • GTW could be used with other tools 2 June 2009
Getting Off the Ground • Working on prototype, based on EVIA Annotators Workbench; also maybe CIShell, Chen Yu’s system for “visual mining of multimedia data” (all from IUB) • Other possible open-source starting points • Sonic Visualiser, SyncPlayer, SIMILE Timeline, etc. • Connections to general tools for nontemporal visualization • Network Workbench (Katy Börner/IU SLIS): networks • Google Map API: “space” (surface of the earth) • Connections to other general tools • Meandre for SEASR (UIUC): humanities/social science research • Max/MSP, Pd: musical audio 15 Apr. 2009
Conclusions • How do I know applications are realistic? • Many probably aren’t, but many, many possibilities exist! • Have ca. 30 usage scenarios, ca. half written/endorsed by experts • Some examples • Ruth Stone: ethnomusicology field work • Larry Yaeger: artificial life • Elaine Chew: annotating video of computer-aided musical improvisation • John Walsh: comic books/movies • Tim Crawford: musicology • Personal knowledge/experience for a few 15 Apr. 2009
End • Thanks to Geoff Chirgwin, Will Cowan, Allen Winold, Paul Sturm, &… • THE END 20 Feb. 2009
Extra Slides • Following slides are just in case… rev. 18 Feb. 2009
Good Design for Music Can Be Good for Many Things • Cf. “Why Studying Music is Both Difficult and Important” (Byrd 2009) • Music is an art => people use elements in unusual ways • Music is a performing art =>performances & symbolic representations • Much music hascomplex synchronization requirements • Music involves many different instruments, often in groups. Leads to: • Arrangements/transcriptions for other instruments • Versions for players with different levels of skill • Notation may represent sounds or actions • Music is often combined with text via singing, narration, etc. • Music is extremely popular, so: • Some works exist in many versions, arrangements for different ensembles, etc. • Handling challenges is important, even on purely economic grounds rev. 18 Feb. 2009
HCI: Searching, Browsing, & Visualization • Visualization is essential for browsing, merely helpful for searching • In browsing, user finds everything; the computer just helps • Browsing is obviously good because it gives user more control, but few systems emphasize it. Why? • “Users are not likely to be pleasantly surprised to find that the library has something but that it has to be obtained in a slow or inconvenient way. Nearly all items will come from a search, and we do not know well how to browse in a remote library.” (Lesk, p. 163) • For “and”, read “as long as” • Searching is more natural on computer, browsing in real world • Effective browsing takes very fast computers—widely available now • Effective browsing has subtle UI demands • Cf. HCIR, visual analytics, visual searching, etc. 7 Mar. 2009
Why juggling? Who cares? • A surprising domain, but realistic • Features 1 & 2 apply • Feature 3 applies in part: has established (though not graphic) notation, “siteswap” • Many juggling programs available • GTW framework has support for: • Control of tempo, including pausing or going backwards • UI for (temporal, not spatial) zooming in on details • Synchronization of multiple videos and/or animations • Framework for auto. synchronization • Framework for combining independent visualizations • Animal motion in general is much more complex => more need for GTW! • Ex: dancing (Labanotation, etc.)
Structure in Basic Representations of Music & Audio Western Music Notation: very complex, irregular structure; some parts well-defined, some not—and what’s well-defined isn’t well-defined Audio: no explicit structure MIDI: simple, regular, well-defined structure 10 Feb. 09
Basic Representations of Music & Audio 1. Audio (e.g., CD, MP3): like speech 2. Time-stamped Events (e.g., MIDI file): like unformatted text 3. Music Notation: like text with complex formatting • Time scales of graphs: #1, milliseconds; #2 & 3, seconds • Essential difference among forms: “knowledge representation” = explicit structure 10 Feb. 09
“Isn’t it a mistake to use music notation this way?” • Chris Raphael’s question about Hendrix transcription • It’s obviously useful: easy to find phrases, “Taps”, etc. • …but seriously misleading in places • But CMWN is “always” misleading! • Is it useful enough to justify danger of misleading? • Knowledge representation has inevitable bias (Davis et al 1993); notation has more bias (Wiggins et al 1993) • Fundamental issue of transcription in ethnomusicology • Conclusion: use it, but be careful • Cf. my “Logician General’s Warning” on classification • …in fact, transcribing requires classifying constantly 12 Feb. 09
Sequential-Art/Movie: The Hard Goodbye (1) • From Frank Miller’s “Sin City” series • John Walsh (SLIS): want to compare comics, movies of them, etc. 18 Feb. 2009
Sequential-Art/Movie: The Hard Goodbye (2) • From Frank Miller’s “Sin City” series • John Walsh (SLIS): want to compare comics, movies of them, etc. 18 Feb. 2009
Types of Visualizations of Music (and more) • Is visualization static or dynamic? • Dynamic = time represented by time • Static = time represented by space • What features are visualized? • What basic representation? Audio, symbolic, both? • Easy to generalize to plays (score = script) & other text phenomena, dance, etc. 18 Feb. 2009
Types of Visualizations of Music (and more) • Hendrix example uses coordinated visualizations • Generalization of parallel, aligned, synchronized, etc. • How are multiple visualizations coordinated? • Parallel panes of a single window • Superimposed in a single window • Separate coordinated windows • Forms 1 & 2 apply directly to audio (incl. sonification) • Easy to interpolate between forms 1 & 2 • Categories in the real world are rarely discrete 26 Feb. 2009
The Ultimate Music System • Original goal: visualizer that can do anything with music • Handle any no. & combination of visualizations • Static visualizations: audio, any kind of notation, structural diagrams, etc. • Dynamic visualizations: video, etc. • Automatic (or near-automatic) synchronization • Support OS-level technologies (QuickTime, etc.) • Easy-to-learn UI allowing high degree of control • User may want frequent extreme zoom changes => help with • If it could do all that, should be useful for lots (domains with >=2 Features) besides music! 20 May 08