1 / 34

Multimedia Michael Christel Alex Hauptmann Rong Jin (TA) cs.cmu/~alex/mmCourse

Multimedia Michael Christel Alex Hauptmann Rong Jin (TA) http://www.cs.cmu.edu/~alex/mmCourse. How to get in touch with us. Mike Christel christel@cs.cmu.edu http://www.cs.cmu.edu/~christel (412)268-7799 or x8-7799 WeH5212 Alex Hauptmann alex@cs.cmu.edu http://www.cs.cmu.edu/~alex

van
Download Presentation

Multimedia Michael Christel Alex Hauptmann Rong Jin (TA) cs.cmu/~alex/mmCourse

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimedia Michael Christel Alex Hauptmann Rong Jin (TA) http://www.cs.cmu.edu/~alex/mmCourse

  2. How to get in touch with us • Mike Christel • christel@cs.cmu.edu • http://www.cs.cmu.edu/~christel • (412)268-7799 or x8-7799 • WeH5212 • Alex Hauptmann • alex@cs.cmu.edu • http://www.cs.cmu.edu/~alex • (412)268-1448 or x8-1448 • WeH5124 • Office Hours by Appointment

  3. Teaching Assistant • Rong Jin • jin+@andrew.cmu.edu • Office WeH5316 • Office hours by appointment • (412)268-4050 or x8-4050

  4. Course Outline, Part 1 of 3 More details at www.cs.cmu.edu/~alex/mmCourse October 22 Intro to Multimedia October 25 Multimedia Enabling Technologies, Macromedia Flash Intro and Demo October 29 Sound Processing, Speech Recognition November 1 Digital Video Creation and Transmission November 5 Speech Synthesis

  5. Course Outline, Part 2 of 3 More details at www.cs.cmu.edu/~alex/mmCourse November 8 Image Processing November 12 Digital Music and Music Processing November 15 Multimedia Internet Protocols, SMIL November 19 Synthetic Interviews: A Multimedia Company (Experiences from the Field) November 22 Programming for Interactive Multimedia (CGI Scripts/ASP)

  6. Course Outline, Part 3 of 3 More details at www.cs.cmu.edu/~alex/mmCourse November 29 Content Analysis and Coding of Digital Audio and Video, Multimedia Storage and Retrieval Management. December 3 Video Retrieval Evaluation and Testing Multimedia Interface Design, Digital Libraries December 6 Visual Design, Multimedia Interface Design Guidelines, Multimedia use in the future (Experience on Demand) December 10 Multimedia as Entertainment Technology, Virtual Reality

  7. Homeworks • See http://www.cs.cmu.edu/~alex/mmCourse • 9 Homeworks planned, 10 points each • One hard homework will be worth 20 points • No final, no midterm • Publish homeworks on your web page - email us URL • Space?

  8. Today: Intro to Multimedia Apple Knowledge Navigator Vision 1988

  9. Multimedia Audio Networking Psychology Natural Language Processing Video Storage Systems Information Retrieval Data Compression Images HCI CPU Power

  10. Definition of Multimedia • Multi (latin multus - numerous) • Media, medium (latin medius, medium: middle, center, intermediary; latin mediat: intermediary, means) • Multiple types of information captured, stored, manipulated, transmitted, and presented. • Specifically: Images, Video, Audio (+Speech) and Text

  11. Definition of Multimodal • Multi (latin multus - numerous) • Modal (latin modus: manner) • Traditionally refers to input/output formats: • Input: • sounds, speech (mike) • gestures (camera, tablet) • eye-gaze (camera), • mouse, • keyboard • Output: • sounds, speech • video • Pictures • Animations • Text

  12. Perceived Information • Physical Variables • Sound is a waveform • An image is a waveform • light is electromagnetic radiation with different intensity in spatial coordinates • color corresponds to wavelength

  13. History of Multimedia I • Analog signals to sensors • E.g. vinyl records • Fidelity is faithfulness to the original • Digital representation (‘60s) • Sampling • Quantizing • Coding • codec, modem, (A/D and D/A)

  14. Hardware Advances • CPU • Bus • Network I/O • Keyboard, Mouse • Disk • Mike + A/D Board • Camera + A/D Board • Speakers (+ D/A Board) • Display

  15. History of Multimedia II • Analog controls only • Special hardware (Displays, Scanners, FFTs) • Integrated hardware components • Further Integration • Other devices

  16. History of Multimedia III Limiting Factors: • Storage Limits • CPU Speeds • I/O Speeds • Network Bandwidth

  17. Why Digital? • Universal storage, transmission format • CD, internet • Precision (Range of values, number of bits, floating point) • Lossless transmission/storage BUT: • sampling rate distorts information • size requirements may be ‘large’ compared to analog

  18. Digitization Process • Sampling from an analog signal • Sampling Errors relate to signal frequencies • Quantization Errors

  19. Text • ASCII, Unicode • Formatted Text, Rich Text • Document Formats: • Structured: Tex, HTML • Page Descriptions: Postscript, PDF

  20. Graphics • Objects • circles, splines, rectangles, lines • Editable • resize, reshape, move, colorize • Synthetic

  21. Images (Pictures) • Fixed digitized representation • bitmap, colors per pixel • Editable in limited ways • retouch, cut and paste, remap colors, filter [Photoshop tools] • no ‘model’ of the thing • Captured • not just from real life, clip art, screen dump

  22. Audio • Sounds • hear 15 Hz to 20 kHz • Speech is 50 Hz to 10 kHz • Speech Recognition • It is hard to wreck a nice beach • Ice cream I scream • Synthesis • Speech • Music MIDI for 127 instruments, 47 percussion sounds Notes, timing

  23. Speech Recognition Issues • Continuous vs Discrete • Vocabulary Size • Channel (Microphone) • Environment (Location of mike and Speaker) • Speaker Dependent/Speaker Independent • Context (Language Model) • Interactivity (Dialog Model)

  24. Acoustic Modeling Describes the sounds that make up speech Speech Recognition Lexicon Describes which sequences of speech sounds make up valid words Language Model Describes the likelihood of various sequences of words being spoken Speech Recognition Knowledge Sources

  25. Speech Variations Style Variations careful, clear, articulated, formal, casual spontaneous, normal, read, dictated, intimate Voice Quality breathy, creaky, whispery, tense, lax, modal Speaking Rate normal, slow, fast, very fast Context sport, professional, interview, free conversation, man-machine dialogue Stress in noise, with increased vocal effort (Lombard reflex), emotional factors (e.g. angry), under cognitive load

  26. Video • Frames comprise the video • Frame rate = delay between successive frames • minimal change between frames • Sequencing creates the illusion of movement > 16 fps is “smooth” Standards: 29.97 is NTSC, 25 is PAL, 60 is HDTV Interlacing • Display scan rate is different • monitor refresh rate • 60 - 70 Hz (= 1/s)

  27. Captured vs. Synthetic • Animation vs Video • Graphics vs Pictures • Synthesizer vs Recording • Storage? Manipulation? Processor Requirements? • Fidelity to real world • Hybrids are possible

  28. Why is Multimedia Important? • Our society - • captures its experience, • records its accomplishments, • portrays its past • informs its masses ……in pictures, audio and video • For many, CNN has become the “publication of record” • Multimedia learning leverages “multiple intelligences” Gardner, 1993 • Multimedia Digital libraries are an essential component of • formal, informal, and professional learning • distance education, telemedicine

  29. Technology Push vs Market Pull • Home Entertainment • Catalog Ordering • Multimedia Training, Education • Videoconferencing • Professional Video Services • Videomail • Speech Recognition

  30. Hype vs. Reality • What is feasible, under what circumstances? • What is possible? • What is impossible? • What is unlikely?

  31. DARPA: Dominate the Battle Space HP “1995” LSI “Flash Point” HP “Synergies” Multimedia Visions

  32. Intro to Multimedia That’s all for today

More Related