1 / 31

Essence Encoding Principles

Professional Content Management Systems 3 rd Lecture: Essence Formats and Standards Dr. Andreas Mauthe SCC – Lancaster University. Essence Encoding Principles. Digitisation Transformation from the continuous to the digital domain Information loss

griedel
Download Presentation

Essence Encoding Principles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Professional Content Management Systems3rd Lecture: Essence Formats and StandardsDr. Andreas MautheSCC – Lancaster University

  2. Essence Encoding Principles • Digitisation • Transformation from the continuous to the digital domain • Information loss • Quality depending on number of bits used for representation (i.e.sampling rate and quantisation interval) • Compression • Reduction of bit rate • Exploiting redundancies & properties of human senses • Lossless vs. lossy compression • Basic compression techniques • Entropy coding, source coding, hyprid coding

  3. Video Encoding • Principles • Basic Image Elements • Pixels, aspect ration (4/3), colour representation (RGB vs. YUV) • Motion • Presentation of images faster than 15 frames/sec • PAL 25 frames/sec, NTSC 29.97 frames/sec • Digitisation • Basic steps • Sampling: into an array of MxN points • Quantisation: commonly into 256 values • Coding: composite vs. component • Component coding standards • Different component representation, sampling frequencies & sampling/ lines • The MPEG-1 Standard • Video & audio standard • Does not standardise encoder but syntax and semantics of MPEG1 bit stream • JPEG like picture coding and compression • Different Picture types • I,P,B & D frames • GoP

  4. MPEG-2 • Target Application Area • Video & media production • Television, HDTV • Interactive Television • Interactive multimedia services • Participating Standardisation Bodies • ISO/ IEC • ITU-TS (telecommunication), ITU-RS (radio/ broadcast) • EBU • SMPTE • Encoding Principles • Similar to MPEG-1 • Specifies the video bit stream syntax • Similar image coding • 8x8 blocks • DCT transformation • Variable run-length coding • I,P,B & D frames • Compression of interlaced video for standard TV • Field pictures • Encoding of fields as independent entities • Frame pictures • Each interlaced field pair is interleaved into a frame • Divided into macro blocks and encoded

  5. MPEG-2 Layered Coding • Objectives • Different layers for better scalability • Build on top of each other starting with the base layer • Layered Coding Modes • Spatial scalability • Different resolutions that build up to the full resolution • For transmitting video at a range of different quality levels, e.g. standard TV & HDTV • Temporal scalability • Contains sequences at a lower frame rate • Different layers build up to the full frame rate • Distribution of I, P & B frames between layers are important • Data partitioning • Data encoded and layered according to priorities • High priority streams contain DC coefficients and motion vector headers • For progressive image build up • Signal-to-noise ratio scalability (SNR) • Basic quality version • Enhancement layers carry information to build up to full quality

  6. MPEG-2 Profiles & Levels • Objectives • Profiles • To support different applications • Professional vs. consumer formats • Different chrominance sampling modes • 4:4:4, 4:2:2 & 4:2:0 • Successive profiles include preceding profiles • Levels • Specifying a sub-set of spatial and temporal resolutions per profile • Supporting a large set of image formats

  7. MPEG-2 Profiles & Levels

  8. MPEG-2 Delivery • MPEG-2 Stream Types • MPEG-2 Program Streams • Elementary stream • Specified for multimedia applications • MPEG-1 compatible • MPEG-2 Transport Stream • For the transmission of multiple programmes and program streams • Fixed sized packets identified by packet ID • Multiple Program Streams can be mapped into one Transport stream • E.g. different video and audio streams • Applications • Video telephony to digital TV • Networks • Lossy and high bandwidth, e.g. Fibre, satellite, cable, etc.

  9. MPEG-4 Motivation: To support the predicted convergence of media and technology areas, viz. communications, computing, television and entertainment • Target Application Area • Mobile telephones & devices • Interactive multimedia applications • Media production • Broadcasting • Functionality Areas: • Content based interactivity • Multimedia access tools, content based manipulation, hybrid, natural and synthetic data coding, etc. • Optimised compression • Through improved coding efficiency • Universal access • Supporting different networks ranging from high-speed to wireless • F. Pereira, T. Ebrahimi (Editors): “The MPEG-4 Book”, IMSC Press Multimedia Series, Prentice Hall, 2002.

  10. MPEG-4 Standard Parts

  11. MPEG-4 (Object Oriented) Coding • Basics • Specifies bit stream as MPEG-1 & MPEG-2 • Motion-compensated hybrid DCT • Motion Decoding and Compensation Tool block based video coding • Shape Decoding Tool  object based video coding • Basic Components • Audio Visual Objects (AVO) • Arbitrary shape • Spatial & temporal extent • Described by object descriptors (OD) • AVO carried in separate elementary streams • Scene Composition • Scene Descriptor Information • Composition information using pointers to AVO • Specifies relationships between AVO • Spatial position • Temporal position • Also specifies • dynamic behaviour • Allowed interactivity pattern • Scene description language • Binary Format for Scenes (BIFS)

  12. Object Based Scene Composition: Example

  13. MPEG-4 Profiles, Levels & Object Types • Profiles & Levels • Profiles • Specify the tool set to be supported • Levels • Set complexity bounds • Required memory, number of objects, bit rate • Conformance • Bitstream • Contains syntactic elements of profile • Stays within boundaries given by level • Decoder • Ability to to interpret values of all allowed syntactic elements • Provides all required resources • Complexity bounds on all objects within a scene • Object Types • Specify syntax and semantic of an object • Tools required to code an object • Specifies restrictions on the combination of object types

  14. MPEG-4 Video Object Types

  15. MPEG-4 Video Profile Level

  16. Digital Video (DV) • Background • Consumer DV • International Electronical Commission: “Helical-scan digital video cassette recording system using 6,35mm magnetic tape for consumer use (525-60, 625-50, 1125-60 and 1250-50 systems)” IEC 61834 • Professional DV • Society of Motion Pictures and Television Engineers: “6.35mm type D-7 component format – video compression at 25 Mb/s – 525&60 and 625/ 59”, SMPTE Standard for Television Digital Recording SMPTE 306M, 1998 • Encoding Basics • Coding • Consumer: 4:2:0 • Professional: 4:1:1, 4:2:2 • 13.5 MHz sampling rate, 8-bit coding • Intra-frame compression only • DCT • Adaptive intra-frame spatial compression • Increased motion results increased compression  Variable bit-rate to constant bit-rate • Compression ration 5:1 • Error correction • Rank (inner) • File (outer)

  17. Professional DV • Supported TV Formats • NTSC • 525 lines (480 active lines) • 29.97 frames/sec • PAL • 625 lines (576 active lines) • 25 frames/sec • 6.35 mm tapes • Helical track • DV 25 • Coding • 4:1:1 • 1 video & 2 independent audio channels • Bit rate: 25 Mb/s • DV 50 • Coding • 4:2:2 • 1 video & 4 audio channels • Bit rate: 50 Mb/s

  18. CMS and Video Formats • Non-Standard Formats (tape based) • Analog component formats • Betacam, Betacam SP, M-2 • Digital formats • Composite • D-2, D-3 • Component • DigiBeta (proprietary compression) • D-5 (no compression) • Standard Digital Formats • DV based • DVCPro 25, DVCPro 50 • D-9 (50 Mb/s) • DVCAM (4:2:0, 25 Mb/s) • MPEG based • Beta SX (4:2:2, IB frames, 21 Mb/s, tape based) • D-10 (4:2:2, MPEG-2 4:2:2P@ML, I frame only, 50 Mb/s) • Requirements • Handle multiple formats • Multiple data rates • Browse <= 1.5 Mb/s • Broadcast 2Mb/s – 8 Mb/s • Production 18 Mb/s – 50 Mb/s

  19. Typical Video Essence Related Issues • Heterogeneous Formats • Generation loss • Lossy encoding leads to qualtiy detoration at every: • Decoding • Editing • Re-encoding • Playout and subsequent archival • Format change results in quality loss • Long term archiving on obsolete formats • Different formats in the production and transmission chain • Not yet a standardized file format

  20. Infrastructure Related Essence Issues • Heterogeneous Communication Infrastructure • Broadcast communication networks • Analog point-to-point • Digital point-to-point via Serial Digital Interface (SDI, ITU-R BT.601-5, 270 Mb/s streaming) • Digital point-to-point via Serial Data Transport Interface (SDTI, DVCPRO/MPEG) • Computer networks • Ethernet based LANs • ATM based LANs & WANs • Internet • Machine connections • Fibre Channel, SCSI • Machine Control Networks (e.g. RS 422)

  21. Audio • Encoding Basics • Digitisation • Analog-to Digital Conversion (ADC) • Sampling of waveform • CD sampling rate 44.1kHz (i.e. 44100 samples/sec) • Telephony 8 kHz • Quantisation • Pulse Code Modulation (PCM) • 16 bit for CD • Differential Pulse Code Modulation • Example: CD encoding • 2 * 44100 1/s * 16 bit = 1,411,200 bit/s • Uncompressed Digital Audio Formats • WAVE • Reference format • File format • 44.1 kHz, PCM dual stereo audio • Digital Audio Tape (DAT) • 48 kHz, PCM encoded

  22. MPEG Based Audio Formats • MPEG-1 Audio • Coding formats • Sampling rate • 32 kHz, 44.1 kHz & 48 kHz • Quantisation: 16 per sampling value • Compression • Split into 32 non-interleaved sub-bands • Frequency transformation • Fast Fourier Transformation (FFT) • Quantisation • Psycho-acoustic model to determine noise level per sub-band • Higher noise level equals bigger quantisation steps • Entropy encoding • Channels • Two independent, two channel stereo, joint stereo • Layers • Downward compatible • Maximum bit rates • Layer I 448 Kb/s, Layer II 384 Kb/s, Layer III 320 • MPEG-2 Audio • Based on MPEG-1 • Supports half the sampling rates • Multiple channels (e.g. 5 channels surround, 7 different language channels)

  23. MPEG-4 Audio • Target Application Area • Speech coding • General audio coding • Synthetic audio • Audio composition • Basics • Object oriented coding • Natural audio objects based on MPEG-2 • Improved coding efficiency and error resilience • Very low bit rates and very low delays • Bit rate scalability • Streams composed into audio scene • Through improved coding efficiency • Audio object types • Profiles • Conformance criteria for streams and decoder • Levels • Complexity units /processor and RAM complexity • F. Pereira, T. Ebrahimi (Editors): “The MPEG-4 Book”, IMSC Press Multimedia Series, Prentice Hall, 2002.

  24. MPEG-4 Audio Object Types

  25. MPEG-4 Audio Object Profiles

  26. CMS and Audio Formats • Main Formats • Uncompressed formats • Based on • 44.1 kHz & 48 kHz PCM encoded audio • MPEG formats • MPEG-1 • MPEG-2 • Other (proprietary) formats • RealAudio • LiquidAudio • Requirements • Handle multiple formats • Multiple data rates • From a few Kb/s to around 1.5 Mb/s • New formats of up to 96 kHz • Multiple tools

  27. Image Formats - JPEG • Basics • ISO/IEC JTC1/SC2/WG10 Joint Photographics Expert Group: “Information Technology – Digital Compression and Coding of Continuous-tone Still Images”, International Standard ISO/ IEC IS 10918, 1993 • Colour & monochrome images • Exchange format including • Image data • Coding tables & parmeters • Modes • Lossy Sequential DCT Base Mode • Expanded Lossy DCT Base Mode • Lossless Mode • Hierarchical Mode • Images in different resolutions

  28. Further Image Formats • GIF (Graphic Interchange Format) • Basics • Platform independent exchange • Lossless compression scheme • Multiple interleave images • GIF Sectors • Header (GIF ID, algorithm ID) • Application (creation information) • Trailer • Control (controls presentation of a subsequent image block) • Image (image header, optional colour table and pixel data) • Comment • Plain text (to appear in an image) • TIFF (tagged Image File Format) • Basics • Baseline part • To be supported by every decoding and presentation applications • Extensions part • Binary, monochrome, colour (different colour pallets), RGB, etc. • Includes multiple coding schemes (e.g. JPEG) • Fields • Header Directory (byte order, version number), Structure (coding techniques), Fields (defines the image coding blocks), Data Fields (graphical objects not specified in advanced)

  29. Stuctured Documents • SGML (Standard Generalised Markup Language) • Basics • Framework that defines the syntax of tags • Document Type Definition (DTD) required to define semantics • Tags to mark text elements • <start-tag> document element </end-tag> • Facilitates automatic processing • Processing instructions can be specified • SGML tag categories • Descriptive Markup (determines structure of document) • Entity References (placeholder) • Markup Declarations (determine Entity Reference types) • Processing Instructions (including audio & video types) • Web Page & HTML • Hypertext Markup Language (HTML) • Structure • DTD • Document header • Document body • Style sheets • Specification of appearance of reoccurring elements • Links to other Web documents

  30. Essence Processing • Essence Processing Tools • Content segmentation • Temporal • Shots, cuts • Reconstructing an Edit Decision List • Spatial • Regions or objects • Metadata generation • Relying on analysable properties • E.g. motion detection • Automatic content description • Speech recognition • Transcripts • Keywords • Indexing • Face recognition • Programme classification • Content based retrieval • Image similarity • Fast browsing • Keyframes, skims, storyboards

  31. Essence Processing: Basic Principles • Feature Extraction • Low level features • Colour histograms, dominant motion vectors, spectrum • Feature interpretation • Matching of low level features with logical concepts • Similarity retrieval • Based on low-level concepts • Example: Speech and sound analysis • Acoustic & phonetic analysis • Syntactical analysis • Semantic analysis

More Related