1 / 19

Audio Definition Model for Flexible File Formats Dave Marston BBC R&D

Audio Definition Model for Flexible File Formats Dave Marston BBC R&D. Involvement. EBU Groups: FAR-BWF (BWF file, audio expertise) MIM-MM (EBU Core, metadata expertise). What is the Audio Definition Model?. Formalised way of describing audio for file formats.

lflaherty
Download Presentation

Audio Definition Model for Flexible File Formats Dave Marston BBC R&D

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Audio Definition Model for Flexible File Formats Dave Marston BBC R&D

  2. Involvement • EBU Groups: • FAR-BWF (BWF file, audio expertise) • MIM-MM (EBU Core, metadata expertise)

  3. What is the Audio Definition Model? • Formalised way of describing audio for file formats. • Initial file format will be Broadcast WAV (BWAV). • Specified by EBUCore XML schema. • Model can be used more generally. • Aim to make it the primary description model for as many formats as possible.

  4. Future Multichannel Audio • Channel based • e.g. stereo, 5.1, 22.2 • Scene based • e.g. Ambisonics • Object based • Audio objects with stationary or moving spatial properties. • Combinations of all three

  5. Cooking with Audio! • Audio Definition Model is like a shopping list of ingredients. • Each ingredient has a formal description. • BWAV file is like a shopping bag containing the actual ingredients. • BWAV 'chna' chunk is like the bar-codes on each item. • The ADM is NOT the recipe though!

  6. Terminology

  7. audioProgramme audioContent audioObject 'chna' chunk audioPackFormat audioChannelFormat audioStreamFormat audioBlockFormat audioTrackFormat Audio Definition Model Diagram Content Format

  8. Channel FrontLeft 00010001 Block start N/A 00000001 Stream PCM_FrontLeft 00010001 Track PCM_FrontLeft 00010001_01 Channel FrontRight 00010002 Block start N/A 00000001 Pack 3.0 00010005 Stream PCM_FrontRight 00010002 Track PCM_FrontRight 00010002_01 Channel Centre 00010003 Block start N/A 00000001 Stream PCM_Centre 00010003 Track PCM_Centre 00010003_01 Object 3.0 00011005 00000001 00000002 00000003 Simple Channel Based Example

  9. Channel FrontLeft 00010001 Block start N/A 00000001 Stream DolbyE_3.0 00040001 Track data1 00040001_01 Channel FrontRight 00010002 Block start N/A 00000001 Pack 3.0 00010005 Track data2 00040001_02 Channel Centre 00010003 Block start N/A 00000001 Object 3.0 00011006 00000001 00000002 Coded Audio Example

  10. Block start 00:00 dur: 00:05 00000001 Stream PCM_Object1 00031001 Track PCM_Object1 00031001_01 Block start 00:05 dur: 00:08 00000002 Pack Objects 00031001 Block start 00:13 dur: 00:07 00000003 Object Objects start 00:30 dur: 00:20 00031001 00000001 Object Based Example Channel Object1 00031001

  11. XML Representation Use new version of the EBUCore schema <audioChannelFormat audioChannelFormatID="AC_00031001" audioChannelFormatName="Object1" typeDefinition=”Objects”> <audioBlockFormat audioBlockFormatID=”AB_00031001_00000001” rtime=”00:00” duration=”00:05”> <position type=”azimuth”>-20.0</position> <position type=”elevation”>5.0</position> <position type=”distance”>1.0</position> </audioBlockFormat> <audioBlockFormat audioBlockFormatID=”AB_00031001_00000002” rtime=”00:05” duration=”00:08”> … </audioBlockFormat> <audioBlockFormat audioBlockFormatID=”AB_00031001_00000003” rtime=”00:13” duration=”00:07”> … </audioBlockFormat> </audioChannelFormat> <audioStreamFormat audioStreamFormatID="AS_00031001" audioStreamFormatName="Object1" typeDefinition=”PCM”> <audioChannelFormatIDRef>AC_00031001</audioChannelFormatIDRef> <audioTrackIDFormatRef>AT_00031001_01</audioTrackFormatIDRef> </audioStreamFormat> <audioTrackFormat audioTrackFormatID=”AT_00031001_01" audioTrackFormatName="Object1" typeDefinition=”PCM”/>

  12. <audioChannelFormat audioChannelFormatID="AC_00010001" audioChannelFormatName="FrontLeft" typeDefinition=”DirectSpeakers”> <audioBlockFormat audioBlockFormatID=”AB_00010001_00000001”> <speakerLabel>M-30</speakerLabel> <position type=”azimuth”>-25.0</position> <position type=”elevation”>5.0</position> <position type=”distance”>1.0</position> </audioBlockFormat> </audioChannelFormat> Standard Configuration File • Many configurations will use common channel types (e.g. stereo, 5.1, 22.2, Ambisonics). Therefore use an external standard reference XML file.

  13. Custom Configuration • For non-standard channel definitions, particularly audio objects, a custom configuration file must file generated. • This is what is carried in the 'axml' chunk. <audioChannelFormat audioChannelFormatID="AC_00031001“ audioChannelFormatName="Object1" typeDefinition=”Objects”> <audioBlockFormat audioBlockFormatID=”AB_00031001_00000001” rtime=”00:00” duration=”00:05”> <position type=”azimuth”>-20.0</position> <position type=”elevation”>5.0</position> <position type=”distance”>1.0</position> </audioBlockFormat> <audioBlockFormat audioBlockFormatID=”AB_00031001_00000002” rtime=”00:05” duration=”00:08”> <position type=”azimuth”>-22.0</position> <position type=”elevation”>6.0</position> <position type=”distance”>1.1</position> </audioBlockFormat> <audioBlockFormat audioBlockFormatID=”AB_00031001_00000003” rtime=”00:13” duration=”00:07”> <position type=”azimuth”>-24.0</position> <position type=”elevation”>7.0</position> <position type=”distance”>1.2</position> </audioBlockFormat> </audioChannelFormat>

  14. What are BWAV and RF64 Files? • WAV is a RIFF file for audio • BWAV = Broadcast WAV • BWF = Broadcast WAV File • RF64 = WAV file for >4GB size files • BWAV have a 'bext' chunk • MBWF is a RF64 file with a 'bext' chunk

  15. Chunks • Resource Interchange File Format (RIFF) • Data stored in chunks – header, length & data. • WAV chunks: • 'RIFF' : tells you its a WAVE file • 'fmt ' : contains sample-rate, number of channels, etc. • 'data' : contains audio samples. • BWAV chunks: • 'bext', 'axml', 'link', 'levl', 'mext', 'qlty', 'dbmd'

  16. Standard XML Definitions Custom XML Definitions Where does the XML go? fmt chunk If no custom XML definitions are used, then no axml chunk is required. Standard XML definitions do not need to be included in the file. bext chunk Refers to chna chunk Refers to data chunk is stored in axml chunk

  17. 'chna' chunk Simple 3.0 Channel Example audioPackFormatID TrackNo audioTrackUID audioTrackFormatID Track 1 1 00010005 00000001 00000002 00000003 00010001_01 00010002_01 00010003_01 2 Track 2 00010005 3 00010005 Track 3 First 4 digits specify type of stream. 0001 = PCM

  18. Current Status • EBU Tech 3364 “Audio Definition Model” now published. • EBU Core v1.5 (EBU Tech 3293) schema containing ADM soon to be released. • ITU Contributions being made.

  19. Future Work • A list of standard configurations will be drawn together. • Database • Reference XML file • Audio Object parameters need continual refinement. • Libraries/APIs for parsing and generating ADM metadata to be developed. • Look at streaming methods.

More Related