1 / 30

Introduction to

Archiving Movies in a Digital World Dave Cavena, Sun Microsystems January, 2007. Introduction to. Agenda. Overview Archiving Archived content integrity Proposed model Costs Alternatives? Summary Conclusion. Overview. Has the time come to begin archiving movies digitally?

kerem
Download Presentation

Introduction to

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Archiving Movies in a Digital World Dave Cavena, Sun Microsystems January, 2007 Introduction to

  2. Agenda • Overview • Archiving • Archived content integrity • Proposed model • Costs • Alternatives? • Summary • Conclusion

  3. Overview • Has the time come to begin archiving movies digitally? • Only archiving remains reliant on film • Digital image archive technology is mature • A viable, scalable, cost-effective COTS model • What are the alternatives?

  4. Archiving • The stories of an Age • Fiduciary responsibility • A Digital Content Archive can store these assets • without degradation • forever

  5. Archiving • Any movie archived in 1907 is playable in 2007 • Will a celluloid movie archived in 2007 be playable in 2107? • Is it time to start digital archiving of this irreplaceable content? Chairman Vice Chairman

  6. Archiving • Will the Archive be the only time the story exists on film? • What are celluloid archive and repurposing costs? • A Digital Content Archive provides image and cost advantages over celluloid • Can be accomplished with COTS Technology

  7. Archived Content Integrity • Irreplaceable content • Multiple copies • Multiple libraries • Automated audit, copy • Algorithmic assurance of bit integrity • Error Correction Codes (ECC) • Bit Error Detection • Bit Error Correction

  8. Archived Content Integrity • ECC • Standard on tape drives • COTS technology • Bit Error Rates* • Bit Error Rates (BER) differ by manufacturer • ECC undetected BER = 10-33 • Four copies = 10-128 • ECC uncorrected BER = 10-19 • Four copies = 10-76 • 10TB Digital Intermediate = 1014 bits • One uncorrectable bit error in 1062 movies (10-76 * 1014) * Sun T10000 drive

  9. Archived Content Integrity • Generational data integrity • 20 generations of compute/disk front-end • 5 generations of libraries • Unknown generations of application file formats • At least 12 rewrites of the content onto new media • What is the generational impact on the algorithmic BER?

  10. Archived Content Integrity • For this application it doesn’t matter how many times the data is accessed; how many generations of rewrite • Probability that the ECC will fail to correct damage during any given access is 10-19. • The probability it will fail one or more times during N accesses is 1 minus the probability that it will succeed N times in a row: 1-(1-10-19)N • For N less than 1019, this is well approximated by N*10-19

  11. Archived Content Integrity • Example • Assume a movie accessed one million times • The chance of an uncorrectable bit error per read is 10-19 • The chance of an uncorrectable bit error on any one of 106 reads is 106 * 10-19 = 10-13 • For a single copy • It reasonably can be assumed for the purposes of this application that the ability to detect and correct errors in transcription is perfect.

  12. Archived Content Integrity • Other Strategies • Secure Hashing Algorithm, SHA-256* • Checksum failure probability of 2-256, or approximately 10-77 • Four-copy BER = 10-308 • One undetected bit loss in 10294 movies • Birthday collisions don’t apply; not defending against traffic analysis, just using it as a good checksum • Voting bit-by-bit • Can make a 10TB DCDM into 40 1TB files, 31 of which would have to be damaged to preclude rebuilding the original * Developed by the NSA, publicly available, peer-reviewed, easy to implement

  13. Archive Model • Enterprise class tape library • Front-end server and disk • Ingest and prepare Archive Object for writing to tape library • Hierarchical Storage Manager, HSM • Two complete and identical systems, geographically separate • Two copies of each movie on each library

  14. Archive Model • Computers and disk front-ends reach EOSL • 5-yr replacement • Tape drives reach EOSL • 10-yr replacement • Libraries reach EOSL • 20-yr replacement • Tape media has a finite lifetime* • Replace tapes every 10 years • Audit every tape every six months • Re-write from pristine copies as necessary *National Media Lab, IBM, Sun, others, publish 30 years as viable tape media lifetime

  15. Archive Model • Application software and file formats • Proposed archive model HSM uses an open tarball format, readable even without the application • When a tape is audited, rewritten or copied, the new copy can be created in the new file format • This is feasible because the underlying data format remains digitally fixed, only the file format and / or storage medium change

  16. Archive Model • Institutional memory must be created • Two or more sites are required, geographically separate • No network connectivity • Archive content in the clear • Same as current model • Lost key or algorithm will render archive useless • Can be encrypted for transport (tape drive HW encryption becoming the norm) • When copying tapes, send old ones to another location

  17. Archive Model • Oil & Gas has been archiving digital images for decades • Medical is doing this with far higher transaction rates • Library of Congress doing it now • "Storing National Treasures" http://www.enterprisestorageforum.com/sans/features/article.php/3586066 • "Sun Rises at the Library of Congress" http://www.enterprisestorageforum.com/sans/features/article.php/3619646

  18. Costs • Can digital compete with celluloid? • Film archiving cost • $100K /100 years / feature • 2,000 movies = $200M • 10TB archive object, 20 objects/year, 100 years • $45,000/movie (list) • $16,000/movie (Archive pricing) • 2,000 movies = $32M • 100TB archive object • $67,000/movie (Archive pricing) • 2,000 movies = $79M

  19. $3,000,000 $2,601,160 $2,500,000 $2,000,000 Dollars $1,500,000 $1,000,000 $408,631 $500,000 $114,218 $73,094 $57,330 $45,493 $0 10 100 500 1000 1500 2000 Movies in Archive Costs10TB Archive Object – List price

  20. Compute/Disk + Library and Drives mtce 4% + mtce 18% Media 3% Description (both SAM License + libraries, two mtce 75% copies/movie/library) Cost Compute/Disk + mtce $3,200,000 Library and Drives + mtce $16,112,000 Media $2,699,000 SAM License + mtce $68,974,400 Total $90,985,400 Costs10TB Archive Object – List price

  21. Costs10TB Archive Object – Archive price

  22. $2,500,000 $2,005,850 $2,000,000 $1,500,000 Dollars $1,000,000 $500,000 $236,852 $54,072 $29,706 $21,259 $16,265 $0 10 100 500 1000 1500 2000 Movies in Archive Costs10TB Archive Object – Archive price

  23. $7,000,000 $6,488,535 $6,000,000 $5,006,303 $5,000,000 Dollars $3,791,715 $4,000,000 $3,180,640 $2,641,742 $3,000,000 $2,121,252 $2,000,000 $1,000,000 $0 20 100 500 1000 1500 2000 Movies in Archive Costs100TB Archive Object – List price

  24. Costs100TB Archive Object – List price

  25. Costs100TB Archive Object – Archive price

  26. $1,600,000 $1,462,466 $1,400,000 $1,200,000 $1,000,000 Dollars $800,000 $600,000 $505,009 $400,000 $170,432 $112,159 $200,000 $85,527 $67,171 $0 20 100 500 1000 1500 2000 Movies in Archive Costs100TB Archive Object – Archive price

  27. Alternatives • An unaddressed question… Does celluloid have a future – at all? • Replaced by commercial photographers globally • Precipitous drop in market share and manufacturer jobs • Environmentally unfriendly to manufacture and process • Celluloid may not be an option • Film may not even exist in 100 years • Film infrastructure – labs, chemicals, workers, etc. - may not exist

  28. Summary • The technology required to store and maintain irreplaceable digital image content for archive durations is mature, proven and in use today • A Digital Content Archive will extend the quick responsiveness of a studio’s Library to the Archive • The return on these increasingly expensive assets easily can be extended – forever • … all using COTS technology

  29. Conclusion The pivotal and immutable point is that this can be done beginning today. The experience Sun brings to the project already has been recognized, and is being broadened by, the Library of Congress and other locations around the world undertaking the digitization of their media assets using solutions from Sun Microsystems. The time is now to begin serious efforts to test and implement studio Digital Content Archives

  30. Thank you Dave Cavena david.cavena@sun.com Introduction to

More Related