1 / 26

Bits about Bits: Bitzi and the Business of Metadata

Bits about Bits: Bitzi and Open, Cooperative Metadata . Gordon MohrBitzi CorporationFounder

ama
Download Presentation

Bits about Bits: Bitzi and the Business of Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    2. Bits about Bits: Bitzi and the Business of Metadata Gordon Mohr Bitzi Corporation Founder & Chief Technology Officer September 17, 2001

    3. Bits about Bits: Bitzi and Open, Cooperative Metadata Gordon Mohr Bitzi Corporation Founder & Chief Technology Officer November 7, 2001

    4. Overview P2P File Sharing: “a cornucopia without confidence” Four Missing Ingredients The Bitzi Approach Demos Future Directions Could metadata be a big business?

    5. Everything is now “Bits”… Anything can be encoded, stored, shifted, shared Thc “cloud” is coming to include everything Tech and social trends are against strict control

    6. No Confidence or Context You can get anything imaginable, BUT… Is it complete? Where did it originate? Has it been damaged or altered? Is this the best or current instance? What’s related? Is it legitimate? What should I seek next? Current ad hoc & P2P sharing/distribution nets inherently blur these issues Filename-centric Mr. Short-Term Memory

    7. What’s Missing? We’re craving four things: Reliable Names Nothing can masquerade as something else Easy to ask for exactly the right thing Rich Metadata Beyond just “filename” and “length” Easy Access Everywhere the files are, and then some A Consensus View Eliminate frivolous skew of understanding

    8. We Want: Reliable Names Does a file have a “True Name”? Yes, via Cryptographic Hashes Essentially, these are “digital fingerprints” Any-sized input (any digital file) to fixed-sized output (hash value) Deterministic but “unpredictable” Infeasible to create specific desired hash value Infeasible to find two inputs with same hash value Examples: MD5 (but maybe not as reliably as once thought) SHA1 (and now SHA256, SHA512) Tiger RIPEMD160

    9. We Want: Rich Metadata Metadata is “Data about other Data” “Filename” and “Length” are a trivial start Intrinsic or extrinsic to file itself Examples Generic: Origin, Free-form description, Comments, Community Ratings Format-specific: Encoding parameters, Resolution, Playback length Growing body of useful standards and conventions XML, RDF, Dublin Core, domain-specific proposals

    10. We Want: Easy Access Ubiquity Anywhere the files are – and where they’re not Simplicity Familiar interfaces Reliability Canonical location Redundant Mirrors Multiple paths – same paths as files

    11. We Want: A Consensus View Avoid redundant efforts Achieve convergence on simple issues Trivial disagreements and mistakes should be quickly and permanently resolved Robustness against casual mischief Capture and highlight enduring disagreements Even arbitrary commonality is valuable Naming systems A central “reference point” is the easy solution

    12. The File Trust Utility

    13. The Bitzi Approach A metadata aggregator, consisting of… Website Community of contributors Editorial/rating policies Canonical datastore Web service Free access and reuse Just give us attribution Other restrictions only get in the way Our long-term role: stewardship We live or die by the usefulness of the dataset

    14. Sources of Inspiration Open Directory Project AKA NewHoo, GnuHoo, DMoz(illa) Volunteer-built Yahoo-like categorical web index CD/Music projects CDDB (before dataset lockdown) FreeDB & MusicBrainz (since) Oxford English Dictionary “The Professor and the Madman” Naspter et al De facto quality filtering Usenet (esp. FAQs), Epinions, Amazon reviews, EBay, Zagat’s

    15. How Bitzi Works: Bitprints & Tickets

    16. How Bitzi Works: Tickets Out

    17. How Bitzi Works: Tech Details Our “Bitprint” Master key into our catalog Concatenation of two nonproprietrary hashes SHA1: safe, standard TigerTree: different basis, range benefits Robustness against research breakthroughs Our data model & terminology Bitprints may be “tagged” Tags are arbitrary XML blobs Growing set of types Usually coercible into a database row or RDF Tags “compete” with each other as necessary “Tickets” are created from the best tags

    18. How Bitzi Works: Current tools Data collection Downloadable “Bitcollider” utility Windows & Linux Free source code Calculates bitprint, extracts some intrinsic tags Web forms Viewing/rating/searching All at our website

    19. How Bitzi Works: Open Code & Data Bitcollider & bitprinting code available Public Domain C & Java Free dataset access: “OpenBits” Draft OpenBits License based on Open Directory Project license Preliminary RDF dump available http://preview.openbits.org Eventually, at the Ticket granularity

    20. Using Bitzi On your desktop: Identify anything you’ve got – including possible problems, newer versions, etc. At our website: Find interesting potential new things to get – in context, presented alongside other options In other applications, devices, websites: Identify what’s playing Choose between offered options Organize/correct your collection Much more… ?

    21. Demos Bitzi Bitcollider Desktop utility LimeWire Evaluate search results before downloading WinAmp See more about “what’s playing” Bitzi Website Search for new items of interest

    22. Future: Greater Integration Standard, generic “get” facility We expect: single-click from Ticket asks multiple applications to locate matching file Ticket info inside applications Get Ticket direct from Bitzi, or elsewhere Verify Ticket validity (cryptographically signed) Display as locally appropriate

    23. Future: Website and Community Enhanced search Improved rating and peer-review processes Browsing/Categorization Automatic and manual Dataset mining Variety of rankings

    24. Is this a Business? Not all Tickets are (or should be) equal Fuzzy vs. guaranteed trust Community vs. promotional info Attention is always scarce Some special inserts will cost Someone always needs to be found & trusted Users benefit: Fees subsidize verification procedures Prices self-select for appropriateness Has anyone succeeded with “free lookups, paid inserts?” (Yes; examples should be obvious)

    25. The End Gordon Mohr Founder & Chief Technology Officer Bitzi Corporation Email gojomo@bitzi.com Bitizen Page http://bitzi.com/bitizen/gojomo O’Reilly Weblog http://www.oreillynet.com/weblogs/gojomo

More Related