1 / 6

Naming and Metadata

Naming and Metadata. Jamie Shiers Application Software and Databases Group Information Technology Division CERN http://wwwinfo.cern.ch/asd/cernlib/rd45/index.html. Introduction. Naming and Metadata in FATMEN Naming on the Web Naming for LHC data. FATMEN Naming.

midori
Download Presentation

Naming and Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Naming and Metadata Jamie Shiers Application Software and Databases Group Information Technology Division CERN http://wwwinfo.cern.ch/asd/cernlib/rd45/index.html

  2. Introduction • Naming and Metadata in FATMEN • Naming on the Web • Naming for LHC data Jamie.Shiers@cern.ch

  3. FATMEN Naming • Think of it as a file catalog • no special characters, case insensitive • Each “file” is identified by a Unix-style path name • //CERN/L3/PROD/DATA/PDRE/CC045QX2 • //CERN/DELPHI/P01_ALLD/MDST/PHYS/Y95V02/SUMT/C0515 • “Nicknames” corresponding to sets of files • :NICK.RAWD91 • :GNAME.P01_ALLD/RAWD/NONE/Y91V00/*/R • :DESC.RAW data of ALL events; 1991 data Jamie.Shiers@cern.ch

  4. FATMEN Metadata • 0.5KB metadata per filename • DSN, hostname, data representation, media type, location code • host type & OS details • VSN/VID/fseq, density, volseq • start/end record/block • record format, length, blocksize (DCB), filesize • creation / catalog / use date & time • creator username, account, job, node • protection mask • user words (10) & comment (80 bytes) • Largely irrelevant for ODBMS, except creation info (++) Jamie.Shiers@cern.ch

  5. Web • By convention, most sites located by http://www.name.com [.org .gov .int .ch] • OK for (very) high-level entry points • but is it www.british-airways.com, www.britishairways.com, www.britishair.com, www.ba.co.uk ? • www.altavista.com is not the altavista search engine • More complicated addresses best found by navigation • www.cern.ch  R&D  RD45 • or via search engines, book-marks etc. Jamie.Shiers@cern.ch

  6. Naming for LHC Event Data • How many entities will need to be named? • Can one avoid naming e.g. the collection of rawdata corresponding to run 123 year 2007? • A simple naming scheme for such data may be sufficient • How many analysis collections will there be? • Naming is clearly insufficient to describe the data • and, in general, a poor way of finding it! • Job information (creator) should probably be an association to a persistent “job object” in the “production database” • Can other metadata be “standardised” or simply “tag+attribute” • re-use of “generic tag” concept (implementation?) Jamie.Shiers@cern.ch

More Related