oai implementation notes for ltrs naca and open video
Download
Skip this Video
Download Presentation
OAI Implementation Notes for LTRS, NACA and Open Video

Loading in 2 Seconds...

play fullscreen
1 / 12

OAI Implementation Notes for LTRS, NACA and Open Video - PowerPoint PPT Presentation


  • 89 Views
  • Uploaded on

OAI Implementation Notes for LTRS, NACA and Open Video. Michael L. Nelson NASA Langley Research Center & University of North Carolina [email protected] http://www.ils.unc.edu/~mln/ OAI Open Meeting, Washington DC, January 23, 2001. Collections Represented. NASA

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' OAI Implementation Notes for LTRS, NACA and Open Video' - barton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
oai implementation notes for ltrs naca and open video

OAI Implementation Notes for LTRS, NACA and Open Video

Michael L. Nelson

NASA Langley Research Center &

University of North Carolina

[email protected]

http://www.ils.unc.edu/~mln/

OAI Open Meeting, Washington DC, January 23, 2001

collections represented
Collections Represented
  • NASA
    • LTRS (Langley Technical Report Server)
      • ~2300 reports, begun in 1992
      • http://techreports.larc.nasa.gov/ltrs/
        • OAI: http://techreports.larc.nasa.gov/ltrs/oai/
    • NACA (National Advisory Committee for Aeronautics)
      • NACA was the predecessor organization to NASA, operating from 1917-1958
      • ~6300 reports, begun in 1996
      • http://naca.larc.nasa.gov/
        • OAI: http://naca.larc.nasa.gov/oai/
collections represented1
Collections Represented
  • University of North Carolina
    • The Open Video Project
      • ~ 200 public domain video segments, project begun in 1998
      • http://www.open-video.org/
        • OAI: http://buckets.dsi.internet2.edu/openvideo/oai/
      • Open Video contents and OAI services still strictly experimental
nasa why is oai important
NASA: Why is OAI Important?
  • NASA builds DLs out of necessity, but ultimately NASA is a publisher
  • Interested in maximum exposure of and accessibility to its “unrestricted, unlimited” contents
  • In the NASA DLs, we left our “dark matter” partially exposed
    • individual reports were spidered by robots anyway…
    • OAI provides a more formal interface & protocol for exposing contents
unc why is oai important
UNC: Why is OAI Important?
  • goal is to grow Open Video into a TREC-like corpus for video segments to share with the research community
    • a standard collection of short (10 seconds – 1 hour) video segments on which to perform video content based retrieval
    • variability in video types: color/b&w, sound/no sound, high/low motion, etc.
    • currently in MPEG-1
      • others formats in the future
oai implementation
OAI Implementation
  • Protocol only specifies CGI stub
    • many implementations possible
  • I used a “bucket” for each: LTRS, NACA & Open Video
    • buckets are aggregative, computational entities normally used for data storage
      • generally, 1 bucket per “report”
    • buckets = metadata + data + methods
oai bucket structure
OAI Bucket Structure

Bucket

index.cgi

_method.pkg

_http.pkg

_log.pkg

_tc.pkg

oai

source files

for methods

http

dependency

files

terms

and

conditions

oai.pl element is a support library

that defines access for the specific

DL

logs

_md.pkg

_state.pkg

metadata

bucket

state

bucket payload is DL specific support library

default bucket packages

in addition to the ~ 30 bucket methods

each OAI verb is implemented as a

separate method

naca oai implementation
NACA OAI Implementation

normal WWW use

OAI requests

NACA file system

OAI responses built

from examining structure

of NACA filesystem

OAI

Server

1917

1918

. . .

1958

. . .

. . .

naca-tn-1

LTRS, NACA, Open Video have

different file structures, metadata

formats,etc.

refer

metadata

thumbnail

GIFs

full size

GIFs

index.cgi

implementation
Implementation
  • Did not implement sets
    • possible set candidates:
      • NACA: years, report type
      • LTRS: NASA STI subject classification
  • Only supporting Dublin Core
    • DC not sufficient for targeted applications
  • Did not implement resumptionToken
302 load balancing

if load > 0.05

redirect request

http://blah/oai/?verb=ListIdentifiers

OAI

Server

harvester

HTTP Status Code 302

naca.larc.nasa.gov/oai/

http://blah/oai/?verb=ListIdentifiers

<?xml version=“1.0” encoding=“UTF-8”?>

<ListIdentifiers>

</ListIdentifiers>

OAI

Server

buckets.dsi.internet2.edu/naca/oai/

302 Load Balancing
  • Interactive users on main DL machine should not be impacted by metadata harvesting
    • don’t take deliveries through the front door
metadata quality
Metadata Quality
  • XML is very brittle – 1 bad character in the metadata and an entire ListIdentifiers mesg can be damaged
    • yes, my DLs should be more diligent about scrubbing their metadata, but…
    • author contributed metadata particularly a problem (e.g. control characters from copy-n-paste)
    • one advantage of resumptionToken is that it compartmentalizes bad data
oai impact
OAI Impact
  • Can use OAI to build our own generalized services
    • updates, alerts
  • Finally have a clean method to export metadata, both to:
    • the general community for unrestricted data
    • closed communities with restricted data
      • Los Alamos, Air Force Research Laboratory, NASA
ad