Oai implementation notes for ltrs naca and open video
Download
1 / 12

OAI Implementation Notes for LTRS, NACA and Open Video - PowerPoint PPT Presentation


  • 89 Views
  • Uploaded on

OAI Implementation Notes for LTRS, NACA and Open Video. Michael L. Nelson NASA Langley Research Center & University of North Carolina mln@ils.unc.edu http://www.ils.unc.edu/~mln/ OAI Open Meeting, Washington DC, January 23, 2001. Collections Represented. NASA

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'OAI Implementation Notes for LTRS, NACA and Open Video' - barton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Oai implementation notes for ltrs naca and open video

OAI Implementation Notes for LTRS, NACA and Open Video

Michael L. Nelson

NASA Langley Research Center &

University of North Carolina

mln@ils.unc.edu

http://www.ils.unc.edu/~mln/

OAI Open Meeting, Washington DC, January 23, 2001


Collections represented
Collections Represented

  • NASA

    • LTRS (Langley Technical Report Server)

      • ~2300 reports, begun in 1992

      • http://techreports.larc.nasa.gov/ltrs/

        • OAI: http://techreports.larc.nasa.gov/ltrs/oai/

    • NACA (National Advisory Committee for Aeronautics)

      • NACA was the predecessor organization to NASA, operating from 1917-1958

      • ~6300 reports, begun in 1996

      • http://naca.larc.nasa.gov/

        • OAI: http://naca.larc.nasa.gov/oai/


Collections represented1
Collections Represented

  • University of North Carolina

    • The Open Video Project

      • ~ 200 public domain video segments, project begun in 1998

      • http://www.open-video.org/

        • OAI: http://buckets.dsi.internet2.edu/openvideo/oai/

      • Open Video contents and OAI services still strictly experimental


Nasa why is oai important
NASA: Why is OAI Important?

  • NASA builds DLs out of necessity, but ultimately NASA is a publisher

  • Interested in maximum exposure of and accessibility to its “unrestricted, unlimited” contents

  • In the NASA DLs, we left our “dark matter” partially exposed

    • individual reports were spidered by robots anyway…

    • OAI provides a more formal interface & protocol for exposing contents


Unc why is oai important
UNC: Why is OAI Important?

  • goal is to grow Open Video into a TREC-like corpus for video segments to share with the research community

    • a standard collection of short (10 seconds – 1 hour) video segments on which to perform video content based retrieval

    • variability in video types: color/b&w, sound/no sound, high/low motion, etc.

    • currently in MPEG-1

      • others formats in the future


Oai implementation
OAI Implementation

  • Protocol only specifies CGI stub

    • many implementations possible

  • I used a “bucket” for each: LTRS, NACA & Open Video

    • buckets are aggregative, computational entities normally used for data storage

      • generally, 1 bucket per “report”

    • buckets = metadata + data + methods


Oai bucket structure
OAI Bucket Structure

Bucket

index.cgi

_method.pkg

_http.pkg

_log.pkg

_tc.pkg

oai

source files

for methods

http

dependency

files

terms

and

conditions

oai.pl element is a support library

that defines access for the specific

DL

logs

_md.pkg

_state.pkg

metadata

bucket

state

bucket payload is DL specific support library

default bucket packages

in addition to the ~ 30 bucket methods

each OAI verb is implemented as a

separate method


Naca oai implementation
NACA OAI Implementation

normal WWW use

OAI requests

NACA file system

OAI responses built

from examining structure

of NACA filesystem

OAI

Server

1917

1918

. . .

1958

. . .

. . .

naca-tn-1

LTRS, NACA, Open Video have

different file structures, metadata

formats,etc.

refer

metadata

thumbnail

GIFs

full size

GIFs

index.cgi


Implementation
Implementation

  • Did not implement sets

    • possible set candidates:

      • NACA: years, report type

      • LTRS: NASA STI subject classification

  • Only supporting Dublin Core

    • DC not sufficient for targeted applications

  • Did not implement resumptionToken


302 load balancing

if load > 0.05

redirect request

http://blah/oai/?verb=ListIdentifiers

OAI

Server

harvester

HTTP Status Code 302

naca.larc.nasa.gov/oai/

http://blah/oai/?verb=ListIdentifiers

<?xml version=“1.0” encoding=“UTF-8”?>

<ListIdentifiers>

</ListIdentifiers>

OAI

Server

buckets.dsi.internet2.edu/naca/oai/

302 Load Balancing

  • Interactive users on main DL machine should not be impacted by metadata harvesting

    • don’t take deliveries through the front door


Metadata quality
Metadata Quality

  • XML is very brittle – 1 bad character in the metadata and an entire ListIdentifiers mesg can be damaged

    • yes, my DLs should be more diligent about scrubbing their metadata, but…

    • author contributed metadata particularly a problem (e.g. control characters from copy-n-paste)

    • one advantage of resumptionToken is that it compartmentalizes bad data


Oai impact
OAI Impact

  • Can use OAI to build our own generalized services

    • updates, alerts

  • Finally have a clean method to export metadata, both to:

    • the general community for unrestricted data

    • closed communities with restricted data

      • Los Alamos, Air Force Research Laboratory, NASA