Oai implementation notes for ltrs naca and open video
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

OAI Implementation Notes for LTRS, NACA and Open Video PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on
  • Presentation posted in: General

OAI Implementation Notes for LTRS, NACA and Open Video. Michael L. Nelson NASA Langley Research Center & University of North Carolina [email protected] http://www.ils.unc.edu/~mln/ OAI Open Meeting, Washington DC, January 23, 2001. Collections Represented. NASA

Download Presentation

OAI Implementation Notes for LTRS, NACA and Open Video

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Oai implementation notes for ltrs naca and open video

OAI Implementation Notes for LTRS, NACA and Open Video

Michael L. Nelson

NASA Langley Research Center &

University of North Carolina

[email protected]

http://www.ils.unc.edu/~mln/

OAI Open Meeting, Washington DC, January 23, 2001


Collections represented

Collections Represented

  • NASA

    • LTRS (Langley Technical Report Server)

      • ~2300 reports, begun in 1992

      • http://techreports.larc.nasa.gov/ltrs/

        • OAI: http://techreports.larc.nasa.gov/ltrs/oai/

    • NACA (National Advisory Committee for Aeronautics)

      • NACA was the predecessor organization to NASA, operating from 1917-1958

      • ~6300 reports, begun in 1996

      • http://naca.larc.nasa.gov/

        • OAI: http://naca.larc.nasa.gov/oai/


Collections represented1

Collections Represented

  • University of North Carolina

    • The Open Video Project

      • ~ 200 public domain video segments, project begun in 1998

      • http://www.open-video.org/

        • OAI: http://buckets.dsi.internet2.edu/openvideo/oai/

      • Open Video contents and OAI services still strictly experimental


Nasa why is oai important

NASA: Why is OAI Important?

  • NASA builds DLs out of necessity, but ultimately NASA is a publisher

  • Interested in maximum exposure of and accessibility to its “unrestricted, unlimited” contents

  • In the NASA DLs, we left our “dark matter” partially exposed

    • individual reports were spidered by robots anyway…

    • OAI provides a more formal interface & protocol for exposing contents


Unc why is oai important

UNC: Why is OAI Important?

  • goal is to grow Open Video into a TREC-like corpus for video segments to share with the research community

    • a standard collection of short (10 seconds – 1 hour) video segments on which to perform video content based retrieval

    • variability in video types: color/b&w, sound/no sound, high/low motion, etc.

    • currently in MPEG-1

      • others formats in the future


Oai implementation

OAI Implementation

  • Protocol only specifies CGI stub

    • many implementations possible

  • I used a “bucket” for each: LTRS, NACA & Open Video

    • buckets are aggregative, computational entities normally used for data storage

      • generally, 1 bucket per “report”

    • buckets = metadata + data + methods


Oai bucket structure

OAI Bucket Structure

Bucket

index.cgi

_method.pkg

_http.pkg

_log.pkg

_tc.pkg

oai

source files

for methods

http

dependency

files

terms

and

conditions

oai.pl element is a support library

that defines access for the specific

DL

logs

_md.pkg

_state.pkg

metadata

bucket

state

bucket payload is DL specific support library

default bucket packages

in addition to the ~ 30 bucket methods

each OAI verb is implemented as a

separate method


Naca oai implementation

NACA OAI Implementation

normal WWW use

OAI requests

NACA file system

OAI responses built

from examining structure

of NACA filesystem

OAI

Server

1917

1918

. . .

1958

. . .

. . .

naca-tn-1

LTRS, NACA, Open Video have

different file structures, metadata

formats,etc.

refer

metadata

thumbnail

GIFs

full size

GIFs

index.cgi


Implementation

Implementation

  • Did not implement sets

    • possible set candidates:

      • NACA: years, report type

      • LTRS: NASA STI subject classification

  • Only supporting Dublin Core

    • DC not sufficient for targeted applications

  • Did not implement resumptionToken


302 load balancing

if load > 0.05

redirect request

http://blah/oai/?verb=ListIdentifiers

OAI

Server

harvester

HTTP Status Code 302

naca.larc.nasa.gov/oai/

http://blah/oai/?verb=ListIdentifiers

<?xml version=“1.0” encoding=“UTF-8”?>

<ListIdentifiers>

</ListIdentifiers>

OAI

Server

buckets.dsi.internet2.edu/naca/oai/

302 Load Balancing

  • Interactive users on main DL machine should not be impacted by metadata harvesting

    • don’t take deliveries through the front door


Metadata quality

Metadata Quality

  • XML is very brittle – 1 bad character in the metadata and an entire ListIdentifiers mesg can be damaged

    • yes, my DLs should be more diligent about scrubbing their metadata, but…

    • author contributed metadata particularly a problem (e.g. control characters from copy-n-paste)

    • one advantage of resumptionToken is that it compartmentalizes bad data


Oai impact

OAI Impact

  • Can use OAI to build our own generalized services

    • updates, alerts

  • Finally have a clean method to export metadata, both to:

    • the general community for unrestricted data

    • closed communities with restricted data

      • Los Alamos, Air Force Research Laboratory, NASA


  • Login