Overview of lockss
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

Overview of LOCKSS PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on
  • Presentation posted in: General

Overview of LOCKSS. Session Learning Objectives. Provide an overview of the LOCKSS architecture. Describe the LOCKSS polling process Describe how LOCKSS private networks differ. Provide a vocabulary of technical terms used frequently with LOCKSS networks. Architectural Components.

Download Presentation

Overview of LOCKSS

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Overview of lockss

Overview of LOCKSS


Session learning objectives

Session Learning Objectives

  • Provide an overview of the LOCKSS architecture.

  • Describe the LOCKSS polling process

  • Describe how LOCKSS private networks differ.

  • Provide a vocabulary of technical terms used frequently with LOCKSS networks


Architectural components

Architectural Components

  • Provider Sites (digital collections)

  • LOCKSS nodes (aka “peers”)

  • Plugins / Plugin Repository

  • Cache Manager

  • Title Database / Conspectus Database


Provider sites

Provider Sites

  • Prepare a digital collection so that it is web accessible to the preservation nodes

  • Expose a “manifest” web page for each collection, according to LOCKSS specifications.

    • Grants permission for LOCKSS to crawl

    • Gives starting point for crawl

  • Provide information sufficient to create a LOCKSS plugin for the collection (or else create the plugin themselves and reposit that plugin with the LOCKSS network)


Lockss peer nodes

LOCKSS Peer Nodes

  • Data caches for harvested content

  • Caches organized into archival units (AUs)

  • Nodes can select which AUs to crawl and preserve

  • There must be >= 6 copies of an AU in order for the polling process to work properly


Plugins plugin repository

Plugins / Plugin Repository

  • Tell LOCKSS where, how and how often to crawl a provider site for AUs

  • Plugins are Java based

  • Distinct from core LOCKSS software


Cache manager

Cache Manager

  • Distributed separately from LOCKSS

  • Can remotely inspect and manage the caches on the various peer nodes


Title conspectus databases

Title / Conspectus Databases

  • Title database on each node describes and manages which AUs to preserve on that node

  • Conspectus Database designed for MetaArchive Project, provides more extensive metadata about the preserved digital collections, and feeds the Title database with entries


Overview of lockss

Plugin

Repository

DC1

Digital Collection 1

Private LOCKSS Network Nodes

1

DC1

AU 1

DC2

DC2

2

DC2

Web Site

3

Manifest

page

DC1

AU 2

4

DC1

DC2

5

DC2

Digital Collection 2

AU 1

AU 2

6

Web

Site

DC1

Source

Code

7

DC1

DC2

DC1

8

AU 3

DC2

Manifest

page

SQL Dump

9

DC2


The polling process

The Polling Process


Overview of lockss

Invited nodes create fresh SHA1 digest of the AU

Polling Process resulting in “landslide loss”, AU repair

Poll Effort Proof is cryptographically derived and sent to affirmative voter’s challenges

Affirmative PollChallenge message responses allow that inner circle node to participate in poll

DC2-AU1

DC2-AU1

2

4

SHA1

SHA1

There is a “landslide” of valid, disagreeing votes against the Node 5’s SHA1 digest of DC2-AU1

Invitation

Valid vote disagrees

Valid vote disagrees

Node 5 calls poll on AU 1 of Digital Collection 2

PollChallenge

PollProof

1

Once repair is completed, Node 5 immediately calls a new poll, which effectively verifies, or invalidates and corrects, the repair

DC2-AU1

Valid vote disagrees

5

SHA1

Encrypted RepairRequest message

Repair made

DC2-AU1

SHA1

Valid vote agrees

Node 9 nominates 7 and 8

Node 5 discovers new peers through nomination process

Node 5 invites some recently encountered peers to vote.

(Each node maintains a reference list of the recently encountered peers)

Those invited are the “inner circle” for this opinion poll.

DC2-AU1

9

Since agreeing votes are below threshold, Node 5 picks a random disagreeing voter from the inner circle

SHA1

DC2-AU1

8

DC2-AU1

7

Nominated Nodes 7 and 8 belong to the “outer circle”, can be invited to subsequent voting rounds by Node 5


Polling refresh timer

Polling Refresh Timer

  • A peer sets a refresh timer for a given AU to determine the interval between successive polls

  • System parameter R is the mean for the possible random values generated for the refresh timer


System parameter quorum

System Parameter – ‘Quorum’

  • Q = # of valid inner circle votes required to conclude a poll successfully

  • Q = 6 is the thoroughly tested value in use

  • If votes < Q, poller invites additional peers, or else aborts the opinion poll


Polling outcome landslide win

Polling Outcome – ‘Landslide Win’

  • The poller considers its current copy to have integrity

  • This is the only scenario in which an opinion poll concludes successfully

  • The poller updates its reference list and then waits until the next polling period (determined by the refresh timer)


Reference list update

Reference List Update

  • Happens only after a successful poll

  • Poller removes the inner circle peers who had valid votes in the last opinion poll

  • Culls peers it has not been able to contact for some time

  • Adds outer circle peers whose votes were valid and eventually agreeing


Polling outcome inconclusive

Polling Outcome - Inconclusive

  • D = max allowed “minority” votes

  • If Agreeing Votes > D, and

  • Agreeing Votes < Total valid votes – D,

  • Then the poll is inconclusive, raises alarm

  • Human intervention needed to determine if nodes have been compromised

  • Peers voting in agreement with a known bad copy are blacklisted if that peer node can’t be identified or it won’t cooperate


Further details on polling process

Further Details on Polling Process

  • Petros Maniatis, Mema Roussopoulos, TJ Giuli, David S. H. Rosenthal, Mary Baker, and Yanto Muliadi, "LOCKSS: A Peer-to-Peer Digital Preservation System", ACM Transactions on Computer Systems (TOCS). http://www.eecs.harvard.edu/~mema/publications/TOCS2005.pdf

  • See also LOCKSS related publications at http://www.lockss.org/lockss/Publications


The lockss private network difference

The LOCKSS Private Network Difference

  • More flexible (not appliance based)

    • Can run on any operating system that supports Java

      • LOCKSS Team maintains rpm packages for Linux installations

    • Peer Node administrators have greater discretion configuring access, customizing functionality, e.g. altering system parameters


The lockss private network difference cont

The LOCKSS Private Network Difference (cont.)

  • Can extend LOCKSS core functionality with supplemental tools and methods to fit new use cases

  • E.g. the MetaArchive Conspectus database


Vocabulary

Vocabulary

  • (Please refer to the workshop binder for terminology and definitions)


Overview of lcap version 3

Overview of LCAP version 3


  • Login