160 likes | 363 Views
Changes in role distribution between library and IT staff. BEFORE (prints)Selectione.g acquisitions, legal deposit, dealing with publishersDescriptione.g cataloguingConservatione.g stacks managementCommunicationin the reading rooms
E N D
1. Let’s talk! – Organization schemes and metrics for Preservation planning and management negotiation in Web archiving institutions Gildas Illien
Head of Digital legal deposit
Bibliothčque nationale de France, Paris.
Gildas.illien@bnf.fr
2. Changes in role distribution between library and IT staff BEFORE (prints)
Selection
e.g acquisitions, legal deposit, dealing with publishers
Description
e.g cataloguing
Conservation
e.g stacks management
Communication
in the reading rooms…
TODAY (digital)
Selection
Domain harvesting, dealing with hosts and webmasters
Description
Large scale indexing
Conservation
Back ups, checksums, ingests, conversions…
Communication
Performance, designing new research tools, using online services…
3. Confusion.
4. Trouble. The librarian: « I want my collections. I want quality. I want security. I don’t know how it works but this is what I want and the IT doesn’t do it right. »
The engineer: « I really don’t know why they want all that stuff, but it’s really costing me a lot of storage, computing and stress. »
The manager: « Those guys can’t talk to each other. How can I make them work together? How can I explain what they’re actually doing to my boss so that we get funding? Uhh… but what are they actually doing? »
5. Case study from BnF: backing up the web archive in the repository
6. Web archiving and preservation communities: a new distribution of roles?
7. Enablers Organization
describe processes
clarify roles and tasks
design appropriate tools
Concepts, terminoloy
e.g « a seed»
e.g « a job »
Metrics
Addressing the needs of all stakeholders
designed for business relationships
designed to assess value and costs
8. Metrics aren’t easy: challenges of Web archiving measurement Scalability : the Web is big. Internationalization : the Web is global.
9. The challenges of Web archiving measurement Virtuality and multiplicity of document types : the Web is intangible and diverse.
Twilight zones : the Web is for everybody and everything.
Web document structure : the Web is a puzzle.
10. Sites and seeds: Metrics for collection development Seeds & orders
collection targets = orders for the I.T.
Seeds = a URL + settings
scope and priority of orders
Jobs & collections
Orders & seeds grouped into bigger packages
A job is a list or a "queue" of seeds to be harvested together as a consistent work package
11. Files and bytes: Metrics for collection processing Number of files or URIs
Size (TB/GB)
Counting and sorting by MIME types
(W)ARC files
12. Metrics for collection management How can we demonstrate and measure the cost of web archiving?
Hardware & software
+ direct computing costs
+ labor costs
How can we demonstrate and measure the value of web archives?
Scientific or heritage value
Scarcity and risk analysis
Usage
demonstrate that harvesting is a valuable way to manage collections because it is an economic one
13. Core metrics
15. Conclusion: coming next, a new ISO work item ISO TC46 SC8, proposed a new working group be formed to examine the collection of internet resources and quality issues. They intend to write a technical report, with the working title « Statistics and quality issues for web archiving. »
First meeting in December, Berlin.
16. Thank you – Q&A gildas.illien@bnf.fr
Image credits:
http://www.flickr.com/groups/librariancards/
http://www.flickr.com/photos/library_of_congress/2179849046/
http://switchzoo.com
http://www.flickr.com/photos/generated/501445202/
http://www.flickr.com/photos/wordridden/284901102/
http://www.flickr.com/photos/serenejournal/2056094466/
http://www.desordre.net/