Institutional repositories discipline based repositories
1 / 22

Institutional Repositories & Discipline Based Repositories - PowerPoint PPT Presentation

  • Updated On :

Institutional Repositories & Discipline Based Repositories. Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005. Outline. Geospatial data = Institutional and Subject Repositories Repository choices Data Centres Possible solutions.

Related searches for Institutional Repositories & Discipline Based Repositories

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Institutional Repositories & Discipline Based Repositories' - gada

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Institutional repositories discipline based repositories l.jpg

Institutional Repositories & Discipline Based Repositories

Pauline Simpson

National Oceanography Centre, Southampton

GRADE Kick Off Meeting

28 Sep 2005

Outline l.jpg

  • Geospatial data =

  • Institutional and Subject Repositories

  • Repository choices

  • Data Centres

  • Possible solutions

Geospatial data l.jpg
Geospatial Data

  • Scope within GRADE

    • Numerical data, raw and analyzed

    • Information Products

      • Publications

      • CD Roms, DVD

      • Learning Objects

Repositories are spreading because l.jpg
Repositories are spreading because …

  • Supplementary to traditional publication

  • Do not affect current research publication processes

  • Give easy access

  • Give rapid access

  • Give long-term access

  • Increase readership and use of material

  • They offer advantages to institutions

  • They offer advantages to research funders

  • They offer new ways for information to be linked and used

Subject discipline based repositories l.jpg
Subject/Discipline Based Repositories

Subject repositories often managed by an individual for a group

  • Relies on peer interaction – no mandate

  • Individual agreements have to be struck

  • No definitive boundaries

  • Quality control issues

  • Sustainability issues

  • Transitory – collection at risk

  • Responsibility for preservation

  • Issues over the return on the money and effort invested

    ? A trusted repository? Supported by ….

Slide6 l.jpg

Subject repositories are archives which collect and manage material relating to one or more related subject areas. A number currently exist mainly within science subjects.

Significant subject repositories include many using e-Prints or DSpace software:

  • ArXiv - (physics, mathematics, non-linear science and computer science)

  • Cogprints - (Cognitive sciences including psychology, neuroscience, linguistics and other related areas)

  • CiteSeer - (computer science)

  • HTP Prints - (History and theory of psychology)

  • PubMedCentral - (US National Library of Medicine's digital archive of life sciences journal literature.

  • PhilSci Archive - (philosophy of science)

  • E-LIS - (library and information science)

  • RePEc (Research Papers in Economics)

Institutional repositories l.jpg
Institutional Repositories material relating to one or more related subject areas. A number currently exist mainly within science subjects.

Freely accessible web-based databases providing access to the full text of scholarly material produced by members of an institution.

Digital collections that capture and preserve the intellectual output of the communities.

What are the essential elements?

  • Institutionally defined: Content - generated by the community

  • Scholarly content:, published articles, books, book sections, preprints

    and working papers, conference papers, enduring teaching

    materials, student theses, data-sets, etc.

  • Cumulative & perpetual: preserve ongoing access to material

  • Interoperable & open access: free, online, global, utilising standards :

    OAI , Dublin Core etc

Institutional repositories8 l.jpg
Institutional Repositories material relating to one or more related subject areas. A number currently exist mainly within science subjects.

Institutions are logical implementers of repositories

because they can take responsibility for:

–Centralising a distributed activity

–Framework and Infrastructure

–Permanence that can sustain changes

–Stewardship of Digital assets

–Preservation policy for long term access

–Provide central digital showcase for the research,

teaching and scholarship of the institution

“a trusted repository” supported by the Information Community

Institutional repository software for geo data l.jpg
Institutional Repository Software for geo data material relating to one or more related subject areas. A number currently exist mainly within science subjects.

  • OSI Directory of Institutional Repository SoftwareV.3

  • E-Prints (GNU) [].  Open-source OAI-compliant

    software developed at University of Southampton to enable anyone to set up

    their own Open Archives-compliant institutional archive.  Originally programmed for subject repositories but now re-engineered for IR. Does not identify treatment of datasets, though can cover bibliographic description

  • DSpace: Durable Digital Depository [].  Open-source software developed at MIT for their own repository; released as open source software in Nov. 2002. 

    Overtly identifies datasets. Offers opportunity to explore the issues surrounding the

    incorporation of different metadata standards within one system…. Different disciplines have adopted different sets of metadata standards to accommodate their particular data needs.

    Two examples are the CSDGM standard for geospatial data and the DICOM standard for digital imaging in medicine. … develop more general standards, such as Dublin Core, which

    proposes a basic set of common elements that can be used across many different disciplines and document types.

    (DC and MARC are norms)

Slide10 l.jpg material relating to one or more related subject areas. A number currently exist mainly within science subjects. need to register to search

Slide11 l.jpg material relating to one or more related subject areas. A number currently exist mainly within science subjects. - information products

Repository choices l.jpg
Repository Choices material relating to one or more related subject areas. A number currently exist mainly within science subjects.

  • Subject - arXiv, Cogprints, RePEC,

  • Institutional – Southampton, Glasgow, Nottingham (SHERPA), MBA UK

  • National - DARE (all universities in the Netherlands), Scotland, British Library (proposal)

  • National / Subject -ODINPubAfrica

  • International - Internet Archive ‘Universal’, OAIster

  • Regional - White Rose UK

  • Consortia - SHERPA-LEAP (London E-prints Access Project)

  • Funding Agency – NIH (PubMed), Wellcome Trust (UK PubMed), NERC

  • Project - Public Knowledge Project EPrint Archive

  • Conference - 11th Joint Symposium on Neural Computation, May 15 2004

  • Personal – peer to peer, web pages etc

  • Media Type - VCILT Learning Objects Repository, NTDL (Theses)

  • Publisher – journal archives

  • Data Repositories/Archives - NODC, BODC, DOD, JODC, BADC etc

    • Science, particularly Environmental Science is well served

    • Logical host for numeric datasets

Data centres archives repositories l.jpg
Data Centres/ Archives / Repositories material relating to one or more related subject areas. A number currently exist mainly within science subjects.

  • Within organisational infrastructures but not defined by it

  • National responsibilities

  • Subject and Technical Specialists, quality control of content

  • Secure storage and migration policies

  • Well developed Metadata schema & Standards

    • DIF – Directory Interchange Format, FGDC etc

    • ISO 19115

      • the minimum set of metadata required to serve the full range of metadata applications (data discovery, determining data fitness for use, data access, data transfer, and use of digital data);

      • optional metadata elements - to allow for a more extensive standard description of geographic data, if required;

      • a method for extending metadata to fit specialized needs.

      • Though ISO 19115:2003 is applicable to digital data, its principles can be extended to many other forms of geographic data such as maps, charts, and textual documents as well as non-geographic data.

        “a trusted repository” supported by the Data Management Community

Slide14 l.jpg

ARCHIMEDE : A Canadian software solution for institutional repositories []. OAI compliant software developed by Laval University Library. Archimede has been developed in a multilingual perspective, with internationalization as a focus. The text (or content) of the interface is independent and not embedded in the code making it relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process.

Berkeley Electronic Press [].  Commercial OAI-compliant software used by the University of California’s eScholarship Repository.

CERN Document Server Software (CDSware) []. OAI compliant software developed by, maintained by, and used at, the CERN Document Server.

Project Tapir []: Tapir provides additional functionality to digital asset management software DSpace primarily designed for Electronic Theses and Dissertations supervision, submission and dissemination. See Queen's University Project.

Fedora™ Project: An Open-Source Digital Repository Management System []. Jointly developed by the University of Virginia and Cornell University, Fedora is a general-purpose digital object repository system that can be used in whole or part to support a variety of use cases including: institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation.

Greenstone [].  Suite of open-source multilingual software for building and distributing digital library collections.  Produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed (since 2000) in cooperation with UNESCO and the Human Info NGO.  Presently in limited use at New Zealand Digital Library Project and some other sites.

OCLC Research Software []. A list of open source software developed by the Online Computer Library Center (OCLC) to build a repository and harvest data according to OAI-PMH standards.

FIGARO, i-TOR, etc

Dilemma for researcher l.jpg
Dilemma for Researcher repositories

  • Mandates from major funding agencies now require grantees to deposit research output in a ‘designated repository’ or ‘any’

    • Wellcome Trust (UK PubMed) - £400 million producing 3500 papers per year

    • RCUK

  • Where should the full text of their research be deposited

  • Researcher wants to enter metadata and deposit only once and perhaps deposit all related material in one place?

  • Situation at present

    • Harvesting, but harvester is not the choice of the depositor

    • Duplicate keying metadata into repositories of choice

    • Cannot target multiple repositories with one exercise

  • Does it matter where it is deposited since Google Scholar, Yahoo, Scopus , will pick it up wherever it is?

Repositories taking over the world l.jpg
Repositories taking over the world? repositories

  • Turf War

    • Not between Institutional and Subject Repositories – complementary and should coexist

    • Possibly between Text based and Numeric based repositories

      • Repositories of whatever flavour v. Data Centres

        • Are both spilling over into each others territory?

  • The Cavalry : JISC Digital Repositories Programme

    • Strand: Linking Text and Data

Slide17 l.jpg

Presentation services: subject, media-specific, data, commercial portals

Searching , harvesting, embedding

Resource discovery, linking, embedding

Resource discovery, linking, embedding

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Data analysis, transformation, mining, modelling

Learning object creation, re-use

Aggregator services


Learning & Teaching workflows

Research & e-Science workflows

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Deposit / self-archiving

Deposit / self-archiving



Resource discovery, linking, embedding


Peer-reviewed publications: journals, conference proceedings

From: Lyon : CNI - JISC - SURF Conference, May 2005

Quality assurance bodies

Slide18 l.jpg
CLADDIER Project ** commercial portals(Citation, Location And Deposition in Discipline and Institutional Repositories)

  • The CLADDIER system will be a step on the road to a situation where (in this case, environmental) scientists will to be able to move seamlessly from information discovery (location), through acquisition to deposition of new material, with all the digital objects correctly identified and cited. The lessons learned will be of applicability for the relationships between other discipline based repositories and institutional repositories.

    **JISC Digital Repositories Programme 2005 -

Slide19 l.jpg

  • Persistent identifiers commercial portals

  • semantically transparent

  • Versioning

  • Dataset Citations

  • Publishing practice

  • Automated Linking both ways

  • citation png

Where to deposit l.jpg
Where to Deposit commercial portals

  • One outcome of CLADDIER Project

  • ‘pull’ = Harvesting

  • ‘push’ = CLADDIER outcome

    • Enable researcher to deposit in one repository and choose to upload (push) the metadata to another repository of choice.

    • Logical to ‘push’ from IR to Subject?

    • Redundancy of records?

Thank you l.jpg
Thank You commercial portals

Pauline Simpson ( )

Data centres l.jpg
Data Centres commercial portals

  • Discovery metadata - What data sets hold the sort of data I am interested in? This enable organisations to know and publicise what data holdings they have.

  • Exploration metadata - Do the identified data sets contain sufficient information to enable a sensible analysis to be made for my purposes? This is documentation to be provided with the data to ensure that others use the data correctly and wisely.

  • Exploitation metadata - What is the process of obtaining and using the data that are required? This helps end users and provider organisations to effectively store, reuse, maintain and archive their data holdings.