1 / 22

Institutional Repositories & Discipline Based Repositories

Institutional Repositories & Discipline Based Repositories. Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005. Outline. Geospatial data = Institutional and Subject Repositories Repository choices Data Centres Possible solutions.

gada
Download Presentation

Institutional Repositories & Discipline Based Repositories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

  2. Outline • Geospatial data = • Institutional and Subject Repositories • Repository choices • Data Centres • Possible solutions

  3. Geospatial Data • Scope within GRADE • Numerical data, raw and analyzed • Information Products • Publications • CD Roms, DVD • Learning Objects

  4. Repositories are spreading because … • Supplementary to traditional publication • Do not affect current research publication processes • Give easy access • Give rapid access • Give long-term access • Increase readership and use of material • They offer advantages to institutions • They offer advantages to research funders • They offer new ways for information to be linked and used

  5. Subject/Discipline Based Repositories Subject repositories often managed by an individual for a group • Relies on peer interaction – no mandate • Individual agreements have to be struck • No definitive boundaries • Quality control issues • Sustainability issues • Transitory – collection at risk • Responsibility for preservation • Issues over the return on the money and effort invested ? A trusted repository? Supported by ….

  6. Subject repositories are archives which collect and manage material relating to one or more related subject areas. A number currently exist mainly within science subjects. Significant subject repositories include many using e-Prints or DSpace software: • ArXiv - http://xxx.arxiv.cornell.edu/ (physics, mathematics, non-linear science and computer science) • Cogprints - http://cogprints.ecs.soton.ac.uk/ (Cognitive sciences including psychology, neuroscience, linguistics and other related areas) • CiteSeer - http://citeseer.nj.nec.com/cs (computer science) • HTP Prints - http://htpprints.yorku.ca/ (History and theory of psychology) • PubMedCentral - http://www.pubmedcentral.nih.gov/ (US National Library of Medicine's digital archive of life sciences journal literature. • PhilSci Archive - http://philsci-archive.pitt.edu/ (philosophy of science) • E-LIS - http://eprints.rclis.org/ (library and information science) • RePEc (Research Papers in Economics)

  7. Institutional Repositories Freely accessible web-based databases providing access to the full text of scholarly material produced by members of an institution. Digital collections that capture and preserve the intellectual output of the communities. What are the essential elements? • Institutionally defined: Content - generated by the community • Scholarly content:, published articles, books, book sections, preprints and working papers, conference papers, enduring teaching materials, student theses, data-sets, etc. • Cumulative & perpetual: preserve ongoing access to material • Interoperable & open access: free, online, global, utilising standards : OAI , Dublin Core etc

  8. Institutional Repositories Institutions are logical implementers of repositories because they can take responsibility for: –Centralising a distributed activity –Framework and Infrastructure –Permanence that can sustain changes –Stewardship of Digital assets –Preservation policy for long term access –Provide central digital showcase for the research, teaching and scholarship of the institution “a trusted repository” supported by the Information Community

  9. Institutional Repository Software for geo data • OSI Directory of Institutional Repository SoftwareV.3 http://www.soros.org/openaccess/software/ • E-Prints (GNU) [http://software.eprints.org/].  Open-source OAI-compliant software developed at University of Southampton to enable anyone to set up their own Open Archives-compliant institutional archive.  Originally programmed for subject repositories but now re-engineered for IR. Does not identify treatment of datasets, though can cover bibliographic description • DSpace: Durable Digital Depository [http://dspace.org/].  Open-source software developed at MIT for their own repository; released as open source software in Nov. 2002.  Overtly identifies datasets. Offers opportunity to explore the issues surrounding the incorporation of different metadata standards within one system…. Different disciplines have adopted different sets of metadata standards to accommodate their particular data needs. Two examples are the CSDGM standard for geospatial data and the DICOM standard for digital imaging in medicine. … develop more general standards, such as Dublin Core, which proposes a basic set of common elements that can be used across many different disciplines and document types. (DC and MARC are norms)

  10. https://dspace.ucalgary.ca/handle/1880/33 need to register to search

  11. http://careo.ucalgary.ca/cgi-bin/WebObjects/CAREO.woa - information products

  12. Repository Choices • Subject - arXiv, Cogprints, RePEC, • Institutional – Southampton, Glasgow, Nottingham (SHERPA), MBA UK • National - DARE (all universities in the Netherlands), Scotland, British Library (proposal) • National / Subject -ODINPubAfrica • International - Internet Archive ‘Universal’, OAIster • Regional - White Rose UK • Consortia - SHERPA-LEAP (London E-prints Access Project) • Funding Agency – NIH (PubMed), Wellcome Trust (UK PubMed), NERC • Project - Public Knowledge Project EPrint Archive • Conference - 11th Joint Symposium on Neural Computation, May 15 2004 • Personal – peer to peer, web pages etc • Media Type - VCILT Learning Objects Repository, NTDL (Theses) • Publisher – journal archives • Data Repositories/Archives - NODC, BODC, DOD, JODC, BADC etc • Science, particularly Environmental Science is well served • Logical host for numeric datasets

  13. Data Centres/ Archives / Repositories • Within organisational infrastructures but not defined by it • National responsibilities • Subject and Technical Specialists, quality control of content • Secure storage and migration policies • Well developed Metadata schema & Standards • DIF – Directory Interchange Format, FGDC etc • ISO 19115 • the minimum set of metadata required to serve the full range of metadata applications (data discovery, determining data fitness for use, data access, data transfer, and use of digital data); • optional metadata elements - to allow for a more extensive standard description of geographic data, if required; • a method for extending metadata to fit specialized needs. • Though ISO 19115:2003 is applicable to digital data, its principles can be extended to many other forms of geographic data such as maps, charts, and textual documents as well as non-geographic data. “a trusted repository” supported by the Data Management Community

  14. ARCHIMEDE : A Canadian software solution for institutional repositories [http://archimede.bibl.ulaval.ca/di/Welcome.do]. OAI compliant software developed by Laval University Library. Archimede has been developed in a multilingual perspective, with internationalization as a focus. The text (or content) of the interface is independent and not embedded in the code making it relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process. Berkeley Electronic Press [http://www.bepress.com/repositories.html].  Commercial OAI-compliant software used by the University of California’s eScholarship Repository. CERN Document Server Software (CDSware) [http://cdsware.cern.ch/]. OAI compliant software developed by, maintained by, and used at, the CERN Document Server. Project Tapir [http://sourceforge.net/projects/tapir-eul]: Tapir provides additional functionality to digital asset management software DSpace primarily designed for Electronic Theses and Dissertations supervision, submission and dissemination. See Queen's University Project. Fedora™ Project: An Open-Source Digital Repository Management System [http://www.fedora.info/]. Jointly developed by the University of Virginia and Cornell University, Fedora is a general-purpose digital object repository system that can be used in whole or part to support a variety of use cases including: institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation. Greenstone [http://www.greenstone.org/cgi-bin/library?a=p&p=home].  Suite of open-source multilingual software for building and distributing digital library collections.  Produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed (since 2000) in cooperation with UNESCO and the Human Info NGO.  Presently in limited use at New Zealand Digital Library Project and some other sites. OCLC Research Software [http://www.oclc.org/research/software/default.htm]. A list of open source software developed by the Online Computer Library Center (OCLC) to build a repository and harvest data according to OAI-PMH standards. FIGARO, i-TOR, etc

  15. Dilemma for Researcher • Mandates from major funding agencies now require grantees to deposit research output in a ‘designated repository’ or ‘any’ • Wellcome Trust (UK PubMed) - £400 million producing 3500 papers per year • RCUK • Where should the full text of their research be deposited • Researcher wants to enter metadata and deposit only once and perhaps deposit all related material in one place? • Situation at present • Harvesting, but harvester is not the choice of the depositor • Duplicate keying metadata into repositories of choice • Cannot target multiple repositories with one exercise • Does it matter where it is deposited since Google Scholar, Yahoo, Scopus , will pick it up wherever it is?

  16. Repositories taking over the world? • Turf War • Not between Institutional and Subject Repositories – complementary and should coexist • Possibly between Text based and Numeric based repositories • Repositories of whatever flavour v. Data Centres • Are both spilling over into each others territory? • The Cavalry : JISC Digital Repositories Programme • Strand: Linking Text and Data

  17. Presentation services: subject, media-specific, data, commercial portals Searching , harvesting, embedding Resource discovery, linking, embedding Resource discovery, linking, embedding Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Data analysis, transformation, mining, modelling Learning object creation, re-use Aggregator services Harvestingmetadata Learning & Teaching workflows Research & e-Science workflows Repositories : institutional, e-prints, subject, data, learning objects Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Deposit / self-archiving Deposit / self-archiving Validation Publication Resource discovery, linking, embedding Validation Peer-reviewed publications: journals, conference proceedings From: Lyon : CNI - JISC - SURF Conference, May 2005 Quality assurance bodies

  18. CLADDIER Project **(Citation, Location And Deposition in Discipline and Institutional Repositories) • The CLADDIER system will be a step on the road to a situation where (in this case, environmental) scientists will to be able to move seamlessly from information discovery (location), through acquisition to deposition of new material, with all the digital objects correctly identified and cited. The lessons learned will be of applicability for the relationships between other discipline based repositories and institutional repositories. **JISC Digital Repositories Programme 2005 -

  19. Persistent identifiers • semantically transparent • Versioning • Dataset Citations • Publishing practice • Automated Linking both ways • citation png

  20. Where to Deposit • One outcome of CLADDIER Project • ‘pull’ = Harvesting • ‘push’ = CLADDIER outcome • Enable researcher to deposit in one repository and choose to upload (push) the metadata to another repository of choice. • Logical to ‘push’ from IR to Subject? • Redundancy of records?

  21. Thank You Pauline Simpson ( ps@noc.soton.ac.uk )

  22. Data Centres • Discovery metadata - What data sets hold the sort of data I am interested in? This enable organisations to know and publicise what data holdings they have. • Exploration metadata - Do the identified data sets contain sufficient information to enable a sensible analysis to be made for my purposes? This is documentation to be provided with the data to ensure that others use the data correctly and wisely. • Exploitation metadata - What is the process of obtaining and using the data that are required? This helps end users and provider organisations to effectively store, reuse, maintain and archive their data holdings.

More Related