the role of libraries in the context of e science n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
The Role of Libraries in the Context of e-Science PowerPoint Presentation
Download Presentation
The Role of Libraries in the Context of e-Science

Loading in 2 Seconds...

play fullscreen
1 / 32

The Role of Libraries in the Context of e-Science - PowerPoint PPT Presentation


  • 65 Views
  • Uploaded on

The Role of Libraries in the Context of e-Science. Dr Anne E Trefethen Oxford e-Research Centre Anne.trefethen@ierc.ox.ac.uk. A Definition of e-Science. ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Role of Libraries in the Context of e-Science' - jewel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the role of libraries in the context of e science

The Role of Libraries in the Context of e-Science

Dr Anne E Trefethen

Oxford e-Research Centre

Anne.trefethen@ierc.ox.ac.uk

a definition of e science
A Definition of e-Science

‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’

John Taylor

Director General of Research Councils

Office of Science and Technology, 2001

uk e science programme
UK e-Science Programme

Director’s

Awareness and Co-ordination Role

Director’s

Management Role

Generic Challenges

EPSRC (£15m) £16.2m, DTI (£15m)

Pilot Application

Programme

PPARC (£26m) £31.6m

BBSRC (£8m) £10.0m

MRC (£8m) £13.1m

NERC (£7m) £8.0m

ESRC (£3m) £10.6m

EPSRC (£17m) £18.0m

CLRC (£5m) £5.0m

Research Councils (£74m),£96.3m

DTI (£5m)

Collaborative projects

Industrial Collaboration

e science goals
e-Science Goals
  • to enable new forms of science that are
    • distributed
    • collaborative
    • multi-disciplinary
    • information-intensive
    • data-intensive
  • to use information technology to
    • leverage data as a form of science capital
    • to manage the “data deluge”
    • improve access to scientific information
slide5
Powering the Virtual Universehttp://www.astrogrid.ac.uk(Edinburgh, Belfast, Cambridge, Leicester, London, Manchester, RAL)

AstroGrid Slides courtesy

of Nick Walton, Cambridge

Multi-wavelength showing the jet in M87: from top to bottom – Chandra X-ray, HST optical, Gemini mid-IR, VLA radio. AstroGrid will provide advanced, Grid based, federation and data mining tools to facilitate better and faster scientific output.

Picture credits: “NASA / Chandra X-ray Observatory / Herman Marshall (MIT)”, “NASA/HST/Eric Perlman (UMBC), “Gemini Observatory/OSCIR”, “VLA/NSF/Eric Perlman (UMBC)/Fang Zhou, Biretta (STScI)/F Owen (NRA)”

National Centre for Text Mining

slide6

SWIFT satellite

observes gamma

ray burst

Gamma Ray Bursts

Image from ESO

D. Ducros, ESA

Image + IRIS data

Interaction with

observatory pipe-

lines

Localise GRB alert

in minutes – as fade

rapidly.

Collate data from

multiple telescopes

over months -

meta data issues

Large computational

photometric redshift

calcs on multi-λ

> gives distance

Cross reference multi-λ

data – ID pre-cursor

and or environment

Compare against SN

light curves – bump

shows eveidence

for a SN in the GRB

(Price et al, 2002)

Reprocessing of

ionospheric STP data

change coords

from earth to celestial

National Centre for Text Mining

slide7

Dark Matter + Large Scale Structure

X-ray cluster: Chandra X-ray (Mullis) overlaid on a deep BRI image (Clowe & Luppino).

Image from ESO

Multi-TB λCDM

models, e.g.

Millennium Sim

Automatic cluster

finding techniques

Multiple large

image sources:

registration &

association

Generate Shear Maps

c.f. CDM models

> DM distribution

with redshift

Remove stars

correlate gals

with z

Source ID from

multiplexed

spectral data

Colour-Colour

relationships

classification in

multi-phase space

National Centre for Text Mining

some facts on astronomy data
Some facts on Astronomy data
  • Virtual observatories
    • Many national virtual observatories containing data at different wavelengths. Estimated
      • US NVO project alone will store 500 Terabytes/year
      • Laser Interferometer Gravitational Observatory (LIGO) generates 250 Terabytes/year
      • VISTA, Visible and infrared survey telescope estimated to generate 250 Gigabytes of raw data/night – 10 terabytes of stored data/year.
  • Together with data analysis need to combine with previously published knowledge on that astronomical time/space events
mygrid directly supporting the e scientist
myGrid:Directly Supporting the e-Scientist

myGrid slides

courtesy of

Carole Goble

Partners

Manchester, EBI, Southampton,Nottingham, Newcastle, Sheffield

AstraZenecaGlaxoSmithKline

Merck KGaA

Epistemics LtdGeneticXchangeNetwork Inference

IBM

SUN Microsystems

http://mygrid.man.ac.uk

mygrid project

(courtesy of Carole Goble, Manchester)

myGrid Project
  • Imminent ‘deluge’ of genomics data
  • Highly heterogeneous
  • Highly complex and inter-related
  • Convergence of data and literature archives
an in silico experiment a web of interconnected information and components

People

Provenance record of workflow runs

Literature

Notes

Data in and out

Services used

An in silico experiment = a web of interconnected information and components

Provenance of the workflow template. Related workflows.

Ontologies describing workflows

(courtesy of Carole Goble, Manchester)

the ebank project
The eBank Project
  • Building links between e-research data, from the CombeChem project, with scholarly communication and other on-line sources
  • Investigating the role of aggregator services in linking data-sets from Grid enabled projects to open data archives contained in digital repositories through to peer-reviewed articles as resources in portals
  • JISC-funded project led by UKOLN in partnership with the Universities of Southampton and Manchester

(eBank slides courtesy of Liz Lyon and Jeremy Frey)

comb e chem project
Comb-e-Chem Project

Video

Simulation

Properties

Analysis

StructuresDatabase

Diffractometer

X-Raye-Lab

Propertiese-Lab

Grid Middleware

(eBank slides courtesy of Liz Lyon and Jeremy Frey)

goals of e bank project
Goals of e-Bank Project
  • Provide self archive of results plus the raw and analysed data
  • Links from traditionally published work provides the provenance to the work
  • Disseminate for “Public Review” – raw data provided so that users can check themselves
  • Avoid the “publication bottleneck” but still provide the quality check

(eBank slides courtesy of Liz Lyon and Jeremy Frey)

slide15

Crystallographic e-Prints

  • Direct Access to Raw Data from scientific papers

Raw data sets can be very large and these are stored at National Datastore using SRB server

(eBank slides courtesy of Liz Lyon and Jeremy Frey)

e bank some comments
e-Bank: Some Comments
  • Data as well as traditional bibliographic information is made available
  • Can construct high level search on data
    • aggregate data from many e-print systems
  • Build new data services
  • Will extend to provision of real spectra - rather than very reduced summaries - for chemistry publications

(eBank slides courtesy of Liz Lyon and Jeremy Frey)

slide17

Grid

E-Scientists

collaboration

storage & processing

data & metadata

Current E-Science Focus: Experimentation Virtual collaborations for large-scale experimentation & analysis

E-Experimentation

(eBank slides courtesy of Liz Lyon)

slide18

Grid

E-Scientists

1

Experimentation & Analysis Cycle

E-Experimentation

(eBank slides courtesy of Liz Lyon)

slide19

Grid

Reprints

Peer-Reviewed Journal & Conference Papers

Technical Reports

LocalWeb

Preprints & Metadata

Institutional Archive

Publisher Holdings

Certified Experimental Results & Analyses

Data, Metadata & Ontologies

2

Publication & Preservation Cycle

E-Scientists

E-Experimentation

(eBank slides courtesy of Liz Lyon)

slide20

Grid

Reprints

Peer-Reviewed Journal & Conference Papers

Technical Reports

LocalWeb

Preprints & Metadata

Institutional Archive

Publisher Holdings

Certified Experimental Results & Analyses

Data, Metadata & Ontologies

Research Cycleaccess & impact

3

Digital Library

E-Scientists

E-Scientists

E-Experimentation

(eBank slides courtesy of Liz Lyon)

slide21

Virtual Learning Environment

Grid

Reprints

Peer-Reviewed Journal & Conference Papers

Technical Reports

LocalWeb

Preprints & Metadata

Institutional Archive

Publisher Holdings

Certified Experimental Results & Analyses

Data, Metadata & Ontologies

Undergraduate Students

Digital Library

Graduate Students

E-Scientists

4

Learning Cycletraining and developing tomorrow’s e-scientists

E-Scientists

E-Experimentation

(eBank slides courtesy of Liz Lyon)

slide22

Virtual Learning Environment

Reprints

Peer-Reviewed Journal & Conference Papers

Technical Reports

LocalWeb

Preprints & Metadata

Institutional Archive

Publisher Holdings

Certified Experimental Results & Analyses

Data, Metadata & Ontologies

Undergraduate Students

Digital Library

Graduate Students

E-Scientists

E-Scientists

E-Scientists

Grid

5

E-Experimentation

Entire E-Science CycleEncompassing experimentation, analysis, publication, research, learning

(eBank slides courtesy of Liz Lyon)

role of publications in science
Role of publications in science
  • Product of research
  • Cumulative, historical record of science
  • Input to research
  • Value chain: Network of documents linked via citations

(courtesy of Christine Borgman)

publication changes
Publication changes
  • Changes much broader than just the libraries
  • Nature of publishing
  • Cycle of authoring, publication, access

Drivers

  • Technology
  • Economics
  • Social and Legal
data publishing
Data Publishing

Databases, notably in biology, are replacing (paper) publications as a medium of communication

    • Built and maintained with a great deal of human effort
    • Often do not contain source experimental data, sometimes just annotation/metadata
    • Borrow extensively from, and refer to, other databases
    • Researchers are now judged by databases as well as (paper) publications
    • Upwards of 1000 (public databases) in genetics
  • Integration of literature and data analysis of increasing importance - linking bio-database to literature, using publishers to check, complete or complement contents of such databases
digital curation
Digital Curation?
  • ‘In next 5 years e-Science projects will produce more scientific data than has been collected in the whole of human history’- Tony Hey
  • In 20 years can guarantee that the operating and spreadsheet program and the hardware used to store data will not exist
    • Research curation technologies and best practice
    • Need to liaise closely with individual research communities, data archives and libraries
generic issues
Generic Issues
  • Data Deluge from e-Science projects requires technologies to facilitate discovery, analysis, curation of data
  • Sheer volume of text published and new results appearing, is impossible for researchers to read and correlate – text mining
  • Effective automated processing required research, locate, gather and make use of knowledge encoded electronically in available literature
what data deserve to be permanently accessible
What data deserve to be permanently accessible?
  • What are the scientific criteria for preservation?
  • What is the equivalent of peer review for data?
  • Whose data do you trust?
  • What data will be re-used?
  • How much to invest?
  • Who will add the value?
digital curation centre
Digital Curation Centre
  • Actions needed to maintain and utilise digital data and research results over entire life-cycle
    • For current and future generations of users
  • Digital Preservation
    • Long-run technological/legal accessibility and usability
  • Data curation in science
    • Maintenance of body of trusted data to represent current state of knowledge
  • Research in tools and technologies
    • Integration, annotation, provenance, metadata, security…..

(www.dcc.ac.uk)

the hybrid library
The hybrid library

‘The dominant user view of a library is of a physical space. But libraries are services which provide organised access, to the intellectual record, wherever it resides, whether in physical places or scattered digital information spaces. The ‘hybrid’ library of the future will be a managed combination of physical and virtual collections and information resources.’

Reg Carr, Oxford University

conclusions
Conclusions
  • Publication of data and “paper” becoming integrated in the digital scholarly research cycle
  • Libraries will move further to the “hybrid” model – Institutional repositories
  • e-Science brings with it the data deluge – needs for data management and curation skills
  • e-Scientists also need library training in discovery and access
  • Have implicitly touched on Open Access but as policies begin to apply to data as well as publication research outputs, then the above will be even more so.
acknowledgements
Acknowledgements

With special thanks to Tony Hey, Carole Goble, Reg Carr, Jeremy Frey, Liz Lyon, Chris Borgman and Nick Walton