Teaching Geoinformatics: A Geoscience Perspective Randy Keller Professor and Edward Lamb McCollough Chair in Geophysics School of Geology and Geophysics University of Oklahoma
It is too hard to find and work with data that already exist. It is too hard to acquire software and make it work. We have too little access to modern IT tools that would accelerate progress. The result is too little time for science! Geoinformatics - the vision
The EarthScope Scientific Vision To understand the structure (evolution) and deformation of the North American continent in four dimensions (x,y,z,t)
Cyberinfrastructure for the GeosciencesWhy do we need it? EarthScope Future research opportunities in the geosciences will be significantly affected both by the availability and utilization of Information Technology. Understanding the rock record that preserves ~4.5 billion years of history, Earth structure, and the processes at work is the key to answering scientific questions associated with studies of biodiversity, climate change, planetary processes, natural resources and hazards, and the 4-D architecture and evolution of continents. It has become evident that we can only answer these complex questions through the integration of all the data we have at hand and that this will require the application of modern IT tools.
Geoinformatics is a science which develops and uses information science infrastructure to address the problems of geosciences and related branches of engineering. The three main tasks of geoinformatics are: ･development and management of databases of geodata ･analysis and modeling of geodata ･development and integration of computer tools and software for the first two tasks. Geoinformatics is related to geocomputation and to the development and use of geographic information systems or Spatial Decision Support Systems Applications･An object-relational database (ORD) or object-relational database management system (ORDBMS) Object-relational mapping (or O/RM) Geostatistics Geoinformatics Research & Education Geoinformatics Research Group, School of Civil Engineering & Geosciences, Newcastle University, UK What is Geoinformatics?
Geoinformatics - Some key elements • A strong partnership between domain experts (geoscientists) and computer scientists • A shared goal of doing better (and more) science • A desire to create products that the scientific community actually needs and will use (not what you think they need or should want) • Always give credit to original sources of data, software, etc. • A desire to preserve data, make it easily used and discovered, and create living databases • A desire to create user friendly and platform independent software • A desire to facilitate data integration • A desire to create cyberinfrastructure breakthroughs (e.g., visualization, 3-D model building editing, etc.) • A desire to democratize the use of cutting edge technology in geoscience research and education
A Scientific Effort Vector Background Research Data Collection and Compilation Software Issues Science Back- ground Research Data Collection and Compilation Software Issues Science Science - Analysis, Modeling, Interpretation, Discovery
Some Definitions about Data Data Set: A relatively raw compilation of data (standards, formats, completeness may be questionable) Data Base: A mature data compilation that has been “cleaned”, standardized with input from the scientific community, formatted for use by others (independent of proprietary software, e.g., ORACLE) Data System: A linked and organized set of data bases including public domain software (not platform dependent), tutorials, workflows, and procedures to analyze the data
Data systems needed • A GEOINFORMATICS DATA SCHEME
Data systems needed (continued) • A GEOINFORMATICS DATA SCHEME
Data is only the beginning DecisionSupport Value Knowledge Volume Information Data
Some considerations in setting up a class The audience (obviously) - what do they know coming in? (Geospatial skills, computer programming skills, general computer skills, mathematical background, geological background) How formal will the structure be? (mix of lecture, lab, seminar style) How mathematical do you want to be? What is mix of computer science and geoscience? Relation to “Computer Applications in the Geoscience” class? I strongly recommend that a computer science colleague be involved to some degree and that there be some computer science students in the class.
Learning Environments Collaboratory Time Same Different Face To Face Library Drop-in Lab Same Cyberinfrastructure Place Tele / Video conference Email Different DATA The independent scientist is not a thing of the past, but more and more big advances are made through collaboration.
A class schedule(cont.) Uncertainty, reliability, provenance. Etc.
Class assignments Read papers from the recent literature (<2004 is old ) Set up a modest personal website Laboratory exercise on EXCEL Laboratory exercise on GIS Laboratory exercise on MATLAB Laboratory exercise on using Google Earth quantitatively Find an interesting piece of software on-line and demo it to the class Create a modest web service Term project to create a modest web portal
Geoinformatics - Cambridge University Press • Geoinformatics: Cyberinfrastructure for the Solid Earth Sciences • Co-editors: G. Randy Keller, University of Oklahoma, USA • Chaitanya Baru, San Diego Supercomputer Center, University of California • I. INTRODUCTION • 1. Introduction to Science Needs and Challenges • G. Randy Keller, University of Oklahoma • 2. Introduction to IT Concepts and Challenges • Chaitanya Baru, University of California, San Diego • II. DATA COLLECTION AND MANAGEMENT • 3. Framework for Managing LiDAR/Remote Sensing Data, Ramon Arrowsmith, and Christopher Crosby, Arizona State University • 4. Algorithms for Gridding and Analysis of Remote Sensing Data, • S. B. Baden, Christopher Crosby, Ramon Arrowsmith, Arizona State University • 5. Digital Field Data Collection, • John Oldow and Douglas Walker, University of Idaho and University of Kansas • 6. Sensor Networks and Embedded Cyberinfrastructure for Sensor Networks, • Tony Fountain, Frank Vernon, Scripps Institute of Oceanography
Geoinformatics - Cambridge University Press • III. MODELING SOFTWARE AND COMMUNITY CODES • 7. Community Codes for Geodynamics, • Mike Gurnis and Walter Landry, CalTech • 8. Community Codes for Earthquake Wave Propagation Research: The TeraShake Platform • Philip Maechling, Yifeng Cui, Kim Olsen, David Okaya, Ewa Deelman, Amit Chourasia, Gaurang Mehta, Reagan Moore, and Thomas H. Jordan, Southern California Earthquake Center, University of Southern California • 9. Parallelizing Finite Element Codes for Geodynamics • Mian Liu, University of Missouri • 10.Designing and Building a Grid-enabled Synthetic Seismogram Computational Resource • Dogan Seber, Choonhan Youn, Tim Kaiser, Cindy Santini, University of California at San Diego • 11. The PaleoAtlas for ArcGIS • Chris Scotese, University of Texas at Arlington
Geoinformatics - Cambridge University Press • IV. VISUALIZATION AND DATA REPRESENTATION • 12. Visualization of Seismic Model Data • Steve Cutchin and Amit Chourasia, UCSD • 13. Integrated Visualization of 4D Data • Charles Meertens, UNAVCO • 14. Visualization and Fusion of Remote Sensing Data • Eric Frost, San Diego State University • 15. Database Development and Visualization for the Yellowstone National Park Region • Robert B. Smith, Jaime Farrell, and Charles Meertens, University of Utah, UNAVCO
V. KNOWLEDGE MANAGEMENT AND DATA INTEGRATION • 16. Data Integration for Paleo Studies: Why and How? • Allister Rees, Chris Scotese, Ashraf Memon, John Alroy, Univeristy of Arizona, UCSD, University of California at Santa Barbara, University of Texas at Arlington, • 17. Creating a dynamic, calibrated geologic time-line using databases, Web applications, and services, • Cinzia Cervato and Peter Sadler, Iowa State University • 18. Data Models and Tools for Geochemistry Databases, Kerstin Lehnert, Doug Walker, Richard Carlson, Columbia University, University of Kansas, Carnegie Institution of Washington • 19. Spatial and Process Ontologies of Subduction Zones, • Hassan Babaie, Georgia State University • 20. GeoSciML - A GML application for geoscience information interchange • Stephen M. Richard and CGI Interoperability working group, Arizona Geological Survey • 21. Bottom-Up Ontologies and Recommendation Systems for Geoscience Applications • Mark Gahegan, Pennsylvania State University • 22. Knowledge Representation in Geology, • Krishna Sinha and Kai Lin, Virginia Tech University, University of California at San Diego Geoinformatics - Cambridge University Press
V. KNOWLEDGE MANAGEMENT AND DATA INTEGRATION • 23. Web Services and Observation Data Catalogs for Uniform Hydrologic Data Access and Analysis • I. Zaslavsky, D. Valentine, T. Whitenack, D. Maidment • University of California at San Diego, University of Texas at Austin • 24. Web Services for Seismic Data Archives • Tim Ahern and Linus Kamb, IRIS • 25. Creating CI resources for gravity and magnetic data: Algorithms, Tools, and Web Services • Leo Salayandia, Raed Aldouri, Ann Gates, Vladik Kreinovich, and G. Randy Keller, University of Texas at El Paso and University of Oklahoma • 26. Use of Scientific Workflows in Geoscience • Ilkay Altintas, Efrat Jaeger-Frank, Bertram Ludaescher • University of California at Davis, University of California at San Diego • 27. Workflow-Driven Ontologies: A methodology to create scientific workflows from domain knowledge • Leonardo Salayandia, Paulo Pinheiro da Silva, and Ann Q. Gates, UTEP • 28. Science Portal for Research and Education in Geosciences • Ashraf Memon, Sandeep Chandra, Choonhan Youn, UCSD Geoinformatics - Cambridge University Press
Geoinformatics - Cambridge University Press VII. Emerging International Efforts 29. The evolution of Earth Science data integration in the Federal Government of the US: Policy, Practice, and Informatics Linda Gunderson, U. S. Geological Survey 30. Geosciences Data in India K. V. Subbarao, Indian Institute of Technology. Department, Department of Earth Sciences 31. Global Earth Observations Grid Satoshi Sekiguchi, Satoshi Tsuchida, and Ryosuke Nakamura, National Institute of Advanced Industrial Science and Technology (AIST), Japan 32. GEO-GRID –eScience for the Earth- and Environmental Science Jens Klump, GeoForschungsZentrum, Potsdam, Germany
Some thoughts about a Geoinformatics curriculum(B.S. in Geoscience with Computer Science Minor) Mathematics background (Calculus, statistics, numerical analysis) Computer Programming [which language(s)?] GIS Geophysics/Remote Sensing (Introductory classes) Geology (at least a minor) Database - Data Structures Software Engineering (informal participation) Computer Applications in the Geosciences Skills needed: Data manipulation, web presence, uncertainty analysis, visualization/graphics, basic hardware handling
SomeThoughts About the Need for Cyberinfrastructure • The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. • The complexity of the fundamental scientific questions being addressed require a variety of data with highly integrative and innovative approaches if we are to find solutions. • Geoscientists have a tradition of sharing of data, but being willing to share data if asked or even maintaining an obscure website accomplishes little. Also as a community, we have no mechanisms to share the work that has been done when a third party cleans up, reorganizes or embellishes an existing database. • We waste a large amount of human capital in duplicative efforts and fall further behind by having no mechanism for existing databases to grow and evolve via community input.