1 / 1

The new Computational and Data Sciences (CDS) Undergraduate Program at George Mason University

The new Computational and Data Sciences (CDS) Undergraduate Program at George Mason University. Kirk D. Borne, John Wallin , Robert Weigel (GMU).

blake
Download Presentation

The new Computational and Data Sciences (CDS) Undergraduate Program at George Mason University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The new Computational and Data Sciences (CDS) Undergraduate Program at George Mason University Kirk D. Borne, John Wallin, Robert Weigel (GMU) We present the new undergraduate science program in Computational and Data Sciences (CDS) at George Mason University (GMU), which began offering courses for both major (B.S.) and minor degrees in Spring 2008. The overarching theme and goal of the program are to train the next-generation scientists in the tools and techniques of cyber-enabled science (e-Science).  New courses include Introduction to Computational & Data Sciences, Scientific Data & Databases, Scientific Data & Information Visualization, Scientific Data Mining, and Scientific Modeling & Simulation.  This is an interdisciplinary science program, drawing examples, classroom materials, and student activities from a broad range of physical and biological sciences, including Space Physics (Space Weather), Solar Physics, Astrophysics, Geosciences, Computational Fluid Dynamics, Materials Science, Physics, Chemistry, and Bioinformatics.  We will describe some of the motivations and early results from the program. More information is available athttp://cds.gmu.edu/ . The development of the GMU CDS program is sponsored by the NSF CCLI (Course, Curriculum, and Laboratory Improvement) program, through award # 0737091. • Computational and Data Sciences: • Computational and Data Sciences are emerging fields involving applications of sophisticated simulation and data-oriented methods to build models and solve problems in science & engineering. • Recently emerged interdisciplinary areas in the chemical, physical and biological sciences (such as biotechnology, nanotechnology, molecular electronics, photonics in nanoscale systems, and energetics of DNA/protein binding) require highly-qualified professionals with strong computational skills to work closely with experimentalists in solving complex scientific and engineering problems. • Emerging data-intensive science fields (such as Geoinformatics, Bioinformatics, Astroinformatics, Materials Informatics, and Medical Informatics) require specially trained professionals with strong data skills to address ubiquitous data-intensive applications in science, industry, and government. • Computational and data sciences complement existing theoretical and experimental science approaches and may be thought of as a new mode of scientific inquiry. • Existing – Graduate Program in Computational Science & Informatics (CSI) at GMU: • The CSI graduate program has existed since 1992, having graduated 175 PhDs to-date. • There are over 90 graduate courses in the CSI program, covering many science disciplines. • New! – Undergraduate Program in Computational & Data Sciences (CDS) at GMU: • Why? Students graduating with a traditional discipline-based bachelor’s degree in biology, chemistry, mathematics, or physics generally do not have the required computational background necessary to participate as productive members of modern interdisciplinary scientific research teams, which are becoming increasingly data- and computationally intensive.  • What? The BS program in CDS provides students with a variety of opportunities to become research professionals possessing interdisciplinary knowledge, including sciences and applied mathematics, augmented with strong computational and data-oriented skills.  • How? Students in this program are exposed to a wide range of computational and data science applications, and will learn computational science tools, high-performance computing, applied and theoretical computational techniques, modeling and simulation, statistical analysis, optimization, data & information visualization, scientific database applications, scientific data mining & knowledge discovery in databases (KDD), and data-intensive science research methods. • Who? Students with a broad interest in computers and sciences will benefit from the program. • The Results? Graduates from this program will acquire interdisciplinary knowledge and will be able to apply scientific principles in solving complex real-world problems. • When & Where? The program started offering courses in Spring 2008 at GMU. • National study reports identifying workforce demands for computational & data sciences: • NSF report on "Knowledge Lost in Information: Report of the NSF Workshop on Research Directions for Digital Libraries" (2003) • NSB (National Science Board) report on "Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century" (2005) • NSF Atkins Report on "Revolutionizing Science and Engineering Through Cyberinfrastructure" (2005) • NSF-sponsored Computing Research Association report on “Cyberinfrastructure for Education and Learning for the Future: A Vision and Research Agenda” (2005) • NSF + ARL report on "Long-term Stewardship of Digital Data Sets in Science and Engineering" (2006) • NSF report on "Cyberinfrastructure Vision for 21st Century Discovery" (2007) Space Weather Colliding Galaxies Virtual Observatories & Large Scientific DBs CFD Computational & Data Sciences Program of Study: Students must complete a total of 18 credits in computational and data sciences core courses, 15 credits in computer science, 23 credits in mathematics, 6 credits in statistics, 21-25 credits in a science concentration, and 3-9 credits in computational and data sciences electives. Three concentration areas are currently offered: Physics, Chemistry, and Biology. Additional concentrations may be added in the future. Students are encouraged to undertake an optional research project that allows them to gain useful experience in the development of simulations and other aspects of computational science. Catalog Descriptions: Six required core computational & data sciences courses (18 credits): CDS 101 Introduction to Computational and Data Sciences – Introduction to the use of computers in scientific discovery through simulations and data analysis. Covers historical development and current trends in the field. CDS 301 Scientific Information and Data Visualization – The techniques and software used to visualize scientific simulations, complex information, and data visualization for knowledge discovery. CDS 302 Scientific Data & Databases – Data & databases used by scientists. Includes basics about database organization, queries, and distributed data systems. Student exercises will include queries of existing systems, along with basic design of simple database systems. CDS 401 Scientific Data Mining – Data mining techniques from statistics, machine learning, and visualization to scientific knowledge discovery. Students will be given a set of case studies and projects to test their understanding of this field and provide a foundation for future applications in their careers. CDS 410 Modeling and Simulations I – Numerical differentiation and integration, initial-value and boundary-value problems for ordinary differential equations, methods of solution of partial differential equations, iterative methods of solution of nonlinear systems, and approximation theory. CDS 411 Modeling and Simulation II – This course covers the application of modeling and simulation methods to various scientific applications, including fluid dynamics, solid mechanics, materials science, molecular mechanics, and astrophysics. http://cds.gmu.edu/programs/bs_cds/program6.php • Topics covered in CDS 101 (Introduction to Computational & Data Sciences): • The Scientific Method - Experiments, Observations, and Models • Computer Internals – binary numbers and logic circuits • Computer Algorithms and Tool – introduction to programming in Matlab • Data acquisition – linking sensors to computers • Signal Processing – understanding sources of noise and errors • Scientific Databases – storing and organizing scientific data • Data Reduction and Analysis – moving from data to information • Data Mining – moving from information to knowledge • Computer Models – using mathematics and algorithms to represent reality • Computer Simulations – solving linear systems and applications • Computer Visualization – seeing experiments as images • High Performance Computing – simulation at the cutting edge of technology • Future directions in computational science – languages, quantum computing and beyond • Topics covered in CDS 401 (Scientific Data Mining): • Scientific Motivation for data mining • Quantitative Concepts for use in data mining • Software packages for data mining • Data Preparation (e.g., data types, previewing, cleaning dirty data, normalization, transformation) • Distance and Similarity Metrics for Clustering and Classification • Supervised Learning Methods (e.g., decision trees, neural networks, Bayes, Markov models) • Unsupervised Learning Methods (e.g., clustering, nearest neighbor, networks, link/association analysis) • Scientific Data Mining Case Studies & Special Topics (e.g., time series, images, spatial data, outlier detection) • Early Results: • The B.S. degree (CDS major) was approved in 2007. The minor was approved in 2008. • The first courses (CDS 101 and 302) were offered in Spring 2008. • Additional courses (CDS 301 and 401) were offered in Fall 2008. • Currently, 10 active students are majoring in the program. • New course for Spring 2009: CDS 151 – Data Ethics in an Information Society "Data-driven science is becoming a new scientific paradigm – ranking with theory, experimentation, and computational science.“ (JISC/NSF Workshop on Data Driven Science & Repositories, 2007) “There is a growing gap between the generation of data and our understanding of it.” (I. Witten & E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 2005) "The importance of data in science and engineering continues on a path of exponential growth; some even assert that the leading science driver of high-end computing will soon be data rather than processing cycles. Thus it is crucial to provide major new resources for handling and understanding data.“ (NSF Atkins Report on Cyberinfrastructure, 2005) “While data doubles every year, useful information seems to be decreasing.“ (M. Dunham, Data Mining Introductory and Advanced Topics, 2002) “Data science has developed to include the study of the capture of data, their analysis, metadata, fast retrieval, archiving, exchange, mining to find unexpected knowledge and data relationships, visualization in two and three dimensions including movement, and management.” (F. Smith, Data Science as an Academic Discipline, Data Science Journal, Vol. 5. p. 163, 2006) “The Information Revolution has provided unprecedented opportunity to ensure that scientific data are fully integrated in the fundamental workings and decision making of our society.” (S. Iwata, Scientific Agenda of Data Science, Data Science Journal, Vol. 7. p. 54, 2008)

More Related