Scientific databases lecture virtual observatories for space science
Download
1 / 67

Scientific Databases Lecture: Virtual Observatories for Space Science - PowerPoint PPT Presentation


  • 267 Views
  • Updated On :

Scientific Databases Lecture: Virtual Observatories for Space Science. Dr. Kirk Borne, GMU SCS November 18, 2003 GMU CSI 710. Outline. Quick Review of Astronomy Data The National Virtual Obseratory (NVO) Other Virtual Observatories for Space Science Why Virtual Observatories?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Scientific Databases Lecture: Virtual Observatories for Space Science' - LionelDale


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Scientific databases lecture virtual observatories for space science l.jpg

Scientific Databases Lecture:Virtual Observatories for Space Science

Dr. Kirk Borne, GMU SCS

November 18, 2003

GMU CSI 710


Outline l.jpg
Outline

  • Quick Review of Astronomy Data

  • The National Virtual Obseratory (NVO)

  • Other Virtual Observatories for Space Science

  • Why Virtual Observatories?

  • NVO – It’s all about the Science:

    • IT-enabled, Science-enabling

  • The Enabling Computational Science Technologies for the NVO – where you can help!

  • Distributed Data Mining in the NVO

Virtual Observatories for Space Science


Slide3 l.jpg

The Nature of Astronomical Data

  • Imaging

    • 2D map of the sky at multiple wavelengths

  • Derived catalogs

    • subsequent processing of images

    • extracting object parameters (400+ per object)

  • Spectroscopic follow-up

    • spectra: more detailed object properties

    • clues to physical state and formation history

    • lead to distances: 3D maps

  • Numerical simulations

  • All inter-related!

Virtual Observatories for Space Science


Noao deep wide field survey http www noao edu noao noaodeep l.jpg
NOAO Deep Wide-Field Survey:http://www.noao.edu/noao/noaodeep/

Virtual Observatories for Space Science


Noao deep wide field survey http www noao edu noao noaodeep5 l.jpg
NOAO Deep Wide-Field Survey:http://www.noao.edu/noao/noaodeep/

Virtual Observatories for Space Science


Noao deep wide field survey http www noao edu noao noaodeep6 l.jpg
NOAO Deep Wide-Field Survey:http://www.noao.edu/noao/noaodeep/

Virtual Observatories for Space Science


Nasa astronomy mission data the tip of the data mountain l.jpg
NASA Astronomy Mission Data:the tip of the data mountain

NSSDC’s

astrophysics

data

holdings:

One of many

science data

collections

for astronomy

across the US

and the world!

NSSDC =

National

Space Science

Data Center

@ NASA/GSFC

Virtual Observatories for Space Science

http://nssdc.gsfc.nasa.gov/astro/astrolist.html


Quote of the day l.jpg
“Quote of the day”

  • “It's just as unpleasant to get more than you expected as it is to get less.”

    • George Bernard Shaw

Virtual Observatories for Space Science


Why so many telescopes l.jpg
Why so many Telescopes? …

Because …

  • Many great astronomical

  • discoveries have come

  • from inter-comparisons

  • of various wavelengths:

  • Quasars

  • Gamma-ray bursts

  • Ultraluminous IR galaxies

  • X-ray black-hole binaries

  • Radio galaxies

  • . . .

Overlay

Virtual Observatories for Space Science


Slide10 l.jpg

Therefore, our science data

archive systems should enable

multi-wavelength interdisciplinary

distributed database access,

discovery, mining, and analysis.

Virtual Observatories for Space Science


How does one integrate and use these distributed data archives l.jpg
How does one integrate and use these distributed data archives? …

Virtual Observatories for Space Science


Emerging computational environment l.jpg
Emerging Computational Environment archives? …

  • Standardizing distributed data

    • Web Services, supported on all platforms

    • Custom configure remote data dynamically

    • XML: Extensible Markup Language

    • SOAP: Simple Object Access Protocol

    • WSDL: Web Services Description Language

    • UDDI: Universal Description, Discovery and Integration

  • Standardizing distributed computing

    • Grid Services

    • Custom configure remote computing dynamically

    • Build your own remote computer, use it, then discard it

    • Virtual Data: new data sets on demand

Virtual Observatories for Space Science


Slide13 l.jpg

…The National Virtual Observatory (NVO) archives? …

  • National Academy of Sciences “Decadal Survey” recommended NVO as highest priority small (<$100M) project :

    “Several small initiatives recommended by the committee span both ground and space. The first among them—the National Virtual Observatory (NVO)—is the committee’s top priority among the small initiatives. The NVO will provide a “virtual sky” based on the enormous data sets being created now and the even larger ones proposed for the future. It will enable a new mode of research for professional astronomers and will provide to the public an unparalleled opportunity for education and discovery.” (p.14)

Virtual Observatories for Space Science


Why is it virtual l.jpg
Why is it archives? … Virtual?

  • A Virtual Data System :

    • has multiple components

    • is (geographically) distributed

    • is interoperable

    • provides seamless user access to distributed data system components

    • provides “one-stop shopping” for data end-user

Virtual Observatories for Space Science


Why is it necessary l.jpg
Why is it Necessary? archives? …

  • To maximize cross-enterprise multi-institutional resources

  • To minimize duplication of effort

  • To streamline operations through shared development

  • To serve multiple user communities

  • To facilitate simultaneous data mining, knowledge discovery, and information retrieval from multiple distributed data collections

  • Because data volumes are huge& growing rapidly ...

    For example, in Astronomy :

    • a few terabytes "yesterday” (10,000 CDROMs)

    • tens of terabytes "today” (100,000 CDROMs)

    • petabytes "tomorrow" (within 10 years) (100,000,000 CDROMs)

Virtual Observatories for Space Science


National virtual observatory http www us vo org l.jpg
National Virtual Observatory archives? … http://www.us-vo.org/

  • NVO is a concept. It was recommended by the Astronomy Decadal Survey Committee to the National Academy of Sciences. Currently funded by NSF ($10M Information Technology Research grant); and NASA next year(?).

  • NVO is not just “National”. It is actually “Global”: http://www.ivoa.net/

  • Will link geographically distributed astronomical data archives and information resources = provides “one-stop shopping” for data end-user

  • Will be heterogeneous, interoperable, and federated (autonomy maintained at local sites) … therefore, we are using XML and Web Services.

  • Requiresmiddleware standards for : metadata, resource descriptions (including the Dublin Core), queries, query results, the data (including the Data Model – see next slide), and semantics (… we are using Unified Content Descriptors = UCDs).

  • Requires innovative computational science technologies for :

    • data discovery, data mining, data fusion, distributed querying, and code-shipping (“Ship the code, not the data”)

Virtual Observatories for Space Science


Virtual observatory data model l.jpg
Virtual Observatory Data Model archives? …

A data model is the structure in which a computer program stores persistent information.

Virtual Observatories for Space Science



Vxo becoming an operational system high trl l.jpg
VxO: becoming an operational system (high TRL) archives? …

  • What is aVxO?

    • Virtual “anything” Observatory – where “anything” currently includes Astrophysical, Solar, Magnetospheric, Heliospheric, Ionospheric, …

  • Summary statement for any VxO …

    Researchers should be able to find and access seamlessly all existing data relevant to the research they are considering, that data should be independently and correctly useable, and that data should be available in useful ways and in useful contexts.

  • Without exception, full VxO efforts aim in this direction by providing multi-mission data access and easy browse functionality.

Virtual Observatories for Space Science


Slide20 l.jpg

(Trajectories) archives? …

Capabilities of Space Physics Science Databases.

The VxO Challenge: to Integrate Data, Tools, Services

ModelsWeb

http://spdf.gsfc.nasa.gov/

CDAWLib

HelioWeb

Science Data Facility

Science User Support

Acquisition & Ingest

Tools & Services

Virtual Observatories for Space Science


Slide21 l.jpg
How do Space Science Databases Change in a Future that has an Increasingly Rich/Robust VxO Framework?

  • One definition for this VxO framework could be …

    "The distributed implementation of an integrated space sciences data environment"

  • The broad goals of the data centers don't fundamentally change with this definition.

    • They still must enable new science by adding unique value to the Space Science research community through strong multi-discipline and cross-discipline data resources, with unique services tied to unique databases.

  • These services (data, functions, software) should (and will) be increasingly supplied as a key element of that new, broader VxO environment.

    • Logically, the data center’s services eventually become consumers as well as providers.

    • Visible early user impact of VxO is critical.

  • VxOs should develop a good long-term hybrid solution = PIs + missions/projects + Science Data Centers + (other) specialized services

Virtual Observatories for Space Science


Science data formats part of the glue l.jpg
Science Data Formats – part of the glue an Increasingly Rich/Robust VxO Framework

  • Several key data formats are standard in space science: FITS (Astrophysics & Solar Physics), CDF and netCDF (Space Physics & Earth Science), HDF (Space Physics, Earth Science, & Computational Science).

  • Why?

    • These provide a baseline data format for all data sets in that discipline and in joint international projects.

    • They provide the base for many data center services, data analysis tools, data integration tools, visualization packages.

    • They are a key enabling technology for many different space missions and space science projects.

  • Plans:

    • Translation tools: from FITS <–> CDF <–> HDF <–> netCDF

    • Substantial work on format translators via XML and XSLT.

Virtual Observatories for Space Science


Interfaces to a vxo environment l.jpg
Interfaces to a VxO Environment an Increasingly Rich/Robust VxO Framework

  • "Web Services" interface to existing data services

    • Web Services interfaces and software libraries complement existing FTP and interactive user web interfaces.

    • Web Services provides application-to-application interface, without human intervention.

    • Web Services provides distributed data registry (WSDL), data/resource discovery (UDDI), and data services (SOAP).

    • Scientific database services have unique scope and functionality that must be accessible by the VxO environment for it to gain user acceptance.

      • e.g., SOAP/XML interface for Space Physics data now enables 3-D interactive graphics of distributed multi-mission data.

    • Plans for data format translators and converters

Virtual Observatories for Space Science


Why virtual observatories l.jpg
Why Virtual Observatories? an Increasingly Rich/Robust VxO Framework

  • Because:

    • The data are highly distributed.

    • Multi-mission data lead to new discoveries.

    • The data volumes are HUGE and growing.

    • And maybe because of Augustine’s Law …

“Software is like entropy; it always increases.”- Norman Augustine

Virtual Observatories for Space Science


Szalay s law the utility of n comparable datasets increases as n 2 l.jpg
Szalay’s Law: an Increasingly Rich/Robust VxO FrameworkThe utility of N comparable datasets increases as N2

  • Metcalf’s Law: The value of a network scales as n2, where n is the number of nodes connected.

  • Hagel & Armstrong’s Axiom: The aggregation of resources is more important than the amount of resources owned.

  • Metcalf’s law applies to telephones, the Internet …

  • Szalay argues as follows:

    • Each new dataset gives new information.

    • 2-way combinations give even more new information.

Virtual Observatories for Space Science


Size of a typical archived astronomical data repository l.jpg
Size of a Typical Archived Astronomical Data Repository an Increasingly Rich/Robust VxO Framework

  • Size of the archived data for an all-sky survey -- 40,000 square degrees is two Trillion pixels --

    • One band 4 Terabytes

    • Multi-wavelength10-100 Terabytes

    • Time dimension10 Petabytes

    • LSST project (10 yrs) ~100 Petabytes@http://www.lsst.org/

All-sky distribution of

526,280,881 stars from

the MACHO survey.

Virtual Observatories for Space Science


Ongoing surveys of the sky l.jpg
Ongoing Surveys of the Sky an Increasingly Rich/Robust VxO Framework

MACHO

2MASS

DENIS

SDSS

GALEX

FIRST

DPOSS

GSC-II

COBE MAP

NVSS

FIRST

ROSAT

OGLE

...

  • Large number of new surveys

    • multi-TB in size, 100 million objects or more

    • individual archives planned, or under way

  • Multi-wavelength view of the sky

    • more than 13 wavelength coverage in 5 years

  • Impressive early discoveries

    • finding exotic objects by unusual colors

      • L,T dwarfs, high-z quasars

    • finding objects by time variability

      • gravitational microlensing

Virtual Observatories for Space Science


Slide28 l.jpg

Sloan Digital Sky Survey Data Products an Increasingly Rich/Robust VxO Frameworkhttp://www.sdss.org/

  • Full Data Collection ~20 TB

  • Object catalog 400 GB parameters of >108 objects

  • Redshift Catalog 1 GB parameters of 106 objects

  • Atlas Images 1.5 TB 5 color cutouts of >108 objects

  • Spectra 60 GB in a one-dimensional form

  • Derived Catalogs 20 GB - clusters - QSO absorption lines

  • 4x4 Pixel All-Sky Map 60 GB heavily compressed

Virtual Observatories for Space Science


Large synoptic survey telescope l.jpg
Large Synoptic Survey Telescope an Increasingly Rich/Robust VxO Framework

  • Highly ranked in Decadal Review

  • Optimized for surveys

  • scan mode

  • deep mode

  • 7 square degree field

  • 6.9m effective aperture

  • 24th mag in 20 sec

  • > 20 Tbytes/night

  • Real-time analysis

  • “Celestial Cinematography”

  • Simultaneous multiple science goals

Virtual Observatories for Space Science


Large mirror fabrication for large telescopes such as lsst l.jpg
Large Mirror Fabrication an Increasingly Rich/Robust VxO Framework(for large telescopes, such as LSST)

That’s big!

(Univ. of Arizona Mirror Laboratory)

Virtual Observatories for Space Science


Nvo it s all about the science l.jpg
NVO – It’s all about the Science an Increasingly Rich/Robust VxO Framework

Virtual Observatories for Space Science


Science discovery the old way l.jpg
Science Discovery - an Increasingly Rich/Robust VxO Frameworkthe Old Way

Virtual Observatories for Space Science


Science discovery the new way different l.jpg
Science Discovery - an Increasingly Rich/Robust VxO FrameworkThe New Way -Different!

The discovery process will rely heavily on distributed data access and multi-archive data mining.

Systematic data exploration

  • will have a central role

  • statistical analysis of the “typical” objects

  • automated search for the “rare” events

Virtual Observatories for Space Science


Slide34 l.jpg

Conceptual Architecture for a Distributed Data Mining System an Increasingly Rich/Robust VxO Framework

User

Analysis tools

Discovery tools

Gateway

Data Archives

Virtual Observatories for Space Science


The discovery process l.jpg
The Discovery Process an Increasingly Rich/Robust VxO Framework

Past:observations of small, carefully selected samples of objects in a narrow wavelength band

discover significant patternsfrom the analysis of statistically rich and unbiased image/catalog databases

understand complex astrophysical systems via confrontation between data and large numerical simulations

Future: high quality, homogeneous multi-wavelength data on millions of objects, allowing us to

The discovery process

will rely heavily on advanced visualization,

data mining, and statistical analysis tools.

Virtual Observatories for Space Science


The nvo in 5 words or less l.jpg
The NVO in 5 words or less: an Increasingly Rich/Robust VxO Framework

“The archive is the sky!”

Virtual Observatories for Space Science


Nvo it is all about the science l.jpg
NVO: It is all about the Science an Increasingly Rich/Robust VxO Framework

  • There is a huge scientific interest in the new data collections --large sky surveys, multiple telescopes, multiple-wavelength coverage of the sky, time domain coverage ... And it is all available on-line from your desktop …

    • “The archive is the sky!”

  • Something is needed to help scientists access, mine, and explore these huge data collections.

    • 1 Terabyte at 10 Mbyte/s takes 1 day to transmit

    • Hundreds of intensive queries and thousands of casual queries per-day

    • Data will reside at multiple locations, in many different formats

    • Existing analysis tools do not scale to Terabyte data sets

  • Acute need in a few years; solution will not just happen.

Virtual Observatories for Space Science


Slide38 l.jpg

NVO Enables New Science an Increasingly Rich/Robust VxO Framework

http://www.us-vo.org/

  • Rare and exotic objects

    • Very high redshift quasars

    • Dark matter in the galactic halo

    • Time-variable objects, transient events: distant supernovae and microlensing

    • Brown dwarfs

    • Variable stars

    • Asteroids...

      • ...incoming!!

    • Serendipity!

Virtual Observatories for Space Science


Nvo science cases drivers from aspen 2001 nvo workshop l.jpg
NVO Science Cases & Drivers an Increasingly Rich/Robust VxO Framework(from Aspen 2001 NVO Workshop)

  • Solar System : NEOs, Long-Period Comets, TNOs, Killer Asteroids!!!

  • The Digital Galaxy : Find star streams and populations -- relics of past/present assembly phase. Identify components of disk, thick disk, bulge, halo, arms, ??

  • The Low-Surface Brightness Universe : spatial filtering, multi-wavelength searches, intersection of the image and catalog domains

  • Panchromatic Census of AGN (Active Galactic Nuclei) : Complete sample of the AGN zoo, their emission mechanisms, and their environments

  • Precision Cosmology & Large-Scale Structure : Hierarchical Assembly History of Galaxies and Structure, Cosmological Parameters, Dark Matter and Galaxy Biasing as f(z)

  • Precision science of any kind that depends on very large sample sizes

  • "Survey Science Deluxe"

  • Search for rare and exotic objects (e.g., high-z QSOs, high-z Sne, L/T dwarfs)

  • Serendipity : Explore new domains of parameter space (e.g., time domain, or "color-color space" of all kinds)

Virtual Observatories for Space Science


Enabling computational science technologies for the nvo l.jpg
Enabling Computational Science Technologies for the NVO an Increasingly Rich/Robust VxO Framework

Virtual Observatories for Space Science


Major functions of the nvo and the related enabling computational science technologies l.jpg
Major Functions of the NVO an Increasingly Rich/Robust VxO Frameworkand the related Enabling Computational Science Technologies

  • To facilitate data mining and knowledge discovery within the very large astronomical databases -- Requires:

    • indexing for fast queries, filtering of large queries, data subsetting, visualization, parallelization (queries, access), ...

  • To facilitate linkages and cross-archive investigations -- Requires:

    • distributed computing, scalable architectures, load balancing, thin middleware layer, interoperability, code libraries, code-shipping, data-finding services, data standards & interchange formats, query/results protocols, data fusion, quality assessment, archive/metadata profiles, user profiles, intelligent agents, ...

  • To serve a broad community of users (professionals, amateur astronomers, schools, general public) --

    • must support thousands of queries per day

Virtual Observatories for Space Science


Some general challenges for nvo and all virtual data systems l.jpg
Some General Challenges for NVO (and all Virtual Data Systems)

  • Data Discovery: Finding data within distributed data systems

  • Transparent User Access to Data: across heterogeneous environments

  • (Distributed) Data Mining and Analysis: of terabytes!

  • Interoperability: of systems, data, metadata, tools

  • New Technology Infusion: across multiple distributed systems

  • Sociology: "We don't need it" or "We already have it”

Virtual Observatories for Space Science


How do you get all of these distributed science databases working together l.jpg
How do you get all of these distributed science databases working together?

Virtual Observatory team motto:

“It’s the middleware, stupid.”

Virtual Observatories for Space Science


National virtual observatory http www us vo org44 l.jpg
National Virtual Observatory working together?http://www.us-vo.org/

  • NVO is a concept. It was recommended by the Astronomy Decadal Survey Committee to the National Academy of Sciences. Currently funded by NSF ($10M Information Technology Research grant); and NASA next year(?).

  • NVO is not just “National”. It is actually “Global”: http://www.ivoa.net/

  • Will link geographically distributed astronomical data archives and information resources = provides “one-stop shopping” for data end-user

  • Will be heterogeneous, interoperable, and federated (autonomy maintained at local sites) … therefore, we are using XML and Web Services.

  • Requiresmiddleware standards for : metadata, resource descriptions (including the Dublin Core), queries, query results, the data (including the Data Model – see next slide), and semantics (… we are using Unified Content Descriptors = UCDs).

  • Requires innovative computational science technologies for :

    • data discovery, data mining, data fusion, distributed querying, and code-shipping (“Ship the code, not the data”)

Virtual Observatories for Space Science


Tools for the nvo other virtual data systems l.jpg
Tools for the NVO & other Virtual Data Systems working together?

  • XML (eXtensible Markup Language) = "the language of interoperability"- ADC/XML Project was most comprehensive and advanced application of XML to NASA astrophysics data archives - including the XDF (eXtensible Data Format) and FITSML data standards [ http://xml.gsfc.nasa.gov/]

  • Comprehensive Data Mining Resource Guide for Large Scientific Databases - [follow the link at http://nvo.gsfc.nasa.gov/ ]

    • "The trouble with facts is that there are so many of them." - Samuel McChord Crothers, in "The Gentle Reader"

  • ISAIA (Interoperable Systems for Archival Information Access) : resource description profiles to enable access to distributed data providers

  • MOCHA (Middleware based On a Code-sHipping Architecture): middleware tools for search, retrieval, & data fusion from heterogeneous databases using heterogeneous interfaces - transparently federates distributed data access -

    • "Ship the code, not the data“

  • The GRID! …

Virtual Observatories for Space Science


What is the grid l.jpg
What is The Grid? working together?

  • The GRID is“a distributed computing infrastructure that facilitates resource-sharing and coordinated problem-solving in dynamic, multi-institutional virtual organizations.”

    http://www.globus.org/datagrid/

    http://www.gridforum.org/

    http://www.nas.nasa.gov/About/IPG/

    (NASA’s Information Power Grid)

Virtual Observatories for Space Science


The grid by foster kesselman argonne national laboratory l.jpg
The Grid working together?: by Foster & Kesselman (Argonne National Laboratory)

Internet computing and GRID technologies promise to change the way we tackle complex problems. They will enable large-scale aggregation and sharing of computational, data and other resources across institutional boundaries …. Transform scientific disciplines ranging from high energy physics to the life sciences

Virtual Observatories for Space Science


Data grids vs computational grids l.jpg
Data Grids vs. Computational Grids working together?

Virtual Observatories for Space Science


Slide49 l.jpg

Slide shown earlier: working together?Conceptual Architecture for a Distributed Data Mining System

User

Analysis tools

Discovery tools

Gateway

Data Archives

Virtual Observatories for Space Science


Slide50 l.jpg

Compute node working together?

Compute node

Compute node

Compute node

Compute node

Compute layer200 CPUs

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Compute node

Other nodes

Objectivity

Objectivity

Objectivity

Objectivity

Objectivity

Objectivity

RAID

RAID

RAID

RAID

RAID

RAID

RAID

Interconnect layer 1 Gbits/sec/node

Objectivity

RAID

Database layer 2 GBytes/sec

A Concept for a Data Grid Nodefor Distributed Data Mining**

Hardware requirements

  • Large distributed database engines

    • with few Gbyte/s aggregate I/O speed

  • High speed (>10 Gbit/s) backbones

    • cross-connecting the major archives

  • Scalable computing environment

    • with hundreds of CPUs for analysis

HPC comes

to the rescue!

10 Gbits/s

** Slide provided by Alex Szalay (JHU)

Virtual Observatories for Space Science


An hpc application parallel data mining l.jpg
An HPC Application: working together?Parallel Data Mining

Figure: How Parallel Processing Speeds Up Data Mining

The application of parallel computing resources and parallel data access (e.g., RAID) enables concurrent drill-downs into large data collections

Virtual Observatories for Space Science


Distributed data mining in the nvo l.jpg
Distributed Data Mining in the NVO working together?

Virtual Observatories for Space Science


Data mining connecting the dots l.jpg
Data Mining: connecting the dots? working together?

Reference: http://homepage.interaccess.com/~purcellm/lcas/Cartoons/cartoons.htm

Virtual Observatories for Space Science


Slide54 l.jpg

Scaling the VO Mountain: Role of Data Mining working together?

Discoveries

Data

Mining

Visualization

You are

here

Data

Services

Existing Data Centers and Archives

Virtual Observatories for Space Science


Slide55 l.jpg

NVO working together? = Data Mining in Action

Exploration of new domains of the observable parameter space : The Time Domain -Part 1

Moving object appears as little rainbow in multiple-color image overlays 

In-coming Killer Asteroid?

Virtual Observatories for Space Science


Data mining through data processsing simple multiple frame subtraction l.jpg
Data Mining through Data Processsing: working together?Simple Multiple-Frame Subtraction

SUPERNOVA

discovered !!

Virtual Observatories for Space Science


Slide57 l.jpg

NVO = working together?Data Mining in Action

The Time Domain -- Part 2

Mega-Flares on normal

Sun-like stars = a star like our Sun increased in brightness 300X one night!

… say what??

Virtual Observatories for Space Science


Slide58 l.jpg

NVO working together? = Data Mining in Action

The Time Domain -- Part 3

SETI@home searches for E.T. -- An equivalent data mining tool VO@home on anyone’s desktop can find new comets, asteroids, exploding stars, quasars -- Chunks of data are sent to user’s screensaver, which begins to mine data for special or one-of-kind astronomical events.

Virtual Observatories for Space Science


Slide59 l.jpg

VO@home working together? brings science discovery to the desktop of everyone! … a great tool for space science and computational science education.Requires: access to distributed science databases and data mining & analysis tools.

Virtual Observatories for Space Science


Slide60 l.jpg

1. Potential tool for Distributed Data Mining: working together?http://skyserver.pha.jhu.edu/VOconeprofile/

ConeSearch

  • Find all astronomical objects within a radius of a point on the sky (= cone).

  • Find cross-identifications (e.g., a radio galaxy in one catalog = an Infrared galaxy in another catalog)

  • >70 services are now queried.

  • Results are returned in XML format (VOTable).

Virtual Observatories for Space Science


Slide61 l.jpg

2. Potential Tool for Distributed Data Mining: working together?Data Inventory Servicehttp://us-vo.org/news/dis.html/

Response from the Data Inventory Service,

showing links to relevant images and catalogs:

Uses ConeSearch Profile Service.

Virtual Observatories for Space Science


Slide62 l.jpg

3. Potential tool for Distributed Data Mining: working together?http://www.skyquery.net/main.htmSubmits queries to large distributed databases!

2nd place

Winner in

Microsoft

Contest

Virtual Observatories for Space Science


Slide63 l.jpg

Summary - Applications of Data Mining to the NVO working together?

  • Data Mining Resource Guide for Space Science:

  • http://nvo.gsfc.nasa.gov/nvo_datamining.html

  • Purpose and Content -- to assist NASA scientists in data mining activities by providing comprehensive summaries of: NASA-funded data mining projects, data mining tutorials, algorithms, techniques, software, organizations, conference links, conference summaries, publications, lessons learned, related I.T. technologies, science applications, expert interviews, and applications of data mining to the new National Virtual Observatory (NVO).

Sample Data Mining Applications within the NVO:

  • Discover data stored in geographically distributed heterogeneous systems.

  • Search huge databases for trends and correlations in high-dimensional parameter spaces: identify new properties or new classes of objects.

  • Search for rare, one-of-a-kind, and exotic objects in huge databases.

  • Identify temporal variations in objects from millions or billions of observations.

  • Identify moving objects in huge survey catalogs and image databases.

  • Identify parameter glitches / anomalies / deviations either in static databases (e.g., archives) or in dynamic data (e.g., science / telemetry / engineering data streams from remote satellites).

  • Find clusters, nearest neighbors, outliers, and/or zones of avoidance in the distribution of astrophysical objects or other observables in arbitrary parameter spaces.

  • Serendipitously explore the huge databases that will be part of the NVO, through access to distributed, autonomous, federated, heterogeneous, multi-wavelength, multi-mission astrophysics data archives.

http://www.us-vo.org/

Virtual Observatories for Space Science


Slide64 l.jpg

Web References working together?

  • General:

    • http://xml.gsfc.nasa.gov/

    • http://nvo.gsfc.nasa.gov/

    • http://www.us-vo.org/

  • Specific:

    • VOTable - XML language for queries and tabular query results:

      • http://www.us-vo.org/VOTable/

  • Data Mining Resource Guide:

    • http://nvo.gsfc.nasa.gov/nvo_datamining.html

  • Scientific Data Mining Workshop and Reports:

    • http://www.anc.ed.ac.uk/sdmiv/

  • Virtual Observatories for Space Science


    Vo creating the future of astrophysics data analysis l.jpg
    VO: Creating the Future of Astrophysics Data Analysis working together?

    Virtual Observatories for Space Science


    Summary l.jpg
    Summary working together?

    • Quick Review of Astronomy Data

    • The National Virtual Obseratory (NVO)

    • Other Virtual Observatories for Space Science

    • Why Virtual Observatories?

    • NVO – It’s all about the Science:

      • IT-enabled, Science-enabling

    • The Enabling Computational Science Technologies for the NVO – where you can help!

    • Distributed Data Mining in the NVO

    Virtual Observatories for Space Science


    Next lecture l.jpg
    Next Lecture working together?

    • November 25 – Intelligent Archives of the Future

    Virtual Observatories for Space Science