1 / 43

Semantic Grid + Data Federation

Semantic Grid + Data Federation. US National Virtual Observatory Roy Williams California Institute of Technology NVO co-director. What is NVO?. Standard protocols, standard data types XML transfer protocol (VOTable) Resource description (VOResource etc)

devona
Download Presentation

Semantic Grid + Data Federation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Grid+ Data Federation US National Virtual Observatory Roy WilliamsCalifornia Institute of Technology NVO co-director

  2. What is NVO? • Standard protocols, standard data types • XML transfer protocol (VOTable) • Resource description (VOResource etc) • Publish/discover to federated registry (OAI) • Semantic Types (UCD) • Services: Cone search, Simple Image Access • Computing with big data on the Grid • Database Crossmatch • Image Federation: Atlases

  3. First NVO Discovery

  4. Database Fuzzy Join Billion Source Cross-Identification: A Computational Challenge 2MASS versus SDSS cross-identification with - j_m as 2MASS magnitude and - I_mtotn as SDSS magnitude 2MASS : j_m ,+ 15 SDSS: I_mtotn <= 18 SDSS unmatched 2MASS matched SDSS matched 2MASS unmatched

  5. Crossmatch Services NVO protocols SDSS database query Crossmatch service 2MASS database query query scientific knowledge!

  6. First NVO Discovery Database crossmatch of two massive databases creates new science “The sum is greater than the parts”

  7. Semantic Grid

  8. ID RA DEC x y z Cone Search • First VO standard service • Input: RA, DEC, SR must be present • decimal degrees J2000 • Output: VOTable of sky-located data records • must have columns with UCDs:POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, ID_MAIN RA=300 DEC=25 SR=0.1 Response Request

  9. ID POS_EQ_RA_MAIN POS_EQ POS_EQ_DEC_MAIN Cone Search Registry A collection of services that have the same shape URLbase RA=200&DEC=20&SR=2 Request: HTTPget of shape: Response: VOTable of shape:

  10. Cone Search + Density Probe Federation of Multiple Services baseURL Spacing Search radius Density Probe interoperating NVO-compliant services! Cone Search

  11. NVO Image ProtocolSIAP • Specify box by position and size • SIAP server returns relevant images • Footprint • Logical Name • URL Can choose: standard URL: http://....... SRB URL srb://nvo.npaci.edu/…..

  12. Simple Image Access Service • Query is sky region • May query on image type, image geometry • Response is VOTable of images • Each has WCS (geometry) parameters • Plus a URL to fetch the image • Designed for • Set of pointed observations (eg Hubble) • Wide-area survey (eg Sloan) • Image service • Mosaicking • Reprojection

  13. Registry OAI Query Registry Registry OAI OAI Publish Publish Data Inventory Service • What data covers a position in the sky? JHU/StSci NCSA 4 Caltech Goddard DIS 2 1 3

  14. Data Inventory Service Request is a cone on the sky

  15. Data Inventory Service Relevant Images and Catalogs NVSS Image ROSAT catalog

  16. Image Federation

  17. VO Registry VORegistry R R OAI md server for ivo:// VOResourceIDivo://me.com/file123 Portals Tools& Services Query service Schemas & Service Types Aladin OASIS DIS Databases Grid Virtual Data VOView Fill-in forms Visualization Reports Publish service Publishing

  18. What is in the Registry? • Answer: “Entities” • It has a global identifier ivo://……. • Must be resolved by authority • It has “VOViews” • Queries return these • …..and that’s all!

  19. 3 Views of an Entitiy Transportation metadata: <weight>4000 kg</weight> <poisonous>no</poisonous> <claws>no</claws> <food>carrots</food> <waste-mgmt>heavy</waste-mgmt> “entity” Zoo-keeper metadata: <diet>carrots</diet> <excrement>yes</excrement> <fencing>strong</fencing> Zoo-manager metadata: <popularity>9</popularity> <visitors>2500 per day</visitors> <feeding>carrots</feeding>

  20. VOResource A mandatory form plus other supporting forms

  21. Schemas and Service Types • VOResource • Entity description form • Organzation, project, data collection, service • Has ivo:// identifier • VORegion • sky coverage form (α/δ/λ) • VOTable • star catalog, image list, other tables • OAI • Registry harvesting • Distributed virtual registry • CONE • Request-response for catalog • SIAP • Request-response for images When can I publish my own schema to VO?

  22. Dublin Core Metadata Curation data for “any human creation” Title A name given to the resource. Creator An entity primarily responsible for making the content of the resource. Subject A topic of the content of the resource. Description An account of the content of the resource. Publisher An entity responsible for making the resource available Contributor An entity responsible for making contributions to the content of the resource. Date A date of an event in the lifecycle of the resource. Type The nature or genre of the content of the resource. Format The physical or digital manifestation of the resource. Identifier An unambiguous reference to the resource within a given context. Source A Reference to a resource from which the present resource is derived. Language A language of the intellectual content of the resource. Relation A reference to a related resource. Coverage The extent or scope of the content of the resource. Rights Information about rights held in and over the resource.

  23. Dublin Core Dublin Core is how the VO will interoperate with libraries of the world A global metadata standard

  24. Prototype Registry Organization Data Collection Project Service SIA service

  25. VOViews VOResource view Dublin Core view

  26. OAI: Open Archives Initiative Harvesting Protocol OAI is popular • Ask your University librarian Distributed Comprehensive Registry • Harvesting Different views for different purposes • Six blind men and the elephant

  27. OAI Harvesting Protocol 6 magic verbs of OAI

  28. VO Identifiers • URI form • Still in flux ivo://mydomain.com / mySkySurvey # file00037.fits • Authority ID • Registered with IVOA • Must correspond to a registry • Resource ID • Created by Authority • Resolved by registry • Record ID • Not known to registry delimiter delimiter

  29. Image Federation

  30. Multispectral Imagery Crab Nebula.3 channels: X-ray in blue, optical in green, and radio in red. Moffet Field California. 224 channels from 400 nm to 2500 nm

  31. Images of the same galaxy taken several days apart are automatically subtracted from one another, and remaining bright spots may be supernova candidates. (NEAT project) detection Image subtraction allows detection of narrow-line features that are not also wide-band (eg Hα but not R-band) Image Federation Stacking allows detection of faint sources. A 1-sigma detection in each of many bands becomes a 3-sigma detection. It's A New Window!

  32. Principle Components SDSS (5 channel) SDSS+2MASS (8 channel)

  33. Mosaicking and Federation Infrared map Mosaicking • Every Astronomical image has a different projection • different pointing of the telescope • We want to mosaic different images • We want to federate different information • Compute intensive: • flux in each pixel is carefully distributed into a new pixel grid Xray map today Federation Xray map last year

  34. AtlasmakerUses Montage, Yoursky Project Estimate & correct Background Co-Add Project David Hockney Pearblossom Highway 1986 Data Chart

  35. Images and Charts • Image • Big data • Chart • Map: sphere → plane • FITS-WCS header • small data An atlas is a collection of charts Hyperatlas is an attempt to standardize atlases

  36. Vchart TM-5-SIN-20-1589 TAN projection SIN projection Hyperatlas Standard naming for atlases and vcharts TM-5-SIN-20 Standard Scales: scale s means 220-s arcseconds per pixel Standard Layout TM-5 layout Standard Projections HV-4 layout

  37. Parallel Atlasmaker Making a single Image Making an Atlas of 1736 Images • Teragrid Distributed • Federated Scheduling wanted • SRB as Virtual Data Catalog • MPI Parallellism • ~2% serial work (Amdahl) • Projection is parallel • All nodes share filespace

  38. NVO Protocol VIEW Bus Atlasmaker Architecture Hyperatlas service making atlas pages NVO/IVO NED Sloan DPOSS FIRST [2MASS] SIAP services scale reproject compress sky index YourSky VirtualSky Oasis federation Virtual Data System data mining

  39. User request Request manager Mosaicked data is on file 2d: Store result & return result 2a. Mosaicked data is not on file 2b. Get raw data from NVO resources AtlasmakerVirtual Data System Metadata repositories Federated by OAI Data repositories Federated by SRB 2c: Compute on TG/IPG Compute resources Federated by TG/IPG

  40. Hyperatlas (service) NVO Image Access (service) Atlasmaker stack Virtual Data System -- Chimera? Atlasmaker (script) Mosaicking (executables) Montage YourSky SRB (service) web

  41. Charts and Pages Page – an organization for data SIN projection Chart – a frame for specific data The virtual disk is 400,000 pixels wide

  42. Background Correction Uncorrected Corrected

  43. Montage Background Correction Project pixels to output chart Fit ramps on overlap regions Fit ramps on projected images Subtract from Pixel values

More Related