1 / 38

VO Standards and Protocols XML VOTable UCD ConeSearch

VO Standards and Protocols XML VOTable UCD ConeSearch Roy Williams California Institute of Technology NVO co-director. XML: Structured Information. <From>Antonio Stadivarius</From> <To>Domenico Scarlatti</To> <Date> <Day>13</Day> <Month>4</Month> <Year>1723</Year> </Date> <Body>

kaelem
Download Presentation

VO Standards and Protocols XML VOTable UCD ConeSearch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VO Standards and Protocols XMLVOTableUCDConeSearch Roy WilliamsCalifornia Institute of Technology NVO co-director

  2. XML: Structured Information <From>Antonio Stadivarius</From> <To>Domenico Scarlatti</To> <Date> <Day>13</Day> <Month>4</Month> <Year>1723</Year> </Date> <Body> Io bisogno una appartamento acoglienti a Cremona … </Body> Separation of structure from presentation 4/13/23 April 13, 1723 17.iv.1723 The computer can read the document and answer queries like this: “Find all memos from April 1723”

  3. XML • Documents and data • Human readable, editable, mailable • Schema constrains structure • -- can encode data models • Can be transformed (XSLT) • -- other xml • -- html/pdf/excel etc • Tools • Parsers in Java, C, C++, Perl, Python, ... • Browsers and editors • XML databases • Binding to make API • For serialization, mediation, brokers

  4. XML for science XML is a comfortable vehicle for our metadata and data models But the real challenge is: To define NVO-specific data objects And how they are used We need consensus more than either software or hardware VOTable VOResource services -- WSDL

  5. XML example(no schema) <?xml version="1.0"?> <BookCatalogue> <Book> <Title>The Cambridge Star Atlas</Title> <Author>Wil Tirion</Author> <ISBN>0-52156-098-5</ISBN> <Publisher>Cambridge UP</Publisher> </Book> <Book> <Title> Parallel Computing Works!</Title> <Author>Geoffrey C. Fox</Author> <Author>Roy D. Williams</Author> <Author>Paul C. Messina</Author> <ISBN>1-55860-253-4</ISBN> <Publisher>Morgan Kaufmann</Publisher> </Book> </BookCatalogue>

  6. XML Parsing SAX: Event-Based Handlers functions for StartElement, Text, EndElement, etc. Found elementBookCatalogue Found elementBook Found ElementTitle Found TextThe Cambridge Star Atlas Found End ElementTitle ….

  7. Parsing DOM: Document Object Model Returns a tree-like Document object with data attached BookCatalogue Book Book Title Title Author Cambridge Star Atlas ISBN Parallel Computing Works! Wil Tirion

  8. XML Schema <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2000/10/XMLSchema" xmlns:cat="uri://BookCatalogue"> <element name="BookCatalogue"> <complexType> <sequence> <element ref="cat:Book" minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </element> <element name="Book"> <complexType> <sequence> <element ref="cat:Title" minOccurs="1" maxOccurs="1"/> <element ref="cat:Author" minOccurs="1"/> <element ref="cat:Date" minOccurs=”0" maxOccurs="1"/> <element ref="cat:ISBN" minOccurs="1" maxOccurs="1"/> <element ref="cat:Publisher" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </schema> Book.xsd = Xml-Schema Definition

  9. XSchema <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2000/10/XMLSchema" xmlns:cat="uri://BookCatalogue"> <element name="BookCatalogue"> <complexType> <sequence> <element ref="cat:Book" minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </element> <element name="Book"> <complexType> <sequence> <element ref="cat:Title" minOccurs="1" maxOccurs="1"/> <element ref="cat:Author" minOccurs="1"/> <element ref="cat:Date" minOccurs=”0" maxOccurs="1"/> <element ref="cat:ISBN" minOccurs="1" maxOccurs="1"/> <element ref="cat:Publisher" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </schema> All XML schemas have “schema” as the root element Book.xsd = Xml-Schema Definition

  10. XSchema <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2000/10/XMLSchema" xmlns:cat="uri://BookCatalogue"> <element name="BookCatalogue"> <complexType> <sequence> <element ref="cat:Book" minOccurs="0" maxOccurs="unbounded"/> </sequence> <annotation>Catalog is a sequence of books</Annotation> </complexType> </element> <element name="Book"> <complexType> <sequence> <element ref="cat:Title" minOccurs="1" maxOccurs="1"/> <element ref="cat:Author" minOccurs="1"/> <element ref="cat:Date" minOccurs=”0" maxOccurs="1"/> <element ref="cat:ISBN" minOccurs="1" maxOccurs="1"/> <element ref="cat:Publisher" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </schema> Default Namespace declaration: all these come from this standard namespace

  11. XSchema <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2000/10/XMLSchema" xmlns:cat="uri://BookCatalogue"> <element name="BookCatalogue"> <complexType> <sequence> <element ref="cat:Book" minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </element> <element name="Book"> <complexType> <sequence> <element ref="cat:Title" minOccurs="1" maxOccurs="1"/> <element ref="cat:Author" minOccurs="1"/> <element ref="cat:Date" minOccurs=”0" maxOccurs="1"/> <element ref="cat:ISBN" minOccurs="1" maxOccurs="1"/> <element ref="cat:Publisher" minOccurs="1" maxOccurs="1"/> </sequence> </complexType> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/> </schema> This namespace is defined here& abbreviated as "cat" This element comes from the namespace called “cat” Book element defined here Book.xsd = Xml-Schema Definition

  12. Namespace Content Here: uri://BookCatalogue can be abbreviated as "cat" The “cat” namespace contains: BookCatalogue Book Title Author ISBN Date Publisher

  13. XML example(with schema) Here is the namespace that we are using in this document <?xml version="1.0"?> <BookCatalogue xmlns= "uri://BookCatalogue" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "uri://BookCatalogue http://www.mydomain.com/schemas/bookcatalog.xsd"> > <Book> <Title>The Cambridge Star Atlas</Title> <Author>Wil Tirion</Author> <ISBN>0-52156-098-5</ISBN> <Publisher>Cambridge UP</Publisher> </Book> <Book> <Title> Parallel Computing Works!</Title> <Author>Geoffrey C. Fox</Author> <Author>Roy D. Williams</Author> <Author>Paul C. Messina</Author> <ISBN>1-55860-253-4</ISBN> <Publisher>Morgan Kaufmann</Publisher> </Book> </BookCatalogue> Document is instance of a w3c schema Here is the URL of its schema

  14. VOTable • Full metadata representation • Hierarchy of RESOURCEs • containing PARAMs and TABLEs • UCD (unified content descriptor) • a has unit meter • a has UCDORBIT_SIZE_SMAJ (Semi-major axis of the orbit ) • Can reference remote and/or binary streams • Table can be • Pure XML • "Simple Binary" • FITS Binary Table

  15. <DATA> <FITS> <STREAMhref="ftp://server.com/mydata.fits" expires="2002-02-22"actuate="onRequest"/> </FITS> </DATA> Sample VOTable <?xml version="1.0"?> <!DOCTYPE VOTABLE SYSTEM "http://us-vo.org/xml/VOTable.dtd"> <VOTABLE version="1.0"> <DEFINITIONS> <COOSYS ID="myJ2000" equinox="2000." epoch="2000." system="eq_FK5"/> </DEFINITIONS> <RESOURCE> <PARAM name="Observer" datatype="char" arraysize="*" value="William Herschel"> <DESCRIPTION>This parameter is designed to store the observer's name </DESCRIPTION> </PARAM> <TABLE name="Stars"> <DESCRIPTION>Some bright stars</DESCRIPTION> <FIELD name="Star-Name" ucd="ID_MAIN" datatype="char" arraysize="10"/> <FIELD name="RA" ucd="POS_EQ_RA" ref="myJ2000" unit="deg" datatype="float" precision="F3" width="7"/> <FIELD name="Dec" ucd="POS_EQ_DEC" ref="myJ2000" unit="deg" datatype="float" precision="F3" width="7"/> <FIELD name="Counts" ucd="NUMBER" datatype="int" arraysize="2x3x*"/> <DATA> <TABLEDATA> <TR> <TD>Procyon</TD><TD>114.827</TD><TD>5.227</TD> <TD>4 5 3 4 3 2 1 2 3 3 5 6</TD> </TR> <TR> <TD>Vega</TD><TD>279.234</TD> <TD>38.782</TD><TD>8 7 8 6 8 6</TD> </TR> </TABLEDATA> </DATA> </TABLE> </RESOURCE> </VOTABLE>

  16. etc Table Cell follows FITS binary table does NOT follow XML schema boolean bit unsignedByte short int long char unicodeChar float double floatComplex doubleComplex scalar Primitives arrays variable length arrays etc

  17. VOTable is Flexy • eg Table of images • UCD="meta.code.mime; image.jpeg" datatype="unsignedByte" arraysize="*" • eg Table of URL links • UCD=“meta.ref.url"datatype="char" arraysize="*"

  18. VOTable Schema (xsd)

  19. Table Data Model • Metadata • Class definition for Row • FIELD • data type • semantic type • Data • Each Row is a list of Cells • Each Cell is an array of Primitives • may be variable length

  20. Table Data Layout • All metadata first • small, complex, XML • Class definition for table record • + params, description, etc etc • Then data • (may be) large, remote • XML | binary | FITS • Instantiations of table record • All records MUST have same format • binary data allows streaming, parallelism

  21. Param Data Model • Param is “Table with one cell” • Like a FIELD value • But with a “value” attribute

  22. Primitives • All have fixed binary length • Same as FITS primitives • Except Unicode

  23. Multidimensional Array Cell • A table cell can have lots of Primitives • Example: WCS parameters are arrays • <FIELD name=“CRVAL” datatype=“double” arraysize=“2”/> • Example: up to 10 images, each 64x64 • <FIELD name="thumbs" datatype="unsignedByte" arraysize="64x64x10*"/>

  24. Hierarchy • A VOTable contains RESOURCES • RESOURCE can contain: • TABLE • RESOURCE • etc etc • Usage example • Many observations in the file, • each is a RESOURCE • Each observation is • Parameters • Calibration table • Raw data table

  25. Hierarchy • New feature: GROUP <TABLE name=“Nutation and Aberration”> <GROUP name=“Nutation”> <FIELD name=“Longitude”/> <FIELD name=“Obliquity”/> </GROUP> <GROUP name=“Aberration”> <GROUP name=“Equinox 1950.0”> <FIELD name=“C”/> <FIELD name=“D”/> </GROUP> <GROUP name=“Equinox 1955.0”> <FIELD name=“C”/> <FIELD name=“D”/> </GROUP> </GROUP> </TABLE>

  26. Astronomical Data • Image • Standard file format: FITS • Standardized c.1980 • Keyword-value dictionary + binary block • Catalog • Derived from image • Connected set of bright pixels • “Table of stars” • Standard format: VOTable • Standardized 2002 • XML with remote binary • Spectrum

  27. XSLT Example <VOTABLE version="1.0"> <DESCRIPTION>Output from the messier catalog at VirtualSky.org</DESCRIPTION> <RESOURCE type="results"> <PARAM ID="RA" datatype="E" value="200.0" /> <PARAM ID="DE" datatype="E" value="40.0" /> <PARAM ID="SR" datatype="E" value="30.0" /> <PARAM ID="PositionalError" datatype="E" value="0.1" /> <PARAM ID="Credit" datatype="A" arraysize="*" value="Charles Messier, Richard Gelderman" /> <TABLE> <DESCRIPTION>Output from messier Catalog Server</DESCRIPTION> <FIELD ID="I" name="Messier Number" datatype="char" arraysize="*" ucd="ID_MAIN"> <DESCRIPTION>Messier Number</DESCRIPTION> </FIELD> <FIELD ID="RA" name="Right Ascension" datatype="float" unit="degrees" ucd="POS_EQ_RA_MAIN"> <DESCRIPTION>Right Ascension J2000</DESCRIPTION> </FIELD> .... <DATA> <TABLEDATA> <TR> <TD>3</TD> <TD>205.5</TD> <TD>28.402</TD> <TD /> <TD>16.2'</TD> <TD>6.4004</TD> <TD>Globular Cluster</TD> <TD>Canes Venatici</TD> <TD>M3 is one of more heavily studied globular clusters due to its position in the galaxy, putting it far from interstellar absorbtion. More than 200 variable stars have been observed out of a total of near 50,000. Being one of the brightest clusters, M3 is</TD> </TR>

  28. XSLT Result this table is the result of a conesearch

  29. XSLT Program <h2>Data</h2> <table border="1"> <xsl:for-each select="FIELD"> <td><b><xsl:value-of select="@name" /> </b></td> </xsl:for-each> <xsl:for-each select="DATA"> <xsl:for-each select="TABLEDATA"> <xsl:for-each select="TR"> <tr> <xsl:for-each select="TD"> <td width="100"><xsl:value-of select="." /></td> </xsl:for-each> </tr> </xsl:for-each> </xsl:for-each> </xsl:for-each> </table>

  30. Binding to make a Parser From the Schema an API and library is generated JAXB Breeze Castor This is JAVOT (Caltech) for(int i=0; i<table.getFieldCount(); i++){ Field field = (Field)table.getFieldAt(i); String u = field.getUcd(); if(u != null && u.equals("POS_EQ_RA_MAIN")) System.out.println("Field " + i + " is for RA"); }

  31. Unified Content Descriptor • UCD is a “semantic type” • phot.mag;em.opt.B Integrated total blue magnitude • src.orbital.eccentricity Orbital eccentricity • stat.median Statistics Median Value • Base + Specifiers • eg error in default right ascension • stat.error; pos.eq.ra; meta.main • First word is "type" • "what kind of thing is this?" • How do we add a stat.error to another?

  32. Unified Content Descriptor • UCD has services • Natural Language Description • Find best UCD • Search in NLD • Matching functions • if I want pos.eq.ra, is stat.error;pos.eq.ra correct? • What about Ontology???

  33. Some UCD S stat Statistical parameters Q stat.Fourier Fourier coefficient Q stat.Fourier.amplitude Amplitude Fourier coefficient P stat.covariance Covariance between two parameters P stat.error Statistical error P stat.error.sys Systematic error Q stat.fit Fit Q stat.fit.chi2 Chi2 Q stat.fit.dof Degrees of freedom Q stat.fit.goodness Goodness or significance of fit Q stat.fit.omc Observed minus computed Q stat.fit.param Parameter of fit Q stat.fit.residual Residual fit Q stat.likelihood Likelihood S stat.max Maximum or upper limit S stat.mean Mean, average value S stat.median Median value S stat.min Minimum or lowest limit

  34. Some UCD S phot Photometry Q phot.calib Photometric calibration Q phot.color Color index or magnitude difference Q phot.color.Cous Color index in Cousins system Q phot.color.Gen Color index in Geneva system Q phot.color.Gunn Color index in Gunn system Q phot.color.JHN Color index in Johnson 65+ system S meta Metadata P meta.bib Bibliographic reference P meta.bib.author Author name P meta.bib.bibcode Bibcode P meta.bib.ivo IVOA identifier ivo:// P meta.bib.fig Figure in a paper P meta.bib.journal Journal name P meta.bib.page Page number P meta.bib.volume Volume number P meta.code Code or flag P meta.code.class Classification code

  35. ID RA DEC x y z Cone Search • First VO standard service • Input: RA, DEC, SR must be present • decimal degrees J2000 • Output: VOTable of sky-located data records • must have columns with UCDs:POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, ID_MAIN RA=300 DEC=25 SR=0.1 Response Request

  36. Cone Searches in a VO Registry

  37. Result of Cone Search RA Dec ID

  38. Cone Search + Density Probe Federation of Multiple Services baseURL Spacing Search radius Density Probe interoperating NVO-compliant services! Cone Search

More Related