170 likes | 193 Views
Learn about defining custom tags, making data human-readable, separating content from presentation, and using XML for input data in the field of Electricity and Magnetism. Understand how to create schemas, use parsers, navigate XML trees, and overcome practical drawbacks in XML processing. Explore JavaBeans with Castor and discover standard XML dialects like MathML, ChemistryML, and SVG for scientific visualization.
E N D
XML for Scientific Applications Marlon Pierce ERDC Tutorial August 16 2001
What is XML? • Standard rule set for defining custom tags. • Make your (meta)data human-readable. • Separate data content from presentation (XSL). • Rules for a particular dialect defined in either DTD or Schema. • W3C: Standards Making Body • Same people that produced HTML. • See http://www.w3c.org
Ex: XML for Electricity and Magnetism <?xml version="1.0"?> <!DOCTYPE ProjectDesc SYSTEM "GridViewer.dtd"> <ProjectDesc> <GridData> <NumberOfMaterials>2</NumberOfMaterials> <GridDimensions> Tags omitted for brevity </GridDimensions> <DataFile> <FileName>balloon.dat</FileName> <FileType>ASCII</FileType> <FileFormat>P3D</FileFormat> <Compression>none</Compression> </DataFile> </GridData> …Tags omitted for brevity… </ProjectDesc>
EX: E&M DTD Fragment <!ELEMENT ProjectDesc (GridData,MaterialList)> <!ELEMENT GridData (GridDimensions,DataFile)> <!ELEMENT GridDimensions (X,Y,Z)> Cut for brevity. <!ELEMENT DataFile (FileName,FileType,FileFormat,Compression)> <!ELEMENT FileName (#PCDATA)> <!ELEMENT FileType (#PCDATA)> <!ELEMENT FileFormat (#PCDATA)> <!ELEMENT Compression (#PCDATA)> <!ELEMENT MaterialList (Material+)> <!ELEMENT Material (Name,Color,Epsilon*,Mu*,Sigma*,Mag*)>
What the DTD Tells You • What tags can be included • Parent/child relationships • The number of allowed tags of a particular type • 1 only, 0 or 1, 0 or more, 1 or more. • Names of attributes • If the tag takes parsable character data
Ex: E&M Schema Fragment <schema> <element name="ProjectDesc" type="ProjectDescType"/> <complexType name="ProjectDescType"> <element name="GridData" type="GridDataType"/> <element name="MaterialList" type="MatListType"/> <complextType> <complexType name="GridDataType"> <element name="NumberOfMaterials type="int"/> <element name="GridDimensions" type="GridDimType"/> <element name="DataFile" type="DataFileType"/> </complexType> ….</schema>
Schema v. DTD(a partial list) • Schemas are in XML; DTDs are not. • Schemas have several simple types (integers, strings, floats, …); DTDs treat everything as character data. • Schema complex types support inheritance • Bee complex type can be extended by drone, queen, worker subtypes. • But DTDs have been around longer.
Now What? • Get a parser for your favorite language • Apache XML Project’s Xerces parser supports Java, C++, Perl • http://xml.apache.org • Write code using the parser: • Validates XML files. • Returns the DOM. • You can now navigate the XML document tree
Document Object Model • Defines general entities that make up the document. • Forms a tree • Objects include • Document • Node • Element • Attribute ProjectDesc GridData MaterialList
Practical Drawbacks • The DOM classes are very general. They only provide you with the most general way of navigating the tree. • Typically for every XML dialect you create, you will have to write new code to extract the information. • It would be nice if there was a better way to do this….
Automatic JavaBeans with Castor • XML trees map nicely into Java Bean components. Get/Set methods return the information. • Castor: automatically generates JavaBeans from XML and vice versa. • You just write the Bean classes (simple) and Castor handles the mapping to XML. • http://castor.exolabs.org
Some Standard XML Dialects • Don’t reinvent what already exists. See http://www.w3c.org/TR • MathML • ChemistryML • SVG: Scalable Vector Graphics • SOAP: Simple Object Access Protocol • RDF: Resource Description Framework
XML Namespaces • Namespaces allow you to mix different types of XML. • You can combine custom and standard tags • Ex: combine GEMML plus MathML
Namespace Example <gem xmlns:gem="http://www.gem.org/gem" xmlns:m="http://www.w3c.org/TR/REC-MathML/"> <gem:analysis> <m:math> <!-- MathML expressions --> </m:math> <!-- GEM analysis content --> </gem:analysis> </gem>
Additional References and Resources • Inside XML by Steven Holzner. New Riders (2001). • The W3C has a nice schema tutorial at www.w3.org/TR/xmlschema-0/ • The ARL ICE project mixes XML and HDF5: www.arl.hpc.mil/ice/XdmfUser.html • XSIL is a markup language for scientific data: www.cacr.caltech.edu/SDA/xsil