dar metadata catalog n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
DAR Metadata Catalog PowerPoint Presentation
Download Presentation
DAR Metadata Catalog

Loading in 2 Seconds...

  share
play fullscreen
1 / 13
rafael-day

DAR Metadata Catalog - PowerPoint PPT Presentation

82 Views
Download Presentation
DAR Metadata Catalog
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. DAR Metadata Catalog Markus Heene, DWD markus.heene@dwd.de

  2. Agenda • Welcome • Notes • Performance Test - Infrastructure • High level architecture • Geonetwork • terraCatalog • Performance Tests • Requirements • Preconditions • Results • Remarks • Resources

  3. Notes • The presented results are from May 2009 • Both software solutions have released newer versions • Geonetwork 2.6 • terraCatalog 3.0 • The findings of the Performance Study were made available to both

  4. Performance Test - Infrastructure Tomcat 5.5 Test client Oracle 10g Application Server: CPU: 4 AMD Opteron 1800 MHz RAM: 9186716 kB

  5. Geonetwork: High level architecture • Geonetwork (version 2.2 and 2.4) • Servlet Container • Main development for jetty (migration to other Servlet containers like Tomcat, OC4J possible) • Geonetwork consists of 3 different web applications which could interact • Different Frameworks used for the development: Jeeves, Struts, Spring, … • For the next generation of Geonetwork a system architecture redesign is announced: remove Jeeves Framework (“Bringing data and metadata closer together”, FOSS4G2008 - Cape Town by Jeroen Ticheler) • Metadata handling • Metadata XML file is stored as “large object” in Database (support for different vendors) • Search is mainly based on lucene index outside of Database • <gmd:fileidentifier> limited to varchar2(250) in basic installation • Huge time necessary to build lucene index • Additional remarks • Open source software • Stable solution so far (migration to other Servlet container needs time) • Version 2.2 implements only some queries of CSW • Some Z39.50 support is available, currently only limited experiences inside DWD • Production installation with up to 25.000 records are running (what we found)

  6. terraCatalog 2.3: High level architecture • terraCatalog 2.3 • Servlet Container • Main development for Tomcat (migration possible but not tried) • terraCatalog consists of different web applications which could interact • Consistent usage of frameworks through all web applications • Metadata handling • Metadata XML file is stored in Database and “mapped” into relational model (database support for Postgresql and Oracle) • Search is function of Database (Oracle Spatial and Text) • Mapping into relational model cause conflicts with XML documents (e.g. title is limited to varchar2(255), same for abstract and keywords)  valid ISO-conform XML documents could not be imported into terraCatalog • Oracle Spatial datatype could store only half of the world  special treatment necessary for whole globe  we found Oracle errors in certain situations • Additional remarks • Commercial software with support • Much more complete implementation of CSW compared to Geonetwork 2.2 • No Z39.50 search functionality  additional investment necessary • Production installation with up to 25.000 records are running • We found some bugs – SQL Injection, Oracle errors, import of valid XML documents not possible, error in export metadata as XML document

  7. Performance Tests - Requirements • Requirements based on WMO and INSPIRE • WMO (see WIS-TechSpec-8, DAR Catalogue Search and Retrieval, Technical Specification 1.1) • Response time < 2 sec • 40 combined search (keyword and bounding box) per second • Minimum of 20 active sessions • INSPIRE • Response time < 3 sec • Minimum of 30 active sessions • DWD • Minimum of 100.000 metadata records

  8. Performance Tests - Preconditions • Importing Metadata • Practical package size was 5.000 metadata records in an archive • Import costs a lot of time (5.000 records ~ 45 minutes – 60 minutes) • Importing metadata into terraCatalog generates GBs of redo-logs (200 MB per minute) • Formulate queries in CSW 2.0.2 • Challenge was to describe a query that both system understood (limited CSW implementation from Geonetwork 2.2) • Parameterize query for different result sets (e.g. search title for “zyx”  0 hits, search title for “gts”  136.511 hits)

  9. Performance Tests - Results + (fulfilled), - (failed), o (partially)

  10. Performance Tests - Results INSPIRE WMO + (fulfilled), - (failed), o (partially)

  11. Currently it looks like that both systems are not capable to handle 140.000 metadata records according to the requirements of INSPIRE and WMO Performance Tests - Remarks • Geonetwork fails to meet the requirement if the result set contains more than 10.000 hits ( response time scales with size of the result set) • Geonetwork installation with 140.000 metadata records • First access of the GUI takes minutes! • Geonetwork 2.2 deployment of web app with around 3000 metadata records costs hours • terraCatalog fails to meet the requirement for combined searches • terraCatalog could not meet the response time requirement for geographical searches • terraCatalog errors if the search touches the equator • Fuzzy search for title, abstract, keywords … is a nice feature • terraCatalog up to 60 times faster as Geonetwork in simple queries • Other solutions like geowaySDI.NODE are although tested only with 25.000 records

  12. Resources • WMO Wiki: http://www.wmo.int/pages/prog/www/WIS/wiswiki/tiki-index.php?page=geonetworkdoc • Geonetwork: http://geonetwork-opensource.org/ • BlueNet: http://anzlicmet.bluenet.utas.edu.au/ • con terra: http://www.conterra.de/

  13. Q&A