1 / 16

The EarthServer initiative: towards Agile Big Data Services

2nd GEOSS Science and Technology Stakeholder Workshop Bonn, Germany, 2012-aug-29 Peter Baumann Jacobs University | rasdaman GmbH Bremen, Germany p.baumann@jacobs-university.de. The EarthServer initiative: towards Agile Big Data Services. About the Presenter.

israel
Download Presentation

The EarthServer initiative: towards Agile Big Data Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2nd GEOSS Science and Technology Stakeholder WorkshopBonn, Germany, 2012-aug-29 Peter Baumann Jacobs University | rasdaman GmbHBremen, Germanyp.baumann@jacobs-university.de The EarthServer initiative: towards Agile Big Data Services

  2. About the Presenter • Professor of CS, Jacobs University • Head, Large-Scale Scientific Information Systems research group Main outcome so far: • rasdaman • first „Big Raster Data Analytics“ server • Standardization • OGC: chair of raster-relevant working groups, editor of 12+ standards & candidate standards • ISO: working on Raster („Array“) SQL • INSPIRE: Invited expert for coverages • www.jacobs-university.de/lsis, www.rasdaman.org

  3. Roadmap • OGC standards • rasdaman • EarthServer • EarthServer & GEOSS • Conclusions

  4. Feature and Coverage Data Standards • Core element in OGC: geographic feature • = abstraction of a real world phenomenon • associated with a location relative to Earth • Special kind of feature: coverage • = space-time varying multi-dimensional phenomenon • Typical representative: raster image • ...but there is more! • Typically, coverages are Big Data

  5. DiscreteCoverage ContinuousCoverage MultiSolidCoverage MultiPointCoverage RectifiedGridCoverage MultiSurfaceCoverage MultiCurveCoverage ReferenceableGridCoverage GridCoverage Coverage Types as per GML 3.2.1 «FeatureType» AbstractCoverage all n-D New subtypes possible

  6. GML Coverage GML Coverage NetCDF GeoTIFF Domain set Domain set Domain set Range type Range type Range type Range type Range set Range set Range set xlink App Metadata App Metadata App Metadata NetCDF file Coverage Encoding • Pure GML: complete coverage represented by GML • Special Format: other suitable file format (ex: MIME type “image/tiff”) • Multipart-Mixed: multipart MIME, type “multipart/mixed” 6

  7. Core OGC Service Standards data images data data feature coverage meta CQL FE WCPS … … … WFS-T CS-T WCS-T WMS WCS WFS CS-W • WMS "portrays spatial data” pictures • WCS "providesdata + descriptions; data with original semantics, may be interpreted, extrapolated, etc.“ • [09-110r4] 7

  8. Web Coverage Service (WCS) • Core: Simple & efficient access to multi-dimensional coverages • subset = trim | slice • WCS Extensions for additional functionality facets • “band extraction”, scaling, reprojection, interpolation, query language, ... • Application Profiles define domain-oriented bundling 8

  9. Web Coverage Processing Service (WCPS) • Raster Query Language: ad-hoc navigation, extraction, aggregation, analytics • Time series • Image processing • Summary data • Sensor fusion& pattern mining

  10. EarthServer: Big Earth Data Analytics • Scalable On-Demand Processing for the Earth Sciences • EU funded, 3 years, 5.85 mEUR • Platform: rasdaman (Array Analytics server)  • Distributed query processing, integrated data/metadata search, 3D clients  • Strictly open standards: OGC WMS+WCS+WCPS; W3C Xquery; X3D • 6 * 100+ TB databases for all Earth sciences + planetary science

  11. The rasdaman Raster Analytics Server www.rasdaman.org • Array DBMS for massive n-D raster data • new database attribute type: array<celltype,extent> • Data integration: rasters stored in standard database • Extending ISO SQL with array operators: • “tile streaming” architecture • n-D array  set of n-D tiles • extensive optimization, hw/sw parallelization • In operational use • dozen-Terabyte objects • Analytics queries in 50 ms on laptop select img.green[x0:x1,y0:y1] > 130from LandsatArchive as img

  12. Value-Added Satellite Image Archive [Diedrich et al 2001]

  13. rasdaman: Distributed Query Processing • WCPS peer-to-peer cloud • each node accepts all requests • Incoming node distributes query, semantics based • Manifold optimization criteria coverageA for $a in ( A )return encode(($a.nir - $a.red) / ($a.nir + $a.red),“array-compressed“ ) for $a in ( A ), $b in ( B )return encode( ( ($a.nir - $a.red) / ($a.nir + $a.red) - ($b.nir - $b.red) / ($b.nir + $b.red) ), “HDF5“ ) coverageB for $b in ( B )return encode(($b.nir - $b.red) / ($b.nir + $b.red),“array-compressed“ ) [Owonibi 2012]

  14. EarthServer Contribution to GEOSS • Integrated n-D coverage data / metadata search • Smooth integration with Broker [Nativi, Mazzetti 2012]

  15. EarthServer Contribution to GEOSS • Integrated n-D coverage data / metadata search • Smooth integration with Broker • Including „reverse lookup“ queries: „give me metadata for data with specific properties“ • Also integration with MapServer, GDAL, ... • Scalable n-D interfaces, based on OGC standards • Working „in situ“on existing archives; no copying! • Flexible ad-hoc processing & filtering • Through OGC standardized query language • nD visual Web clients • 1D diagrams, 2D maps, 3D data cubes, 3D timeseries sets, ... • Dynymically composed from query results

  16. Conclusion • Sensor, image, & statistics data = a main source of Big Data in Earth Sciences • Petrol industry has „more bytes than barrels“ • OGC standards offer common platform • spatio-temporal coverages – a unified, cross-domain data model • Web Coverage Service suite – from simple download to flexible analytics • www.ogcnetwork.net/wcs • EarthServer can contribute Agile Analytics to GEOSS • OGC coverage standards • rasdaman technology www.earthserver.eu

More Related