The importance of locality in the visualization of large datasets

The importance of locality in the visualization of large datasets John Brooke1, James Marsh1, Steve Pettipher1, Lakshmi Sastry2 1 The University of Manchester 2 CLRC Rutherford Appleton Laboratory http://www.man.ac.uk http://www.clrc.ac.uk

GODIVA - Grid for Ocean Diagnostics, Interactive Visualization and Analysis Department PI Expertise

GODIVA: Background • At Reading and SOC we hold copies of various datasets (~2TB) • Mainly from models of oceans and atmosphere • Also some observational data (e.g. satellite data) • From Met Office, SOC, ECMWF, more • We serve these datasets to many end users • Scientists (1000s of hits per year) • Industry (e.g. British Maritime Technology) • Datasets are in a variety of formats • netCDF, GRIB, HDF, HDF5 … • Data do not conform to naming conventions • E.g. “temp” instead of “sea_water_potential_temperature”

GODIVA: Background (2) • There is a clear need to make access to these datasets easier • Users shouldn’t have to know details of how data are stored • Hence development of GADS (Grid Access Data Service) • Developed as part of GODIVA project • Grid for Ocean Diagnostics, Interactive Visualisation and Analysis • NERC e-Science pilot project • Originally developed by Woolf et al (2003) • Allows richer queries and more flexibility than DODS standard • Although we plan to implement a DODS translation layer

GODIVA Web Portal • Allows users to interactively select data for download using a GUI • Users can create movies on the fly • cf. Live Access Server

Advantages of GADS • User’s don’t need to know anything about storage details • Can expose data with conventional names without changing data files • Users can choose their preferred data format, irrespective of how data are stored • Behaves as aggregation server • Delivers single file, even if original data spanned several files • Deployed as a Web Service • Can be called from any platform/language • Can be called programmatically (easily incorporated into larger systems), workflows • Java / Apache Axis / Tomcat

GODIVA: Science Drivers The Thermohaline Circulation • Convection and Downwelling Sites (SOC) • Thermodynamic Water Transformation Diagnostics (ESSC) • Unstructured Mesh Convection Modelling (IC)

Science: Convection and Downwelling Isopycnal 1027.28 kg/m3 • How are Convection and Downwelling related? • Visualize Isopycnal/Isothermal surfaces in time • Superpose Currents/Vertical velocities • Diagnostics eg. Q vectors • Commodity Processors for Remote access

Techology: Unstructured Mesh Modelling • How to resolve convection, mixing and downwelling? • Idealised Experiments • Transformation and visualization of 3D unstructured mesh data • Community Visualization development

Software Condor Implementation XML Metadata Libraries Inferno - new approach to Grid middleware Commodity Graphics Card viewers Web services for visualization tasks GRID Large 3D arrays 1-10Gb/file Diverse computing resources (T3E, SGI O3000 and Onyx, graphics workstations, Linux PCs) Diverse data resources (eg. Winds, surface fluxes, assimilation data) Intelligent use of locality Technology: Visualization and modelling

Locality-aware architecture C. Visualization Service B. Data Service A. Data Storage Sends data to D. Client viewer, rendering either locally or by using visualization service Requests data from

Visualization: Local rendering • Investigating fast volumetric-rendering • Generating geometric isosurfaces: • Computationally expensive processing • Changing the isovalue requires re-processing • Direct rendering using 3D textures: • Fast on modern PC graphics cards • Can exploit programmable GPUs • GODIVA architecture allows data to be pre-fetched • Can hide network effects by fetching as user is viewing locally • Local rendering can bypass visualization service

Direct Volume Rendering • Volume data stored in a 3D texture • Multiple planes are drawn parallel to the screen sampling the texture • Enough samples will reconstruct scalar data according to the nyqhist frequency • Surfaces look flat • No shading Klaus Engel

Fragment Programs • What is a fragment program? • A short, simple program that is executed whenever a fragment is drawn • A fragment is similar to a pixel, but may later be covered by another fragment • Determines actual pixel colour • Usually by performing vector operations • Often uses one or more textures as input values

Shaded isosurfaces • Stack of partially transparent slices • 3D texture also contains isovalue gradient • Each pixel is shaded according to the dot-product of the gradient and light vectors • Requires small fragment program to describe how to perform the shading • Looks “realistic” • Small features more visible • Can interactively change the isovalue

Shaded isosurface example

Isopycnal Surfaces John Stark/SOC

Isopycnals • Density is generally more interesting than depth • Slow to render using existing packages • Can it be made interactive? • Second 3D texture containing attribute data • Colour fragments based on this data as well as the isosurface gradient

Simple? • No! • Need even more texture memory for extra data • Lighting calculation and texture look-ups are being performed for every fragment/voxel, even if invisible • Performance is dependent on screen size (number of fragments) • When textures won’t fit in graphics card, performance suffers terribly • Current state • Can render at approximately 10 frames per second • At present we need information as to what features are scientifically important, not part of a general modular package

Multiple passes • First Pass: render just silhouette of density isosurface • Similar to earlier toy example • No lighting • Use colour of pixel to store the voxel co-ordinates that pixel represents (r,g,b) vs. (x,y,z) • Second Pass: colour and shade pixels • Use output of first pass as a 2D texture • Use a fragment shader to combine first pass texture with 3D gradient and attribute texture • Lighting calculation only performed for visible pixels

Result • Both density and attribute textures don’t need to be in memory at the same time • Opportunity to replace the textures in between passes • Fewer fragment program instructions • Speed much less dependent on screen size • Can re-use first pass in order to render second isopycnal for the same density

Use of texture mapping • We can use colour in the texture mapping to visualize a second quantity on the isopycnal surface • Resolution very high, demands high quality of dataset, e.g missing values can be seen • Next stage is interaction with the visualization service local renderer allows user to alert service to ranges of interest which can then be rendered by more conventional means.

Visualization and data services • Our local viewer can interact with either the data services (GADS) or the Visualization services from RAL. • Local viewing can allow rapid exploration of features, key scenes can be rendered to higher quality by the visualization service. • In an unstable Grid if we lose contact with the visualization service we can continue to work with cached data. • Next GODIVA challenges involve exploiting this locality intelligently.

Conclusions • Beyond client-server visualization, local renderer can be as powerful as centralized server. • GODIVA architecture fits to a web services world. • With improved visualization techniques, quality control of datasets becomes important. • Next area of research is to explore possiblilities of caching and prefetching data for both the central visualization service and the local renderer. • Demos on NERC/NIeES and Reading e-Science Centre stands.

The importance of locality in the visualization of large datasets