Enhancing Scientific Data Management with Model-based User Views

MauveDB: Model-based User Views

Problem • Databases are unusable for scientific data • Data are incomplete, imprecise, and erroneous • Need to be filtered/synthesized using models • Scientists use the in the most rudimentary ways • As a backing store for raw data • Run few or no queries • User-define functions are inadequate • Static models, insufficient for many applications • Let’s discuss this later?

Approach • Define user-views based on a model syntax • Extend traditional SQL-view model • User views provide access to synthesized data • Data independence • Present stable view of system • When sites don’t report data (missing values) • When network changes • Report data at different locations than sampled • View maintenance • Issues of whether to materialize or not

Processing Scientific Data • Without Model-based views • Export to Matlab then apply models • Use custom, programmatic querying tools • Can’t use SQL • Getting data back into database is awkward and inefficient • With Model-based views • Self-updating models as data changes • Standard SQL data against synthesized data

Example • Benefits • Network changes are transparent • Spatial or temporal biases removed (e.g., for aggregates) • What about model errors?

Architecture

View Creation: Regression • Select a virtual grid on which data are reported • Using MatLab style syntax • Create a unique model at each time T

View Creation: Interpolation • Interpolate missing values from nearby sites

Case Study 1: Temp Regression

Case Study 2: Temp Interpolation

The AS Clause • AS clause specifies each model • AS FIT • AS INTERPOLATE • Probably needs extended syntax for models methods • INTERPOLATE with splines, nearest neighbor, regression • User-views are only as flexible as models pre-programmed into the syntax • How does this compare with UDFs, table valued functions? • Is this the appropriate level for this kind of customization?

View Maintenance • Options • Logical: build results for each query • Materialized: pre-compute all results for each model • Partial/Cached: store results generated by queries • Model-based: often models have fixed costs • Building basis functions, matrix inversions, linear solutions • Tradeoff between query latency and overhead • Is implementing model logic at such a low level reasonable?

Outcomes/Opinions • Is MauveDB the technology that will make scientists use databases?

Enhancing Scientific Data Management with Model-based User Views

Enhancing Scientific Data Management with Model-based User Views

Presentation Transcript

Efficient User Interest Estimation in Fisheye Views

User / Kernel Communication Model

Continuing Work in Model-Based User Interfaces

Guava: Graphical User Interfaces as Updatable Views

Lecture 13: Continuing Work in Model-Based User Interfaces

User Based Tech Support

A User-to-User Relationship-based Access Control Model for Online Social Networks

Creating and Managing VIEWS and User Access

Creating and Managing VIEWS and User Access

Recent Work in Model-Based User Interfaces

Activity-based User Interfaces

Modeling Provenance through User views

Country reports: Rationale and user views

MauveDB: Supporting Model-based User Views in Database Systems

Event-based Synchronization of Model-Based Multimodal User Interfaces

User Based Tech Support

End-User Support for Debugging Demonstration-based Model Transformation Execution

Designing user interfaces using: Simple views

User views from outside of Western Europe

Model based design Cognitive (user) models