200 likes | 377 Views
Techniques for Visualizing Massive Data Sets. Leilani Battle , Mike Stonebraker. Context. Visualization System. query. result. Database. Problem. Performance Vis systems don’t scale well for big d ata Or are turning into databases Over-plotting M akes visualizations unreadable
E N D
Techniques for Visualizing Massive Data Sets Leilani Battle, Mike Stonebraker
Context Visualization System query result Database
Problem • Performance • Vis systems don’t scale well for big data • Or are turning into databases • Over-plotting • Makes visualizations unreadable • Waste of time/resources
Solution: Resolution Reduction Visualization System query Resolution Reduction Layer queryplan query modified query queryplan result reduced result Database
ScalaR • Scalable vis system for data exploration • Web front-end • Uses SciDB (www.scidb.org) • Visualizes query results • Performs Resolution Reduction
Array Browser • Collaboration with: • Brown: Justin DeBrabant, Stan Zdonik, UgurCetintemel • Stanford: Zhicheng Liu, Jeff Heer • Google Maps-style exploration experience • Fetches subsets of the data (aka data tiles)
Future Work: Prefetching • Goal: Reduce user-wait time by prefetching tiles • Cache tiles in the tile buffer • Need algorithms to decide what to pre-fetch
User Behavior Predictor (Seer) • Learn common query sequences from user traces P P
Statistical Analysis Predictor • Look for statistical similarities in tiles • Try to guess what’s important based on patterns P P P
Using Multiple Predictors • Run multiple predictors (or experts) in parallel • Compare predictions to user’s actual behavior • Use predictions from best performing expert • May change over time based on user’s goals
Other Challenges • Lots if interesting problems left to address • Best eviction policy for the tile buffer? • How to share data between multiple users? • More predictors?
Sagittarius Gemini Dogs Cats
Prefetching Experts • User behavior predictor (Seer) • Learn common query sequences from user traces • Stats analysis predictor • Look for statistical similarities in tiles • Try to guess what’s important based on patterns