1 / 33

DataScope: A Database Content Visualization Tool based on Ranking Queries

DataScope: A Database Content Visualization Tool based on Ranking Queries. CS511 Course Project Tianyi Wu Dec 08, 2006. DataScope. Motivation Contributions Demonstration Architecture Design & Implementation Future Work. DataScope. Motivation Contributions Demonstration Architecture

vivien
Download Presentation

DataScope: A Database Content Visualization Tool based on Ranking Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DataScope: A Database Content Visualization Tool based on Ranking Queries CS511 Course Project Tianyi Wu Dec 08, 2006

  2. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  3. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  4. Motivation • Existing database systems • SQL query-based • Form-based • Limited user interface • Inconvenient to browse data • Existing database visualization • Polaris – Stanford • Stotle et al. Query, analysis, and visualization of hierarchically structured data using Polaris. KDD 2002. • DIVE-ON – U of Alberta • Ammoura et al. Towards a novel OLAP interface for distributed data warehouses. DaWaK 2001. • Maryland • Various projects and tools • http://www.cs.umd.edu/hcil/research/visualization.shtml

  5. Some screenshots (Polaris)

  6. Some screenshots

  7. Some screenshots (Maryland)

  8. Limitations of Existing Work • Particular domains • Spatial-temporal • Time-series • Predefined schemas • Fixed visual representation • Statistical charts. (e.g. scatterplots)

  9. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  10. Goal • Visualize databases like Google Maps! • Content-based • Explorative, easy-to-use • Dragging (pan), drilling (zoom)… • Domain independent • Web-based interfac • Fast query processing

  11. Challenges • Layout • Not well-defined and well-understood • Maps : longitude, latitude • How to position objects on screen? • User preferences on what to see • Different users have different preferences • Even the same user may have different preferences based on the context of the query • Data is often associated with multiple hierarchies or semantic links • Powerful query engine

  12. Contribution • Interface design • Principles which can address the above challenges • A system prototype • Efficient implementation • Ranking-Cube [Xin et al. VLDB’06] • Ranking Aggregation

  13. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  14. About the Demo • Not fully-functional yet • Ajax vs. PHP • Demonstrate important concepts • Ranking • Efficiency • Customization • Datasets • Real: DBLP (extracted 20387 database related entries) • Synthetic: store database

  15. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  16. System Architecture

  17. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  18. DataScope Overview

  19. DataScope Overview

  20. DataScope Overview • Design principles • Structured dimensions • Ordering of attribute values • Selection of comprehensive layout • Quick selection • Display rich information • Easy customization • Implementation • Linear ranking functions with arbitrary selections • Ranking on aggregation

  21. Design principles • Structured dimensions • View data in multi-resolution • Roll-up and drill-down • Can be automatically generated • Numeric attributes • Age, salary, price • Categorical attributes • Milk - dairy products - food

  22. Design principles • Ordering of attribute values • How to order values along X/Y axis? • Ascending/Descending • Alphabetic order (e.g. AAAI, CIKM…) • Numeric order (e.g. 2001, 2002…) • Independent of any ranking function • The order of a value is not determined by its score

  23. Design principles • Selection of comprehensive layout • Initial layout • High-level, familiar to most users • Map - US map • DBLP – (AI, Theory, System)*(80s, 90s, 00s) • Subsequent layout • Can be changed according to different data • Customizable

  24. Design principles • Quick selection • Dragging • Scrolling the mouse wheel to roll-up/drill-down or zoom in/out • Push constraint easily • Context menu

  25. Design principles • Display rich information • Top-k (as in Google Maps) • K-representative items • Outliers • Display primitives • Color, size (e.g. big cities have big font), etc. • Searching

  26. Design principles • Easy customization • Users can freely define their own layout • X=location, Y=year • Adjust the resolution • More/less objects on screen • Customize ranking function • e.g. rank houses by “0.7*Price+0.3*size” • Selection “database conferences” and “2003-2006”

  27. Implementation • Ranking-Cube • Xin Et al. Answering top-k queries with multi-dimensional selections (VLDB’06) • Linear ranking functions • Arbitrary selections • Methods • Partition the data space and store blocks • Progressively retrieve the most promising blocks for each query • Data fragments • Partial materialization to deal with high dimensionality

  28. Ranking on Aggregation • Example • Given a relation (conference, year, author, paper) • Query • SELECT top k COUNT(author) • FROM R • GROUP BY conference, year

  29. Ranking on Aggregation • Method • Materialization for all possible cuboids • Algorithm • Input: aggregation dimension D, ranking dimensions R, concept hierarchies H. • Output: a set of ranking fragments S; • 1) For each possible group-by of R and H • 2) Compute aggregation for each value in D ; • 3) Compute ranking fragments for D; • 4) S = S + D;

  30. DataScope • Motivation • Contributions • Demonstration • Architecture • Design & Implementation • Future Work

  31. Conclusion • DataScope • Extend the current prototype to support mapping operations and multiple sessions • Improve design principles which can lead to a more effective interface • Support various ranking queries efficiently

  32. Future work • Interface • Improve the initial system prototype • Support the full set of operations • Support easy customization • Implementation • Rich research issues • Ranking objects based on user feedbacks • Retrieve most relevant objects in keyword searching • Multiple types of ranking queries

  33. Thank you! Any questions?

More Related