220 likes | 246 Views
Explore a dynamic query mediator for interactive data exploration with millions of records and advanced statistical measures. Utilize a graphical user interface to interact with a unified results system and traditional DBMS.
E N D
Interactive Dynamic Aggregate Queries Kenneth A. Ross Junyan Ding Columbia University
Scenario Outline User Web Data Request Dynamic Query Mediator ... Graphical User Interface UnifiedResults TraditionalDBMS Dynamic Query Engine Data Files e.g., PUMS
Engine Decoupled from Interface • Can use a variety of interfaces • Multiple connections to one server • Can “do one thing well” • Client/Server parallelism • Abstract interaction via API
Engine Performance Goals • Interactivedata exploration • Millions of records • Thousands of columns (but look at ten or so at a time) • Aggregates and statistical measures • Fine adjustments at 30 answers/second.
Technical Details • Main Memory Implementation • Multidimensional tree structures • Cache consciousness • Branch Misprediction • SIMD • Asynchronous work
GlossIT GetGloss Internet Xml glossary info .gov url Sensus ParseGloss
GlossIT Automatic Ontologies from Web Pages Judith L. Klavans Peter K. Davis Samuel Popper Columbia University
Where are Glossaries? Internet
ParseGLOSSBuilding an Ontology ParseGloss
Output forSENSUS Ontology SENSUS
Data Users Social Science Research Data Component Electronic Data Service – Columbia Univ Librarians and Data Specialists Steady stream of different user groups Collect user logs and interview users Coordinated by Walter Bourne
DGRC User Interface Testbed Menu presented as grid of alternating rows and columns • Top level items in left column Ontology entry shown in beam for selected item • Located asnear as possible
DGRC User Interface Testbed Color coding shows parental and semantic relationships
DGRC User Interface Testbed Fisheye magnification of region of interest Magnified group laid out to avoid internal overlap
Goals of Evaluation Optimize the effectiveness of the interface, Identify usability problems, Provide feedbackon the overall functionality, Anticipate changes in user need that might drive future development, Validate the design, Indicate the extent to which the interface improves on previous interfaces.
Methods of Evaluation Interviews to Experts Analysis of DataGate Interface Design and Testing with Heuristic for Database Interface User and Task Analysis
Interview Findings User Type Identification • Novice and Power/Expert Users User Goals Kinds of Questions Types of Searches Related Terms for Searches • Difficulty of Use of Alternative Terms Selecting the database Learning to Use the Interface • Innovative Interface • Need Orientation and Time to Familiarize with the Interface
Interview Findings Searching Styles Flexibility to Searching Styles Helping the User Define the Search • Help users to Visualize the Context and Structure of Information • Definition and Redefinition of Search Standardization Problems Suggestions for the Design
Variables Hierarchical Structure Massive Amount Terminology Definitions Change Obscure Terminology Census Question Change Geographical References Boundaries Change Unique Boundaries Codes for Areas Various Meanings for Same Names Content Visualization Display Information Organization Dynamic Menu Magnification on Selected Items with Full Content Zoom In, Zoom Out Manipulate the Level of Magnification Searchlight Multiple Layers of Display Alternative Terms Definition of Terms Alternative Pathways Create Dynamic Maps Census Characteristics and Interface Possibilities