1 / 40

Design data retrieval and manipulation for subset of ‘Gombe’ database using QBE Durga Gumaste

Design data retrieval and manipulation for subset of ‘Gombe’ database using QBE Durga Gumaste Advisor: Dr. Shashi Shekhar June 10, 2003. Agenda. Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo.

fionn
Download Presentation

Design data retrieval and manipulation for subset of ‘Gombe’ database using QBE Durga Gumaste

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design data retrieval and manipulation for subset of ‘Gombe’ database using QBE Durga Gumaste Advisor: Dr. Shashi Shekhar June 10, 2003

  2. Agenda • Objective • Background and Motivation • Related work and my contribution • Porting of tables • Query description • Query optimization • Summary • Demo

  3. Objective Design and implement queries to access and manipulated ‘Gombe’ chimpanzees data subset, such that queries can be modified by the user having no background of any Data Manipulation Language(DML)

  4. Agenda • Objective • Background and Motivation • Related work and my contribution • Porting of tables • Query description • Query optimization • Summary • Demo

  5. Background and motivation • Data about ‘Gombe’ chimpanzees • Collected since 1953 • Behavioral and location data • 15 tables • Average size: 10-12 MB • Dr. Jane Goodall has done active research of ‘Gombe’ chimpanzee for last 35 years • Jane Goodall Institute's Center for Primate Studies at the University of Minnesota • Data retrieval for analysis on the data • Frequent query modification • Ease of modification

  6. Sample in ‘Gombe’ database

  7. Relationship between tables

  8. Agenda • Objective • Background and Motivation • Related work and my contribution • Porting of tables • Query description • Query optimization • Summary • Demo

  9. Related work • Earlier implementations • Oracle • Paradox • Limitations of earlier implementations • Ecologists not comfortable in modifying PL/SQL queries in Oracle • Paradox is not licensed at University of Minnesota

  10. Present Implementation • Database: Microsoft Access 2000 • Ecologists familiar with MS Access environment • Desktop database • Microsoft office suite • University of Minnesota has a license for Microsoft Office • Provides QBE

  11. My contribution • Port ‘Gombe’ database to MS Access • Implement new queries using MS Access • Helped behavioral ecologists to modify queries using MS Access QBE • Optimize queries

  12. Agenda • Objective • Background and Motivation • Related work and my contribution • Porting of tables • Query description • Query optimization • Summary • Demo

  13. Port tables to MS Access • Create tables in design view • Apply primary key constraints • Apply referential integrity constraints • Import tables using import utility

  14. Verification of porting • Number of records present in .txt files • Follow: 8459 records • Count of records by count query

  15. Agenda • Objective • Background and Motivation • Related work and my contribution • Porting of tables • Query description • Query optimization • Summary • Demo

  16. Queries Nested, join, range Q1: Find all chimps arriving alone Q2: Include mothers arriving with off springs in Q1 Q3: Include siblings in Q1 Q4: include mothers and siblings in Q1 Q5: Find chimps arriving together with other chimp Single table, aggregate, point Q6: Find food count of food items in a particular month of a year (Find % food counts) Q7: Find duration for which food items are eaten in a particular month of a year(Find % food duration)

  17. follow_arrival (A) follow_arrival (B) 1. Inner join (self join) on follow_arrival 2. Select chimps having fa_time_start difference more than 5 minutes for a particular follow on a particular date Inner join on A and B (self join) A.date=B.date A.follow=B.follow A.chimp<>B.chimp 3. Take location coordinates for such chimps from follow_map_position table by joining follow_arrival table with follow_map_position table Inner join with F A.date=F.date A.follow=F.follow A.chimp<>F.focal A.seq = F.seq Result Set Follow_map_position (F) Implementation (Q1) Chimps are said to be alone when arrival time between 2 chimps is more than 5 minutes Arrival time difference between A.chimp and B.chimp > 5 minutes

  18. MS Access Implementation for Q1 • Few Inner joins conditions cannot be displayed in QBE • MS Accesses uses Dyna sets • Sub-query over base query • Base query in SQL (views) • Sub-query in QBE • Sub-query easy to change by ecologists

  19. MS Access Implementation for Q1

  20. Comparison - PL/SQL & QBE for Q1

  21. Comparison - PL/SQL & QBE for Q1

  22. Q1 extension Chimps arriving alone • Mothers arriving with off springs are counted as arriving alone (Q2) • Chimps which arrive with their siblings are counted as arriving alone(Q3) • Both Q2 and Q3 (Q4)

  23. chimp_id AL AO AP AR AT Sum of certainties Follow_arrival Certainty Value 1 1 0 0 not observed blank Follow_map_time Time interval adjustment 10:03  10:00 10:11  10:15 Follow_Arrival Derived table • Written in VB Group_composition_table AL AO AP AR 1 1 0 1 3 AL740101 1/1/1974 10:00 AM 2

  24. Agenda • Objective • Background and Motivation • Related work and my contribution • Porting of tables • Query description • Query optimization • Summary • Demo

  25. Query optimization in MS Access • Cost bases query optimization • MS Jet 4.0 • Table statistics • Rushmore optimization • Efficient use of indexes • Index intersection,union,minus

  26. Performance evaluation

  27. Compacting database Compact database using Compact utility provided by MS Access • De-fragmentation • Reordering database pages • Reclaim unused space • Original size: 1.1 GB • After compaction: 284 MB • Flags queries needing recompilation

  28. Summary • Porting of data to MS Access • Query modification using QBE • Ease of writing and modifying queries • GUI • Queries over views • Base queries in SQL and sub-queries in QBE • Access uses dyna-sets • Derived tables created in VB • Multiple queries • Onetime queries • Indexes on join attributes improve the performance by 90-95%

  29. Future work • Updating group composition table if new chimp gets added to chimp table

  30. Demo • Query Description • Find out chimpanzees arriving alone (include relations) • Base query (view) in SQL • Sub-query in QBE • Modifications in QBE • Results of modifications

  31. Base query (view) in SQL

  32. Sub-query in QBE

  33. Modifications in query (1)

  34. Results(1)

  35. Modifications in query (2)

  36. Results(2)

  37. Modifications in query (3)

  38. Results(3)

  39. Acknowledgements • I would like to thank Prof. Shekhar for giving me this wonderful opportunity to work on this project and his precious guidance from time to time • I would also like to thank Prof. Pusey, Carson Murray, and Ian Gilby of Ecology Department for their help in understanding Gombe database. • I would like to thank Prof. Pusey and Prof. Srivastava for their time today

  40. Thank You!!

More Related