1 / 27

Getting Started Writing a Thesis/Dissertation

Getting Started Writing a Thesis/Dissertation. Dr. Karen C. Davis Electrical & Computer Engineering Dept. Graduation. ECES 877 Advanced Data Models and Query Optimization. query optimization logical physical advanced data models object-relational data warehouse XML. Spring 2007

austin
Download Presentation

Getting Started Writing a Thesis/Dissertation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Getting Started Writing aThesis/Dissertation Dr. Karen C. Davis Electrical & Computer Engineering Dept.

  2. Graduation

  3. ECES 877 Advanced Data Modelsand Query Optimization • query optimization • logical • physical • advanced data models • object-relational • data warehouse • XML Spring 2007 coming to a classroom near you!

  4. Relational Algebra Query Trees Sujan Turlapaty’s thesis defense: Performance Analysis of Self-Maintainable Data Warehousing Algorithms, 11/99

  5. view chromosome: 101100010100001 index chromosome: 1100110 Fitness: sum of query processing costs of individual queries using the views and indexes selected Q2 Q3 Q1 πO.orderkey, O.shippriority (v9) πC.custkey, C.name, C.acctbal, N.name, C.address, C.phone (v12) πP.type, L.extendedprice (v15) σ C.mktsegment = “building” and L.shipdate = “1995-03-15” (v8) σ O.orderdate = “1994-10-01” (v11) σ L.shipdate = “1995-09-01” (v14) ⋈nationkey (v10) ⋈orderkey (v7) ⋈custkey (v6) ⋈partkey (v13) πname, address, phone, acctbal, nationkey, custkey, mktsegment (v1) πorderkey, orderdate, custkey, shippriority (v2) πpartkey, orderkey, shipdate, extendedprice (v3) πnationkey, name (v4) πpartkey, type(v5) Customer (C) Orders (O) Lineitem (L) Nation (N) Part (P) Multiple View Processing Plan (MVPP) thesis defense of Sirisha Machiraju: Space Allocation for Materialized Views and Indexes Using Genetic Algorithms, June 2002

  6. BH System Architecture Michael Brant, Binding Hash Technique for XML Query Optimization, 2006

  7. My Students Ph.D. (2) Satish Venkatesan, 1996, Database Modeling for Electronic Design Automation Environments, awarded ECECS Outstanding Dissertation Award, 1996. Yunsong Zhan, XML-based Data Integration for Application Interoperability, 2002. M.S. (24) Lun Ye, A Compiler Cooperative Dynamic Memory Management System for C++, 1993. Ron Meade, EasyOpt: A Design Optimization Interface Package, 1994. Rao Seshagiri Kasinadhuni, Design and Performance Issues of Client-Server DBMS Architectures, 1994. Samir Nigam, Transformation-based Semantic Query Optimization for Object-Oriented Databases, 1994. Baskaran Dharmarajan, The Property Map: A Theoretical Foundation and Query Optimization Algorithms, 1997. Mala Rajamani, Reduction and Maintenance of Self-maintainable Views for Data Warehousing, 1997. Veena Pandiri, A Global Framework for Distributed Agent-based Systems, 1997. Radha Ganapathy, Selection of Self-Maintainable Views to Materialize in a Data Warehouse, 1998. Vishal Sheth, Extended Property Maps: An Efficient Access Mechanism for Retrieval from Large Data Sets, 1998. Gayathri Krishnan, Physical Schema Design for Object Databases, 1998. Shobha Ravishankar, Object-Oriented Index Selection and Integration, 1998. Ji Qin, Access Plan Generation for Property Maps and Multidimensional Indexes, 1999. Sujan Turlapaty, Performance Analysis of Self-Maintainable Data Warehousing Algorithms, 1999. Unmi Tina Kang, Path Inherited Dictionary Index (PIDI): An Integrated Object-Oriented Database Index, 2000. Jennifer Grommon-Litton, Heuristic Design Algorithms and Evaluation Methods for Property Maps, 2000. Rajeswari Malladi, Applying Multiple Query Optimization in Mobile Databases, 2001. Xioaming Du, Dynamic Channel and Broadcast Disk Organization in Mobile Databases, 2001. Krishnamoorthy Janakiraman, Entity Identification Using Data Mining Techniques, 2001. Casie Phipps, Migrating an Operational Database Schema to Data Warehouse Schemas, 2002. Ashima Gupta, Performance Comparison of Property Map Indexing and Bitmap Indexing for Data Warehousing, 2002. Sirisha Machiraju, Space Allocation for Materialized Views and Indexes Using Genetic Algorithms, 2002. Ravi Darira, A Design Framework for Property Maps, 2006. Micheale Brant, Binding Hash Technique for XML Query Optimization, 2006. Janet Rajan, A Framework for Medical Acronym Disambiguation, 2007.

  8. Thesis/Dissertation Organization title page, abstract, dedication, table of contents, list of figures, list of tables • Introduction • Related Research • Foundations • Results (may be several chapters) • Conclusions and Future Work • Appendices

  9. Sample Table of Contents

  10. Introduction • introduce the general topic area • narrow the focus to specific topic • motivate the research • why is it needed? • who will benefit from the research? • conclude with a clear statement of the problem • give a statement of the work • provide an overview of the thesis (one sentence per chapter)

  11. conventional database systems are increasingly leveraged for organizational decision-making analysis systems are different than conventional operational systems because … … because of these differences, designing a data warehouse has challenges … this thesis addresses specific phases of design Sample Introduction

  12. Research Objectives • general research objective: one sentence describing what you hope to accomplish (not how!)

  13. Parallel Sections: Statement of the Work • specific research objectives: partition the general objective into sub-goals • research plan/methodology/tasks/approach: revisit the objectives • your approach to solving the problem • each objective has an associated task or approach to satisfy the objective • expected contributions: revisit the methodology • what will you know or have when you’ve done the task? • potential impact of your work

  14. expected contributions describe what will be accomplished by executing the methodology specific research objectives accomplish the general research objective methodology defines approach to accomplishing the objectives Sample Parallel Sections

  15. Related Research • focused around your topic; not a tutorial! • compare/contrast to your approach • tables with features/research efforts are concise, readable way to summarize

  16. Examples of Summary Tables

  17. Foundations • work you build on (your own or someone else’s) • definitions, theorems, models, system

  18. Research Discuss conventions, setup, hypotheses of experiments, proofs • why did you do it? • what did you learn from it? Presenting • figures • algorithms • tables • graphs Sample! Don’t do a dump of everything … put everything in appendices and discuss representative results in the body of the thesis or dissertation

  19. PMap Creator PMap storage and performance measurement simulator REBSI storage and performance measurement simulator Example Experiment Setup Goals • What are the comparative storage and retrieval cost of REBSI and PMaps in different scenarios? • How is individual and relative performance affected by parameters such as blocksize, database size, selectivity of queries and cardinality of attributes, kind of queries, property ordering? • Can PMaps design and performance be improved using this knowledge? • In what conditions is it better to use either index? Query Set PMap PMap Performance [1…6] properties and Storage Cost Word Size (ws) (pu, pstring) {16, 32} Tuple size (t) {1,000,000, 50,000} Blocksize (SB) {2048, 4096, 8192}REBSI Performance and Storage Cost Scaling Factor (sf) {min, …, 10}

  20. Queries are ordered by the difference of REBSI with min_sf and PAvg. • Observations: • REBSI performance improves as the sf becomes larger. • REBSI performance improves as cardinality becomes smaller. • PAvg performance deteriorates as cardinality becomes smaller. • PAvg is better than REBSI min_sf (4) for all queries. • PMin << pages retrieved by any REBSI. • PMap retrieves fewer pages for multi-attribute queries than single attribute queries. Example Presentation of Results • number figures (e.g., Figure 3.2) • refer to the figures in text “In Figure 3.2, results for the HCAQS are shown.” “Figure 3.2 shows HCAQS results.” • explain the conventions “The x-axis shows individual queries and the y-axis shows index pages retrieved. The queries are ordered by decreasing cardinality.” • offer observations to help the reader see what is important or interesting “REBSI performance improves as the cardinality decreases.” • discuss possible reasons for the observed results • give general conclusions

  21. Conclusions and Future Work • revisit objectives • what was accomplished? • what was learned? • topics for future work • extensions • open questions

  22. Conclusions: • BH method work well for deeply nested queries with few branches (non-bushy) • BH Indexing technique requires further optimization • BindingCollection is a flexible data structure • Can be used in to generate witness trees for processing embedded Xpath expressions • Used to process Xpath expression directly • Can use a different indexing schemes Future Work: • Modify indexing technique to increase performance and perform inequality matching • Expand Post-order Traversal to support more TAX pattern tree features: e.g., value-based joins • Expand more extensive performance study

  23. Citations • allow the reader to follow up on the topic • fill in background information • judge what you’ve said by reading original sources • relieve you of the burden of going over all territory on a subject • strengthen/justify your point • respect your peers by acknowledging their contributions [vL78]

  24. Citations • not a part of speech! • not a part of speech! • not a part of speech! #11: never, ever, use a bracketed number as if it were the name of an author or a work [vL78]: • BAD: “In [23], algorithms are presented …” • GOOD: “Jones presents algorithms … [23].”

  25. Writing Style • avoid vague words (e.g., “deals with,” “handles”) • avoid contractions • be consistent in spelling, punctuation, capitalization style • use the same grammatical style for items in a list • develop flow/transitions between paragraphs, sections, chapters • avoid empty sections • merge/eliminate single item sublists or subsections • place punctuation inside quotes • avoid second person (“you …”) • try to write in only one verb tense, preferably the present tense • use “including …” instead of “etc.” • use “such as” instead of “like” • put math in definitions, theorems, proofs; explain in English to build the reader’s intuition • Use “:” instead of “-” in technical writing • Space after “)” and “:” (not before!) • Use “that” instead of “which” when not counting things

  26. References [s99] Strunk, The Elements of Style, New York: bartleby.com, 1999, http://www.bartleby.com/141. [vL78] van Leunen, M.-C., A Handbook for Scholars, Alfred A. Knopf, 1978.

  27. Current Research Work • Sandipto Banerjee, Ph.D. • Bartley Richardson, Ph.D. • Lydia Fitzgerald, M.S. • Bill Nicholson, M.S./Ph.D

More Related