1 / 28

Impact Analysis of Database Schema Changes

Impact Analysis of Database Schema Changes. Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University College London {a.maule|w.emmerich|d.rosenblum}@cs.ucl.ac.uk. 2008. Database Management Systems (DBMS). Provide concurrent access

lalo
Download Presentation

Impact Analysis of Database Schema Changes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University College London {a.maule|w.emmerich|d.rosenblum}@cs.ucl.ac.uk 2008

  2. Database Management Systems (DBMS) Provide • concurrent access • efficient execution of complex queries over large datasets Often used with OO languages (C++, Java, and C#) • Results in impedance mismatch problem • Complicates impact analysis

  3. Motivation The effects of DB schema change estimated manually fragile and difficult expensive Proposition of assessing the effects of DB schema change in an automatic more reliable and cost-effective way Not the reconciliation of the impacts themselves, but rather the difficulty of discovering and predicting them

  4. Example No default value Null values not allowed

  5. The Impact of Schema Change Change3 (delete Name) • any location in the application which will behave differently • required to behave differently Change2 Change1

  6. Data Access Practices types DBConnection, DBRecord, DBRecordSet, DBParam belonging to the persistence API Entry point Need to know: • State of these types • Exact query • Where/how results are used

  7. Approach • Program Analysis (PA) is compile-time techniques for approximating the run-time properties of program • Previously has been used to extract queries from OO languages • String Analysis (SA) is a form of PA where the possible run-time values of string variables are predicted for selected locations in the program. • Gould et al used SA to predict the values of strings passed to the Java JDBC library methods, in order to check that the queries are type safe with respect to the DB schema • Christensen et al created the JSA application using SA

  8. Approach Overview(Requirements for Context-Sensitivity) • Specifies how precisely the calling context of procedures are represented in dataflow analyses k-CFA • where all or some of the propagated data in the dataflow analysis include a call string that represents the last k calling call-sites

  9. Approach Overview(Required Precision of Context-Sensitivity) distinguishes between different values of the variables belonging to separate calling contexts

  10. Approach Overview

  11. Program Slicing • Program Slicing • Extraction of a subset of the source application that can affect, or be affected by the DB calls • Why? • As k increases, k-CFA analysis has exponential complexity with respect to program size

  12. Program Slicing Example • Cite the code used in the paper as an example and show the how the slicing will reduce the code.

  13. Dataflow Analysis • Computing set of runtime properties that can occur at a given point in time of a program • The analysis is based on string analysis by Choi et al. • Two modifications in this existing string analysis • Increase Context Sensitivity • Addition of query types to the string analysis

  14. Dataflow Analysis – Increasing Context Sensitivity • Modify Choi et al’s algorithm from 1-CFA to k-CFA • Modify the property space of dataflow analysis • Abstract variables and abstract heap locations are distinguished from context locations • Extending identifiers to include a string of k call sites

  15. Dataflow Analysis – Add query types to String Analysis • query types – denotes all query representing types and those that are involved in execution and use of database queries • Generate a dataflow graph by performing a standard fixed point iteration of the graph from the slicing stage • Result: • All query representation types have an estimated set of possible runtime values • Other query type objects (returned result sets) have unique identifiers associated based on their instantiation information

  16. Dataflow Analysis – Extracting Dataflow Information

  17. Impact Calculation • Involved in the prediction of possible effects of database schema change • Use CrocoPat [tool that efficiently executes relational programs against arbitrary data]

  18. Impact Calculation (contd…) +

  19. Implementation • Currently used only for C# applications that use SQL Server databases • Total size of SUITE = 19 KLOC (written in C#)

  20. Evaluation - Basis • What is this evaluation for ? • Evaluate the feasibility of the technique presented in this paper • To generalize the evaluation, the subject application had to represent the real world practice for database driven applications

  21. Evaluation - Setup • Subject Application: irPublish (content management system) • Consists of 127 KLOC of C# code • Uses database schema of up to 101 tables and 615 columns and 568 stored procedures • Three (interesting changes) out of 62 previous schema changes were chosen for evaluation • System Configuration used: • 2.13GHz Intel Pentium Processor • 1.5GB RAM

  22. Evaluation – Setup (contd…) • Schema changes used in this evaluation are as shown: • Value of k used for dataflow analysis is 2 (there were places where the value for k need to be set up to 7)

  23. Evaluation Summary – Comparison of Predicted changes vs. Observed changes • Program Slicing reduced the size to 37% (from 191173 instructions to 70050 instructions)

  24. Evaluation - Execution times of the analysis vs. increase in context sensitivity

  25. Summary of Results • Highlights the importance of context sensitive program analysis • High level of context sensitivity is required in many real world architectures where similar architecture patterns are used • The types of schema change that occurred agree with the predictions of the study

  26. Related Work • Impact analysis of database schema • A. Karahasanovic - Supporting Application consistency in Evolving Object-Oriented Systems by Impact Analysis and Visualization [2002] • String analysis • A. S. Christensen, A. Moller, and M. I. Schwartzbach. - Precise analysis of string expressions [2003] • C. Gould, Z. Su, and P. Devanbu. Static Checking of Dynamically Generated Queries in Database Applications. [2004] • T.-H. Choi, O. Lee, H. Kim, and K.-G. Doh. - A practical string analyzer by the widening approach. [2006] • Dataflow analysis • S. Horwitz, T. Reps, and M. Sagiv. - Demand interprocedural dataflow analysis [1995]

  27. Future Work • Investigate alternatives to program slicing technique to reduce cost of dataflow analysis • To analyze impact of the available impact analysis techniques on the development of database applications

  28. Conclusion SUITE demonstrates to be more precise than the related work currently available in the area of impact analysis of database changes

More Related