1 / 46

Mega Software Engineering and EASE Project

International Workshop on Community-Driven Evolution of Knowledge Artifacts UC Irvine, Dec. 16-18, 2003. Mega Software Engineering and EASE Project. Katsuro Inoue Osaka University. Overview.

lindamurphy
Download Presentation

Mega Software Engineering and EASE Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. International Workshop on Community-Driven Evolution of Knowledge Artifacts UC Irvine, Dec. 16-18, 2003 Mega Software Engineering and EASE Project Katsuro Inoue Osaka University

  2. Overview • Proposed a concept of Mega Software Engineering, which shares experiences and knowledge in community • Introduced EASE project based on the concept of MSE • Presented the overview of Empirical Environment and showed current implementation of Empirical Project Monitor EMP, as a partial realization of Empirical Environment • Predicted ongoing directions to deeper analyses of empirical data

  3. Empirical Software Engineering • Various technologies in Software Engineering based on empirical data • Essential for scientific improvement of project processes and products

  4. analysis improvement 3 Major Phases in Empirical SE collection

  5. Classification of ESE Technologies by Target Scale Mega Software Engineering

  6. Mega Software Engineering MSE • Targets many projects • A new concept but not a new technology itself • Collection of key technologies already existing and emerging • Distributed environment and data sharing • Analysis and data mining • Project monitoring and controlling • Scalable computing • ... • Use advances of hardware performance, e.g., network bandwidth, CPU clock, memory space, disk capacity, ... • Software engineering technology should share in advances of hardware, which is mainly used for multimedia, grid, simulator, ...

  7. Characteristics of MSE • Experience and knowledge of individual developer or project are collected, refined as assets, and reused in community • Single-level flat static community for information sharing • Automatic process : Little burden is required for each developer or manager • View from the organizational benefits may be directly obtained (no individual developer’s view or project view) • Open source development is a simple case of MSE (MSE focuses analysis and feedback)

  8. EASE Project • Empirical Approach to Software Engineering • Using the concept of MSE as its basis • Funded by MEXT (Japanese government, Ministry of Education, Culture, Sports, Science and Technology) • 5 year project starting 2003 Senri Lab.

  9. Project Target Empirical software development environment from 1 to thousands of projects Empirical Environment

  10. Project Objectives • Development of empirical environment • Application of empirical environment to real projects • Collection of data and expertise of empirical SE • Organizational benefits by applying empirical environment

  11. Analysis Related Organization Software Development Organization Concept of Empirical Environment Internet Public Domain Software Open Source Project Collection Improvement

  12. ImplementingEmpirical Environment

  13. (1)Policy for Collection • Goal first (ideal cases)   → Data collection first (Realistic approach) • Collect mainly product data(Obtain process data from product data) • Minimize developers overhead for collection • Raw data without human tampering • Real-time collection • Applicable to various projects • Small scale • Non-water fall process such as XP • Distributed development including sub-contracting

  14. (2)Policy for Analysis Step-wise implementation difficult 5. … • 4.Reuse comp./ expertise • 3.Classification and evolution 2.Inter-project metrics simple 1.Process / product metrics inside single project

  15. (3)Policy for Improvement • Feedback method for each objective • Various mechanisms for various cases Currently construct a browser for visualizing collected data and measured metrics

  16. Empirical Project MonitorEPM • A partial implementation of Empirical Environment • Collect, measure, and show various data for project control • Data source • Versioning system CVS • Mailing list manager Mailman • Issue tracking tool GNATS

  17. CVS, Mailman, GNATS, (WinCVS, CorporateSource) Architecture of EPM analysis tools developer manager measurement of intra and inter projects PostgreSQL(Repository) Standardized empirical SE data (in XML) developer manager prediction/ schedule metrics value other tool data etc. versioning history mail history problem history

  18. Characteristics of EPM • Use open source development tools →Easy to introduce • Small overhead of data collection • Most data from versioning history • Communication through e-mail, and recoding issues by tracking tool • Easy to transform other data format to the standardized empirical SE data format

  19. Application Area of EPM • Large project • Share project status immediately • Reduce project management load • Reduce risk for tampering data • Small project • Apply with small cost • Apply to various projects, including XP and distributed development

  20. Features ofEmpirical Project Monitor

  21. EPM Analysis Tool • Single activity view • Source code size • Issue resolution time • Cumulative number of issue, number of unsolved problems, ... • Multiple activity view • Check-in and check-out • #Issue and #mail • check-in and #issue

  22. Growth of LOC • Progress monitoring • Schedule v.s. actual menu Project: EmpiriPrj LOC Cumulative LOC month

  23. Growth of LOC(3 months) LOC Project: EmpiriPrj LOC Check-in occurred month

  24. Growth of LOC Open source project nkf (character-code converter) LOC LOC Check-in occurred month

  25. Cumulative Issues/Unsolved Issues /Mean Resolution Time cumulative issues Project:EmpiriPrj mean resolution days cumulative issues unsolved issues mean resolution days month

  26. Check-in and Check-out # check-out Project:EmpiriPrj # check-out Check-in occurred month

  27. CVS Log View

  28. Growth of Mail and Issues cumulative # mail Project:EmpiriPrj cumulative # mail check-in occurred issue raised issue resolved month

  29. Mail Log View

  30. Cumulative Issues and Check-in cumulative # issues Project:EmpiriPrj cumulative issues check-in occurred month

  31. Future of Empirical Environment

  32. Extending Analysis Features • Make deeper analysis and extract organizational expertise • Find and reuse expertise easily

  33. EPM(developing) Code clone detection Component search Metrics measurement Project categorization Cooperative filtering Product data archive (CVS format) Process data archive (XML format) Format Translator Format Translator Format Translator Format Translator Versioning (CVS) Mailing (Mailman) Issue tracking (GNATS) Other tool data Managers Project x Project y Corporate Source GUI Project z . . . Developers

  34. Example Scenario (1) Scheduled progress of project X 1 Actual progress of project X 2 Find projects similar to X - Project categorization - Collaborative filtering E C A W X Y V Q T P

  35. Example Scenario (2) 3 Average reuse rate in similar projects Project X’s reuse rate - Code-clone detection Promote using software asset search engine to project X 4 - Software asset search engine

  36. Expected Effect • Productivity can be drastically improved by reusing organizational assets • Management of assets can be easily performed • Cost control can be precisely made relative to previous similar projects • Reliability can be improved using issue history

  37. Analysis Technology (1)Fast Code Clone Detection Code clones = similar portions of program

  38. Analysis Technology (2)System Similarity Using Code-Clone Detecion

  39. Analysis Technology (3)Collaborative Filtering Represen-tative Collaborative OutcomeAdopted Focused Q & MResources 9 9 9 7 7.5 (target) App. A 8 7 8 ? (missing) 8 App. B ? (missing) 8 8 8 7 App. C 7 6 ? (missing) 9 6 App. D

  40. Analysis Technology(4) Java Class Search Engine SPARS-J

  41. 0.02 0.01 0.01 0.05 0.03 0.001 0.1 Markov Model • Component rank model can be considered as a Markov Chain of user's focus • User's focus moves from one component to another along a use relation at a fixed time duration • Node weight represents the existence probability of the user's focus at infinite future

  42. Demo of SPARS-J http://demo.spars.info

  43. Current Status and Schedule • Current - Demo version of EPM • First quarter of 2004 a release of EPM • First quarter of 2005 Application of EPM in industry • End of 2005 Inclusion of analysis tools • User group, consortium, interest group, ...

  44. Summary • Proposed a concept of Mega Software Engineering, which shares experiences and knowledge in community • Introduced EASE project based on the concept of MSE • Presented the overview of Empirical Environment and showed current implementation of Empirical Project Monitor EMP, as a partial realization of Empirical Environment • Predicted ongoing directions to deeper analyses of empirical data

More Related