1 / 22

Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains

Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains. Gary D. Boetticher Department of Software Engineering University of Houston - Clear Lake. What Customers Want. What Requirements Tell Us. Standish Group [Standish94].

Download Presentation

Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains Gary D. Boetticher Department of Software Engineering University of Houston - Clear Lake

  2. What Customers Want

  3. What Requirements Tell Us

  4. Standish Group [Standish94] • Exceeded planned budget by 90% • Schedule by 222% • More than 50% of the projects had less than 50% requirements

  5. Underlying Problems 85% are at CMM 1 or 2 [CMU CMM95, Curtis93] Scarcity of data

  6. Consequences Early life-cycle estimates use a factor of 4 [Boehm81, Heemstra92]

  7. Related Research: Economic Models

  8. Why are Machine Learning algorithms not used more often for estimating early in the life cycle?

  9. Related Research - 2

  10. Goal Apply Machine Learning (Neural Network) early in the software lifecycle against Empirical Data

  11. Neural Network

  12. Data • B2B Electronic Commerce Data • Delphi-based • 104 Vectors • Fleet Management Software • Delphi-based • 433 Vectors

  13. Experiment 1: Product-Based Fleet to B2B

  14. Experiment 1: Product Results

  15. Experiment 2: Project-Based Results Fleet to B2B

  16. Experiment 3: Product-Based B2B to Fleet

  17. Extrapolation issue Largest SLOCs divided by each other 4398 / 2796 = 1.57

  18. Experiment 3: Product Results

  19. Experiment 4: Project-Based Results B2B to Fleet

  20. Results

  21. Conclusions • Bottom-up approach produced very good results on a project-basis • Results comparable between NN and stat. • Scaling helped • Estimation Approach is suitable for Prototype/Iterative Development

  22. Future Directions • Explore an extrapolation function • Apply other ML algorithms • Collect additional metrics • Integrate with COCOMO II • Conduct more experiments (additional data)

More Related