Using machine learning to predict project effort empirical case studies in data starved domains
Download
1 / 22

Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains - PowerPoint PPT Presentation


  • 61 Views
  • Uploaded on

Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains. Gary D. Boetticher Department of Software Engineering University of Houston - Clear Lake. What Customers Want. What Requirements Tell Us. Standish Group [Standish94].

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains' - steel-stokes


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Using machine learning to predict project effort empirical case studies in data starved domains

Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains

Gary D. Boetticher

Department of Software Engineering

University of Houston - Clear Lake


What customers want
What Customers Want Case Studies in Data-starved Domains


What requirements tell us
What Requirements Tell Us Case Studies in Data-starved Domains


Standish group standish94
Standish Group Case Studies in Data-starved Domains[Standish94]

  • Exceeded planned budget by 90%

  • Schedule by 222%

  • More than 50% of the projects had less than 50% requirements


Underlying problems
Underlying Problems Case Studies in Data-starved Domains

85% are at CMM 1 or 2 [CMU CMM95, Curtis93]

Scarcity of data


Consequences
Consequences Case Studies in Data-starved Domains

Early life-cycle estimates use a factor of 4 [Boehm81, Heemstra92]


Related research economic models
Related Research: Economic Models Case Studies in Data-starved Domains



Related research 2
Related Research - 2 estimating early in the life cycle?


Goal estimating early in the life cycle?

Apply Machine Learning (Neural Network)

early in the software lifecycle

against Empirical Data


Neural network
Neural Network estimating early in the life cycle?


Data estimating early in the life cycle?

  • B2B Electronic Commerce Data

    • Delphi-based

    • 104 Vectors

  • Fleet Management Software

    • Delphi-based

    • 433 Vectors


Experiment 1 product based fleet to b2b
Experiment 1: estimating early in the life cycle?Product-Based Fleet to B2B


Experiment 1 product results
Experiment 1: estimating early in the life cycle?Product Results


Experiment 2 project based results fleet to b2b
Experiment 2: estimating early in the life cycle?Project-Based Results Fleet to B2B


Experiment 3 product based b2b to fleet
Experiment 3: estimating early in the life cycle?Product-Based B2B to Fleet


Extrapolation issue
Extrapolation issue estimating early in the life cycle?

Largest SLOCs divided by each other

4398 / 2796 = 1.57


Experiment 3 product results
Experiment 3: estimating early in the life cycle?Product Results


Experiment 4 project based results b2b to fleet
Experiment 4: estimating early in the life cycle?Project-Based Results B2B to Fleet


Results
Results estimating early in the life cycle?


Conclusions
Conclusions estimating early in the life cycle?

  • Bottom-up approach produced very good results on a project-basis

  • Results comparable between NN and stat.

  • Scaling helped

  • Estimation Approach is suitable for Prototype/Iterative Development


Future directions
Future Directions estimating early in the life cycle?

  • Explore an extrapolation function

  • Apply other ML algorithms

  • Collect additional metrics

  • Integrate with COCOMO II

  • Conduct more experiments (additional data)


ad