1 / 17

Using Correlation and Accuracy for Identifying Good Estimators

Using Correlation and Accuracy for Identifying Good Estimators. Gary D. Boetticher Nazim Lokhandwala Univ. of Houston - Clear Lake, Houston, TX, USA boetticher@uhcl.edu Lokhandwala@uhcl.edu. 61. 62. 63. http://nas.cl.uh.edu/boetticher/publications.html.

affrica
Download Presentation

Using Correlation and Accuracy for Identifying Good Estimators

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Correlation and Accuracy for Identifying Good Estimators Gary D. Boetticher Nazim Lokhandwala Univ. of Houston - Clear Lake, Houston, TX, USA boetticher@uhcl.eduLokhandwala@uhcl.edu 61 62 63 http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  2. 82% Human 12% Human 20% ML 68% Algorithm 18% Formal Research vs. Reality according to Jörgensen JSS ’04: Compendium of expert estimation studies TSE ’07: 300+ software est. papers, 76 journals, 15+ Years http://nas.cl.uh.edu/boetticher/publications.html The 3rd International Predictor Models in Software Engineering (PROMISE) Workshop

  3. ((Log (TechGradCourses + (TechGradCourses ^ ((Log TotWShops)/(Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (((ProcIndExp + (Log (Sin MgmtGradCourses)))/(Sin SWPMExp)) + (Sin ((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Sin SWPMExp)))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((Log SWProjEstExp) / (((Log (ProcIndExp + (Log (TechGradCourses ^ ((Log SWProjEstExp) / (Log SWProjEstExp)))))) - 3) / (ProcIndExp + (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (TechGradCourses ^ (Log SWProjEstExp))))) / (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp)))))))))))))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) + ((Log SWProjEstExp) / (Log SWProjEstExp)))))) / (Log (Log (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp))))))))))))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Log ((((Log TotLangExp) / (Log SWProjEstExp)) / (Log SWProjEstExp)) / (Sin SWPMExp))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))))))) + (((((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + ((TechGradCourses ^ (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))) / (Sin SWPMExp))))))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (Sin SWPMExp))) Statement of Problem Some Background 2006 http://www.starwarscrawl.com/?id=232 http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  4. TechUGCourses < 45.5 | Hardware Proj Mgmt Exp < 6 | | No Of Hardware Proj Estimated < 4.5 | | | No Of Hardware Proj Estimated < 3 | | | | TechUGCourses < 23 | | | | | Hardware Proj Mgmt Exp < 0.75 | | | | | | TechUGCourses < 18 | | | | | | | Hardware Proj Mgmt Exp < 0.13 | | | | | | | | TechUGCourses < 0.5 | | | | | | | | | TechUGCourses < -1 : F (1/0) | | | | | | | | | TechUGCourses >= -1 | | | | | | | | | | Degree < 3.5 : A (4/0) | | | | | | | | | | Degree >= 3.5 : A (5/2) | | | | | | | | TechUGCourses >= 0.5 | | | | | | | | | TechUGCourses < 5.5 | | | | | | | | | | Degree < 3.5 : F (5/0) | | | | | | | | | | Degree >= 3.5 | | | | | | | | | | | TechUGCrses < 2 : A (1/0) | | | | | | | | | | | TechUGCrses >= 2 : F (1/0) | | | | | | | | | TechUGCrses >= 5.5 | | | | | | | | | | Degree < 3.5 | | | | | | | | | | | TechUGCrs < 10.5 : A (3/0) | | | | | | | | | | | TechUGCrses >= 10.5 | | | | | | | | | | | | TechUGCrs<12.5 : F (3/0) | | | | | | | | | | | | TechUGCrses >= 12.5 | | | | | | | | | | | | | TechUGCrs<16: A (2/0) | | | | | | | | | | | | | TechUGCrs>15 : A (2/1) | | | | | | | | | | Degree >= 3.5 : F (1/0) | | | | | | | HardProjMgmt Exp >= 0.13 : A (2/0) | | | | | | TechUGCourses >= 18 : A (2/0) | | | | | Hard Proj Mgmt Exp >= 0.75 : F (1/0) | | | | TechUGCourses >= 23 : F (5/0) | | | No Of Hardware Proj Est >= 3 : F (1/0) | | No Of Hardware Proj Est >= 4.5 : A (5/0) | Hardware Proj Mgmt Exp >= 6 : F (4/0) TechUGCrses >= 45.5 : A (2/0) Some Background2007 Statement of Problem How to build human-based estimation models that are accurate, intuitive, and easy to understand? http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  5. PROMISE 2008 versus 2007 • Sample set: 178 Samples • One learner  Accuracy and Intuitive Results • Attribute reduction Analysis. • Relatively Simple models. http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  6. Supplier Software Buyer Software Distribution Server Supplier1 Buyer Admin Supplier2 ... Buyer1 Buyern : Suppliern The Approach • Personal Demographics • Age, Gender, Nationality, etc. • Academic • Courses Undergrad/Grad: • CS, HW, SE, Proj. Mgmt, MIS • Workshops/Conferences: • CS, HW, SE, Proj. Mgmt, MIS • Work • Programming:Ada, ASP, Assembly, C, C++, • COBOL, DBMS, FORTRAN, Java, PASCAL, • Perl, PHP, SAP, TCL, VB, Other • Work Experience (HW/SW) • Project Management Exp. (HW/SW) • # Projects Estimated (HW/SW) • Average Project Size • Domain Experience • Procurement Industry Experience Estimate 28 Components Scale Factor And Correlation Apply Machine Learners http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  7. User’s Estimates How user compares to other respondents Actual Estimates Feedback to Users http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  8. S c a l e S c a l e Correlation Correlation S c a l e S c a l e Correlation Correlation Experiments: Data Original Data set Experiment 1 82.8 -29.4 Experiment 2 Experiment 3 29X 0.008 http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  9. Experiments: Tools, Configuration • Outliers Removed • WEKA Toolset • C4.5 (J48) • 1000 Trials • 10-Fold Cross Validation http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  10. Results: Correlation Only 2-Class Problem: 10 Best (A), 10 Worst (F) 1000 Trials, Accuracy=41.6% Attribute Reduction using WRAPPER 1000 Trials, Accuracy=78.6% http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  11. Results: Scale Factor Only 2-Class Problem: 10 Best (A), 10 Worst (F) 1000 Trials, Accuracy=65.0% Attribute Reduction using WRAPPER 1000 Trials, Accuracy=78.2% http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  12. Results: Correlation & Scale Factor 2-Class Problem: 10 Best (A), 10 Worst (F) 1000 Trials, Accuracy=82.2% Attribute Reduction using WRAPPER 1000 Trials, Accuracy=93.3% http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  13. Discussion - 1 How well does the decision tree from the third experiment apply to all the respondents minus outliers? http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  14. Discussion - 2 Challenges in component effort estimation • Scope of effort • Amortization of effort • Reuse can skew estimates (esp. Design for Reuse) • Respondent’s estimates = Boetticher’s estimates http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  15. Conclusions • Good accuracy rates, especially after attribute reduction • Correlation + Scale Factor Intuitive Model • Bridges expert and model groups http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  16. Thank You! http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

  17. References • Jorgensen, M., “A review of studies on Expert Estimation of Software Development Effort,” Journal of Systems and Software, 2004. • Jørgensen, Shepperd, A Systematic Review of Software Development Cost Estimation Studies, IEEE Transactions on Software Engineering, 33, 1, January, 2007, Pp. 33-53. http://nas.cl.uh.edu/boetticher/publications.html The 4th International Predictor Models in Software Engineering (PROMISE) Workshop

More Related