1 / 28

Transfer learning in effort estimation

Transfer learning in effort estimation. Ekrem Kocaguneli Tim Menzies Emilia Mendes Empirical Software Engineering, 2015,20(3) Presented By Tong Shensi 2 016.02.29. Introduction Related Work Methodology Results Conclusions. Introduction. Background

Download Presentation

Transfer learning in effort estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transferlearningineffortestimation EkremKocaguneli Tim Menzies Emilia Mendes EmpiricalSoftwareEngineering, 2015,20(3) PresentedByTongShensi 2016.02.29

  2. Introduction • RelatedWork • Methodology • Results • Conclusions

  3. Introduction • Background • Data-minercanfindinterestingandusefulpatternsfromwithin-companydata • Transferresultacrossdifferentcompaniesischallenging • Manyorganizationsexpendmuchefforttocreaterepositoriesofsoftwareprojectdata(PROMISE,BUGZILIA,ISBSG)

  4. Introduction • Background(cont.) • Arerepositoriesofsoftwareprojectdataarevaluabletoindustrialsoftwarecompanies? • Suchrepositoriesmaynotpredictpropertiesforfutureprojects • Theymaybeveryexpensivetobuild • Findingsindefeatpredictionareashowpromisingresult • Earlierfindingshowthattransferringdatacomewiththecostofreducedperformance • Filteringthetransferreddatamayaddresstheproblem

  5. Introduction • Background(cont.) • Previousresultshowtransferringeffortestimationresultisachallengingtask • Kitchenham et al.reviewed7publishedtransferstudies,inmostcases, transferreddataisworse • Ye et al.reportedthatCOCOMOmodelhavechangedradicallyfornewdatacollectedin2000-2009

  6. Introduction • Research Question • Istransferlearningeffectiveforeffortestimation? • Howusefularemanualdivisionofthedata? • Doestransferlearningforeffortestimationworkacrosstimeaswellasspace? • ThispaperusesTEAKaslaboratoryforstudyingtransferlearningineffortestimation

  7. Introduction • RelatedWork • Methodology • Results • Conclusions

  8. Related Work • Transferlearning(TL) • SourcedomainDS,SourceTaskTS,TargetdomainDT,TargetTaskTT • TLtriestoimproveanestimationmethodinDTusingtheknowledgeofDSandTS • DS≠DT,TS≠TT

  9. Related Work • TransferlearningandSE • ThepriorresultperformanceofTLareunstable • 10studiesreviewedbyKitchehametal.,4studiesfavoredwithindata,another4studiesfoundthattransferringdataisnotstatisticallysignificantlyworsethanwithindata,2studieshadinconclusiveresults • Zimmermannetal.foundwithinperformedbetterin618cases(total622cases)indefeatprediction • Turhan et al.compareddefectpredictorslearnedfromtransferredorwithindata,foundtransferredpredictorshavepoorperformance.Butafterinstanceselection,theyarenearlythesame

  10. Introduction • RelatedWork • Methodology • Results • Conclusions

  11. Methodology • Dataset • Tukututudatabase • 195projectsfrom51companies • Eliminatedallthecompanieswithlessthan5projects,125projectsfrom8companies

  12. Methodology • Dataset(cont.) • Cocomo81 • Coc-60-75 • Coc-76-rest • Nasa93 • Nasa-70-79 • Nasa-80-rest

  13. Methodology • PerformanceMeasures • Meanabsoluteerror(MAE) • MeanMagnitudeofRelativeError(MMRE) • MMRE=mean(allMREi) • MedianMagnitudeofRelativeError(MdMRE) • MdMRE=median(allMREi) • Pred(25)

  14. Methodology • PerformanceMeasures(cont.) • Mean MagnitudeofErrorRelative(MMER) • MeanBalancedRelativeError(MBRE) • MeanInvertedBalancedRelativeError(MIBRE) • StandardizedAccuracy(SA)

  15. Methodology • InstanceSelectionandRetrieval • Analogy-basedestimation(ABE) • Inputadatabaseofpastprojects • Foreachtestinstance, retrieve k similar projects • For choosing k analogies use a similarity measure • Before calculating similarity, scale independent features to 0-1 interval so thathigher numbers do not dominate the similarity measure. • Use a feature weighting scheme to reduce the effect of less informativefeatures. • Adapt the effort values of the k nearest analogies to come up with the effort estimate.

  16. Methodology • InstanceSelectionandRetrieval(cont.) • TEAK • TEAK is a variance-based instance selector that discards training data associated with regions of high dependent variable (effort) variance

  17. Methodology • Experimentation • Goal • Getthepercentageofeverysubsetwouldberetrievedintokanalogiesusedforestimation • AnswerwhetherTLcanenablethe use of datafrom other organizations as well as from other time intervals

  18. Introduction • RelatedWork • Methodology • Results • Conclusions

  19. Result • TransferinSpace • Tuku1,4-7’stieare veryhigh • Tuku8,performance dependontheerror Measure

  20. Result • TransferinTime

  21. Result • Inspecting Selection Tendencies

  22. Result • Inspecting Selection Tendencies • Finding1 • Onlyaverysmallportionofalltheavailabledataistransferredasusefulanalogies • Finding2 • Whenwecomparethediagonalandoff-diagonalpercentages,weseethatthevaluesareveryclose

  23. Introduction • RelatedWork • Methodology • Results • Conclusions

  24. Conclusion • Whenprojectslacksufficientlocaldatatomakepredictions,theycantrytotransferinformationfromotherprojects • Researchquestion • RQ1:Is transfer learning effective for effort estimation? • Inmajorityofthecases,transferredresultsaswellaswithinresults

  25. Conclusion • Researchquestion(cont.) • RQ2:How useful are manual divisions of the data? • Testinstancesselectequalamountsoinstancesfrom within and transferred data sources • Teakfoundnoaddedvaluein restricting reasoning to just within a delphi localization • RQ3:Does transfer learning for effort estimation work across time as well as space? • Yes • Itmaybemisguidedtothink • Thedataofanotherorganizationcannotbeused • Olddataofanorganizationisirrelevant

  26. Conclusion • Thoughts • Heterogeneousdatafortransferlearningineffortestimationischallengingandmeaningful • Sometimewecoulduseothersmethodtovalidateourthoughts

  27. Q&A

  28. Thank you

More Related