1 / 26

A Taxonomy of Evaluation Approaches in Software Engineering

A Taxonomy of Evaluation Approaches in Software Engineering A. Chatzigeorgiou , T. Chaikalis , G. Paschalidou , N. Vesyropoulos , C. K. Georgiadis , E. Stiakakis University of Macedonia, Greece. BCI 2015, Craiova Romania, September 2015.

jamesmacias
Download Presentation

A Taxonomy of Evaluation Approaches in Software Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Taxonomy of Evaluation Approaches in Software Engineering A. Chatzigeorgiou, T. Chaikalis, G. Paschalidou, N. Vesyropoulos, C. K. Georgiadis, E. Stiakakis University of Macedonia, Greece BCI 2015, Craiova Romania, September 2015

  2. …we regret to inform you … the evaluation of your approach is rather weak … …unfortunately we had to reject a number of good papers… ..the proposed approach lacks a thorough evaluation… …we would like to thank you for your submission, BUT… ..further evaluation is required … …Congratulations, your paper has been accepted… …evaluation is backed up by systematic statistical results …

  3. I need some proof = EVALUATION!!

  4. Taxonomies Taxonomy: Τάξις(Arrangement) + Νόμος(Law, Method) “aims at organizing a collection of objects in a hierarchical manner to provide a conceptual framework for discussion and analysis”

  5. Goal of Study To build a taxonomy of evaluation approaches in Software Engineering

  6. Context of Study 3 PhD students, 3 faculty members TSE: ΙΕΕΕ Transactions on Software Engineering, TOSEM: ACM Trans. on Soft. Eng. and Methodology, JSS: Elsevier's Journal of Systems and Software articles that appeared in the corresponding 2012 volume

  7. Context of Study (2) Title, Authors, Journal, Issue Free Keywords & Classification (ACM) Employed Evaluation Approach Pages devoted to the evaluation Total #pages TSE: 81 articles TOSEM: 24 articles JSS: 207 articles Filtered: articles that clearly did not belong in the SE domain, Empirical Studies (Systematic Literature Reviews, surveys, mapping studies) 133 Articles TSE: 58 articles TOSEM: 22 articles JSS: 53 articles

  8. Key Terms Performance: Most typical definition of performance originates from computer architecture: performance refers to the amount of work that a system/computer/program can perform in a given time or for given resources. Effectiveness: By effectiveness we refer to the extent by which a proposed technique/methodology accomplishes the desired goal. For example, a testing approach is effective if it reveals a large number of bugs. Benchmark: A benchmark is a standard, acknowledged data set (consisting of tasks, collection of items, software etc.) designed with the purpose of being representative of problems that occur frequently in real domains.

  9. Proposed Taxonomy

  10. Goal is to make clear the advantages and dis-advantages over previous work, and usually to high-light the added value of the proposed technique

  11. Proposed Taxonomy

  12. By formal treatment we mean the use of a mathematically-based approach for proving theorems, properties, invariants or the correctness of a system. Not all of software engineering research can benefit from the application of formal methods criterion is related to the completeness of the proof, 1. the mathematical reasoning validates the entire approach 2. ensures the fulfillment of certain properties

  13. Proposed Taxonomy

  14. Application of the proposed tool, algorithm, technique on artificially constructed or selected case studies. Results are obtained and discussed to demonstrate the feasibility, performance or effectiveness of the approach. Empirical Evaluation Case Studies Case Study Evaluation Empirical Results Experiments Experimental Results …..

  15. Extent of Evaluation papers with just one page and papers with as many as 24 pages for the evaluation have been encountered

  16. Availability of Data

  17. Validation of the Taxonomy • By definition, it is difficult to assess whether taxonomies are valid, since their construction relies on the subjective interpretation of categories • we have applied the taxonomy on articles which have not been considered during its development • we have classified the papers from the Main Track of the 34th International Conference on Software Engineering (ICSE'2012) • 87 articles have been considered • We recorded: • Whether the paper actually introduces any technique • Whether the paper could be mapped to any of the derived classification categories • The corresponding category code

  18. Validation of the Taxonomy (2)

  19. Correlation between evaluation and area RQ1: Is the evaluation approach correlated to the area of research? H0 Variables "Area of Research" and "Evaluation Type" are independent H1 Variables "Area of Research" and "Evaluation Type" are dependent Areas of research correspond to a second level classification based on the 2012 ACM Computing Classification System A chi-square test revealed that there is no statistically significant correlation between “Evaluation Type” and “Area of Research”

  20. In Software Testing there is a tendency to employ case studies and analysis of effectiveness (i.e. how well a testing strategy achieves its goals)

  21. Correlation between evaluation and area RQ2: Is the extent of the evaluation correlated to the evaluation approach? H0 The distribution of "Extent of Evaluation" is the same across categories of "Evaluation Type" H1 The distribution of "Extent of Evaluation" is not the same across categories of "Evaluation Type" we applied the non-parametric Independent-Samples Kruskal-Wallis test to compare the distributions across groups formed by the evaluation type variable result is significant at the 0.05 level. In other words, the extent of evaluation is affected by the employed evaluation strategy.

  22. Evaluation of efficiency on case studies, relying on explicitly stated research questions (E3.3.1.1) devotes a large percentage of the paper to the evaluation.

  23. Conclusion In software engineering there is a vast amount of different evaluation techniques designed and executed to serve the needs of each particular research We have attempted to introduce a taxonomy of evaluation approaches. We identified 17 evaluation types that any approach can adopt either individually or in combination with other types and 8 axes according to which evaluation approaches can be classified.

  24. So, the next time you receive a review pointing to the strength or weaknesses of the evaluation approach . . . We are glad to inform you that your paper: ….has been ACCEPTED by BCI 2015 Program Committee Review 1 … the authors have done good job in supporting their methodology by a convincing evaluation approach ….. You might be able to classify your approach based on the proposed taxonomy!

  25. Thank you for your attention!! BCI 2015, Craiova Romania, September 2015

More Related