1 / 16

Thematic Analysis of Group Software Project Change Logs

Thematic Analysis of Group Software Project Change Logs. Andy Burn. Overview. Software Engineering Group Project Change Logs Thematic Analysis Results. Software Engineering Group Project. SEG Durham and Newcastle collaboration Team-based software development project

hidi
Download Presentation

Thematic Analysis of Group Software Project Change Logs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thematic Analysis of Group Software Project Change Logs Andy Burn

  2. Overview • Software Engineering Group Project • Change Logs • Thematic Analysis • Results

  3. Software Engineering Group Project SEG • Durham and Newcastle collaboration • Team-based software development project • Requirements, design, implementation, delivery • Focus here is on implementation 2006/07 • 12 Groups • 62 students contributed code to the implementation • 36 students from Durham, 26 from Newcastle

  4. Change Logs (1) • Students use SubVersion for code management • One message per revision • (Revision: a version of the code being stored) • Examples • “Added a splash screen” • “Fixed a bug” • “Bah” • “asdasd” • “Who reads these?” • And the most popular: • “ ”

  5. Change Logs (2) • 2006/07 Projects • ~4,100 revisions (2,600 Durham, 1,500 Newcastle) • ~1,000 revisions with messages • 800 Durham, 200 Newcastle • Why use comments? • In theory, provides a complete and informative history of a project • In practice, it works • In SEG, it doesn’t • Can they tell us anything else?

  6. Thematic Analysis (1) • Unlike purely numerical analysis, thematic analysis aims to uncover patterns or stories in data • In CS terms, each data item is ‘tagged’ (coded), and patterns are found in the groups formed from the tags. • Carried out on the comments (all 1,035 of them…) • The codes used were loosely based on maintenance activities

  7. Thematic Analysis (2) • After a long process of experimenting and verifying code schemes (thanks Stephen) the following codes were used: • Developmental: Creation or modification of features or functionality • Perfective: Refactoring, testing, commenting, cleaning of code • Corrective: Bug fixing • Ambiguous: Fits – or may fit - more than activity type • Misc: Irrelevant, or is out of scope (e.g. documentation)

  8. Thematic Analysis (3) • Research Questions • How are activity types distributed? • Does this change over time? • Is this affected by gender or campus?

  9. Limitations • Only one year’s data • Data may not be representative • Limited to 25% of the total data • Students don’t always submit work under their own names • Visiting other campuses • Pair programming • Metrics used are effective on average • Too much ambiguity

  10. How Are Activity Types Distributed?

  11. How Are Activity Types Distributed? • Too much emphasis on development • Too little testing, fixing and improving • “Misc” and “Ambiguous” should be minimized

  12. How Did This Change Over Time?

  13. Does Gender Affect Activity Types? • Could not address this question • Too few commented revisions from women • More data is needed, even with 2007/08 data included

  14. Does Campus Affect Activity Types?

  15. Conclusions • Students do not have a very good work distribution • Too much emphasis on developing new features (the fun part) • This is expected - SEG is designed to teach students these lessons before they reach industry • There is no significant effects from the different universities • Too early to tell if gender is a factor

  16. Future Work • Analysis of 2007/08 project data • Analysis of open source projects • Analysis of other university projects • Attempt to overcome limitations of the data • Better codes • Better metrics • Improved use of SubVersion by the students

More Related