1 / 19

Bernard Chalk Kemi Adeboye London South Bank University

india
Download Presentation

Bernard Chalk Kemi Adeboye London South Bank University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Bernard Chalk & Kemi Adeboye London South Bank University Using a Web-Based Peer Review System to support the teaching of Software Development: Preliminary Findings LTSN-ICS 2004, University of Ulster, Belfast, UK

    2. 2 Contents Software Development at LSBU Peer Review Systems Web-Based Peer Review System Findings Conclusions

    3. 3 Peer Review in HE Peer review assessment is applied in a variety of subject areas within the higher education sector and there is a wide range of variation in why and how they are applied. For example Is the system primarily designed for time saving or for cognitive gains? Is the focus of the system qualitative and/or quantitative feedback and will it Be used for formative or summative purposes? What privacy does the system offer? Do the peers have the same of different abilities? How will the assessors and assessed be organised? Where will the assessment take place? What requirements are imposed on the assessors and the assessees? What rewards are provided?Peer review assessment is applied in a variety of subject areas within the higher education sector and there is a wide range of variation in why and how they are applied. For example Is the system primarily designed for time saving or for cognitive gains? Is the focus of the system qualitative and/or quantitative feedback and will it Be used for formative or summative purposes? What privacy does the system offer? Do the peers have the same of different abilities? How will the assessors and assessed be organised? Where will the assessment take place? What requirements are imposed on the assessors and the assessees? What rewards are provided?

    4. 4 Web-Based Peer Review Systems Web-Based peer review systems are relatively new and have some features that are particularly useful. For example They supports both In class and Out class assessment which is particularly useful for distant learners They allow students to submit their work in different formats such as as diagrams or as audio clips It is easy to link the system to other learning resources Can be integrated with course/unit management and other systemsWeb-Based peer review systems are relatively new and have some features that are particularly useful. For example They supports both In class and Out class assessment which is particularly useful for distant learners They allow students to submit their work in different formats such as as diagrams or as audio clips It is easy to link the system to other learning resources Can be integrated with course/unit management and other systems

    5. 5 System Architecture The Web-Based Peer Review System that we developed at London South Bank was an integral part of the Coursework Management system and used a three-tier client-server architecture. The students accessed the server using a browser which in turn dynamically generated Web-pages through php scripting and access a mySQL database. The Web-Based Peer Review System that we developed at London South Bank was an integral part of the Coursework Management system and used a three-tier client-server architecture. The students accessed the server using a browser which in turn dynamically generated Web-pages through php scripting and access a mySQL database.

    6. 6 General Procedure Each week the students were asked to submit answers to a set of questions, one of which was chosen for peer review. On completion of the exercise the students were asked to review 3 answers which were randomly selected All reviews were anonymous and the reviewees were referred to as Student A, B and C If insufficient submission were available to allocate then the students were asked to try again later. Each week the students were asked to submit answers to a set of questions, one of which was chosen for peer review. On completion of the exercise the students were asked to review 3 answers which were randomly selected All reviews were anonymous and the reviewees were referred to as Student A, B and C If insufficient submission were available to allocate then the students were asked to try again later.

    7. 7 LogOn Page After Logging On, a student would be presented with a number of options allowing them to do exercises, review answers submitted by peers etc. Attempts to carry out a review or participate in a discussion before attempting the corresponding exercise was automatically blocked by the system. The students could also see the marks awarded for the exercise by the tutor, which was always within 10 days of submission.After Logging On, a student would be presented with a number of options allowing them to do exercises, review answers submitted by peers etc. Attempts to carry out a review or participate in a discussion before attempting the corresponding exercise was automatically blocked by the system. The students could also see the marks awarded for the exercise by the tutor, which was always within 10 days of submission.

    8. 8 Review Form On selecting a link to do a review the reviewer three 3 randomly selected answers were displayed together with 3 review forms like the one shown here for Student A. The reviewer was asked to score the answer based on code style, code correctness and code quality. They also were asked to explain the reasoning behind the score allocation and to suggest ways to improve it. The fact that they could scroll between the answers encouraged consistency in their mark allocation. A scale of 1-10 rather than 0-10 was used as this allowed us to reserve a score of 0 for a non-submission.On selecting a link to do a review the reviewer three 3 randomly selected answers were displayed together with 3 review forms like the one shown here for Student A. The reviewer was asked to score the answer based on code style, code correctness and code quality. They also were asked to explain the reasoning behind the score allocation and to suggest ways to improve it. The fact that they could scroll between the answers encouraged consistency in their mark allocation. A scale of 1-10 rather than 0-10 was used as this allowed us to reserve a score of 0 for a non-submission.

    9. 9 Code Style Scores Over the 9 exercises the distribution of code style scores was skewed towards the higher scores. This was the same for the correctness and quality scores as well.Over the 9 exercises the distribution of code style scores was skewed towards the higher scores. This was the same for the correctness and quality scores as well.

    10. 10 Code Correctness Scores

    11. 11 Code Quality Scores

    12. 12 Comments Analysis of the comments revealed that on average each students wrote 4-5 sentences or about 90 words for each review. This also tended to remain reasonably constant and there was no ‘drop off’ as one might have expected towards the end of the unit.Analysis of the comments revealed that on average each students wrote 4-5 sentences or about 90 words for each review. This also tended to remain reasonably constant and there was no ‘drop off’ as one might have expected towards the end of the unit.

    13. 13 Comments Here is a comment of average length, providing some useful information about the correctness and a very good comment on quality in terms of the “inappropriate use of a Java Class”. Unfortunately it also includes vague comments like “this solution is more complicated than required” and “formatting is OK”. Here is a comment of average length, providing some useful information about the correctness and a very good comment on quality in terms of the “inappropriate use of a Java Class”. Unfortunately it also includes vague comments like “this solution is more complicated than required” and “formatting is OK”.

    14. 14 Comments Here is a much longer comment which tends to focus on the code style rather than correctness and in this case only a partially explained comment on the quality. The length of the comment in this case is nearly 25% longer than average showing that comment length can be misleading.Here is a much longer comment which tends to focus on the code style rather than correctness and in this case only a partially explained comment on the quality. The length of the comment in this case is nearly 25% longer than average showing that comment length can be misleading.

    15. 15 Reviewer versus Tutor score Plotting the average total scores awarded by each group of reviewers for a given submission against the score awarded by the tutor over ALL exercises shows that there is a correlation (correlation coefficient r = 0.62) between the two but that this is mainly for the higher scores. This indicates that there tends to be good agreement as to what is a good answer and poor agreement as to what is a weak answer. It also shows a tendency for students to award higher scores that tutors. However on inspection of individual exercises there can be wild fluctuations so that in general it seems that students scores are unreliable. A stronger correlation seems to be present for the high scores as compared to The low scores and there is also a tendency for the students to award higher marks than the tutor. Plotting the average total scores awarded by each group of reviewers for a given submission against the score awarded by the tutor over ALL exercises shows that there is a correlation (correlation coefficient r = 0.62) between the two but that this is mainly for the higher scores. This indicates that there tends to be good agreement as to what is a good answer and poor agreement as to what is a weak answer. It also shows a tendency for students to award higher scores that tutors. However on inspection of individual exercises there can be wild fluctuations so that in general it seems that students scores are unreliable. A stronger correlation seems to be present for the high scores as compared to The low scores and there is also a tendency for the students to award higher marks than the tutor.

    16. 16 Reviews received Of the 459 (17 students x 3 reviews x 9 exercises) possible reviews, only 277 were submitted. These were distributed amongst the students as shown here. This was almost certainly due to the demanding weekly schedule, an experience reported by other educators using a similar system. The shape of the tail on this distribution shows that 3 or 4 of these 17 students didn’t engage with the system for some reason.Of the 459 (17 students x 3 reviews x 9 exercises) possible reviews, only 277 were submitted. These were distributed amongst the students as shown here. This was almost certainly due to the demanding weekly schedule, an experience reported by other educators using a similar system. The shape of the tail on this distribution shows that 3 or 4 of these 17 students didn’t engage with the system for some reason.

    17. 17 Exam scores v Engagement There was also a significant correlation (correlation coefficient r = 0.64) at the 1% confidence level between exam score and the number of reviews received. This could be due to the information received although it is probably more likely to be due to the fact that the conscientious students engage and do well in exams. There was also a significant correlation (correlation coefficient r = 0.64) at the 1% confidence level between exam score and the number of reviews received. This could be due to the information received although it is probably more likely to be due to the fact that the conscientious students engage and do well in exams.

    18. 18 Feedback Questionnaire Analysis Students were asked to complete a feedback questionnaire after each exercise. The questionnaire used a 5-point Likert scale and asked the students to rate the difficulty of the exercise, the usefulness of the peer review process and also the feedback they had received. It then asked whether they felt they had learnt anything from the peer review process and to give details if they had done so. Analysis of these reviews revealed that the studentsStudents were asked to complete a feedback questionnaire after each exercise. The questionnaire used a 5-point Likert scale and asked the students to rate the difficulty of the exercise, the usefulness of the peer review process and also the feedback they had received. It then asked whether they felt they had learnt anything from the peer review process and to give details if they had done so. Analysis of these reviews revealed that the students

    19. 19 Discussion

    20. 20 Discussion The scores awarded by students for the code style, quality and correctness are unevenly distributed and don’t generally represent the true quality of the answer. Possibly using a wider range of specifity might help to improve this Another conclusion is that weekly reviews are too demanding and more time needs to be allowed. With a little over half of the reviews received this suggests a bi-weekly programme might work better. Finally little use was made of the discussion boards. This was unfortunate as other researchers like Phil Davies from University of Glamorgan has found such discussion to be highly productive and to encourage critical thinking.The scores awarded by students for the code style, quality and correctness are unevenly distributed and don’t generally represent the true quality of the answer. Possibly using a wider range of specifity might help to improve this Another conclusion is that weekly reviews are too demanding and more time needs to be allowed. With a little over half of the reviews received this suggests a bi-weekly programme might work better. Finally little use was made of the discussion boards. This was unfortunate as other researchers like Phil Davies from University of Glamorgan has found such discussion to be highly productive and to encourage critical thinking.

More Related