who copied who n.
Skip this Video
Loading SlideShow in 5 Seconds..
Who Copied Who? PowerPoint Presentation
Download Presentation
Who Copied Who?

Loading in 2 Seconds...

play fullscreen
1 / 17

Who Copied Who? - PowerPoint PPT Presentation

Download Presentation
Who Copied Who?
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Who Copied Who? Gordon Lingard School of Software University of Technology, Sydney glingard@it.uts.edu.au

  2. The Problem • Students copying computer code off other students within a subject is a significant problem. • Different to problems of students copying from an external source. • Programs exists for determining code is a copy. • They don’t answer the question of who created the code and who copied it. • This presentation outlines a solution to this problem.

  3. Presentation Outline • What is Computer Programming • Detection System • Assignment Submission System • Combining the Systems • Results and Conclusions • Questions

  4. What Is Computer Programming?Computer Code • Computer programs are written in a formal programming language that looks like a cross between mathematics and natural language. • They have a very strict syntax structure. • The language is used to construct a large set of carefully orchestrated instructions that become the program. • Student programs are typically less than a thousand instructions. Commercial programs can be tens of thousands to millions of instructions. • Larger programs are of staggering complexity.

  5. What Is Computer Programming?C++ Code Example

  6. What Is Computer Programming? Why Learning to Program is Hard • Learning issues students face • Learning the language. • Learning how to use the language to create a program to do a specified task. • Managing the complexity as programs grow in size. • In the face of these issues, many students are overwhelmed and resort to copying.

  7. Detection SystemProblems of detection • Disguise • Simple transformations that change the look of the code without changing what it does. • Combinatorics • n assignments creates p = n/(n-1)/2 pairs. • 100 assignments = 4950 pairs. • Code Overlap • Two pieces of code designed to do the same thing – about 50% of the code will be common. • Boilerplate code creating many false positives.

  8. Program Instructions TokenisedInstructions Complexity Numbers if (x > y) { a[x] = b[1][y]; foo(&x, *y); : : instr n-1 instr n if(>) { [] = [][]; (&, *); : : tokenised n-1 tokenised n 98592 112142 147716 : : complex n-1 complex n   Detection SystemComplexity Numbers • Tokenise Code. • Generate Complexity Numbers.

  9. Detection SystemComparing Complexity Numbers • Determine the percentage of numbers common between two programs.

  10. Submission System • Used for a number of years in parallel with the detection system. • A formative assessment tool. • Runs students programs with a suite of tests. • Analyses their code for poor programming practices. • The students can use the results from the tests to refine their assignments and re-submit as often as they like. • The submission system becomes a development environment.

  11. Combining the SystemsOverview • Extract information from the detection system to create a digital fingerprint of an assignment. • The fingerprint helps to uniquely identify a piece of code while being unaffected to by minor changes to the code. • Append the fingerprint, along with time and date, to a log of submissions for each student. • Analyse logs to see if fingerprints are appearing between students and use the date/time to determine order of development.

  12. Combining the SystemsDigital Fingerprints • A fingerprint is created by extracting the 6 largest, unique complexity numbers from all the numbers a piece of code generates. • Represent the 6 most complicated pieces of the code. Assignment Code Complexity Numbers Digital Fingerprint = 6 largest unique complexity numbers in sorted order if (x > y) x = x * 6; else y = x + y; : : : : *z = a->b[x]; 62145 87219 14067 57063 : : : : 112103 68018 68682 72172 87219 97843 112103 Append fingerprint and date/time to log

  13. Combining the SystemsSubmission Logs 1 Changes 4 1

  14. Combining the SystemsComparing Logs • Comparing summary of logs. • Time frames in comparison makes it clear who originated the code, who copied and when.

  15. Who Copied Who?Results • Rarely is there collaboration. It is students copying other students. • In cases of copying, the logs almost always make a very clear statement of what has happened and when. • The copying usually involves one copying off another, sometimes two but rarely more. • Frequently, it is not the final submission that gives away the copying, but earlier submissions. This can be seen in the logs and then examining the earlier submissions.

  16. Who Copied Who?Conclusions • The system has proved extremely successful in presenting misconduct cases to the Faculty. • The sheer weight of evidence the logs produce often saves time as students don’t try and bluff their way through the allegation. • This allows the Faculty to shift the focus away from penalty and to remedial action.

  17. Questions ?