1 / 14

Catching Plagiarists

Catching Plagiarists. Baker Franke The University of Chicago Laboratory Schools. Problem Motivation. e.g. Physics 101 at a large university One student helping another by giving him a copy of his assignment only to have the other turn it in (or large portions of it) as his own.

etta
Download Presentation

Catching Plagiarists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Catching Plagiarists • Baker Franke • The University of Chicago Laboratory Schools

  2. Problem Motivation • e.g. Physics 101 at a large university • One student helping another by giving him a copy of his assignment only to have the other turn it in (or large portions of it) as his own.

  3. Problem Statement • Given a set of text documents, find if any have been plagiarized in full, or in part, from some other document in the set.

  4. What’s Nifty? • Virtually zero prep...or a lot. • Covers many areas of CS1/CS2 • Real Big-Oh implications • Accessible by/challenging to all levels of students • No one minds it being a console application! • Relevance and motivation

  5. Nift[yi]est Thing of All • This problem REQUIRES a computational solution - no other way to do it.

  6. Suggest a Strategy? • Compare n-word chunks between documents. Document B Document A 356 887 378 Document C

  7. Nifty Parts • Harder text processing • Real data structure choices/consequences • “Real World” issues

  8. Part I :Text Processing • Processing many documents in a non-trivial way. • Option: just compare two documents to find a degree of similarity. • Interesting Question: whatdo I need to compare? • Did someone say regular expressions?

  9. Part II : Data Structures • Comparing even a small set of documents requires some structures • Many possibilities • Real Big-Oh concerns / analysis

  10. Part III : Real World Issues • Large document set quickly becomes unmanageable if poor algorithm chosen. • Memory limits • Output representation • Legal issues?

  11. Web Site Assignment Sheet Resources

  12. “Through the duration of Heathcliff’s life, he encounters many tumultuous events that affects him as a person and transforms his rage deeper into his soul, for which he is unable to escape his nature.” --bwa93.txt

More Related