1 / 34

Presentation 6

Presentation 6. Cross Language Clone Analysis Team 2 November 10, 2010. Agenda. Current Tasks Parsing & CodeDOM Clone Analysis Unit Testing Team Collaboration Path Forward. Our Team. Allen Tucker Patricia Bradford Greg Rodgers Brian Bentley Ashley Chafin. Current Tasks.

lamond
Download Presentation

Presentation 6

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presentation 6 Cross Language Clone Analysis Team 2 November 10, 2010

  2. Agenda • Current Tasks • Parsing & CodeDOM • Clone Analysis • Unit Testing • Team Collaboration • Path Forward

  3. Our Team • Allen Tucker • Patricia Bradford • Greg Rodgers • Brian Bentley • Ashley Chafin

  4. Current Tasks What we are tackling…

  5. Current Tasks (Review) • Current tasks created for the first user story “Source Code Load & Translate”: • Load & parse C#, Java, C++ source code. • Translate the parsed C#, Java, C++ source code to CodeDOM. • Associate the CodeDOM to the original source code. • Current tasks created for the second user story “Source Code Analyze”: • Analyze CodeDom for clones. • Display clone analysis results.

  6. Parsing & CodeDOM GOLD Parsing Populating CodeDOM

  7. Task Understanding • Three Step Process • Step 1 Code Translation • Step 2 Clone Detection • Step 3 Visualization Common Model Translator Source Files Detected Clones Inspector Common Model Clone Visualization UI Detected Clones

  8. Grammar Updates • Currently the grammars we have for the Gold parser are out dated. • Current Gold Grammars • C# version 2.0 • Java version 1.4 • Current available software versions • C# version 4.0 • Java version 6

  9. Grammar Update Issues • Grammars for C# and Java are very complex and require a lot of work to build. • Antler and Gold Parser grammars use completely different syntax. • Positive note: Other development not halted by use of older grammars.

  10. What’s in CodeDOM (Java Code)

  11. Our Grammar Bookkeeping Since there are so many production rules, we came up with the following bookkeeping: • A spreadsheet of the compiled grammar table (for each language) with each production rule indexed. • This spreadsheet covers: • various aspects of language • what we have/have not handled from the parser • what we have/have not implemented into CodeDOM • percentage complete

  12. Our Grammar Bookkeeping

  13. Parsing & CodeDOM Status • Parsing Handlers’ Status: • C# = 52% complete • Java = 100% complete • CodeDOM Translation Status: • We currently do not believe we can give an accurate measure of our completeness on this. It is a learning process, and we are working hard to complete it. • We are working on the CodeDOM translation for Java currently. (We will demo)

  14. SLOC For Our Project • As of Nov 8, 2010 • SLOC: • CS666_Client = 553 lines • CS666_Core = 114 lines • CS666_CppParser = 117 lines • CS666_CsParser = 1678 lines • CS666_JavaParser = 3350 lines • CS666_LanguageSupport = 48 lines • CS666_UnitTests = 3384 lines • Total = 9244 lines (including unit tests)

  15. Clone Analysis Dr. Kraft’s Student’s Tool

  16. Clone Research • Detecting clones across multiple programming languages is on the cutting edge of research. • A preliminary version of this was done by Dr. Kraft and his students for C# and VB. • They compared the Mono C# parser (written in C#) to the Mono VB parser (written in VB). • Publication: • Nicholas A. Kraft, Brandon W. Bonds, Randy K. Smith: Cross-language Clone Detection. SEKE 2008: 54-59

  17. Dr. Kraft Approach • Token sequence of CodeDOM graphs with Levenshtein distance • The Levenshtein distance between two sequences is defined as the minimum number of edits needed to transform one sequence into the other • Performs Comparisons of code files • CodeDOM tree is tokenized • Based on Distances • Percentage of matching tokens in a sequence

  18. Dr. Kraft Approach (cont)

  19. Porting Clone Analysis • About 50% complete porting the analysis code • Dr. Kraft's code • AST, codeDOM and Tokenization's are woven tightly together

  20. Limitations • Only does file-to-file comparisons • Does not detect clones in same source file • Can only detect Type 1 and some Type 2 clones • Not very efficient (brute force)

  21. Our User Interface

  22. Unit Testing NUnit

  23. Two Types of Testing • White Box Testing: • Unit Testing • Black Box Testing: • Production Rule Testing • Allows us to test the robustness of our engine because we can force rule production errors. • Regression Testing • Automated

  24. NUnit • What Is NUnit? • NUnit is a unit-testing framework for all .Net languages. Initially ported from JUnit, the current production release, version 2.5, is the sixth major release of this xUnit based unit testing tool for Microsoft .NET. It is written entirely in C# and has been completely redesigned to take advantage of many .NET language features, for example custom attributes and other reflection related capabilities. • http://www.nunit.org/index.php

  25. NUnit Example Code

  26. NUnit Example Code

  27. Unit Test Progress • NUnit has approximately 180 unit test stubs created. • We are currently adding code to the unit test stubs that have functional code to support the test. • Unit testing demo

  28. Team Collaboration Team 2 & Team 3

  29. Team Collaboration Team 2 & Team 3 • Both teams met Monday (11-8-10) after class and performed the required Pair Programming. • Team 2 • All project source code has been made available. • We are researching and working to update the Java and C# grammars. • Team 3 • Team 3 is working on C++ parsing. • Looking into other parser, ELSA.

  30. Patricia’s Status Patricia Bradford

  31. Patricia Had A…. • Beautiful baby boy, Joshua Aydan.

  32. Path Forward Next Iteration & Schedule

  33. Path Forward Finalize Iteration 1Task (C++ to CodeDom) Continue Iteration 2 Task (Code Analysis) Continue Iteration 3 Task (Begin GUI)

  34. Schedule

More Related