1 / 33

Source Code Exploration with Google

Source Code Exploration with Google. Wayne State University. Denys Poshyvanyk, Maksym Petrenko, Andrian Marcus, Xinrong Xie, Dapeng Liu Presented by: Roli Shrivastava. HISTORY. Global Regular Expression Print (G/RE/P ) Existing Integrated Development Environments (IDE) File Searches

lucio
Download Presentation

Source Code Exploration with Google

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Source Code Exploration with Google • Wayne State University Denys Poshyvanyk, Maksym Petrenko, Andrian Marcus, Xinrong Xie, Dapeng Liu Presented by: Roli Shrivastava

  2. HISTORY • Global Regular Expression Print (G/RE/P ) • Existing Integrated Development Environments (IDE) File Searches • Both are based on Regular Expression Matching Limitations of GREP and IDEs • Supports only specific development or maintenance task • Not in the mainstream of the software development practice. • Case sensitive • Limited interaction with potential users

  3. MOTIVATION • To understand large & new parts of the Software systems. • People search codes for: • Concept location in source code • Impact analysis • Change propagation • Debugging • Comprehension of software in general • Hence to support them, we needed a fast and accurate tools and techniques.

  4. PROPOSAL OF PAPER • New approach to Source Code Exploration • Integration of Google Desktop Search + IBM’s Eclipse Development Environment. • Known as Google Eclipse Search (GES)

  5. EXISTING APPROACH • Searching based on Information Retrieval (IR) Indexing technique • IR allows formulation of queries with multiple words • More popular than regular expression matching Problems: • Computational Efficiency • Online-Re-indexing of the software

  6. GES • Allows you to search software projects in a manner similar to searching the internet or their own desktops. • Searching Within Projects / working set of files • Uses Natural Language Queries • GES has advantages of GDS + Eclipse’s Extensibility. • GES based on IR indexing technique • Idea is also integrated with MS Visual Studio • Uses GDS to index and search source code files and project files • Is efficient as GDS • Re-indexing as the search space changes

  7. Problems with GDS as a Standalone : ??

  8. LIMITATIONS OF GDS!! • GDS is not project specific search • Searches files in the entire system • Needs an internet browser • Awkward !! • User has to switch between IDE and the browser Solution is definitelyGES

  9. GDS + ECLIPSE !!! • On-the-Fly preprocessing and indexing of the context • Continual indexing • maintains and updates current location changes • Accurate results • Immediate response for queries • History of searches • Advanced Search Options • Project specific search • Sorting of the results • Relevance • Dates

  10. ADVANTAGES • features specific to IR-based searching • multiple term queries • natural language queries • Boolean operators • ranking of search results • scalability & high reliability of the proven search engine (i.e., GDS) • important for massive file • repositories, such as large scale software systems • display of and access to the search results within Eclipse’s IDE • its native interfaces that provide direct links between the search results and the actual • source code in the editor.

  11. SYSTEM REQUIREMENT • To run GES, you will need: • Eclipse SDK 3.2 or higher; • Google Desktop Search (GDK) 2.0 or higher; • Java Run-Time Environment (JRE) 1.5 or higher.

  12. GES DESIGN & IMPLEMENTATION • GES similar to File Search in Eclipse. • Type a Query into the GES dialogue Box. • Specify the Scope of the search • workspace • selected resources • enclosing projects • working sets • After the query, the search is displayed in GES search Results Tab. • Results can be explored by browsing in the editor.

  13. GES SCREEN SHOT

  14. SCREEN SHOT

  15. PILOT CASE STUDY • Performed on Violet (http://www.horstmann.com/violet/) • Violet is a Cross Platform UML Editor written in JAVA • Has 65 classes + 448 methods + 9000 LOC Approach: • To request for a new feature • GOAL: “introduce a user-defined arrow type for the class diagram”.

  16. QUERIES FOR PCS-I • Q2 : “arrow class diagram” OOPS… Did not return any matches • Q3: “edge class diagrams” Worked

  17. RESULTS • 11 files as search results • UseCaseDiagramGraph • StateDiagramGraph • SequenceDiagramGraph • StateTransitionEdge • ObjectDiagramGraph • NoteNode • ObjectNode • FieldNode • ImplicitParameterNode • ClassDiagramGraph • CallNode.

  18. ANALYSIS OF RESULTS • ClassDiagramGraph had the relevant result. To verify this finding: • ‘draw’ and ‘getPath’ methods in ‘ArrowHead’ are modified. • Related methods in ArrowHeadEditor file are also modified successfully.

  19. GES vs. FILE SEARCH Problem : “concept location task” in violet Goal : “to locate the place in the source code which specifies the width of the class diagrams” File : “value saved in DEFAULT_WIDTH variable”

  20. GES BEHAVIOR • Q1: “default width” “Bingo” in the first step itself…!!

  21. FILE SEARCH BEHAVIOR Q1: “default width” “OOPSS !!! No results” Q2:”default” “yes …. Hmmm closer” Q3: “width” “yes… Much Closer”

  22. FILE SEARCH can be made BETTER?? • In this particular case … • “Default *Width” would have worked fine. • Gave same result as GES in the 1st attempt Drawback: • To construct such expressions, • programmer should have additional information about identifiers • Unusable to construct such complex expressions all the time (this was a relatively simpler expression) • What will happen if the expression was more complex ?? !!!

  23. FILE SEARCH vs. GES RESULTS • File Search had to be modified to get to the result • Narrow down the result by performing the search within the query • GES gave results in the first query itself. • GES is faster than File Search. • GES investigates less LOCs. • GES returns the ranked list of results. • Developers learn relevant information faster than File Search.

  24. STILL NOT SURE !! • Authors say “This study has a proof-of-concept role, we do not generalize these conclusions”. • Need more detailed case study to extend the results.

  25. OTHER CASE STUDIES • Needed a bigger project than “violet” • Queries were run on • P4 2.8Ghz with 1GB of RAM • GES plug-in • File Search in Eclipse 3 • Art of illusion : 3D modeling studio • Written in JAVA • Has 442 classes , 20 interfaces, 100838 LOC • Eclipse Version 3.1 + complete sources • 20000 files • 2 million LOC

  26. METHODOLOGY • 10 queries were run on each system • Average response time needed for GES and File Search

  27. COMPARING THE RESULTS

  28. DERIVED RESULTS !!! • GES is more effective in terms of response time • GES scales up very well with the size of the search space

  29. LIMITATIONS • GES uses GDS • GDS’s background indexing • Only when user’s computer is idle • User has to wait for the (re)-indexing of the file. • None of the GDS APIs handles this issue.

  30. Q: Is this really an issue?? A: As this is 1-time step, it only affects the first search on a software system

  31. CONCLUSION • Integrating GDS into Eclipse • Improves source code searching • Produce easier to adopt approach • GES allows to perform searches in • all the source code • Associated documentation • Faster than the file search • Queries do not take into account the format of the identifiers in the source code

  32. RELATED WORKS • JIRiSS – an Eclipse plug-in for Source Code Exploration (Information Retrieval based Software Search for Java) http://mercury.cs.wayne.edu/~vip/publications/Poshyvanyk.ICPC.2006.JIRiSS.pdf JIRiSS includes other advanced features • automatically generated software vocabulary • advanced query formulation options • including spell-checking as well as fragment-based search. • Information Retrieval – A book by C. J. van RIJSBERGEN http://www.dcs.gla.ac.uk/Keith/Preface.html

  33. DISCUSSIONS ‘n’ QUESTIONS??

More Related