f boerboom a janssen g lommerse f nossin l voinea a telea n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
F. Boerboom, A. Janssen, G. Lommerse, F. Nossin, L. Voinea, A. Telea PowerPoint Presentation
Download Presentation
F. Boerboom, A. Janssen, G. Lommerse, F. Nossin, L. Voinea, A. Telea

Loading in 2 Seconds...

play fullscreen
1 / 32

F. Boerboom, A. Janssen, G. Lommerse, F. Nossin, L. Voinea, A. Telea - PowerPoint PPT Presentation


  • 117 Views
  • Uploaded on

The Visual Code Navigator: An Interactive Toolset For Source Code Investigation. F. Boerboom, A. Janssen, G. Lommerse, F. Nossin, L. Voinea, A. Telea. Eindhoven University of Technology , the Netherlands. Outline. The Visual Code Navigator (VCN):

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'F. Boerboom, A. Janssen, G. Lommerse, F. Nossin, L. Voinea, A. Telea' - dennis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
f boerboom a janssen g lommerse f nossin l voinea a telea

The Visual Code Navigator: An Interactive Toolset

For Source Code Investigation

F. Boerboom, A. Janssen, G. Lommerse, F. Nossin, L. Voinea, A. Telea

Eindhoven University of Technology, the Netherlands

slide2

Outline

The Visual Code Navigator (VCN):

  • an environment for interactive visualization of industry-sizesource code projects
  • tuned for C/C++ code bases stored in CVS
  • targets understanding code evolution and code structure
  • based on three views with complementary purposes

How can we extract facts from source code?

What can the VCN source code views show?

slide3

Fact extraction

  • Notoriously difficult problem…Requirements (roughly):
  • completeness:- extracts all elements & cross-refs from source code - extracts correct information - complies with latest C/C++ standard - includes preprocessor facilities
  • tolerance:
  • - handles incomplete/incorrect/ambiguous code
  • efficiency:
  • - memory/speed efficient on industry-size code bases
  • availability:- can be built from source code, preferably cross-platform
slide4

Existing fact extractors

Testing: - get the tool as binary/source; try to build it

- analyze very large systems (>0.5MLOC) - select extremely messy C/C++ code - try with/without includes (incomplete) - check output for size, correctness, completeness, throughput

- investigate limitations’ causes

++ very good

+ good

o could be better

- limited

-- unacceptable/missing

? insufficiently tested

slide5

Conclusions

  • Many surprises:
  • most tools extract interface data quite ok
  • … but badly fail at parsing implementation (function bodies)
  • tolerance and completeness are mutually exclusive
  • completeness and performance are also complementary
  • GLR grammar based tools are by far the best
  • Overall, we found just one reasonably good tool: Columbus
  • However, it is:
  • closed-source
  • limited in some technical respects (template handling)
  • quite slow (1 hr 20 min for ~150000 LOC)

How can we do better than the above tools?

slide6

EFES: An own C/C++ fact extractor

  • We chose to build an own extractor:
  • based on the Elkhound C/C++ GLR parser
  • uses a modified preprocessor, for tolerance
  • extends the parser, for tolerance vs incomplete/incorrect code & handling templated code
  • uses compression techniques to compact/speed up output
  • So far:
  • tests on very large projects (>200 MLOC) look good
  • we are 3..7 times faster than Columbus
  • we produce the ‘bare’ info, no metrics yet

Hard, but unavoidable endeavour

slide7

EFES Architecture

source: any C/C++ project, possibly incomplete/incorrect code

preprocessor: libcpp, also used by GNU CPP

parser: Elsa – uses the Elkhound GLR parser generator

type checker: disambiguates code with type information

filter: limits output to a set of interest (e.g. files, scopes, …)

output generator: efficiently writes the output information to a file

slide8

EFES Enhancements

Several enhancements to ‘standard’ fact extraction:

preprocessor: enhanced CPP to produce exact location information (needed later for construct visualization & comparison)

parser & enhanced Elsa to:type checker: - parse incorrect code with extra grammar rules – errors are caught at scope level

- extended Elsa’s template support

- added checkpoints at top-form level to trap internal errors

filter: novel element; reduces output size dramatically, e.g. by skipping standard header information

output added compact binary output; reduces output size 10 times

generator: increases output speed 5 times

project lets users customize extraction (C++ dialect, filtering, parser

concept: strictness, what to output, etc)

slide9

Performance & Results

Columbus

We are 3..7 times faster

EFES

slide10

Conclusions

  • We’ve build a powerful C/C++ fact extractor:
  • works on large projects (>200 MLOC)
  • handles incorrect/incomplete code well
  • extracts virtually all raw information there is
  • is 3..7 times faster than a known commercial solution
  • Desired additions
  • distil raw information into more interesting facts (metrics, patterns, etc)
  • add query layer atop basic extractor
  • add interactive visualization layer atop query layer

An evolving project

slide11

Visualization

  • We have now our extracted facts:
  • variables, types, functions, classes…
  • cross-references between all these
  • location information (file, line, column) of each construct
  • We like to show it to the user & answer questions:
  • how is the code structured?
  • how are programming constructs distributed?
  • how has the code changed in time?
  • how are the typical function signatures used in a project?
  • …and so on

Several visualization tools

slide12

1) Syntactic view: 1 version, N files – code view

  • Basic idea:
  • combine a classical text editor with a pixel-based text display (e.g. SeeSoft) in a single view
  • let users smoothly navigate between the two
  • blend syntactic structures over code text using cushions

syntax tree

result

source

code

+

cushion

texture

cushion

profile f(x)

border size x

slide18

Cushion vs ‘syntax highlighting’

  • clasical syntax highlighting is actually lexical lighlighting
  • we generalize and enhance syntax highlighting

syntax highlighting

structure cushions

slide19

Syntactic view: Navigation

user points the mouse at some code location…

slide20

Syntactic view: Spot cursor

…and brings the text in focus above the structure

slide21

Syntactic view: Structure cursor

…over a whole syntactic construct, if desired.

slide22

Syntactic view - Conclusions

  • Two main uses:
  • Overview:
  • good for showing up to 10-15000 LOC on one screen
  • colors code by construct type
  • easy to spot presence/distribution of constructs in code
  • Detail:
  • good for quick browsing a single source file
  • gives structure context information
  • typical question:
  • “where was that function with that doubly-nested for?”
slide23

2) Symbol view: N files, 1 version – interface view

  • Displays public symbols in source files
  • Nested by scope rules (global, namespace, method, argument)
  • Visualized using a cushion treemap, colored by symbol type

‘public’symbolsin files

arguments

functions

fields

typedefs

global vars

files

files

slide24

Symbol view - Details

  • Treemap node size computation:

- leafs: function bodies: number of LOC in declaration

else number of LOC or sizeof()

- non-leafs: sum of children

  • Shading:- hue: construct type (typedef, function, argument, …)- saturation: construct nesting (global/class scope)
  • Targeted questions:

- “what kind of symbols are in a library’s headers?”

- “how are namespaces used in interface headers?”

- “does a header have a simple / uniform structure or not?”

- “are there heavy functions from a parameter-passing view?”

slide25

Symbol view: Example

C global

namespace

C++ std namespace

symbols

in file

brushed

file

slide26

3) Evolution view: M files, N versions

Basic idea: CVSscan tool [Voinea & Telea, ACM SoftVis’05]

time (version) axis

file axis

source code

details

slide27

Evolution view: M files, N versions

  • extends the CVSscan tool[Voinea & Telea, ACM SoftVis’05]
  • stacks several stripped-out file evolution views above each other
  • line color = construct type
  • helps spotting cross-file correlations (e.g. large changes)

file

axis

comments

time

(version)

axis

function bodies

strings

function headers

slide28

Evolution view - Results

  • We look for:
  • Large size jumps = large code changes
  • Size jumps correlating across more files at same version = cross-system changes
  • Less ‘wavy’ patterns = stable(r) files
  • Horizontal patterns = unchanged code
slide29

Evaluation

  • Method & materials: - VTK C++ library (1 MLOC, 100 versions) - 3 users with C++ but no VTK knowledge - 1 user with C++ and VTK knowledge (evaluator)
  • - quantitative and qualitative questions to be answered on VTK with and without VCN

Questions

Evo

Stx

Sym

are files fine/coarse grained?

what is the typical class interface structure?

what is the typical class implem. structure?

find & describe a few large evolution changes

what is the typical macro usage/frequency?

what is the typical comment usage/frequency?

preferred/first tool

optional/second tool

slide30

Evaluation

  • Results:
  • VCN allowed getting answers (much) faster than by pure classical source code browsing
  • views are complementary, serve different tasks in different ways
  • a single view is usually not enough
  • a fine-tuned, fast, integrated system is essential!
  • users reluctant to work with lame/suboptimal tools

interface?

fine insight

text editor

start

symbol view

evolution

view

fine insight

implementation?

syntax view

slide31

Implementation

  • Syntactic view:
  • cushions: OpenGL textures - superimposed, not blended
  • careful cushion border design (see paper)
  • Symbol view:
  • cushion treemap: OpenGL fragment programs
  • essential for interactive, fast navigation!
  • Evolution view:
  • column cushions: OpenGL textures
  • several LOC / pixel solve by software antialiasing
  • efficient tool design essential for smooth navigation in large code bases important for user acceptance
slide32

Conclusions

  • VCN: multi-view visual environment for understanding source code and its evolution
  • Syntax view: 1 version, N files (compiler)
  • Symbol view: 1 version, N version (linker)
  • Evolution view: M versions, N files
  • Dense pixel displays essential for viewing large datasets
  • Cushion techniques effective for visualizing various kinds of visual nesting (syntax,symbol,file,…)
  • Working to extend & generalize the VCN
  • What to do when M,N exceed a few hundred?

Check it out: www.win.tue.nl/~lvoinea/VCN.html