Grokking Software Architecture - PowerPoint PPT Presentation

Grokking software architecture l.jpg
Download
1 / 50

2008 Working Conference on Reverse Engineering Grokking Software Architecture Richard C. Holt Software Architecture Group (SWAG) School of Computer Science, University of Waterloo, Canada Retrospective 1998 2008

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Grokking Software Architecture

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Grokking software architecture l.jpg

2008 Working Conference on Reverse Engineering

Grokking Software Architecture

Richard C. Holt

Software Architecture Group (SWAG)

School of Computer Science, University of Waterloo, Canada


Retrospective l.jpg

Retrospective

1998

2008

Ten years ago.WCRE most influential paper. “Structural Manipulations of Software Architecture using Tarski Relational Algebra”

Today.Retrospective. “Grokking Software Architecture”

17 papers in WCRE


Grokking software architecture3 l.jpg

Grokking Software Architecture

Grokking

Software architecture


Overview of talk 4 parts l.jpg

Overview of Talk: 4 Parts

  • Part 1. 1998 paper: Hopes & claims

  • Part 2. Software Architecture

  • Part 3. Formalizing Boxology

  • Part 4. ROP: Relation-Oriented Programming & Grok-Like Languages


Part 1 1998 paper hopes claims l.jpg

Part 1.1998 paper: Hopes & claims

  • Represent software architecture as a typed graph

    • Graphs with “colors” of edges & nodes

  • Manipulate & visualize these architectural graphs

  • Manipulations can be specified algebraically --- and automatically executed

In brief: Formalize architectural diagrams and reap the benefits arising from the corresponding mathematics.


Top view of as built software architecture 250kloc system l.jpg

Top View of As-Built Software Architecture (250KLOC System)


View of one subsystem of the 250 kloc system l.jpg

View of One Subsystem of the 250 KLOC System


Cs 746g topics in software architecture university of waterloo l.jpg

CS 746G Topics in Software ArchitectureUniversity of Waterloo

  • CS746 in Winter 1998 Linux (Operating System)

  • CS746 in Winter 1999 Apache (Web Server)

  • CS746 in Winter 2000 Mozilla (Web Browser)

  • CS746 in Winter 2001 Eazel Nautilus (File Manager)

  • CS798 in Winter 2002 Postgres et al (Data Base)

  • CS746 in Winter 2003 EMACS et al (Editor)

  • CS746 in Winter 2004 Gnumeric (Spreadsheet)

  • CS746 in Fall 2004 Mozilla (Web Browser -- again)

  • CS746 in Fall 2005 Open Office (Open Source Office Suite)

  • CS746 in Fall 2006 Asterisk (Open Phone Switch)

  • CS746 in Fall 2008 MySQL


Process of view creation l.jpg

Process of View Creation

Source code

Clustering

Parser

Facts extracted

from code

Hierarchic

decomposition

Grok:

Fact manipulator

Architectural

diagram

Layouter

Browser


Transformations to do hiding l.jpg

T

a

c

d

S

e

V

b

f

g

h

Transformations to do Hiding

Graph G

d

T

e

a

V

f

b

Graph I = hideExt(G, S)

Graph H = hide(hide(G,T),V)


Lifting calls up to file level l.jpg

Lifting Calls Up to File Level

call is a procedure call

fileCall is a file level call

File

File

fileCall

main.c

start.h

funcDef

funcDcl

main

startup

call

Procedure body

Procedure header

fileCall := funcDef o call o inv funcDcl


Part 2 software architecture boxology approach l.jpg

Part 2.Software Architecture: Boxology Approach

  • Software architecture:

    • What is it?

    • State of practice

    • How is it represented

    • Keep It simple

    • Models & tools

    • Views of architecture

  • Extracting As-Built architecture


Software architecture what is it l.jpg

Software Architecture:What is it?

  • Confusion. I have a sneaking suspicion that ‘architecture’ is one of the most overused and least understood terms in professional software development circles.Gorton

  • Consensus. Architecture captures system structure in terms of components [parts] and how they interact.Gorton


Software architecture state of the practice l.jpg

Software Architecture: State of the Practice

  • “It’s common for there to be little or no documentation covering the architecture in many projects.”Gorton

  • “I'm hopeless when it comes to documentation.”Torvalds

  • “The architecture that actually predominates in practice is the ‘big ball of mud’ ” Foote et al


Software as spaghetti l.jpg

Software as Spaghetti

Foote et al


Software architecture how is it represented in practice l.jpg

Software Architecture: How is it Represented in Practice?

  • …predominant tools used for architecture documentation are Microsoft Word, Visio and Power PointGorton

  • What’s needed: Concepts, notations and tools that are

    • easy to use and

    • help us produce useful, understandable documentation


Kiss keep it simple stupid l.jpg

KISS: Keep it Simple Stupid

“Any fool can make things bigger, more complex, and more violent. It takes a touch of genius - and a lot of courage - to move in the opposite direction.”Einstein

“Make everything as simple as possible, but not simpler.”Einstein


Models and tools for software architecture l.jpg

Models and Tools for Software Architecture

  • “UML has, for better or (many would say) worse, become the industry standard ADL [Architecture Design Language]”Shaw

  • UML “lacks, however, a robust suite of tools for analysis, consistency checking”Shaw


Uml component diagram box and arrow diagram l.jpg

UML Component Diagram: Box and Arrow Diagram

Gorton


Views of software architecture kruchten l.jpg

Views of Software Architecture Kruchten

End user

Users’ View

As-Built View

Programmers

& software managers

Scenarios

Concurrency

View

Deployment

View

System Engineer

Integrator


Extracting the as built architecture from the code l.jpg

Extracting the As-Built Architecture from the Code

  • “Reverse engineering is the process of analyzing a subject system to create representations of the system at a higher level of abstraction.”Chikofsky

  • Relational approach.

    • Parse the code to produce relations, e.g

      • (call, P, Q) means proc P calls Q

    • Manipulate edges into as-built architecture


Boxology as a central adl architectural design language l.jpg

Boxology as a Central ADL (Architectural Design Language)

  • “The most widely used design notation [for software architecture] is informal ‘block and arrow’ diagrams.”Gorton


Cross fertilization rev eng s w arch relational approach l.jpg

Cross Fertilization!! Rev Eng, S/W Arch, Relational Approach

  • Reverse engineering

    • Architecture extraction

    • As-Built view: Code is king

    • Traceability

  • Software architecture

    • Need for representation & tools

    • Simplicity & utility

  • Relational approach

    • Boxology

    • Formalization --- Tarski algebra


Part 3 formalizing boxology l.jpg

Part 3. Formalizing Boxology

  • Boxology is the “Representation of an organized structure as a graph of labeled nodes (‘boxes’) and connections between them (as lines or arrows).”Wikipedia

  • “Toward boxology: preliminary classification of architectural styles”Shaw


Example typed graph l.jpg

Example Typed Graph

r

r

C

C

a

I

b

a

b

I

z

w

E

U

C

C

C

E

C

C

U

x

y

v

v

w

x

y

z

U

U

  • C = { (r,a), (r,b), (a,v), (a,w) (a,x), (b,y), (b,z) }

  • I = { (a,b) }

  • E = { (b,y) }

  • U = { (v,w), (x,y) }


Boxology is just scribbling l.jpg

Boxology is Just Scribbling?

  • Box & arrow diagrams

    • Are just scribbles? No

    • Formalized by typed graphs

    • Visualized as (nested) boxes & arrows

    • Manipulated by Tarski algebra etc.

    • Exchanged as

      • Triples (RSF), extended to TA, or GXL or …


Boxology has semantics yes l.jpg

Boxology has Semantics? Yes

  • Compare to BNF

    • Semantics by informal attachment to productions

  • Compare to Codd’s relational approach

    • Semantics by interpretation of tables.

  • Semantics by attributes & descriptions

    • Separation of concerns

    • Structure then semantics

  • Use box/arrow diagrams as underlying formalism for software architecture (Mini-MOF?)


Adding algebra to boxology l.jpg

Adding Algebra to Boxology

  • Tables then Codd relational algebra

    • N-ary relations

  • Boxes/arrows then Tarski relational algebra

    • Binary relations


Example typed graph29 l.jpg

Example Typed Graph

r

r

C

C

a

I

b

a

b

I

z

w

E

U

C

C

C

E

C

C

U

x

y

v

v

w

x

y

z

U

U

  • C = { (r,a), (r,b), (a,v), (a,w) (a,x), (b,y), (b,z) }

  • I = { (a,b) }

  • E = { (b,y) }

  • U = { (v,w), (x,y) }


Tarski algebraic operators l.jpg

Tarski Algebraic Operators

UnionI + E = {(a,b), (b,y)}

IntersectionE ^ C = {(b,y)}

DifferenceC - E = {(r,a), (r,b), (a,v), (a,w), (a,x), (b,z)}

Inverseinv E = {(y,b)}

CompositionI o E = {(a,y)}

Identityid = {(r,r), (a, a), (b,b), (w,w) … }

Transitive Cl.C+ = {(r,a), (r, b), (r,v), (r,w), (r,x), (r,y),

(r,z), (a,v), (a,w), (a,x), (b,y), (b,z)}

Reflex. T.C.C* = ID + C+


Ta schemas for box and arrow diagrams l.jpg

TA Schemas for Box and Arrow Diagrams

ref

proc

var

  • A Schema in TA

    • Determines

      • Types of boxes

      • Types of edges

      • Allowed connectivity between edges

      • Supports inheritance in schemas

    • Also attributes (strings) on boxes & on edges

call

instance

instance

instance

instance

p

q

x

y

call

ref

Malton WCRE 2005


Why formalize boxology cause it makes our life better l.jpg

Why Formalize Boxology??Cause it Makes Our Life Better

  • Clear understanding & clear specification

    • What does RSF meaning?

    • Meaning is independent of implementation

    • Clarifies deeper concepts, e.g., expressiveness

  • Generality

  • Progress in reverse engineering

  • Progress in software architecture

  • Not just scribbling


Part 4 rop relation oriented programming grok like languages l.jpg

Part 4.ROP: Relation-Oriented Programming & Grok-Like Languages

  • A paradigm shift


Example mickey eats swiss cheese l.jpg

Example: Mickey Eats Swiss Cheese

The “eat” relation

  • Mickey . eat

    • Swiss

    • Roquefort

  • eat . Mickey

    • Garfield

    • Fluffy

  • eat o eat

    • (Garfield Swiss)

    • (Garfield Roquefort)

    • (Fluffy Swiss)

    • (Fluffy Roquefort)

  • eat+

    • ,,,

Garfield

Fluffy

Mickey

Nancy

Swiss

Roquefort


Example rop grok program is relation r a tree l.jpg

Example ROP/Grok Program:Is relation R a tree?

How you would program this test …


Grok program is r a tree l.jpg

Grok Program: Is R a Tree?

Pseudo code

if R has no loops &

R has one root &

R has only single parents then

put “R is a tree”

Assume each node is a source or target of the contain C relation


Grok program is r a tree37 l.jpg

Grok Program: Is R a Tree?

Pseudo code

Grok code

if # ( R+^ ID ) = 0

if R has no loops

Does transitive closure of R have any self-loops? Yes

R

R

R

R

a

d

b

c


Grok program is r a tree38 l.jpg

Grok Program: Is R a Tree?

Pseudo code

Grok code

if # ( R+ ^ ID ) = 0 &

# (dom R - rng R) = 1

if R has no loops &

R has one root

Does R have exactly one source? Yes

a

dom

b

c

f

g

d

e

rng


Grok program is r a tree39 l.jpg

Grok Program: Is R a Tree?

Pseudo code

Grok code

if # ( R+ ^ ID ) = 0 &

# (dom R - rng R) = 1 &

# ((R o inv R) - ID) != 0

if R has no loops &

R has one root &

R has only single parents

R o inv R

b

d

inv R

Does my child have another parent? Yes

R

a

c


Grok program is r a tree40 l.jpg

Grok Program: Is R a Tree?

Pseudo code

Grok code

if # ( R+ ^ ID ) = 0 &

# (dom R - rng R) = 1 &

# ((R o inv R) - ID) != 0

then

put “R is a tree”

if R has no loops &

R has one root &

R has only single parents then

put “R is a tree”

Moral: Relational progamming is not like low level (Java level) programming. Loops typically disappear.


Notation does it matter l.jpg

Notation: Does it Matter?

By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of the race.

Alfred North Whitehead


Wins losses using tarski algebra l.jpg

Wins & Losses Using Tarski Algebra

  • Wins

    • Good for computing new edges, for finding properties of edges, eg, nodes in loops, leaves, etc.

  • Losses

    • Not good for locating patterns involving several nodes, e.g., find complete connected sub-graphs


Notation grok tarski vs crocopat l.jpg

Notation: Grok (Tarski) vs. Crocopat

My parent’s (P) children (C) are my (reflexive) siblings (S)

y

P

C

C

P

x

z

S

S

S := P o C

S(x,z) := EX(y, P(x,y) & C(y,z))

Grok

Crocopat

Should Crocopat add Tarski operators??


Characterizing grok like languages l.jpg

Characterizing Grok-Like Languages

  • Relational

  • Useful for software analysis

  • Expressiveness

    • How powerful can a query be?

      • Codd algebra and Crocopat are more powerful.

    • How well can a query meet our needs? How writeable? How readable?

  • Performance of implementation

    • Can hold large graphs?

    • Fast enough to manipulate large graphs?


Performance of grok like languages l.jpg

Performance of Grok-Like Languages

  • Size & speed: OK for --- Grok & Crocopat

    • All memory resident, no disk access

    • Hundreds of thousands of edges

    • Modeling million-line systems

    • Most operations not more than a few seconds

    • Crocopat scales up a bit more for transitive closure

    • House keeping, e.g., time to read files, is critical

    • Need to test on 64-bit implementations


Data structures for binary relations l.jpg

Data Structures for Binary Relations

  • Tables: One for each type of relation DBMS

  • Single table of triples Grok

  • Linked lists

    • Pointers and nodes Lsedit, JGrok (caches sorted lists)

  • BDD: Binary Decision Diagram Relview, Crocopat

    • Memory efficient storage of binary relations

    • Works well with dense graphs

    • Proven useful RelView, Crocopat

    • Surprising (to me): BDD efficient for transitive closure


Grok like languages l.jpg

Grok-Like Languages

Discussion of Grok-Like Languages

PS: Paul Klint’s relational language ...


Progress using grok like languages l.jpg

Progress: Using Grok-Like Languages

  • Enforce architecture rules. Holt 96, Feijs 98, Knodel 08

  • Lift dependency edges. Holt 98, Feijs1998

  • Find design pattern instances. Consens 98, Beyer 02

  • Find violations of patterns. Guo 99

  • Find anti-patterns. vanEmden 02, Feijs 98

  • Change impact analysis. Feijs 98

  • Specify extraction from syntax. Lin 08

  • Find source of dependency. Fahmy 01, Feijs 98

  • Locate uses of protocols. Wu 01

  • Type inference using transitive closure. vanDeursen 99


Conclusions l.jpg

Conclusions

Grokking Software Architecture


Conclusions50 l.jpg

Conclusions

  • Typed graphs nicely formalize various software structures

  • Software architecture can benefit from a ROP approach

  • Tarski algebra, added to boxology, is elegant

    • Does not handle multi-node patterns

  • Grok-like (ROP) languages are elegant and sufficiently efficient

    • ROP is high level, is faster, more reliable, more flexible

  • Lots of

    • Work done so far

    • Room for more work


  • Login