Analyzing UML Descriptions of Architectures Using Information Theory

Analyzing UML Descriptions of Architectures Using Information Theory Lionel Briand Systems and Computer Engineering

Goals • From proposal: Assess • Error propagation • Change propagation • Requirement propagation • In the architecture(s) being analyzed • Draw conclusions regarding the relative impact of one or several architectures on product external quality attributes

Working Assumptions • A full OO development process is followed, I.e., Analysis, Architectural Design, etc. • Analysis and Design only use standard UML representations, though their use may be tailored to our purposes • Information theory is the main underlying theory to be used to develop analysis instruments • We are mostly interested by what can be done at the end of the architectural design

SCENARIOS ACTORS USE CASES UML Development - Overview REQUIREMENTS ELICITATION Time D A SEQUENCE DIAGRAMS T A ANALYSIS CLASS DIAGRAM(S) CLASS STATEDIAGRAMs ANALYSIS D I OPERATION CONTRACTS C T I SUBSYSTEM DECOMPOSITION & INTERFACES DESIGN COLLABORATION DIAGRAMS System Design DEPLOYMENT DIAGRAM O N DESIGN CLASS DIAGRAMS A DESIGN R Y Object Design IMPLEMENTATION CHOICES IMPLEMENTATION CLASS DIAGRAM IMPLEMENTATION PROGRAM

Information Available at Architectural Design • Use cases, Use case Diagram • Analysis and Arc. Design class diagrams • Analysis and Arc. Design Sequence/Collaboration diagrams • Statecharts for multi-modal classes • Subsystems + their public interface (API) • Subsystem dependencies • Mapping subsystems’ instances to Hardware (Deployment) • Data management, Security control

RouteAssistant PlanningService Trip Location Direction Destination Crossing Segment MyTrip Analysis Class Diagram

PlanningSubsystem RoutingSubsystem RouteAssistant PlanningService Trip Location Direction Destination Crossing Segment MyTrip Subsystems

Subsystem Interface • The set of public operations forms the subsystem interface or Application Programming Interface (API) • Includes operations but also their parameters, types, and return values • Operation contracts are also defined (pre- and post-conditions) and accounted for by client subsystems – they can be considered part of the API

:OnBoardComputer :WebServer :PlanningSubsystem :RoutingSubsystem MyTrip Deployment Diagram

PlanningSubsystem RoutingSubsystem RouteAssistant PlanningService Trip Location Destination TripProxy Direction Crossing SegmentProxy Segment CommunicationSubsystem Message Connection New Classes and Subsystems

RoutingSubsystem PlanningSubsystem CommunicationSubsystem TripFileStoreSubsystem MapDBStoreSubsystem MyTrip Data Storage

Static versus Dynamic Architecture Information • Subsystem Interfaces and dependencies provide static information • Interaction models provide dynamic information • Relationship between static and dynamic information and quantitative factors? • Change propagation (dependencies) • Requirements propagation (dependencies) • Error propagation (at run time)

Impact on Change and Requirements Propagation • Assuming a subsystem SSi undergoes change (correction or requirement) • Possible Relevant Information: • Does the change affect the interface of SSi? • How many other subsystems depend on SSi? • How much coupling (static) is there between SSi and the other subsystems? • What is the size and complexity of the *used* interface between SSi and the other subsystems?

Impact on Error Propagation • Assuming a subsystem SSi contains a fault, how likely is it to cause an incorrect execution in another, client subsystem? (propagate the error) • Relevant Information: • Message frequency between subsystems • Indirectly dependent on system operational profile • An error in a rarely invoked subsystem/class is not likely to propagate • A fault in in a operation where the ratio OutputRange / InputRange is small has low testability but errors are also less likely to propagate

Role of Information Theory • Quantify amount of information transmitted between two subsystems • Between subsystems at run time (dynamic information) • Between the people/teams designing the subsystems (static information) • Characterize the distribution of information flow among subsystems, I.e., the topology of relationships among subsystems

Information Theory Main Concepts • Channel • Noise • Equivocation • Information transmitted • Information loss • How do they match to our problems? Can they serve any purpose in our context?

Subsystem Usage Receiver (B) SS1 SS2 SS3 SS4 SS5 Sender (A) SS1 SS2 SS3 SS4 SS5

Subsystem Usage: Cell Information • nab = 0 / 1, a does (not) depends on b • nab = # operations used in b by a • pab =probability for a of using any operation in b • The first to capture static information whether the third one is more complex to measure as it captures dynamic information and assume an operational profile

Operation Usage Receiver (B) Op1 Op2 Op3 Op4 Op5 Sender (A) Op1 Op2 Op3 Op4 Op5

Operation Usage: Cell Information • nab = 0 / 1, a does (not) use b • nab = weighted by information flow • pab =probability for a of using operation b

Example (3) (1) (2)

Example II • Subsystem usage case, nab = 0 / 1 • Total Entropy: H(AB) • H1(AB) = log2 5 • H2(AB) = log2 5 • H3(AB) = log2 9 • Noise: H(AB) – H(A) • H1A(B) = 4/5 = 0.8 • H2A(B) = 3/5 log2 3 = 0.94 • H3A(B) = log2 3 = 1.58

Example III • Equivocation: H(AB) – H(B) • H1B(A) = 3/5 = log2 3 = 0.98 • H2B(A) = 4/5 = 0.8 • H3B(A) = log2 3 = 1.58 • Information Transmitted: H(A) – HB(A) • T1 (A:B) = 0.57 • T2 (A:B) = 0.57 • T3 (A:B) = 0 • Note: to compare architectures, probably need to normalize by log2 |A|* |B|

Interpretation • (1) High total entropy -> High (static) level of dependency between subsystems / operations • (2) High equivocation -> on average, subsystems/operations accommodate many uses / usages from different subsystems / operations • (3) High noise -> on average, subsystems / operations use many other subsystems / operations • (1) Has impact on communications between teams / developers • (2) and (3) provide additional insight explaining the source of entropy

Open Questions • How to quantify and characterize usage? • Static information only (# services used), • Information flow weight (parameters, return value) • Frequency of invocation • Communication mode: asynchronous/synchronous • Distribution: remote/local invocation

Plan • Differences between information theory concepts and simple count measures • Characterize more formally how IT concepts react to changes in architectures (properties) • Explore different ways of characterizing usage • Relationship between information theory (IT) concepts and architecture styles

Analyzing UML Descriptions of Architectures Using Information Theory

Analyzing UML Descriptions of Architectures Using Information Theory

Presentation Transcript

Learning Semantic Descriptions of Web Information Sources

Information Theory

Information System Architectures Traditional architectures, Client

Modelling Clinical Information Using UML

Inferring Models of cis -Regulatory Modules using Information Theory

UML Modeling using MagicDraw

Information System Architectures

Analyzing and Using Marketing Information

Analyzing Sources of Information Review

Assessing the Suitability of UML for Modeling Software Architectures

Inheritable Information Interchange Architectures

Web-based Information Architectures

Information Systems Architectures

Chapter 8 Analyzing Job Descriptions

Marketing Research: Gathering, Analyzing, and Using Information

Using UML for Modeling Complex Real Time System Architectures

Marketing Research: Gathering, Analyzing, and Using Information

School of Information Theory

Information architectures: theory and practice (Internet, Web, Grid, Cloud), design

Assessing the Suitability of UML for Modeling Software Architectures

Using UML for Modeling Complex Real Time System Architectures