On the Design of a Testbed for AOSD

On the Design of a Testbed for AOSD Alessandro Garcia May 2007

Key Researchers • Lancaster – UK • Phil Greenwood, Alessandro Garcia • Eduardo Figueiredo, Nelio Cacho, Claudio Sant’Anna, Americo Sampaio, Awais Rashid • Recife – Brazil • Sergio Soares, Marcos Dosea, Paulo Borba • Kiel – Germany & Waterloo – Canada • Thiago Bartolomei • Lisbon – Portugal • Joao Araujo, Ana Moreira, Isabel Brito, Ricardo Argenton • Malaga – Spain • Monica Pinto, Lidia Fuentes • Salvador & Natal – Brazil • Thais Batista, Christina Chavez, Lyrene Silva • Other Contributors: Milan/Italy, Fraunhofer/Germany, Colorado/USA, Rio/Brazil, INRIA/France,Siemens/Germany…

AOSD: from embryonic techniques… • … to integration and testing in real-world settings • Growing need to assess AO methodologies • AOSD is becoming a sufficiently established research community • Need to compare AO approaches with other contemporary modularization approaches • Creation of an experimental environment for end-to-end evaluation of AOSD techniques • requirements • architecture • design • implementation • maintenance

Uncountable barriers • Available systems lack proper documentation • Difficult to find multiple AO and non-AO implementations for the same system • even worst: guarantee that the non-AO and AO decompositions are good ones is a tricky activity • PhD research studies: difficult to find or develop from scratch a plausible “benchmark” • many risks: time-consuming task, inherent bias, etc… • collaboration is the only alternative left • Quantitative or qualitative indicators are often NOT ready for use • Replication of studies becomes a pain

A Testbed for AOSD • Towards more scientific and cohesive research • serve as a communication and collaboration vehicle • achieve widely-accepted exemplars, indicators, and data that can be reused and refined • facilitate the identification of “unknown” problemsand benefits inherent to AOSD • effects throughout the lifecycle • bottlenecks specific to certain SE phases and their transitions • accelerate the progress in the area by offering context to pinpoint technique-specific problems

Testbeds vs. Software Engineering • Recent recognition of the pivotal role of benchmarking on the community cohesion and rapid progress1 • Some fields have faced some progress on benchmarking • e.g. reverse engineering, software refactoring, and program comprehension • However… • there is not much work on benchmarking modularization techniques • reports about the process of designing, instantiating, and evolving benchmarks in software engineering is rare 1S. Sim, S. Easterbrook, R. Holt. Using Benchmarking to Advance Research: A Challenge to Software Engineering. Proc. 25th Intl. Conf. on Software Engineering, Portland, Oregon, pp. 74-83, 3-10 May, 2003. 

Timeline Benchmark instantiaions preparation of the 1st pilot stability study starts... conclusion of the 1st study preparation of the pilot AO requirements study starts... June 2006 July 2006 August 2006 December 2006 September 2006 October 2006 1st benchmark definition starts... indicators definition new needs identified, e.g.:- concern interaction metrics - redefinition of metrics to CaesarJ - measurement reliability: tool support choice of thebenchmark goal Testbed design circulation of the questionnarie contributions of artefacts starts... choice of the change scenarios proposal accepted

Outline • Testbed design: the first benchmark • Testbed elements • Testbed instantiation • Testbed evolution • EA & the Testbed

Testbed design: the first benchmark • a number of decisions… such as: • application selection • it should be a system likely to be universally used to different assessment purposes • ten candidate applications were examined • Tourist Guide System, Pet Store, J2ME Games, CVS Eclipse Plug-In, OpenORB middleware system, etc. • each application was ranked according to weighted criteria

Selection Criteria • Examples • availability of AO and non-AO implementations (important) • availability of documentation (least important) • system generality (important) • heterogeneous types of concern interactions (most important) • aspects emerging in different phases (least important) • previous acceptance by the research community (most important) • paradigm neutral (most important) • a variety of crosscutting and non-crosscutting concerns (important) • e.g. widely-scoped vs. more localized ones • e.g. those requiring different uses of AO mechanisms • elegance of the AO and non-AO decompositions (important)

Health Watcher (HW) System1 1Soares et al. Implementing Distribution and Persistence Aspects with AspectJ. OOPSLA 2002  • Java version was developed by a company in Brazil • Several desirable properties • real-life system • non-trivial • Java and AspectJ implementations available • elegant OO and AO designs • some requirements, architecture and design documentation available • designed with modularity, reusability, maintainability and stability in mind • used in a reasonable number of studies that report well-accepted non-AO and AO design decompositions: • OOPSLA.02, FSE.06, S:P&E 2006, ICSM.06, EWSA.06, EA.06, ESEM.07, etc… • Important that multiple applications are used in the testbed to allow broad conclusions to be made

Health Watcher Architecture

Artefacts Repository • Initially a limited number of approaches have been applied • Requirements (e.g. Use-Cases, V-Graph, AOV-Graph, AORE, AORA) • Architecture (e.g. UML, ACME, AO ADL, AspectualACME, AOGA) • Design (UML, Theme/UML, aSideML) • Implementation (Java, AspectJ, CaesarJ, AWED, JBoss) • Contributors reported: • strengths and weaknesses of the HW system • issues to be benchmarked

What issues to benchmark? • Questionnaires sent to a representative set of SE institutions • understand which areas the existing AO techniques… • … were mature enough • phases: requirements engineering, detailed design and implementation • e.g. “pointcut languages” • … in evolution stage (e.g. aspect interaction) • … target quality attributes (e.g. enhanced maintainability and reusability) • Investigation of typical “ilities” in previous empirical studies involving modularization techniques (e.g. OO, AO, etc…): • modularity, maintainability and reusability • e.g. software stability • reliability • e.g. error proneness • specification effort and outcome quality • e.g. time spent, recall, and precision

What issues to benchmark? • Impact of AO mechanisms on particular SE activities or phases • phases are often assessed in isolation • desirable to determine the affects of one phase on subsequent phases • E.g. how changes in my AO program impacts the stability of the architecture decomposition (compared with OO program changes)? • Which motivating comparison? • OO vs. AO? or • Multiple AO techniques

Enhancing HW System… • … to include changes and produce releases • both widely-scoped and localized changes • changes to both CCCs and non-CCCs • different categories: perfective changes and refactorings, corrective changes, evolutionary changes, etc… • … to address the identified weaknesses w.r.t. • our original criteria • e.g. include localized CCCs, such as design patterns • feedback received from the contributors • e.g. need for improving the categories of aspect interactions • … based on the history of HW changes in the deployed Java system

Stability Indicators • Generality • indicators not tied to one specific artefact/technique type • Traceability in the assessment process • support assessment of effects of one phase on subsequent phases • SE-wide properties • modularity: cohesion, coupling, SoC, interface simplicity, etc… • change impact and stability • concern interaction

Testbed Elements Consequence: more mature elements Design Stability Study

Outline • Testbed design: the first benchmark • Testbed elements • Testbed instantiation • study on architecture and implementation stability1 • Java vs. AspectJ vs. CaesarJ • study on AO requirements engineering2 1P. Greenwood et al. On the Impact of Aspectual Decompositions on Design Stability: An Empirical Study. Proceedings of the 21st European Conference on Object-Oriented Programming (ECOOP.07), July 2007, Germany. (to appear)  2A. Sampaio et al. A Comparative Study of Aspect-Oriented Requirements Engineering Approaches. Proc. of the 1st International Symposium on Empirical Software Engineering and Measurement (ESEM.07), September 2007. (to appear) 

Instantiation of the Benchmark (Design Stability Study) • Application of the selected metric suites to each of the artefacts generated • Java, AspectJ, and CeasarJ programs • Non-AO architecture (N-Tier architecture) vs. AO architecture • Multi-dimensional analysis, including: • modularity sustenance • observance of architectural and design ripple effects • which categories of aspects (and respective interfaces) have exhibited or not stability • satisfaction of basic design principles through the releases

Instantiation of the Benchmark (Design Stability Study) 1P. Greenwood et al. On the Impact of Aspectual Decompositions on Design Stability: An Empirical Study. Proceedings of the 21st European Conference on Object-Oriented Programming (ECOOP.07), July 2007, Germany.  • Outcomes overview + Concerns aspectized upfront tend to show superior modularity stability + AO solutions required less intrusive modification in modules + Aspectual decompositions have demonstrated superior satisfaction of the Open-Closed principle • Highlighted the “fragile pointcut” problem: ripple effects observed in interacting aspect interfaces • AO modifications tended to propagate to seemingly unrelated modules + Architectural ripple effects observed only in the OO solution: undesirable changes relative to exception handling in multiple layers

Instantiation of the Benchmark(AO Requirements Study) 2A. Sampaio et al. A Comparative Study of Aspect-Oriented Requirements Engineering Approaches. Proc. of the 1st International Symposium on Empirical Software Engineering and Measurement (ESEM.07), September 2007. (to appear) 

Instantiation of the Benchmark(AO Requirements Study) • comparison of four eminent AORE approaches • time effectiveness (person-minutes) • accuracy of their produced outcome • precision and recall of the models produced • example of research question: • which activities are the main bottlenecks in terms of effort for each AORE approach? • target: 1st author interested in learning which tasks should be automated in the EA-Miner tool • main outcome: composition specification and conflict analysis

requirements study • lack of architectural changes: added EH • fix bugs encountered • improvement of “alignments” • metrics redefinitions thanks to • CaesarJ mechanisms 1st benchmark definition starts... indicators definition • common naming scheme • common activities choice of thebenchmark goal circulation of the questionnarie • more details in the architecture • documentation • refine architecture metrics • improved definition of concern interaction metrics contributions of artefacts starts... Timeline - Evolution Benchmark instantiaions 1st pilot stability study starts... conclusion of the 1st study June 2006 July 2006 August 2006 December 2006 September 2006 October 2006 1st benchmark definition starts... indicators definition choice of thebenchmark goal Testbed design circulation of the questionnarie contributions of artefacts starts... proposal accepted

Evolution: feedback from the studies • new categories of crosscutting concerns • implementation level • checked exceptions: EH aspectization is more challenging • use of exception-softening mechanism • complex, context-sensitive exception handlers • use of around advice • detailed design level: • use of design patterns • plenty of different uses of AO mechanisms (role-based composition, multiple inheritance, etc…) • Particular aspect interactions still not investigated • more than two aspects sharing the same join point • no presence of pointcuts picking out advice executions

EA and the Testbed • Status: • repository of AO and non-AO artifacts • no changes have been applied • Improvements are necessary, e.g.: • there is no detailed problem description • only use cases; requirements information is missing • most of the requirements-level aspects are directly mapped to architecture and implementation aspects • alignment of existing AO and non-AO artefacts needs to be improved • some architecture models are abstract, and some architectural views are missing

EA and the Testbed • Elements of the testbed repository have shown to be useful even for unanticipated assessment contexts, e.g. • AO measurement (U. Waterloo – Thiago Bartolomei) • dynamic AO metrics (U. Milan – Walter Cazzola) • AO design heuristics (U. Lancaster – Figueiredo, Sant’Anna, Garcia) • architectural styles and aspects (U. Bologna, U. Lancaster, UFBA, UFRN) • Used and extended in several ways • Investigate the interplay of AO requirements composition mechanisms and several attributes • requirements description stability • traceability • change impact analysis • understandability • etc…

EA and the Testbed • Other lessons learned • it is very difficult to design a proper testbed without the effective participation of the technique experts • e.g. J. Araujo and A. Moreira (AORE technique) • e.g. T. Bartolomei from CaesarJ team • testbed is an effective collaboration/communication tool • enables developers/researchers of emerging EA techniques to communicate • a common set of artefacts • improved problem understanding • not targeted to one specific phase • developers gain an improved awareness of all development phases • enables focused discussions at EA workshops • we need more funding $$$ 

Future Expansions • Other benchmarks • … for assessing stability in early aspects techniques • … for error proneness • Expand testbed elements • New applications • Apply more approaches • Develop new metrics • Testbed repository is a semi-open resource by now • The elements used and generated in the stability study is available at: www.comp.lancs.ac.uk/~greenwop/ecoop07/

On the Design of a Testbed for AOSD Alessandro Garcia May 2007

Contributing to the Testbed • Aim is to become an extensive open resource. • Only a limited number of approaches initially applied to the testbed. • Requires further contributions form the SE community. • Applications • New approaches • Metric suites

Summary Provided an overview of the various elements that contribute to the testbed. Illustrated how traceability can be achieved across development phases in terms of assessing approaches. Given a concrete example of how the testbed can be instantiated which can also be achieved in other development phases. Highlighted the benefits of using a common testbed for the community.

Other issues • Important that the testbed is an open resource. • Necessary for users of the testbed to contribute results gathered. • Repository of data • Guidelines on how to select the benchmarks and indicators (and previous data) • Validation of the benchmark (which issues should we consider)? • Plethora of new composition mechanisms in AOSD • How much they should affect the benchmarks design? • E.g. CaesarJ has feature-oriented programming mechanisms that are most suited to PLs

Outline • Provide an overview of the testbed. • Aims • Elements • Design Decisions • Detail the targeted development phases. • Approaches • Metrics • Example instantiation of the testbed. • Stability case-study at the implementation phase. • Subset of results. • Comparison of AORE approaches. • Results of the stability case-study • Benefits and future work

Testbed design: the first benchmark Answer key questions regard the effectiveness of AOSD through the development life-cycle. Provide a valuable resource to the software engineering community. A common testbed used to assess and compare AO and non-AO approaches. A communication vehicle for AO proponents.

Possible focus of upcoming benchmarks • design stability • error proneness • impact of aspects in adjacent phases • e.g. requirements -> architecture (traceability, quality of decisions made, etc...)

Achieving Traceability • Phases are often assessed in isolation • Desirable to determine the affects of one phase on subsequent phases • Number of attributes are common across development phases • Concern Interaction • Modularity • Stability • Change Impact

Requirements Phase 2A. Sampaio et al, “A Comparative Study of Aspect-Oriented Requirements Engineering Approaches”, Proc. of the 1st International Symposium on Empirical Software Engineering and Measurement (ESEM), September 2007. (to appear)  • Number of approaches applied • Viewpoint-based AORE • AO Requirement Analysis (AORA) • MDSOC • AOV-Graph • Difficult to compare varied approaches • Testbed project initiated related work for comparing AORE appraoches2 • Provides common schemes for comparison • Some commonalities exist for comparison • Effort – time to produce documentation • Modularity

Architecture Design Phase • A variety of architecture approaches applied. • ACME, AspectualACME, AO-ADL, Aspectual Template, AOSD-Europe Notation. • A specific metric suite has been developed for assessing architecture design approaches. • Coupling • Cohesion • Interface Complexity • SoC • Interactions • Other general attributes to measured. • Effort • Stability • Change impact • These metrics allow correlation to the requirements phase.

Instantiation of the Benchmark (Implementation Phase) (1) • Aim was to compare/assess stability of AO and non-AO approaches. • Involved selecting various elements provided by the testbed. • Application, metric suites, etc. • Apply new approaches to base artefacts (Java/AspectJ implementation) to create new artefacts. • CaesarJ

Usar o timeline para dar exemplos • Como os estudos retroalimentaram a definicao dos benchmarks • Change scenarios (different HW releases) • Can be reused for studies involving traceability, reuse, effectiveness of change impact analysis techniques, etc.. • indicators (concern interaction analysis) • common naming scheme

Results gathered can influence future development of the testbed • Metrics collected in the stability study highlighted deficiencies in some changes. • Added additional changes to improve coverage. • Development of new metrics. • Modularity metrics unable to capture all variations in the code due to their level of granularity. • Developed and applied change propagation metrics to be able to analyse all phenomenon • to explicitly investigate the differences between AspectJ and CaesarJ.

The Testbed as a Communication Tool • Enables developers/researchers across phases to communicate. • A common set of artefacts. • Improved problem understanding. • Not targeted to one specific phase. • Developers gain an improved awareness of all development phases. • Enables focused discussions at workshops etc.

Need to establish commonalities between approaches in order for comparisons to be made • tasks • e.g. concerns, concern interaction, change propagation, modularity

Instantiation of the Benchmark(Design Stability Study) both architecture andimplementation measures

Instantiation of the Benchmark(AO Requirements Study) • Outcomes Overview • composition is the corner stone of AORE • Composition specification is a time-consuming activity • improves change management and conflict analysis • this trade-off requires further analysis • Conflict analysis is also a significant task • composition specification and conflict analysis • future: comparison with non-AO RE approaches

On the Design of a Testbed for AOSD