1 / 28

Debugging Components

Debugging Components. Koen De Bosschere RUG-ELIS. Problem description. Components are loosely coupled and do not have a common notion of time Components have contracts (e.g. timing contracts) Components are activated asynchronously by the scheduler

tegan
Download Presentation

Debugging Components

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Debugging Components Koen De Bosschere RUG-ELIS

  2. Problem description • Components are loosely coupled and do not have a common notion of time • Components have contracts (e.g. timing contracts) • Components are activated asynchronously by the scheduler • Components can be replaced at run-time Traditional debugging techniques are not adequate

  3. Traditional debugging inadequate? • Execution is non-deterministic: no two runs can be guaranteed to be identical (scheduling, timing differences, replacing components,…): cyclic debugging not applicable • Timing is part of correctness: the intrusion caused by the debugger might violate the contracts • Input might not be repeatable if generated by an external device (e.g. camera or microphone) Debugging is often a matter of trial and error, and a good portion of luck and experience is needed; the use of multithreading only adds to that.

  4. Two approaches • On-chip debugging techniques • Software debugging techniques

  5. On-chip debugging techniques • Logic Analyser • ROM monitor • ROM emulator • In-Circuit Emulator • Background Debug Mode • JTAG These add-ons take up valuable chip area (up to 10%) Hardware manufacturers believe in design for debugability

  6. Software debugging techniques • Execution must be repeatable to allow for cyclic debugging • Program flow must be identical • Input must be identical • Execution must be observable to allow for debugging • We must be able to use breakpoints, watch points, etc. without altering the program flow Re-execution must be deterministic

  7. Example code class G { public static int global = 5; } class Thread1 extends Thread { public void run() { G.global += 2; } } class Thread2 extends Thread { public void run() { G.global *= 3; } } class Main { public static void main(String [] args) { Thread1 t1 = new Thread1(); Thread2 t2 = new Thread2(); G.global = 5; t1.start(); t2.start(); t1.join(); t2.join(); System.out.println(“global” + G.global); } }

  8. Possible executions G.global=5 G.global=5 G.global=5 G.global=5 L(5) L(5) L(5) L(5) *3 L(5) L(5) +2 +2 +2 S(15) *3 *3 S(7) S(7) L(15) L(7) S(15) S(15) +2 S(7) *3 S(17) S(21) G.global=15 G.global=7 G.global=21 G.global=17

  9. Causes of non-determinism • Sequential programs: • Input • Certain system calls (time) • … • Parallel programs: • Race conditions on shared variables, • Load balancing • …

  10. Execution Replay • Goal: make repeated equivalent re-executions possible • Method: two phases • Record phase: record all non-deterministic events during an execution in a trace file • Replay phase: use trace file to produce the same execution • Question: what & where to trace? • Synchronization Replay • Input Replay • Data race detection

  11. Requirements execution replay • Record must have low intrusion • Replay must be accurate • Record phase must be space efficient • Replay phase must be time efficient

  12. Synchronization Replay Execution 1 Execution 2 Trace file record replay (happens before relation)

  13. Input replay application IO-instructions System calls kernel

  14. Example code class G { public static int global = 5; public static Object s = new Object(); } class Thread1 extends Thread { public void run() { synchronized(G.s){G.global += 2;}} } class Thread2 extends Thread { public void run() { synchronized(G.s){G.global *= 3;}} } class Main { public static void main(String [] args) { Thread1 t1 = new Thread1(); Thread2 t2 = new Thread2(); G.global = 5; t1.start(); t2.start(); t1.join(); t2.join(); } }

  15. Possible executions G.global=5 G.global=5 G.global=5 L(5) L(5) L(5) L(5) *3 L(5) +2 +2 S(15) *3 *3 S(7) L(15) S(15) S(15) +2 S(7) S(17) G.global=15 G.global=7 G.global=17 G.global=5 L(5) +2 S(7) L(7) *3 S(21) G.global=21

  16. Record phase 1 G.global=5 3,4,5,6 3,7,8,9 2 4,6,7,8 4,5,6,7 3 3 4 5 L(5) *3 S(15) 6 7 L(15) +2 S(17) 8 9 10 7 11 12 G.global=17 1,2,3,7,9,10 1,2,3,10,11,12 1 G.global=5 2 3 3 4 4 L(5) +2 S(7) 6 5 L(7) *3 6 S(21) 7 7 9 8 G.global=21 10

  17. Replay phase G.global=5 G.global=5 1 1 2 2 3 3 3 3 4 4 4 L(5) +2 S(7) 5 5 L(5) *3 S(15) 6 6 6 L(7) *3 S(21) 7 7 7 7 L(15) +2 S(17) 8 8 9 9 10 G.global=21 10 11 12 G.global=17 1 2 3 4 5 6 7 8 9 10 11 12

  18. Execution Replay in Java • Requires to record the choices made by synchronization constructs like synchronized, wait, signal, etc. • During replay, the synchronization operations are replaced by operations waitforlogicaltime(t). T component system

  19. Input Replay • Execution will only yield the same results if the input is repeatable too • Solution: recording input by capturing all I/O events and regenerating them during replay • Input replay generates a huge amount of data…

  20. Data race detection G.global=5 G.global=5 L(5) L(5) L(5) L(5) +2 +2 *3 *3 S(7) S(15) S(15) S(7) G.global=15 G.global=7 • Data race occurs if a store/store, load/store or store/load occurs between two threads in parallel on the same location. • Automatic data race detection: check data race condition on all load/store pairs that are not ordered.

  21. Implementation • RecPlay for Solaris (SPARC) and Linux (x86) • Uses JiTI for dynamic instrumentation • Record overhead: 1.6% • JaReC for Java (on top of the JVM) • Uses JVMPI for dynamic instrumentation • Record overhead: 25% on average • Input-Replay for Linux (Tornado) • Uses ptrace

  22. Performance modeling JVM • Java workload separable in different components • Virtual Machine (SUN, IBM, JikesRVM, JRockit, …) • Java application (SPECjvm98, SPECjbb2000, …) • Input to the application • Measure execution characteristics (AMD Duron) • IPC, branch & cache behavior, … • Statistical analysis • Principal Components Analysis • Cluster Analysis • Quantify difference SPECcpu2000 and Java workloads

  23. JVM Results • Java workloads mostly clustered by • benchmark for large workloads • VM for small workloads • SPECjvm98: small input set not significant for large input set execution behavior • Comparing Java vs. C: • No significant difference IPC, amount of branches, data TLB • Significant difference data cache behaviour, instruction TLB, return stack usage

  24. PCA for SPECjvm98 – s1 input set

  25. PCA for SPECjvm98 – s100 input set

  26. PCA for SPECcpu vs. Java

  27. Conclusions • Debugging multithreaded/distributed systems is not an easy task • Faithful record/replay requires extra resources (time + space) • Record/replay enables the developer the effectively debug a complex multithreaded program • The choice of Java VM has an impact on the low-level behavior of the processor. Java benchmarks should be large enough to be realistic.

  28. Output • 14 refereed conference papers (OOPSLA, ParCo, WBT,…) • 12 workshop papers • 5 journal publications (FGCS, CACM, Parallel Computing,…) • 1 PhD • 12 master theses • Java and Embedded Systems Symposium Nov 2002 [150 people] • AADEBUG 2003 workshop, Sept 2003 [60 people]

More Related