490 likes | 589 Views
Explore the intricacies of causal inference and its impact on analyzing the performance of programs across various computer systems. Delve into practical deployment of learning algorithms, the importance of qualitative properties, and philosophical insights.
E N D
Learning Causal Models of Multivariate Systemsand the Value of it for the Performance Modeling of Computer Programs Jan Lemeire December 19th 2007 Supervisor: Prof. dr. ir. Erik Dirkx
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics. • The importance of qualitative properties. Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
What is Parallel Processing? Computational work: Ideally: Speedup = number of processors Parallel system Causal Inference & Performance Analysis
Parallel Overhead • Speedup = 2.55 • Overhead = time the processors are not spending on useful work = lost processor cycles Causal Inference & Performance Analysis
Overhead Analysis Impact of overhead on speedup Causal Inference & Performance Analysis
Experimental Parallel Performance Analysis: Data Acquisition Causal Inference & Performance Analysis
EPDA: Multivariate Analysis Causal Inference & Performance Analysis
EVT Experimenten in animatie tonen (zonder (a) en (b) Intermezzo I: Causal Inference Causal Inference & Performance Analysis
Causal Inference for PerformanceAnalysis Utility based on the following properties: • Dependency analysis: how variables relate. • Markov property. • A causal model corresponds to a decomposition. Causal Inference & Performance Analysis
Execution of program gives cache misses x? 4 x? 4 datatype (integer, float, double,…) data size in Bytes Causal Inference & Performance Analysis
Markov Property Correlated With information about the data size: Provides explanations Differentiate direct from indirect relations Causal Inference & Performance Analysis
Can we Observe Causal Relations? ~ ??? OK, but: or Causal Inference & Performance Analysis
What is Causality? A causal relation denotes a mechanism, that a variable is `produced’ by its causes. However… not directly observable. Mmmh Causality is a relic of a bygone age Bertrand Russell Judea Pearl But: we want to learn something about underlying system (goal of statistics) Causal Inference & Performance Analysis
Second Cause ~ Causal Inference & Performance Analysis
V-structure Property angle independent from gunpowder but dependent when distance is known Causal Inference & Performance Analysis
Conditional Independencies Make Causal Inference Possible • From a causal structure follow conditional independencies, irrespective of the mechanisms. • Markov • V-structure Causal Inference & Performance Analysis
Graph is a Description of Independencies • Graphical criterion: d-separation • Intuitive • Faithfulness property: independencies independencies in graph in reality Causal Inference & Performance Analysis
Causal Structure Learning In two steps: • Undirected graph • Orientation Causal Inference & Performance Analysis
Dit kan ook pas verder, bij bespreking van unique Result • Partially directed acyclic graph “We know what parts are unknown.” • Faithfulness assumption: all independencies follow from the causal structure Causal Inference & Performance Analysis
Figuur opnieuw in png, zonder losless compression Experimental Results Contribution 1 (1) Automatic learning of accurate performance models (2) Model validation (3) Identification of unexpected dependencies (4) Explanations for outliers Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
Practical Causal Inference The following limitations had to be overcome: • Non-linear relations: form-free independence test • Mixture of continuous, discrete and categorical data: general independence test • Deterministic relations: augmented causal model and extended learning algorithms Causal Inference & Performance Analysis
Form-Free and General Dependency Test • Example Y Pearson: Rxy=0.083 => X and Y linearly independent • Mutual information • Kernel density estimation X Y P(X, Y) X I(X;Y)=0.90 bits => dependent Causal Inference & Performance Analysis
Deterministic Relations • Data sizeand data typeare information equivalent with respect to cache misses • During learning connect least complex relation Causal Inference & Performance Analysis
Complexity Criterion Contribution 2a Correct models are learned under the Complexity Increase Assumption Causal Inference & Performance Analysis
Dit moet erbij!! Details misschien niet? Reestablishment of Faithfulness Contribution 2b • Consequences are considered • Information equivalences • Independence and simplicity • D-separation extension • Faithful model: represents all independencies • Information is added to the model • Basic information equivalences Causal Inference & Performance Analysis
Extension of PC Learning Algorithm Contribution 2c • Detection of information equivalences • Among information equivalent relations, the simplest one is chosen • Orientation rules remain the same Correct models are learned from data containing deterministic relations. Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
Jaartallen van scientists erbij zetten Inductive Inference • Occam’s Razor “Among equivalent models choose the simplest one.” William of Ockham BUT: Objective measure of complexity? Causal Inference & Performance Analysis
Kolmogorov Complexity Kolmogorov Complexity of a binary string: the length of the shortest program that computes the string and halts Andrey Kolmogorov Causal Inference & Performance Analysis Applied to Occam’s Razor: “Select model that describes the observations minimally”
Shortest Programs • 001001001001001001001001001001001 regularity of repetition allows compression • 011000110101101010111001001101000 random information = incompressible Causal Inference & Performance Analysis
Randomness versus Regularity Kolmogorov Minimal Sufficient Statistics (KMSS): formal separation • 001001001001001001001001001001001 • 011000110101101010111001001101000 Only random information (incompressible) Meaningful information regularities Accidental information randomness repetition 11 times, 001 Causal Inference & Performance Analysis
Learning = finding regularities = maximal compression Structure of a diamond Exact size random regularities random Causal Inference & Performance Analysis
Meaningful Information of Probability Distributions Contribution 3a meaningful information(Theorem 1) Kolmogorov Minimal Sufficient Statistic if graph and CPDs are incompressible (Theorem 2) a graph with random CPDs is faithful (Theorem 4) Causal Inference & Performance Analysis
Causal Aspect of Causal Models = Decomposition • Canonical decomposition:quasi-unique and minimal decomposition into atomic and independent components (the CPDs) • Corresponds to reality (mechanisms) Causal Inference & Performance Analysis
Even more Figuurtje toevoegen van holisme en reductionisme Causal Component Relies on Reductionism • The world can be studied in parts. Or, even more: • The world is made up of indivisible parts. • When DAG of Bayesian network is a complete graph • no meaningful information • holism Causal Inference & Performance Analysis
Validity of Causal Inference Contribution 3b How OK is the learned causal model? Do CPD components correspond to physical mechanisms? Minimal model? Faithful? Other regularities? Causal Inference & Performance Analysis
Well-known Example of Unfaithfulness ’Normally’: A and D correlate A and D get independent if influences along paths 1 and 2 cancel each other out Mechanisms are related Regularity among them Causal Inference & Performance Analysis
Learning causal models for the performance analysis of programs executed on various computer systems. • Intermezzo I: Causal Inference. • Practical deployment of the causal learning algorithms. • Philosophical and theoretical study of causal inference. • Intermezzo II: Kolmogorov Minimal Sufficient Statistics • The importance of qualitative properties. Causal Inference & Performance Analysis
Regularities are Qualitative Properties • Different from quantitative information. • Allow for qualitative reasoning. • Qualitative properties determine behavior. Causal Inference & Performance Analysis
Communication Schemes on Network Topologies Communication time? Causal Inference & Performance Analysis
Generic Performance Model Contribution 4a • Good predictions for combinations of random schemes and random topologies Causal Inference & Performance Analysis
Met minder voordehandliggende figuurtjes tonen Broadcast niet in stervorm, shift in lijnvorm, torus toevoegen Combinations of Patterns Contribution 4b Performance depends on match! Causal Inference & Performance Analysis
Qualitative Properties Faithfulness: ”graph should describe all independencies” KMSS: ”model should describe all regularities” Qualitative information Quantitative information explicitly describe regularities contains no more regularities Causal Inference & Performance Analysis
Explicitly Mention Qualitative Properties! Causal Inference & Performance Analysis
Conclusions • Contribution to performance analysis. • Automatic causal analysis. • Useful add-on in combination with other techniques. • The value of causal inference is underlined. • The importance of regularities or qualitative properties. Causal Inference & Performance Analysis
Future Work • Application of the learned performance models for optimization. • Is the failure of generic performance models only due to regularities? • Augment models with qualitative properties. • But: how define, recognize and reason with regularities? Causal Inference & Performance Analysis