Download presentation
Presentation is loading. Please wait.
1
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006
2
Pag. 2 Causal Performance Models Overview I. Causal Models II. Learning Algorithms III. Performance Modeling IV. Extensions
3
Pag. 3 Causal Performance Models I. Multivariate Analysis Variables Probabilistic model of joint distribution? Relational information? A priori unknown relations Experimental data
4
Pag. 4 Causal Performance Models A. Representation of distributions Factorization Reduction of factorization complexity Bayesian Network Ordering 1Ordering 2
5
Pag. 5 Causal Performance Models Conditional independence Qualitative property: P(rain|quality of speech)=P(rain) ? Markov condition in graph Variable becomes independent from all its non-descendants by conditioning on its direct parents. – graphical d-separation criterion B. Representation of Independencies
6
Pag. 6 Causal Performance Models Faithfulness Faithfulness: Joint Distribution Directed Acyclic Graph Conditional independencies d-separation Theorem: if a faithful graph exists, it is the minimal factorization. Independence-map: All independencies in the Bayesian network appear in the distribution
7
Pag. 7 Causal Performance Models Definition through interventions causal model + Conditional Probability Distributions + Causal Markov Condition = Bayesian network C. Representation of Causal Mechanisms Model of the underlying physical mechanisms
8
Pag. 8 Causal Performance Models Reductionism Causal modeling = reductionism Canonical representation: unique, minimal, independent Building block = P(X i |parents(X i )) Whole theory is based on this modularity Intervention = change of block
9
Pag. 9 Causal Performance Models Ultimate motivation for causality Model = canonical representation able to explain all qualitative properties (independencies) close to reality If causal mechanisms are unrelated model is faithful
10
Pag. 10 Causal Performance Models II. Learning Algorithms Two types: Constraint-based based on the independencies Scoring-based searches set of all models, give a score of how good they represent distribution
11
Pag. 11 Causal Performance Models Step 1: Adjacency search Property: adjacent nodes do not become independent Algorithm: start with full-connected graph check for marginal independencies check for conditional independencies
12
Pag. 12 Causal Performance Models Step 2: Orientation Property: V-structure can be recognized Algorithm: look for v-structures derived rules
13
Pag. 13 Causal Performance Models Assumptions General statistical assumptions: No selection bias Random sample Sufficient data for correctness of statistical tests Underlying network is faithful Causal sufficiency No unknown common causes
14
Pag. 14 Causal Performance Models Criticism Definition causality? About predicting the effect of changes to the system Faithfulness assumption Eg.: accidental cancellation Causal Markov Condition “All relations are causal” Learning algorithms are not robust Statistical tests make mistakes
15
Pag. 15 Causal Performance Models Part III: Performance Analysis High-Performance computing 1 processor parallel system Performance Questions: Performance prediction System-dependency? Parameter-dependency? Reasons of bad performance? Effect of Optimizations?
16
Pag. 16 Causal Performance Models Causal modeling (cf. COMO lab, VUB) Representation form Close to reality Learning algorithms TETRAD tool (open-source, java) PhD??
17
Pag. 17 Causal Performance Models Performance Models Aim performance analysis Support software developer High-performance applications Expected properties offer insight into causes performance degradation prediction estimate effect of optimizations reusable submodels separate application and system-dependency reason under uncertainty causal models
18
Pag. 18 Causal Performance Models Integrated in statistical analysis Statistical characteristics Regression analysis Probability table compression Outlier detection Iterative process 1. Perform additional experiments 2. Extract additional characteristics 3. Indicate exceptions 4. Analyze the divergences of the data points with the current hypotheses
19
Pag. 19 Causal Performance Models A. Model construction Model of computation time of LU decom- position algorithm elementsize (redundant variable) is sufficient for influence datatype -> cache misses regression analysis on submodels X=f(parents) analysis of parameters
20
Pag. 20 Causal Performance Models B. Detection of unexpected dependencies Point-to-point communication performance background communication
21
Pag. 21 Causal Performance Models C. Finding explanations for outliers Exceptional data in communication performance measurements Probability table compression => derived variable Interesting features
22
Pag. 22 Causal Performance Models IV. Complexity of Performance Data Mixture discrete and continuous variables Mutual Information & Kernel Density Estimation Non-linear relations Mutual Information & Kernel Density Estimation Deterministic relations Augmented models & Complexity criterion Context variables Work in progress Context-specific independencies Work in progress
23
Pag. 23 Causal Performance Models A. Information-theoretic Dependency Entropy of random variable X Discretized entropy for continuous variable Mutual Information
24
Pag. 24 Causal Performance Models B. Kernel Density Estimation See applets Trade-off maximal entropy <> typicalness Conclusions Limited number data points needed Discretization of continuous data justified Form-free dependency measure
25
Pag. 25 Causal Performance Models C. Deterministic relations Y=f(X) Y becomes independent from Z conditioned on X ~ violation of the intersection condition (Pearl ’88) Not faithfully describable Solution: augmented causal model - add regularity to model - adapt inference algorithms
26
Pag. 26 Causal Performance Models The Complexity Criterion Select simplest relation X & Y contain equivalent information about Z
27
Pag. 27 Causal Performance Models Augmented causal model Consistent models under Complexity Increase assumption Restrict conditional independencies Generalize d-separation Reestablish faithfulness {
28
Pag. 28 Causal Performance Models Theory works! Deterministic A B Probabilistic
29
Pag. 29 Causal Performance Models Conclusions Benefit of the integration of statistical techniques Causal modeling is a challenge –wants to know the inner from the outer More information –http://parallel.vub.ac.behttp://parallel.vub.ac.be –http://parallel.vub.ac.be/~janhttp://parallel.vub.ac.be/~jan
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.