Working with “loose” Theories of Change Alternative approaches to exploring and testing complex causal models of development interventions 21/10/2015Rick.

1 Working with “loose” Theories of Change Alternative approaches to exploring and testing complex causal models of development interventions 21/10/2015Rick Davies @ DFID EvD1

2 Some proposals, informed by… ITAD’s macro-evaluation of empowerment and accountability projects Analysis of data on CSCF projects for Triple Line Portfolio analysis exercises with Comic Relief Literature on QCA applied to evaluation (Befani & others) Literature on predictive analytics, a form of data mining Current experiences with development of an Excel application to operationalise related tools Rick Davies @ DFID EvD21/10/20152

3 Common features of the approaches 1.Data, on what has already happened 2.A view of causation that has a reasonable fit with complexity of the world as we see it 3.Methods, to analyse this data in the light of this view Rick Davies @ DFID EvD21/10/20153

4 Why bother with alternatives? It would be useful to be able to triangulate findings using the same data, using different methods but which have a consistent view on causality The different approaches have different strengths and weaknesses, providing a wider range of possible uses overall Rick Davies @ DFID EvD21/10/20154

5 1. What sort of data? Rows = cases e.g. projects Columns = Attributes of projects and their context, and Outcomes of these projects Cells Presence/absence of these attributes, and/or Scales e.g. achievement ratings, and/or Numeric values e.g. costs Rick Davies @ DFID EvD21/10/20155

6 Examples Rick Davies @ DFID EvD Data from the Civil Society Challenge Fund, Managed by Triple Line & Crown Agents One sub-set chosen for analysis 21/10/20156

7 2. What sort of view of causality? Conjunctural causes Many events are caused by combinations of factors, rather than single factors. Multiple conjunctural causes (equifinality) Events can arise as a result of different conjunctions of conditions*. Asymmetric causes The causes of events may not be simply the absence of conditions that cause them, but the occurrence of other additional conditions which complicate, block or deflect change Rick Davies @ DFID EvD21/10/20157

8 Multiple types of causal conditions Causes are not simply present or absent, strong or weak, as in a statistically’ significant correlation. Individual causal conditions can be Necessary but insufficient causes Sufficient but unnecessary Necessary and sufficient Neither necessary or sufficient There are other kinds as well, describing their role within configurations of conditions e.g. INUS SUIN Rick Davies @ DFID EvD21/10/20158

9 3. Methods combining… Cross-case analysis (“Quant”) B.Pattern finding: Attributes associated with outcomes Within-case knowledge and analysis (“Qual”) A.Defining the boundaries of the cross-case analysis Based on “loose” theories of what is happening C.Investigating the results of cross-case analysis Looking for causal mechanisms within individual cases Rick Davies @ DFID EvD21/10/20159

10 A. Defining the boundaries The problem – too many attributes vs cases Predictive analytics: “Curse of dimensionality” QCA: The problem of “limited diversity” Solutions (both can be used) Feature selection algorithms, used in data mining E.g. minimise correlations between attributes Genetic algorithm built into an Excel app. Choices of sub-sets of attributes informed by “loose” theory E.g. Two-step or nested analyses in QCA Rick Davies @ DFID EvD21/10/201510

11 A range of loose Theories of Change Types of project outcomes Categories of project attributes (52>7) Rick Davies @ DFID EvD21/10/201511

12 B. The pattern finding challenge A huge number of possibilities remain even after using loose theories to focus our attention A simple data set with 15 project attributes has 2 15 = 32,768 possible combinations Which of these provide will the best predictors of the outcomes we are interesting? How do we search this space to find out? How do we evaluate the results? Rick Davies @ DFID EvD21/10/201512

13 6+ types of search strategies 1.Theory-led hypothesis testing Prior research and theory increases chances of useful findings But narrow in focus, because testability requires specifics High likelihood of missing unexpected solutions But can be speeded up with a simple Excel app* 2.Exhaustive search: Every possibility is tested, but time consuming Feasible with many small data sets / binominal data Can be speeded up with R or Excel app Rick Davies @ DFID EvD21/10/201513

14 Algorithmic searches 3.QCA (Quine-McCluskey) * Origins in political science, now also used for evaluation Technically demanding method to master and to communicate Specialist software available (fs/QCA, etc) 4.Decision Tree algorithms*** Origins in data mining, more widely used, but not yet in evaluations Easy to read visual display of results Open source software available (RapidMiner) Rick Davies @ DFID EvD21/10/201514

15 Example of QCA results display Outcome 1: Users use ICTs to report rural water supply functionality to the local government (A*B*C*D*E)+(A*B*c*D*E)+(a*B*c*d*e) = Outcome present (a*b*c*d*E)+(A*B*c*d*E) = Outcome absent Testing the Waters: A Qualitative Comparative Analysis of the Factors Affecting Success in Rendering Water Services Sustainable Based on ICT Reporting, 2015. Itad. Rick Davies @ DFID EvD Model uses 5 of 9 attributes 21/10/201515

16 Decision Tree results display Nodes = project attributes Branch values = attribute present or absent Leaves = associated outcomes – higher effectiveness 1.0 = predicted present 2.0 = predicted absent Blue = observed present Red = observed absent Rick Davies @ DFID EvD Data from the Civil Society Challenge Fund, Managed by TripleLine 21/10/201516

17 Algorithmic searches 5.Genetic algorithms** Widely used to solve business and engineering optimisation processes. Not at all in evaluations Available as a free add-in to Excel. Easy to use. Can triangulate results from QCA and/or Decision Trees Can discover new/alternate solutions E.g. 3 rather than 5 configurations in Water Aid data, using 2 vs 5 conditions Rick Davies @ DFID EvD21/10/201517

18 Solver Excel add-in OBJECTIVE: Value that needs to be maximised e.g. accuracy of prediction VARIABLES: Values that can be varied e.g. attributes of project to be present or absent CONSTRAINTS on how project attributes can be varied SOLUTIONS for 60 x 20 data sets usually found within 1 minute Rick Davies @ DFID EvD21/10/201518

19 6. Ethnographic and participatory methods Ethnographic Decision Tree Modelling (Chr. Gladwin 1989) Time consuming development process Performance is measurable and independently verifiable Hierarchical card sorting (Davies, 1993)** Much quicker to use Results in the form of a readable tree structure Performance is measurable and independently verifiable Both tap into existing and often informal and semi- tacit knowledge and make it more explicit and testable Rick Davies @ DFID EvD21/10/201519

20 Ethnographic Decision Tree example Model of factors affecting farmers decisions of whether or not to harvest thatching materials Rick Davies @ DFID EvD21/10/201520

21 Hierarchical Card Sort example Rick Davies @ DFID EvD Comic Relief CYPAR project portfolio 2015 Classification and prediction of their relative achievements 21/10/201521

22 Assessing model performance Results of models produced by all methods can be collated and compared using a Confusion Matrix (i.e. a simple truth table) Numbers within the matrix form the basis of a menu of different performance measures. See wikipedia Rick Davies @ DFID EvD Insert number of cases fitting into each result category 21/10/201522

23 Performance measure examples Support The % of cases that a prediction rule applies to Prevalence The % of cases which have the outcome present Accuracy The % of cases that are True Positives and True Negatives Lift The incidence of True Positives compared to Prevalence of the outcome And many others… Rick Davies @ DFID EvD21/10/201523

24 Simplicity also matters Defined as Number of prediction rules need to account for all outcomes, and Number of attributes within all of those rules Why bother: Simple rules would be easier to put to use in project selection or design Simpler rules tend to have wider applicability Rick Davies @ DFID EvD21/10/201524

25 Algorithms augment but don’t replace Algorithms don’t always produce a unique best answer, because Limitation of the algorithm e.g. QCA, GA Nature of the data (e.g. many attributes vs cases). More than one attribute can be an accurate predictor of outcome Manual tweaking of predictive models helps to Find simpler but equally good performing models Explore “adjacent possible” models that also do well Identify the relative importance of parts of the model Rick Davies @ DFID EvD21/10/201525

26 3. Investigating the results of cross- case analysis Two steps Case selection - subject of increased interest Gerring, J., Cojocaru, L., 2015. Case-Selection: A Diversity of Methods and Criteria. Need to be transparent and replicable Otherwise risk of confirmation bias Within-case analysis Process tracing methods are the most cited method (No further discussion here) Rick Davies @ DFID EvD21/10/201526

27 One case selection strategy Current experiment via ITAD’s Macro-evaluation of Empowerment & Accountability Use Hamming distance to measure case similarity Attributes of project A: 000110100 Attributes of project B: 011101100 = 5 commonalities Two kinds of similarity measures Similarity of any two cases Average similarity of one case with all others Case with highest average similarity = “modal” case Rick Davies @ DFID EvD21/10/201527

28 Use Confusion Matrix to find and compare 3 types of modal cases Select a modal True Positive case to find any likely causal mechanisms connecting the conditions that make up the configuration Select a modal False Positive case Given the presence of the same configuration of conditions one would expect the SAME mechanism to be present BUT some other factors blocking it from working i.e. delivering the outcome Select a modal False Negative Case Given the absence of the same configuration of conditions one would NOT expect the same casual mechanism to be present Rick Davies @ DFID EvD21/10/201528

29 In Summary: When to use what? Ethnographic / participatory methods: When you want to understand/explore/test the theories of specific stakeholders When there is no data set readily at hand QCA When there is a reasonably well developed Theory of Change When the data set has sufficient diversity Rick Davies @ DFID EvD21/10/201529

30 In Summary: When to use what? Decision Tree algorithms When easily communicable results are needed Genetic algorithms When a quick exploration for alternate solutions is needed Exhaustive search When certainty is needed that a solution is the best available Excel app (EvalC3) When you want to tweak results of all of the above Rick Davies @ DFID EvD21/10/201530

31 Where to use any of these? Where numerical data is hard to find Where experimental approaches are impractical Where causal complexity is likely to be high Large, decentralised projects, with diverse interventions and contexts e.g. Grant making programs Participatory development programs Rick Davies @ DFID EvD21/10/201531

32 What to look out for How good is the underlying data? Relevant attributes included? Sufficient range of attribute values? Diversity of cases? Minimise redundant configurations Maximise proportion of all possible configurations Did stakeholders views inform selection of attributes and outcomes for analysis? Rick Davies @ DFID EvD21/10/201532

33 What to look out for Transparency of process Who participates and how What performance measures have been used What search parameter have been set e.g. with GA, Decision Tree Have the results been triangulated? Was approach to within-case analysis systematic and transparent? Rick Davies @ DFID EvD21/10/201533

34 Some lessons so far? 1.“Successful models” are not distinct entities “Adjacent possible” always worth investigating 2.Its not only success (FPs) that matters Investigating False Positives and False Negatives will help improve existing models 3.Unambiguous success (no FPs or FNs) is rare. Find an acceptable level of accuracy and support 21/10/2015Rick Davies @ DFID EvD34

35 Rick Davies @ DFID EvD Surplus material hereafter 21/10/201535

36 Software options QCA See 8 packages Decision Trees Rapid Miner Studio Genetic algorithms Solver add-in to Excel EvalC3 application under development (Early adopters are welcome) Rick Davies @ DFID EvD21/10/201536

37 Initial QCA result found 5 configurations, using 5 project attributes But these could be simplified down to 3 configurations using 2 project attributes Testing the Waters: A Qualitative Comparative Analysis of the Factors Affecting Success in Rendering Water Services Sustainable Based on ICT Reporting, Itad, 2015 Rick Davies @ DFID EvD21/10/201537

38 Comic Relief results Observed in project assessments More successfulLess successful Predicted inMore successful144 HCS+ modelLess successful54 21/10/2015Rick Davies @ DFID EvD38 Prevalence = (14+5)/(14+4+5+4) = 70% Support = 14/(14+5) = 74% Accuracy = (14+4)/(14+4+5+4) = 67% Lift =(TP/(TP+FP))/ ((TP+FN)/(TP+FP+FN+TN)) =(14/(14+4))/ ((14+5)/(14+4+5+4) = 1.04

39 “Prediction is very difficult, especially if it’s about the future.” Niels Bohr “Ninety per cent of problems have already been solved in some other field. You just have to find them.” Tony McCaffrey Rick Davies @ DFID EvD21/10/201539

40 More evidence, less poverty Dean Kalan, Scientific American, October 2015 “We must figure out what works and what does not”, But others are asking …what works for whom in what circumstances And others should be asking …and how often and how often is good enough? Rick Davies @ DFID EvD21/10/201540

