Working with “loose” Theories of Change Alternative approaches to exploring and testing complex causal models of development interventions 21/10/2015Rick.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

Key Stage 3 National Strategy Scientific enquiry Science.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Introduction to Research Methodology
1 Procedural Analysis or structured approach. 2 Sometimes known as Analytic Induction Used more commonly in evaluation and policy studies. Uses a set.
Risk Management and Strategy Prioritisation Intelligence Step 8 - Risk Management and Strategy Prioritisaiton Considering the risks associated with action.
Doing Social Psychology Research
Developing Ideas for Research and Evaluating Theories of Behavior
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Scientific method - 1 Scientific method is a body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Science and Engineering Practices
Chapter 5 Research Methods in the Study of Abnormal Behavior Ch 5.
RSBM Business School Research in the real world: the users dilemma Dr Gill Green.
Section 2: Science as a Process
Measuring Effectiveness, Melbourne, Sept Measuring effectiveness a network perspective…
Chapter 1: Introduction to Statistics
Sociological Research Methods and Techniques
Study Design. Study Designs Descriptive Studies Record events, observations or activities,documentaries No comparison group or intervention Describe.
by B. Zadrozny and C. Elkan
Understanding Statistics
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Course on Data Analysis and Interpretation P Presented by B. Unmar Sponsored by GGSU PART 2 Date: 5 July
Presented by Qian Zou.  The purpose of conducting the experiments.  The methodology for the experiments.  The Experimental Design : Cohesion Experiments.
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
Learning from Observations Chapter 18 Through
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.
Environmental Science Chapter 2 – Scientific Tools Test Review
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
Methodological Problems in Cognitive Psychology David Danks Institute for Human & Machine Cognition January 10, 2003.
Chapter(3) Qualitative Risk Analysis. Risk Model.
An overview of multi-criteria analysis techniques The main role of the techniques is to deal with the difficulties that human decision-makers have been.
Section 2 Scientific Methods Chapter 1 Bellringer Complete these two tasks: 1. Describe an advertisement that cites research results. 2. Answer this question:
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
Module 3: Research in Psychology Learning Objectives What is the scientific method? How do psychologist use theory and research to answer questions of.
Question paper 1997.
The word science comes from the Latin "scientia," meaning knowledge. Scientific Theories are not "tentative ideas" or "hunches". The word "theory" is often.
EXCEL DECISION MAKING TOOLS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
URBDP 591 I Lecture 4: Research Question Objectives How do we define a research question? What is a testable hypothesis? How do we test an hypothesis?
Data Mining and Decision Support
Brian Lukoff Stanford University October 13, 2006.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
IB Business & Management
Helpful hints for planning your Wednesday investigation.
A research and policy informed discussion of cross-curricular approaches to the teaching of mathematics and science with a focus on how scientific enquiry.
IB Design and Technology Evaluation Evaluation and Designing.
The need-to-knows when commissioning evaluations Rick Davies, Stockholm, 27 June, 2016.
In response… How QCA can be combined with other methods in evaluation? How can various ideas that are part of the whole QCA package be useful in themselves?
Presentation at UKES and CDI, 27 and 9 April 2016 Jeremy Holland and Florian Schatz Evaluating complex change across projects and contexts: Methodological.
WHAT IS THE NATURE OF SCIENCE?
Understanding different types and methods of research
Responding to Complexity in Impact Evaluation
Evaluability Assessments
Right-sized Evaluation
AF1: Thinking Scientifically
Chapter 21 More About Tests.
Statistical Data Analysis
Research Methods Lesson 1 choosing a research method types of data
CASE STUDY BY: JESSICA PATRON.
Software Requirements analysis & specifications
Starter Look at the photograph, As a sociologist, you want to study a particular group in school. In pairs think about the following questions… Which group.
IS Psychology A Science?
Developing and Evaluating Theories of Behavior
Statistical Data Analysis
RESEARCH BASICS What is research?.
BEYOND MIXED METHODS: USING QUALITATIVE COMPARATIVE ANALYSIS (QCA) TO INTEGRATE CROSS-CASE AND WITHIN-CASE ANALYSES © BARRY COOPER, JUDITH GLAESSER, LOUIS.
Presentation transcript:

Working with “loose” Theories of Change Alternative approaches to exploring and testing complex causal models of development interventions 21/10/2015Rick DFID EvD1

Some proposals, informed by… ITAD’s macro-evaluation of empowerment and accountability projects Analysis of data on CSCF projects for Triple Line Portfolio analysis exercises with Comic Relief Literature on QCA applied to evaluation (Befani & others) Literature on predictive analytics, a form of data mining Current experiences with development of an Excel application to operationalise related tools Rick DFID EvD21/10/20152

Common features of the approaches 1.Data, on what has already happened 2.A view of causation that has a reasonable fit with complexity of the world as we see it 3.Methods, to analyse this data in the light of this view Rick DFID EvD21/10/20153

Why bother with alternatives? It would be useful to be able to triangulate findings using the same data, using different methods but which have a consistent view on causality The different approaches have different strengths and weaknesses, providing a wider range of possible uses overall Rick DFID EvD21/10/20154

1. What sort of data? Rows = cases e.g. projects Columns = Attributes of projects and their context, and Outcomes of these projects Cells Presence/absence of these attributes, and/or Scales e.g. achievement ratings, and/or Numeric values e.g. costs Rick DFID EvD21/10/20155

Examples Rick DFID EvD Data from the Civil Society Challenge Fund, Managed by Triple Line & Crown Agents One sub-set chosen for analysis 21/10/20156

2. What sort of view of causality? Conjunctural causes Many events are caused by combinations of factors, rather than single factors. Multiple conjunctural causes (equifinality) Events can arise as a result of different conjunctions of conditions*. Asymmetric causes The causes of events may not be simply the absence of conditions that cause them, but the occurrence of other additional conditions which complicate, block or deflect change Rick DFID EvD21/10/20157

Multiple types of causal conditions Causes are not simply present or absent, strong or weak, as in a statistically’ significant correlation. Individual causal conditions can be Necessary but insufficient causes Sufficient but unnecessary Necessary and sufficient Neither necessary or sufficient There are other kinds as well, describing their role within configurations of conditions e.g. INUS SUIN Rick DFID EvD21/10/20158

3. Methods combining… Cross-case analysis (“Quant”) B.Pattern finding: Attributes associated with outcomes Within-case knowledge and analysis (“Qual”) A.Defining the boundaries of the cross-case analysis Based on “loose” theories of what is happening C.Investigating the results of cross-case analysis Looking for causal mechanisms within individual cases Rick DFID EvD21/10/20159

A. Defining the boundaries The problem – too many attributes vs cases Predictive analytics: “Curse of dimensionality” QCA: The problem of “limited diversity” Solutions (both can be used) Feature selection algorithms, used in data mining E.g. minimise correlations between attributes Genetic algorithm built into an Excel app. Choices of sub-sets of attributes informed by “loose” theory E.g. Two-step or nested analyses in QCA Rick DFID EvD21/10/201510

A range of loose Theories of Change Types of project outcomes Categories of project attributes (52>7) Rick DFID EvD21/10/201511

B. The pattern finding challenge A huge number of possibilities remain even after using loose theories to focus our attention A simple data set with 15 project attributes has 2 15 = 32,768 possible combinations Which of these provide will the best predictors of the outcomes we are interesting? How do we search this space to find out? How do we evaluate the results? Rick DFID EvD21/10/201512

6+ types of search strategies 1.Theory-led hypothesis testing Prior research and theory increases chances of useful findings But narrow in focus, because testability requires specifics High likelihood of missing unexpected solutions But can be speeded up with a simple Excel app* 2.Exhaustive search: Every possibility is tested, but time consuming Feasible with many small data sets / binominal data Can be speeded up with R or Excel app Rick DFID EvD21/10/201513

Algorithmic searches 3.QCA (Quine-McCluskey) * Origins in political science, now also used for evaluation Technically demanding method to master and to communicate Specialist software available (fs/QCA, etc) 4.Decision Tree algorithms*** Origins in data mining, more widely used, but not yet in evaluations Easy to read visual display of results Open source software available (RapidMiner) Rick DFID EvD21/10/201514

Example of QCA results display Outcome 1: Users use ICTs to report rural water supply functionality to the local government (A*B*C*D*E)+(A*B*c*D*E)+(a*B*c*d*e) = Outcome present (a*b*c*d*E)+(A*B*c*d*E) = Outcome absent Testing the Waters: A Qualitative Comparative Analysis of the Factors Affecting Success in Rendering Water Services Sustainable Based on ICT Reporting, Itad. Rick DFID EvD Model uses 5 of 9 attributes 21/10/201515

Decision Tree results display Nodes = project attributes Branch values = attribute present or absent Leaves = associated outcomes – higher effectiveness 1.0 = predicted present 2.0 = predicted absent Blue = observed present Red = observed absent Rick DFID EvD Data from the Civil Society Challenge Fund, Managed by TripleLine 21/10/201516

Algorithmic searches 5.Genetic algorithms** Widely used to solve business and engineering optimisation processes. Not at all in evaluations Available as a free add-in to Excel. Easy to use. Can triangulate results from QCA and/or Decision Trees Can discover new/alternate solutions E.g. 3 rather than 5 configurations in Water Aid data, using 2 vs 5 conditions Rick DFID EvD21/10/201517

Solver Excel add-in OBJECTIVE: Value that needs to be maximised e.g. accuracy of prediction VARIABLES: Values that can be varied e.g. attributes of project to be present or absent CONSTRAINTS on how project attributes can be varied SOLUTIONS for 60 x 20 data sets usually found within 1 minute Rick DFID EvD21/10/201518

6. Ethnographic and participatory methods Ethnographic Decision Tree Modelling (Chr. Gladwin 1989) Time consuming development process Performance is measurable and independently verifiable Hierarchical card sorting (Davies, 1993)** Much quicker to use Results in the form of a readable tree structure Performance is measurable and independently verifiable Both tap into existing and often informal and semi- tacit knowledge and make it more explicit and testable Rick DFID EvD21/10/201519

Ethnographic Decision Tree example Model of factors affecting farmers decisions of whether or not to harvest thatching materials Rick DFID EvD21/10/201520

Hierarchical Card Sort example Rick DFID EvD Comic Relief CYPAR project portfolio 2015 Classification and prediction of their relative achievements 21/10/201521

Assessing model performance Results of models produced by all methods can be collated and compared using a Confusion Matrix (i.e. a simple truth table) Numbers within the matrix form the basis of a menu of different performance measures. See wikipedia Rick DFID EvD Insert number of cases fitting into each result category 21/10/201522

Performance measure examples Support The % of cases that a prediction rule applies to Prevalence The % of cases which have the outcome present Accuracy The % of cases that are True Positives and True Negatives Lift The incidence of True Positives compared to Prevalence of the outcome And many others… Rick DFID EvD21/10/201523

Simplicity also matters Defined as Number of prediction rules need to account for all outcomes, and Number of attributes within all of those rules Why bother: Simple rules would be easier to put to use in project selection or design Simpler rules tend to have wider applicability Rick DFID EvD21/10/201524

Algorithms augment but don’t replace Algorithms don’t always produce a unique best answer, because Limitation of the algorithm e.g. QCA, GA Nature of the data (e.g. many attributes vs cases). More than one attribute can be an accurate predictor of outcome Manual tweaking of predictive models helps to Find simpler but equally good performing models Explore “adjacent possible” models that also do well Identify the relative importance of parts of the model Rick DFID EvD21/10/201525

3. Investigating the results of cross- case analysis Two steps Case selection - subject of increased interest Gerring, J., Cojocaru, L., Case-Selection: A Diversity of Methods and Criteria. Need to be transparent and replicable Otherwise risk of confirmation bias Within-case analysis Process tracing methods are the most cited method (No further discussion here) Rick DFID EvD21/10/201526

One case selection strategy Current experiment via ITAD’s Macro-evaluation of Empowerment & Accountability Use Hamming distance to measure case similarity Attributes of project A: Attributes of project B: = 5 commonalities Two kinds of similarity measures Similarity of any two cases Average similarity of one case with all others Case with highest average similarity = “modal” case Rick DFID EvD21/10/201527

Use Confusion Matrix to find and compare 3 types of modal cases Select a modal True Positive case to find any likely causal mechanisms connecting the conditions that make up the configuration Select a modal False Positive case Given the presence of the same configuration of conditions one would expect the SAME mechanism to be present BUT some other factors blocking it from working i.e. delivering the outcome Select a modal False Negative Case Given the absence of the same configuration of conditions one would NOT expect the same casual mechanism to be present Rick DFID EvD21/10/201528

In Summary: When to use what? Ethnographic / participatory methods: When you want to understand/explore/test the theories of specific stakeholders When there is no data set readily at hand QCA When there is a reasonably well developed Theory of Change When the data set has sufficient diversity Rick DFID EvD21/10/201529

In Summary: When to use what? Decision Tree algorithms When easily communicable results are needed Genetic algorithms When a quick exploration for alternate solutions is needed Exhaustive search When certainty is needed that a solution is the best available Excel app (EvalC3) When you want to tweak results of all of the above Rick DFID EvD21/10/201530

Where to use any of these? Where numerical data is hard to find Where experimental approaches are impractical Where causal complexity is likely to be high Large, decentralised projects, with diverse interventions and contexts e.g. Grant making programs Participatory development programs Rick DFID EvD21/10/201531

What to look out for How good is the underlying data? Relevant attributes included? Sufficient range of attribute values? Diversity of cases? Minimise redundant configurations Maximise proportion of all possible configurations Did stakeholders views inform selection of attributes and outcomes for analysis? Rick DFID EvD21/10/201532

What to look out for Transparency of process Who participates and how What performance measures have been used What search parameter have been set e.g. with GA, Decision Tree Have the results been triangulated? Was approach to within-case analysis systematic and transparent? Rick DFID EvD21/10/201533

Some lessons so far? 1.“Successful models” are not distinct entities “Adjacent possible” always worth investigating 2.Its not only success (FPs) that matters Investigating False Positives and False Negatives will help improve existing models 3.Unambiguous success (no FPs or FNs) is rare. Find an acceptable level of accuracy and support 21/10/2015Rick DFID EvD34

Rick DFID EvD Surplus material hereafter 21/10/201535

Software options QCA See 8 packages Decision Trees Rapid Miner Studio Genetic algorithms Solver add-in to Excel EvalC3 application under development (Early adopters are welcome) Rick DFID EvD21/10/201536

Initial QCA result found 5 configurations, using 5 project attributes But these could be simplified down to 3 configurations using 2 project attributes Testing the Waters: A Qualitative Comparative Analysis of the Factors Affecting Success in Rendering Water Services Sustainable Based on ICT Reporting, Itad, 2015 Rick DFID EvD21/10/201537

Comic Relief results Observed in project assessments More successfulLess successful Predicted inMore successful144 HCS+ modelLess successful54 21/10/2015Rick DFID EvD38 Prevalence = (14+5)/( ) = 70% Support = 14/(14+5) = 74% Accuracy = (14+4)/( ) = 67% Lift =(TP/(TP+FP))/ ((TP+FN)/(TP+FP+FN+TN)) =(14/(14+4))/ ((14+5)/( ) = 1.04

“Prediction is very difficult, especially if it’s about the future.” Niels Bohr “Ninety per cent of problems have already been solved in some other field. You just have to find them.” Tony McCaffrey Rick DFID EvD21/10/201539

More evidence, less poverty Dean Kalan, Scientific American, October 2015 “We must figure out what works and what does not”, But others are asking …what works for whom in what circumstances And others should be asking …and how often and how often is good enough? Rick DFID EvD21/10/201540