Revealing the weakness of SNA and possibly fixing it, using MAS Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University.

Slides:



Advertisements
Similar presentations
Chapter 2 The Process of Experimentation
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Comparing One Sample to its Population
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Specifying an Econometric Equation and Specification Error
Introduction to Statistics
Issues in Modelling the Hawala System Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University Issues in Modelling the Hawala System,
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
1 Introduction to Computability Theory Lecture15: Reductions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Critical thinking, session 1: about argument, MMUBS Mres Induction, slide-1 Critical thinking, session 1: About Argument Bruce Edmonds.
1 Validation and Verification of Simulation Models.
Experimental Evaluation
Personality, 9e Jerry M. Burger
Introduction, Acquiring Knowledge, and the Scientific Method
Introduction to the design (and analysis) of experiments James M. Curran Department of Statistics, University of Auckland
Quantitative Methods – Week 7: Inductive Statistics II: Hypothesis Testing Roman Studer Nuffield College
RESEARCH DESIGN.
Chapter 13: Inference in Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Statistical Techniques I
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Approaching a Question & Research
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.
When Simple Networks Fail: Characterising Social Networks Using Simulation Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University.
Ways for Improvement of Validity of Qualifications PHARE TVET RO2006/ Training and Advice for Further Development of the TVET.
Critically assessing and analysing simulation results Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University.
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
Using the Experimental Method to Produce Reliable Self-Organised Systems, B. Edmonds, ESOA 2004, New York, July 2004, slide-1 Using.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Assumes that events are governed by some lawful order
Science Fair How To Get Started… (
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
CMPT 880/890 The Scientific Method. MOTD The scientific method is a valuable tool The SM is not the only way of doing science The SM fits into a larger.
Thomson South-Western Wagner & Hollenbeck 5e 1 Chapter Sixteen Critical Thinking And Continuous Learning.
Model to Model Workshop, EHESS, Grequam/CNRS, Marseille 2003, slide-1 Model  Model Workshop - relating simulation models At EHESS,
1 Comparing multiple tests for separating populations Juliet Popper Shaffer Paper presented at the Fifth International Conference on Multiple Comparisons,
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Comparing Snapshots of Networks Shah Jamal Alam and Ruth Meyer Centre for Policy Modelling 28 th March, 2007 – CAVES Bi-annual Meeting, IIASA,
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Chapter 6 - Standardized Measurement and Assessment
Case Studies and Review Week 4 NJ Kang. 5) Studying Cases Case study is a strategy for doing research which involves an empirical investigation of a particular.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
2-Day Introduction to Agent-Based Modelling Day 2: Session 7 Social Science, Different Purposes and Changing Networks.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Definition Slides Unit 2: Scientific Research Methods.
Definition Slides Unit 1.2 Research Methods Terms.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Chapter 9 Hypothesis Testing.
Significance Tests: The Basics
Statistical Data Analysis
Presentation transcript:

Revealing the weakness of SNA and possibly fixing it, using MAS Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University

Modelling and Social Network Analysis Revealing the weakness of SNA and possibly fixing it, using MAS Introduction Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-2

Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-3 Modelling parts and relations Object System knownunknown Formal Model input (parameters, initial conditions etc.) output (results) encoding (measurement) decoding (interpretation) All the stages are necessary for the model to be useful

Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-4 Modelling ideas rather than observed systems Object Systemconceptual model Model

Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-5 Some ‘scientific’ uses of modelling Prediction: Provide information about a current unknown by inference from known information Explanation: Provide an explanation how an outcome resulted from some conditions Analogy: Provide a framework for (or a way of) thinking about a complex system But there are many other uses: illustration, personal exploration, persuasion, counter example, etc.

Social Network Analysis Abstracts a target system to a system of (possibly rich and dynamic) nodes and arcs It is necessary to decide what a node and an arc are (in terms of what nodes and arc represent in the target system) Key idea: the structure of the abstracted network tells us something useful about the properties of the system Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-6

Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-7 Modelling parts and relations with a Social Network Model Object System knownunknown SNA Model output in terms of visualisation, measures etc. representation in terms of arcs and nodes interpretation of output It is not a model until there is an analysis of a network that can be interpreted in terms of the object system

About SNA Models Representing anything as a network involves many decisions as how to do this The resulting representation is only a model if one can infer anything from it Often this inference is implicit or informal If the inference is specified it can be called a Social Network Model (SNM) Any model of an observed system is a contingent (mini-)theory that is, it could be wrong however plausible it seems Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-8

Descriptive Network Analysis Measures upon networks – the very idea!, Bruce Edmonds, Social Netwrok Conference, London, 14 th July slide-9 A B A measure on the network, M(x) Based on an already existing good understanding of what is happening in the target system Choose and use measure etc. to illustrate that understanding

Network Analysis as a Generative Model Measures upon networks – the very idea!, Bruce Edmonds, Social Netwrok Conference, London, 14 th July slide-10 A B E.g. using a measure on the network, M(x) Given an observed system model it with a SNM Use the model to infer something about the model that is meaningful in terms of the observed system

Validating a Social Network Model Since a SNA model is a (complex) contingent hypothesis about the target system To be trusted it needs to be independently validated (strong validation) This is very expensive to do with SNMs since not only does the data need to be collected and the model built but it also tested against what is measured in the target system So instead it is usual to validate weakly using the intuitions of the researcher who did the analysis which is clearly insufficient if we are to rely on it for any purpose (e.g. understaning) Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-11

Using ABS to Probe SNA Assumptions However we can explore the robustness of SNA against plausible social simulations An artificial test-bed for SNA This can indicate the conditions under which a particular SNM can be relied upon (or not) given which assumptions If a SNM of an ABS cannot be made to work how could we rely on this when considering real social phenomena? Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-12

Analysing a Simulation of a P2P System Revealing the weakness of SNA and possibly fixing it, using MAS Example 1 Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-13

Example I: A Peer-to-Peer (P2P) File- sharing system Collection of ‘servers’, each of which: –Is controlled by a user to some extent –‘Knows’ a limited number of servers, with which it can communicate (the network) –Makes some (or no) files available for download by other servers –Search for files is by flood-fill: (i.e. send query to n others who send it to n others…) –If query matches an available file it is sent back to originator E.g. Bittorrent Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-14

A Simulation of a P2P System 50 servers, each can decide to share files (coop) or not (def) at any time Try collect ‘sets’ of related files stored (initially) randomly by sending queries Satisfaction is measured by success at collecting files – (small) cost of dealing with others’ queries (but decays over time) May look at and copy what a more satisfied server does, or may drop out and be replaced (especially if satisfaction is low) Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-15

Number of co-operators in a run of the simulation (out of 50) Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-16 Key issue is number (and manner) of cooperation – Why does anyone cooperate? – How does network structure impact upon this?

Typical Emergent Network Structure Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-17

Suggests four types of node In-coop – those who share their files in core partition In-def – those who don’t share their files in core partition Out-coop – those who share their files but are outside the core partition Out-def – those who don’t share their files but are outside the core partition Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-18

Some General Statistics TypeAverage utilityAverage number of links Average centrality in-coop out-coop in-def out-def Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide- 19 Over all the runs for all nodes and later times

Stop!! Time for a Thought Experiment If this were observations of a real P2P system and not a simulation, what would you conclude from this analysis: 1.That the kind of node (using the above categorisation) was a significant factor in the utility of nodes? 2.That either a node’s number of links or centrality was a significant factor in achieving its utility? Wouldn’t a paper that came to positive conclusions on these questions be publishable? Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-20

Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-21 Over all kinds of nodes and later times and runs

Regression coefficients with satisfaction levels of nodes TypeNumber of links Number of links lagged 6 periods CentralityCentrality lagged 6 periods in-coop out-coop in-def out-def Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide- 22 Other measures and lags had lower correlations, including those that just did these in aggregate

Size of partitions during a run Blue – size of largest partition Green – 2 nd largest (if there is one) Red, orange, etc. – even smaller ones Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-23

Conclusion of P2P Case-study The global measures were not very useful in providing understanding leverage It can be unsafe to assume that such measures derived from empirical studies give a helpful picture of the role of networks The structural analysis based on the detailed understanding of the dynamics created a more useful categorisation of node types (but this is precisely the kind of understanding difficult to obtain when the system is real rather than simulated) Given this understanding it might be possible to choose better measures etc. It is important to distinguish demonstrating an existing understanding of a network from fishing for understanding using SNA measures Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide- 24

A “simple” abstract system Revealing the weakness of SNA and possibly fixing it, using MAS Example 2 Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-25

The Target System So here I will consider this system looking at the question of whether any measure can be relied upon to indicate eventual node importance. It is: –Relatively simple –Deterministic –About which we have almost complete information about behaviour, links, etc. to help us chose our measure Measures upon networks – the very idea!, Bruce Edmonds, Social Network Conference, London, 14 th July slide-26

The Abstract System Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-27 E ?

Using the Experimental Method to Produce Reliable Self-Organised Systems, B. Edmonds, ESOA 2004, New York, July 2004, slide-28 Basic System Outline Giving Agent System with Plans Fixed number of agents: A 1, A 2, …A n Each agent, A x, has –a store, S x –a fixed number of plans: P x1, P x2, … Each Plan, P xy, composed of instructions: –A fixed number of “give one to” –And one test: If S i is zero then do plan j next, otherwise plan k next Each time click, all do: get 1 unit; use current plan to: [do giving (while they have); test others; note next plan].

Using the Experimental Method to Produce Reliable Self-Organised Systems, B. Edmonds, ESOA 2004, New York, July 2004, slide-29 Thus all that happens in GASP systems is: That agents have a fixed set of very simple plans/programs Their state is the amount in their store and the index of the current plan All they do is give fixed amounts to other agents accorind to their current plan All they can perceive is whether an other agent’s store is zero or not… …which determines the index of the next plan in a fixed way

Using the Experimental Method to Produce Reliable Self-Organised Systems, B. Edmonds, ESOA 2004, New York, July 2004, slide-30 An illustration of a GASP system Plan 1: G 3 G 2 JZ 2,1,3 Plan 2: JZ 1,2,3 Plan 3: G 2 JZ 2,3,3 Agent Agent Agent 3 Etc. Check if zero 4 27 Store:

Thus the reformulated question is... Given almost complete knowledge of a particular GASP system (except for the initial store of Agent-1), can you effectively find any measure, M, such that: If and only if M(A) ≥ M(B) then... Eventually S(t,A) ≥ S(t,B) [ where S(t,x) is the value of the store in agent x at time t ] That is given this system is there an M: M(A) ≥ M(B) ↔  T; for t>T S(t,A) ≥ S(t,B) Measures upon networks – the very idea!, Bruce Edmonds, Social Network Conference, London, 14 th July slide-31

And the answer is... No ! In other words, there are GASP systems, where even though we know: their complete behaviour (comparable to detailed interviews of all participants); everything possible about their social network (who they can make transfers to); and almost all of the initial conditions (except one value)......there is no measure that will tell us from the structure which nodes will be more influential than others once running. Measures upon networks – the very idea!, Bruce Edmonds, Social Network Conference, London, 14 th July slide-32

Proof Sketch The class of GASP systems are Turing Complete, in other words they can compute anything a Turing Machine (TM) can (shown by a mapping into an Unlimited Register Machine a know TM equivalent). If there were a such a measure, then we could use it to check (without computation) that the results of two GASP systems (the end value in the store of Agent-1) were equal by joining the two systems into one; finding the measure, M and then using it to see if the two output nodes would be equal. This is a known uncomputable problem. Measures upon networks – the very idea!, Bruce Edmonds, Social Network Conference, London, 14 th July slide-33

(the pessimistic) Moral! Even with very simple, deterministic systems, where we know everything about the behaviour and structure of the system, there are no measures that a priori inform us about node importance. This backs up simulation studies where a set of apparently sensible measures fail to do the same. Therefore the burden of proof is on those that claim, with a largely unknown complex system, that a measure will tell us such information! Effective measurement follows understanding Measures upon networks – the very idea!, Bruce Edmonds, Social Network Conference, London, 14 th July slide-34

The Cure? Revealing the weakness of SNA and possibly fixing it, using MAS Part 4 Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-35

Why SNM might be inadequate SNM are simply too abstract to adequately represent social complexity The jump from rich social phenomena to simple network model is too great This is usually masked by –The prima face plausibility of SNM –That SNA work is traditionally divided between: Theorists that study what can be inferred from SN Social Scientists who represent using SN and then trust that the theoretical techniques work Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-36

Staging Abstraction using ABS Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-37

Example II: A Simulation of Ga-Selala (in the Limpopo Valley of South Africa) A Complex Evidence-led Simulation of a particular village Represents many aspects of life there, including: sexual network and HIV/AIDS spread, friendship network, kinship network, employment, savings clubs, household structure, birth and death, government grants and health Purpose was to assess impacts of factors, in particular how fragile the social structure might be to these factors given the complex interplay of the various social structures and behaviours Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-38

Basic Methodology Repeated iterations of model development in response to stakeholder criticism, expert opinion, statistics, interviews etc. So that most aspects of the model had some (but varying) levels of justification from available evidence Result is a context-specific but dynamic “description” using a computer simulation Simulation is difficult to understand and slow to run, but open to experiment and inspection Changes in network structure can be studied in the simulation even though it is highly dynamic Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-39

Observations from running simulation experiments That (given the introduction of a new mining enterprise near the village) the social structure(s) collapsed To try and show this, snapshots of the social network taken and their degree distribution compared using non-parametric statistics (Kolmogorov-Sinai) to see if there is evidence of significant change Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-40

Comparing the social network over time with that at time 0 Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-41 Initialised with Watts-Strogatz Small-world network Initialised with Erdös random network P-scores of K-S test on the degree distributions of the social networks

Comparing the social network over time with the previous time Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-42 Initialised with Watts-Strogatz Small-world network Initialised with Erdös random network P-scores of K-S test on the degree distributions of the social networks

Conclusion of talk SNM are weak in the sense that they are contingent and yet almost always without any independent validation Their apparent power comes from their simplicity and plausibility ABS can be used to test the assumptions behind SNA analyses in vitro ABS can be used to stage the abstraction from evidence to SNA, allowing chains of reference to be maintained and understanding gained to inform SNA Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-43

The End Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University Business School Revealing the weakness of SNA and possibly fixing it, using MAS, Bruce Edmonds, SNAMAS invited Talk, AISB Leicester, March slide-44