Predictability and Prediction of Social Processes Rich Colbaugh*† Kristin Glass* *New Mexico Institute of Mining and Technology †Sandia National Laboratories.

Slides:



Advertisements
Similar presentations
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Advertisements

Split Questionnaire Designs for Consumer Expenditure Survey Trivellore Raghunathan (Raghu) University of Michigan BLS Workshop December 8-9, 2010.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
1 Helsinki University of Technology Systems Analysis Laboratory Robust Portfolio Modeling for Scenario-Based Project Appraisal Juuso Liesiö, Pekka Mild.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
MANAGING FOR QUALITY AND PERFORMANCE EXCELLENCE, 7e, © 2008 Thomson Higher Education Publishing 1 Chapter 11 Statistical Thinking and Applications.
 delivers evidence that a solution developed achieves the purpose for which it was designed.  The purpose of evaluation is to demonstrate the utility,
Effective Coordination of Multiple Intelligent Agents for Command and Control The Robotics Institute Carnegie Mellon University PI: Katia Sycara
Evaluating Search Engine
A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.
The AutoSimOA Project Katy Hoad, Stewart Robinson, Ruth Davies Warwick Business School WSC 07 A 3 year, EPSRC funded project in collaboration with SIMUL8.
Lecture 10 Comparison and Evaluation of Alternative System Designs.
Chapter 01 Introduction to Probability Models Course Focus Textbook Approach Why Study This?
1 Computation in a Distributed Information Market Joan Feigenbaum (Yale) Lance Fortnow (NEC Labs) David Pennock (Overture) Rahul Sami (Yale)
Robert delMas (Univ. of Minnesota, USA) Ann Ooms (Kingston College, UK) Joan Garfield (Univ. of Minnesota, USA) Beth Chance (Cal Poly State Univ., USA)
1 D r a f t Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Approximation Metrics for Discrete and Continuous Systems Antoine Girard and George J. Pappas VERIMAG Workshop.
Cao et al. ICML 2010 Presented by Danushka Bollegala.
Enabling Organization-Decision Making
Issues with Data Mining
Lecture 1 What is Modeling? What is Modeling? Creating a simplified version of reality Working with this version to understand or control some.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Kristina Lerman Aram Galstyan USC Information Sciences Institute Analysis of Social Voting Patterns on Digg.
Part 17: Regression Residuals 17-1/38 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8): , Presented by Yong Li.
A Beginner’s Guide to Bayesian Modelling Peter England, PhD EMB GIRO 2002.
SCSC 311 Information Systems: hardware and software.
Predicting Content Change On The Web BY : HITESH SONPURE GUIDED BY : PROF. M. WANJARI.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
The Common Shock Model for Correlations Between Lines of Insurance
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
1 Enviromatics Environmental sampling Environmental sampling Вонр. проф. д-р Александар Маркоски Технички факултет – Битола 2008 год.
Chapter 12 Modeling the Yield Curve Dynamics FIXED-INCOME SECURITIES.
1 ICPR 2006 Tin Kam Ho Bell Laboratories Lucent Technologies.
Quantitative Project Risk Analysis 1 Intaver Institute Inc. 303, 6707, Elbow Drive S.W., Calgary AB Canada T2V 0E5
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Chapter 1 Introduction n Introduction: Problem Solving and Decision Making n Quantitative Analysis and Decision Making n Quantitative Analysis n Model.
Question paper 1997.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
1 A Comparison of Information Management using Imprecise Probabilities and Precise Bayesian Updating of Reliability Estimates Jason Matthew Aughenbaugh,
Systems Realization Laboratory The Role and Limitations of Modeling and Simulation in Systems Design Jason Aughenbaugh & Chris Paredis The Systems Realization.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Lecture 1 Introduction to econometrics
Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
1 Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Do Professional Critics Diverge from Public Opinion? Evidence from Twitter Yu-Hsi Liu Suffolk University ACEI 2014.
Sampath Jayarathna Cal Poly Pomona
CEDEFOP Session 2: Closer look at the roadmap Moving towards detailed analysis of occupations – use of expert opinion Production of Skills Supply and.
1University of Oklahoma 2Shaker Consulting
Lecture 3 Prescriptive Process Models
Evaluation of IR Systems
Chapter 6: Temporal Difference Learning
Objectives of the Course and Preliminaries
SUBMITTED TO: Mrs. BIKRAMJEET KAUR
Morgan Bruns1, Chris Paredis1, and Scott Ferson2
Quantitative Project Risk Analysis
Evaluation techniques
Generalization in deep learning
Chapter 6: Temporal Difference Learning
Chapter 9: Planning and Learning
Emna Krichene 1, Youssef Masmoudi 1, Adel M
Peer assessment.
Statistical Thinking and Applications
Dept. of Computation, UMIST
Presentation transcript:

Predictability and Prediction of Social Processes Rich Colbaugh*† Kristin Glass* *New Mexico Institute of Mining and Technology †Sandia National Laboratories April 2007

Introduction Objective Develop formal analytics capability for important social processes, with emphasis on predictive analysis. Sample tasks of interest include: assessing predictability of given social process; recommending observables upon which to base prediction; forming useful predictions even when using noisy/incomplete data. Challenge Social decision-making is critical aspect of many social processes and is notoriously difficult to predict.

Foundations Low cognition agent models Multi-scale representation of social processes via S-HDS framework: micro-scale – simple models for individual agent dynamics; meso-scale – collective dynamics within social context; macro-scale – aggregation of collective dynamics across contexts.

Foundations Predictability assessment Basic questions Given a social process and prediction question of interest: Is process/question pair predictable? (Does any combination of available knowledge/observations regarding the social process enable desired prediction?) Which observables are most useful for making predictions? Can predictions be formed using noisy, incomplete versions of these data? If problem is unpredictable, can it be refined to one which is predictable?

Foundations Predictability assessment (cont’d) We have developed rigorous, computationally tractable methods for evaluating predictability of a given social process/question pair. Approach: basic idea – assess reachability properties of process of interest and determine if properties are in conflict with prediction goals; example – if A, B are both reachable from indistinguishable configurations then process is unpredictable; method – one-dimensional abstraction of social process models enables computationally tractable, provably correct reachability assessments without simulations. B A IC

One-dimensional abstraction Example One: nondeterministic systems. System  nd : dx/dt = f(x,d), where x  X and d(t)  D is “disturbance”. Theorem: No trajectory from X 0  X reaches X u  X if  B(x) s.t. ▫B(x)  0  x  X 0 ; ▫B(x)  0  x  X u ; ▫(  B/  x) f(x,d)  0  x  X,  d  D. Computation: convex relaxation of theorem criteria via SOS decomposition [Parrilo 2000] and semidefinite programming (SDP); for example relax {B(x)  0  x  X 0 } to {− B(x) − 0 T (x) h 0 (x) = SOS with 0 (x) = SOS}. Foundations

One-dimensional abstraction (cont’d) Example Two: stochastic systems. System  s : dx = f(x) dt + g(x) dw where w(t) is a Wiener process. Theorem:  is upper bound on probability of reaching X u  X from X 0  X while remaining in X if  B(x) s.t. ▫B(x)    x  X 0 ; ▫B(x)  1  x  X u ; ▫B(x)  0  x  X ; ▫(  B/  x) f + (1/2) tr [g T (  2 B/  2 x) g]  0  x  X. Computation: convex relaxation of theorem criteria via SOS and SDP. Foundations

Case studies On-line markets Objective: study social process “paradox” – outcomes of social processes are often both unequal and unpredictable – and explore the possibility of prediction in on-line markets. Empirical data: music and soft- ware markets.

Case studies On-line markets (cont’d) Summary of software market study (for description of music market see [Watts et al. 2006]): data source: CNET software library ( consisting of >30,000 programs with associated news/reviews/prices/technical information, one month data collection period; main findings: ▫ daily download market share of item is (statistically significantly) positively correlated with cumulative item downloads, negatively correlated with item age, and not (statistically) affected by other information (e.g., expert reviews, user reviews, technical data); ▫ average quality of “most popular” software is not distinguishable from average quality of all software available on site.

Case studies On-line markets (cont’d) Model: “low cognition” agent model in which each agent selects option based upon evaluation of option quality and consideration of choices of other agents. Predictability assessment: formal analysis of predictability of market share winners/losers shows prediction is feasible in both low social influence (SI) and high SI cases; identifies observable for which prediction is feasible – limited, very early market share time series. Predictability assessment unpredictability at t = 0: 0.63 unpredictability at t = 10: ~10 −3 (both for high SI) unpredictability w/ high SI: 0.63 unpredictability w/ low SI: ~10 −3 (both for t = 0)

Case studies On-line markets (cont’d) Prediction: sample results Simulated music market 1.Estimate SI level (low or high) within “multiple models” framework using convergence rate of market share variance. 2.Predict market share using algorithm appropriate for SI level. Experimental music market: scheme appropriately classifies market share trajectories as corresponding to low or high SI and enables useful market share predictions in each case.

Movie revenue predictability/prediction Objective: explore possibility of predicting total box office revenue for a given movie. Empirical data: movie industry – weekly box office receipts, production budget and marketing expenses; personnel data; on-line ratings – critic reviews, daily user reviews; for 40 movies. Studio view: “Nobody knows anything” – screenwriter William Goldman. Case studies

Movie revenue predictability/prediction (cont’d) Sample predictability results Classical “input-output” prediction: movie revenues are power law distributed with infinite variance, so classical revenue forecasts have zero precision. Dynamics-based prediction: formal analysis of predictability of total box office revenue suggests ▫prediction is feasible only if some form of time series data is available; ▫useful predictions may be obtainable using only on-line user reviews (as proxy for “buzz”). Predictability assessment unpredictability at t = 0: 0.83 unpredictability at t = 10: ~10 −2 (both for high SI) Case studies

Movie revenue predictability/prediction (cont’d) Sample prediction results Model estimation: use “training” data (for 20 movies) to develop formula that estimates movie appeal and buzz from early on- line reviewer data. Algorithm: ▫estimate {appeal, buzz} for movie from very early (e.g., first week) time series using formula obtained above; ▫predict total box office revenue for movie by evolving low cognition movie attendance model to equilibrium. Case studies