Download presentation
Presentation is loading. Please wait.
1
Predictability and Prediction of Social Processes Rich Colbaugh*† Kristin Glass* *New Mexico Institute of Mining and Technology †Sandia National Laboratories April 2007
2
Introduction Objective Develop formal analytics capability for important social processes, with emphasis on predictive analysis. Sample tasks of interest include: assessing predictability of given social process; recommending observables upon which to base prediction; forming useful predictions even when using noisy/incomplete data. Challenge Social decision-making is critical aspect of many social processes and is notoriously difficult to predict.
3
Foundations Low cognition agent models Multi-scale representation of social processes via S-HDS framework: micro-scale – simple models for individual agent dynamics; meso-scale – collective dynamics within social context; macro-scale – aggregation of collective dynamics across contexts.
4
Foundations Predictability assessment Basic questions Given a social process and prediction question of interest: Is process/question pair predictable? (Does any combination of available knowledge/observations regarding the social process enable desired prediction?) Which observables are most useful for making predictions? Can predictions be formed using noisy, incomplete versions of these data? If problem is unpredictable, can it be refined to one which is predictable?
5
Foundations Predictability assessment (cont’d) We have developed rigorous, computationally tractable methods for evaluating predictability of a given social process/question pair. Approach: basic idea – assess reachability properties of process of interest and determine if properties are in conflict with prediction goals; example – if A, B are both reachable from indistinguishable configurations then process is unpredictable; method – one-dimensional abstraction of social process models enables computationally tractable, provably correct reachability assessments without simulations. B A IC
6
One-dimensional abstraction Example One: nondeterministic systems. System nd : dx/dt = f(x,d), where x X and d(t) D is “disturbance”. Theorem: No trajectory from X 0 X reaches X u X if B(x) s.t. ▫B(x) 0 x X 0 ; ▫B(x) 0 x X u ; ▫( B/ x) f(x,d) 0 x X, d D. Computation: convex relaxation of theorem criteria via SOS decomposition [Parrilo 2000] and semidefinite programming (SDP); for example relax {B(x) 0 x X 0 } to {− B(x) − 0 T (x) h 0 (x) = SOS with 0 (x) = SOS}. Foundations
7
One-dimensional abstraction (cont’d) Example Two: stochastic systems. System s : dx = f(x) dt + g(x) dw where w(t) is a Wiener process. Theorem: is upper bound on probability of reaching X u X from X 0 X while remaining in X if B(x) s.t. ▫B(x) x X 0 ; ▫B(x) 1 x X u ; ▫B(x) 0 x X ; ▫( B/ x) f + (1/2) tr [g T ( 2 B/ 2 x) g] 0 x X. Computation: convex relaxation of theorem criteria via SOS and SDP. Foundations
8
Case studies On-line markets Objective: study social process “paradox” – outcomes of social processes are often both unequal and unpredictable – and explore the possibility of prediction in on-line markets. Empirical data: music and soft- ware markets.
9
Case studies On-line markets (cont’d) Summary of software market study (for description of music market see [Watts et al. 2006]): data source: CNET software library (www.download.com) consisting of >30,000 programs with associated news/reviews/prices/technical information, one month data collection period; main findings: ▫ daily download market share of item is (statistically significantly) positively correlated with cumulative item downloads, negatively correlated with item age, and not (statistically) affected by other information (e.g., expert reviews, user reviews, technical data); ▫ average quality of “most popular” software is not distinguishable from average quality of all software available on site.
10
Case studies On-line markets (cont’d) Model: “low cognition” agent model in which each agent selects option based upon evaluation of option quality and consideration of choices of other agents. Predictability assessment: formal analysis of predictability of market share winners/losers shows prediction is feasible in both low social influence (SI) and high SI cases; identifies observable for which prediction is feasible – limited, very early market share time series. Predictability assessment unpredictability at t = 0: 0.63 unpredictability at t = 10: ~10 −3 (both for high SI) unpredictability w/ high SI: 0.63 unpredictability w/ low SI: ~10 −3 (both for t = 0)
11
Case studies On-line markets (cont’d) Prediction: sample results Simulated music market 1.Estimate SI level (low or high) within “multiple models” framework using convergence rate of market share variance. 2.Predict market share using algorithm appropriate for SI level. Experimental music market: scheme appropriately classifies market share trajectories as corresponding to low or high SI and enables useful market share predictions in each case.
12
Movie revenue predictability/prediction Objective: explore possibility of predicting total box office revenue for a given movie. Empirical data: movie industry – weekly box office receipts, production budget and marketing expenses; personnel data; on-line ratings – critic reviews, daily user reviews; for 40 movies. Studio view: “Nobody knows anything” – screenwriter William Goldman. Case studies
13
Movie revenue predictability/prediction (cont’d) Sample predictability results Classical “input-output” prediction: movie revenues are power law distributed with infinite variance, so classical revenue forecasts have zero precision. Dynamics-based prediction: formal analysis of predictability of total box office revenue suggests ▫prediction is feasible only if some form of time series data is available; ▫useful predictions may be obtainable using only on-line user reviews (as proxy for “buzz”). Predictability assessment unpredictability at t = 0: 0.83 unpredictability at t = 10: ~10 −2 (both for high SI) Case studies
14
Movie revenue predictability/prediction (cont’d) Sample prediction results Model estimation: use “training” data (for 20 movies) to develop formula that estimates movie appeal and buzz from early on- line reviewer data. Algorithm: ▫estimate {appeal, buzz} for movie from very early (e.g., first week) time series using formula obtained above; ▫predict total box office revenue for movie by evolving low cognition movie attendance model to equilibrium. Case studies
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.