Blockbusters, Bombs & Sleepers The Income Distribution of Movies Sitabhra Sinha The Institute of Mathematical Sciences Chennai (Madras), India.

Slides:



Advertisements
Similar presentations
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Advertisements

From Quark to Jet: A Beautiful Journey Lecture 1 1 iCSC2014, Tyler Dorland, DESY From Quark to Jet: A Beautiful Journey Lecture 1 Beauty Physics, Tracking,
The Rich Are Different ! Pareto law from asymmetric interactions in asset exchange models Sitabhra Sinha The Institute of Mathematical Sciences Chennai.
Stephen McCray and David Courard-Hauri, Environmental Science and Policy Program, Drake University Introduction References 1.Doran, P. T. & Zimmerman,
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Toyota InfoTechnology Center U.S.A, Inc. 1 Mixture Models of End-host Network Traffic John Mark Agosta, Jaideep Chandrashekar, Mark Crovella, Nina Taft.
Critics review or preceding week’s admissions explaining movie admissions Seppo Suominen.
Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.
Power Laws: Rich-Get-Richer Phenomena
Analysis. Start with describing the features you see in the data.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
GIS and Spatial Statistics: Methods and Applications in Public Health
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Understanding Variables Emily H. Wughalter, Ed.D. Professor, Department of Kinesiology Spring 2010.
ACTION!!! Introduction We explored the top 100 movies in the last 10 years in terms of gross revenues We investigated why certain movies have.
Why Stock Markets Crash. Why stock markets crash? Sornette’s argument in his book/article is as follows: 1.The motion of stock markets are not entirely.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
1 Measurement-based Characterization of a Collection of On-line Games Chris Chambers Wu-chang Feng Portland State University Sambit Sahu Debanjan Saha.
Complex Systems, Agent Cognition and Network Structure : Modeling with Low Cognition Agents Rich Colbaugh, Kristin Glass, Paul Ormerod and Bridget Rosewell.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
Bottom-Up Coordination in the El Farol Game: an agent-based model Shu-Heng Chen, Umberto Gostoli.
Add image. 3 “ Content is NOT king ” today 3 40 analog cable digital cable Internet 100 infinite broadcast Time Number of TV channels.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Information Networks Power Laws and Network Models Lecture 3.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Emergence of two-phase behavior in markets through interaction and learning in agents with bounded rationality Sitabhra Sinha The Institute of Mathematical.
Critical Analysis. Key Ideas When evaluating claims based on statistical studies, you must assess the methods used for collecting and analysing the data.
1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.
Statistical analysis of global temperature and precipitation data Imre Bartos, Imre Jánosi Department of Physics of Complex Systems Eötvös University.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Kristina Lerman Aram Galstyan USC Information Sciences Institute Analysis of Social Voting Patterns on Digg.
Models and Algorithms for Complex Networks Power laws and generative processes.
1 Discovering Authorities in Question Answer Communities by Using Link Analysis Pawel Jurczyk, Eugene Agichtein (CIKM 2007)
1 Statistical Distribution Fitting Dr. Jason Merrick.
Critical Phenomena in Random and Complex Systems Capri September 9-12, 2014 Spin Glass Dynamics at the Mesoscale Samaresh Guchhait* and Raymond L. Orbach**
Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
TEKS (6.10) Probability and statistics. The student uses statistical representations to analyze data. The student is expected to: (B) identify mean (using.
Design and Implementation of a Dynamic Data MLP to Predict Motion Picture Revenue David A. Gerasimow.
FAT TAILS REFERENCES CONCLUSIONS SHANNON ENTROPY AND ADJUSTMENT OF PARAMETERS AN ADAPTIVE STOCHASTIC MODEL FOR RETURNS An adaptive stochastic model is.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Minority Game and Herding Model
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Higher Media Studies Production Unit Module 1: Getting Started Lesson 1: Introduction to Production.
Percolation Percolation is a purely geometric problem which exhibits a phase transition consider a 2 dimensional lattice where the sites are occupied with.
Collective behavior of El Farol attendees European Conference on Complex Systems 2007 October 1-6, 2007 – Dresden Photo credit Matthew Bannister, James.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Computational Physics (Lecture 10) PHY4370. Simulation Details To simulate Ising models First step is to choose a lattice. For example, we can us SC,
Your friend has a hobby of generating random bit strings, and finding patterns in them. One day she come to you, excited and says: I found the strangest.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
WIS/COLLNET’2016 Nancy, France
Sampling Distributions and Hypothesis Testing
What Stops Social Epidemics?
Analyzing and Interpreting Quantitative Data
All About BLOCKBUSTERS!!
CHAPTER 29: Multiple Regression*
CASE − Cognitive Agents for Social Environments
World Distribution of Household Wealth
Basic Practice of Statistics - 3rd Edition Inference for Regression
Measurement-based Characterization of a Collection of On-line Games
Presentation transcript:

Blockbusters, Bombs & Sleepers The Income Distribution of Movies Sitabhra Sinha The Institute of Mathematical Sciences Chennai (Madras), India

A Pareto Law for Movies Why look at Movie Income ? Movie income is a well-defined quantity; Income distribution can be empirically determined Asset exchange models for explaining Pareto Law in wealth/income distribution cannot be applied ! Movies don’t exchange anything between themselves !! “There’s no business like show business” Pareto exponent for Movie Income :   2 But

Popularity of Products/Ideas  Movies: S Sinha & S Raghavendra (2004) Eur Phys J B, 42, 293  Scientific Papers: S Redner (1998) Eur Phys J B, 4, 131  Books: D Sornette et al (2004) Phys Rev Lett, 93, Movies popularity distribution → a prominent member of the class of popularity distributions

1+  = 3 The Popularity of Scientific Papers 1/   0.48 Measure of popularity : citation distribution Relation between exponents for  : Cumulative probability (Pareto Law) 1+  : Probability distrn (Power law) 1/  : Rank distribution (Zipfs Law)  Pareto exponent  2 ISI Phys Rev D

The Popularity of Books Measure of popularity : Book sales at amazon.com  Pareto exponent  2

A ‘ Hit ’ is Born: The Dynamics of Popularity Conjecture: Universality Pareto exponent for popularity distributions   2

Outline of the Talk  Empirical : Distributions SS & S Raghavendra (2004) Eur Phys J B, 42:  Empirical : Time evolution SS & R K Pan, in preparation  Model SS & S Raghavendra (2004) SFI Working Paper SS & S Raghavendra (2005) to appear in Practical Fruits of Econophysics, Proc 3 rd Nikkei Econophysics Symposium, Springer-Tokyo

Outline of the Talk  Empirical : Distributions SS & S Raghavendra (2004) Eur Phys J B, 42:  Empirical : Time evolution SS & R K Pan, in preparation  Model SS & S Raghavendra (2004) SFI Working Paper SS & S Raghavendra (2005) to appear in Practical Fruits of Econophysics, Proc 3 rd Nikkei Econophysics Symposium, Springer-Tokyo

Measuring Popularity However, these are for movies released long ago: lot of information available for people to decide What about newly released movies still running in theatres ? Popularity of a movie can be estimated in various ways: e.g., Number of votes received from registered users in IMDB database Or, DVD/Video rentals from Blockbuster Stores What’s the income, dude ?

Income Distribution Snapshot Too few data points, too much scatter Each week, about movies running in theatres across USA Hard to make a call on the nature of the distribution !

The Movie Year: Seasonal Fluctuations in Movie Income over a Year Makes sense to look at income distribution over a year: we can ignore seasonal variations

Popularity Distribution of movies released in USA during acc to weeks in Top 60 Gaussian distribution Long tail: the most popular movies do not fit a Gaussian! Rank distribution of movies: explores the tail of the distribution containing the most popular movies Data for all years fall on the same curve after normalizing !! slope 

Gross Income Distrn of movies released in USA during Opening Gross Kink indicating bimodality Bimodal distribution of opening gross Movies either do very badly or very well on opening ! Distribution scaled by average gross to correct for inflation

Gross Income Distrn of movies released in USA during Opening Gross Total Gross Unimodal 1/   0.5  Pareto exponent  2 at opening week and remains so through the entire theatre lifespan The only contribution of movies which perform well long after opening (sleepers) Distribution scaled by average gross to correct for inflation

Relation between longevity at Top 60 & Total Gross IMAX movies Slope ~ 2.14 G Total ~ T 2

Outline of the Talk  Empirical : Distributions SS & S Raghavendra (2004) Eur Phys J B, 42:  Empirical : Time evolution SS & R K Pan, in preparation  Model SS & S Raghavendra (2004) SFI Working Paper SS & S Raghavendra (2005) to appear in Practical Fruits of Econophysics, Proc 3 rd Nikkei Econophysics Symposium, Springer-Tokyo

A Movie Bestiary Classifying Movies according to the time evolution of their income  Blockbusters: High Opening Gross, High Total Gross Intermediate to long theatre lifespan  Bombs: Low Opening Gross, Low Total Gross Short theatre lifespan  Sleepers: Low Opening Gross, High Total Gross Long theatre lifespan

Spiderman (2002) A classic blockbuster Peaks on weekends Daily earnings Weekend earnings Exponential decay

Spiderman 2 (2004) A blockbuster … but like most sequels, earned less & ran fewer weeks than the original !

The Blockbuster Strategy “If it doesn’t open, you are dead !” - Robert Evans, Hollywood producer The opening is the most critical event in a film’s commercial life FACT: > 80 % of all movies earn maximum box-office revenue in the first week after release Jaws (1975) : the first movie to be released using the (now classic) blockbuster strategy :  Heavy pre-release advertising  Presence of star/stars with name recognition  Wide release Underlying assumption : ‘Herding’ effect among movie audience A large opening will induce others to see the movie !

BLOCKBUSTERS: Examples  Very high opening gross  Exponential decay in subsequent earnings

Lord of the Rings 3: Return of the King (2003) Top grosser of the year !

Harry Potter and the Sorcerer’s Stone (2001)

The Sixth Sense ( 1999) Blockbuster…. but behaved like a sleeper very late in its theatre lifespan ! (longest time at top 60 for non-IMAX movie - 40 weeks)

BOMBS: Examples  Very low opening gross  Exponential decay in subsequent earnings  Earns significantly less than budget

Bulletproof Monk (2003) Spectacular flop ! Production budget: $ 50 Million Advertising budget: $ 25 Million

American Psycho (2000)

SLEEPERS: Examples  Very low opening gross  Sudden rise in subsequent earnings before eventual exponential decay before eventual exponential decay

My Big Fat Greek Wedding (2002) A classic sleeper ! Produced outside Hollywood Extremely long theatre lifespan Gradual rise in income Subsequent exponential decay

The Blair Witch Project (1999) Another Hollywood outsider sleeper

Mystic River (2003) Publicity Buildup to Oscar Awards A Hollywood insider sleeper ! Unusual: Multiple rises in income during theatre lifespan

To compare 2004 Spiderman Lord of the Rings 3: Return of the King Mystic River Bulletproof Monk 2002Spiderman My Big Fat Greek Wedding 2001 Harry Potter and the Sorcerers' Stone 2000 American Psycho 1999 The Sixth Sense Blair Witch Project Color code: BlockbusterSleeperBomb

Scaled by opening gross Income of most movies decay exponentially with the same decay rate < 5 weeks Comparing the Income Growth / Decay of Movies

Outline of the Talk  Empirical : Distributions SS & S Raghavendra (2004) Eur Phys J B, 42:  Empirical : Time evolution SS & R K Pan, in preparation  Model SS & S Raghavendra (2004) SFI Working Paper SS & S Raghavendra (2005) to appear in Practical Fruits of Econophysics, Proc 3 rd Nikkei Econophysics Symposium, Springer-Tokyo

Puzzle The Pareto tail appears at the opening week itself Asset exchange models don’t apply Can’t be explained by information exchange about a movie through interaction between people Need a different approach The Pareto tail appears at the opening week itself Asset exchange models don’t apply Can’t be explained by information exchange about a movie through interaction between people Need a different approach

Popularity = Collective Choice Process of emergence of collective decision in a society of agents free to choose in a society of agents free to choose constrained by limited information constrained by limited information having heterogeneous beliefs. having heterogeneous beliefs. Example: Example: Movie popularity. Movie popularity.

Collective Choice: A Naive Approach Each agent chooses randomly independent of all other agents. Each agent chooses randomly independent of all other agents. Collective decision: sum of all individual choices. Collective decision: sum of all individual choices. Example: YES/NO voting on an issue Example: YES/NO voting on an issue For binary choice For binary choice Individual agent: S = 0 or 1 Individual agent: S = 0 or 1 Collective choice: M = Σ S Collective choice: M = Σ S Result: Normal distribution. Result: Normal distribution. NOYES 0 % Collective Decision M 100%

Modeling emergence of collective choice Agent’s choice depends on Personal belief (expectation from a particular choice)Personal belief (expectation from a particular choice) Herding (through interaction with neighbors)Herding (through interaction with neighbors) 2 factors affect the evolution of an agent’s belief Adaptation (to previous choice):Adaptation (to previous choice): Belief changes with time to make subsequent choice of the same alternative less likely Belief changes with time to make subsequent choice of the same alternative less likely Learning (by global feedback through media):Learning (by global feedback through media): The agent will be affected by how her previous choice accorded with the collective choice (M). The agent will be affected by how her previous choice accorded with the collective choice (M).

The Model: ‘Adaptive Field’ Ising Model Binary choice :2 possible choice states (S = ± 1). Binary choice :2 possible choice states (S = ± 1). Belief dynamics of the i th agent at time t: Belief dynamics of the i th agent at time t: where is the collective decision μ: Adaptation timescale μ: Adaptation timescale λ: Learning timescale λ: Learning timescale Choice dynamics of the ith agent at time t: Choice dynamics of the ith agent at time t: for square lattice

Results Long-range order for λ > 0Long-range order for λ > 0

Initial state of the S field: 1000 × 1000 agents

λ = 0: No long-range order μ =0.1 N = 1000, T = itrns Square Lattice (4 neighbors)

μ =0.1 λ > 0: clustering λ = 0.05 N = 1000, T = 200 itrns Square Lattice (4 neighbors)

Results Long-range order for λ > 0Long-range order for λ > 0 Self-organized pattern formationSelf-organized pattern formation

μ =0.1 λ = 0.05 Ordered patterns emerge asymptotically

Results Long-range order for λ > 0Long-range order for λ > 0 Self-organized pattern formationSelf-organized pattern formation –Multiple ordered domains –Behavior of agents belonging to each such domain is highly correlated –Distinct ‘cultural groups’ (Axelrod).

Results Long-range order for λ > 0Long-range order for λ > 0 Self-organized pattern formationSelf-organized pattern formation –Multiple ordered domains –Behavior of agents belonging to each such domain is highly correlated –Distinct ‘cultural groups’ (Axelrod). Phase transitionPhase transition –Unimodal to bimodal distribution as λ increases.

Bimodality with increasing λ

Results Long-range order for λ > 0Long-range order for λ > 0 Self-organized pattern formationSelf-organized pattern formation –Multiple ordered domains –Behavior of agents belonging to each such domain is highly correlated –Distinct ‘cultural groups’ (Axelrod). Phase transitionPhase transition –Unimodal to bimodal distribution as λ increases. Similar results for agents on scale-free networkSimilar results for agents on scale-free network

OK… but does it explain reality ? Rank distribution: Compare real data with model US Movie Opening Gross Model Model: randomly distributed λ

Rank Distribution according to Ratings A DeVany & W D Walls (2002) J Business 75:425 Rank distrn of G-rated movies similar to that for = 0 Rank distrn of PG, PG-13 and esp R-rated movies similar to that for > 0

Conclusion Movie income distribution is Gaussian but with a power law tail having Pareto exponent  ~ 2 Possibly universal for popularity distributions ! True for opening gross income as well as total gross income distribution Pareto tail cannot be explained by information exchange through interaction among agents Bimodality in opening gross distribution can be explained by a collective choice model Movie income distribution is Gaussian but with a power law tail having Pareto exponent  ~ 2 Possibly universal for popularity distributions ! True for opening gross income as well as total gross income distribution Pareto tail cannot be explained by information exchange through interaction among agents Bimodality in opening gross distribution can be explained by a collective choice model