Section 6 – Ec1818 Jeremy Barofsky March 10 th and 11 th, 2010.

Slides:



Advertisements
Similar presentations
The Simple Regression Model
Advertisements

AP Statistics Chapter 7 – Random Variables. Random Variables Random Variable – A variable whose value is a numerical outcome of a random phenomenon. Discrete.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Metabolic theory and ecological scaling Geoffrey WestJames Brown Brian Enquist.
Web Graph Characteristics Kira Radinsky All of the following slides are courtesy of Ronny Lempel (Yahoo!)
© 2010 Pearson Prentice Hall. All rights reserved Sampling Distributions and the Central Limit Theorem.
Section 6.1 Let X 1, X 2, …, X n be a random sample from a distribution described by p.m.f./p.d.f. f(x ;  ) where the value of  is unknown; then  is.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Machine Learning CMPT 726 Simon Fraser University
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
Normal Distributions What is a Normal Distribution? Why are Many Variables Normally Distributed? Why are Many Variables Normally Distributed? How Are Normal.
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
1 Dynamic Models for File Sizes and Double Pareto Distributions Michael Mitzenmacher Harvard University.
Standard Scores & Correlation. Review A frequency curve either normal or otherwise is simply a line graph of all frequency of scores earned in a data.
The Lognormal Distribution
5.4 The Central Limit Theorem Statistics Mrs. Spitz Fall 2008.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
1 ECON – Principles of Microeconomics S&W, Chapter 6 The Firm’s Costs Instructor: Mehmet S. Tosun, Ph.D. Department of Economics University of.
CA200 Quantitative Analysis for Business Decisions.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
PBG 650 Advanced Plant Breeding
Copyright © 2010 Pearson Education, Inc. Slide
Overview 7.2 Central Limit Theorem for Means Objectives: By the end of this section, I will be able to… 1) Describe the sampling distribution of x for.
“Real Estate Principles for the New Economy”: Norman G. Miller and David M. Geltner Real Estate QUIZMASTER DefinitionsAnalyticalNumerical.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Section 8 – Ec1818 Jeremy Barofsky March 31 st and April 1 st, 2010.
Models and Algorithms for Complex Networks Power laws and generative processes.
Information Retrieval and Web Search Text properties (Note: some of the slides in this set have been adapted from the course taught by Prof. James Allan.
A P STATISTICS LESSON 2 – 2 STANDARD NORMAL CALCULATIONS.
Probability and Samples
AP Statistics Section 9.3A Sample Means. In section 9.2, we found that the sampling distribution of is approximately Normal with _____ and ___________.
Random Variables Chapter 16.
1 Statistical Properties for Text Rong Jin. 2 Statistical Properties of Text  How is the frequency of different words distributed?  How fast does vocabulary.
Chapter 7 Estimation Procedures. Basic Logic  In estimation procedures, statistics calculated from random samples are used to estimate the value of population.
Session 6 : 9/221 Exponential and Logarithmic Functions.
Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.
Discrete Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4)
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Section 6-5 The Central Limit Theorem. THE CENTRAL LIMIT THEOREM Given: 1.The random variable x has a distribution (which may or may not be normal) with.
Exam 1 next Thursday (March 7 th ) in class 15% of your grade Covers chapters 1-6 and the central limit theorem I will put practice problems, old exams,
Chapter 6 The Normal Distribution. 2 Chapter 6 The Normal Distribution Major Points Distributions and area Distributions and area The normal distribution.
LECTURE 17 THURSDAY, 22 OCTOBER STA 291 Fall
Probability Review CSE430 – Operating Systems. Overview of Lecture Basic probability review Important distributions Poison Process Markov Chains Queuing.
Chapter 5a:Functions of Random Variables Yang Zhenlin.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
Exploring Text: Zipf’s Law and Heaps’ Law. (a) (b) (a) Distribution of sorted word frequencies (Zipf’s law) (b) Distribution of size of the vocabulary.
DISTRIBUTIVE PROPERTY. When no addition or subtraction sign separates a constant or variable next to a parentheses, it implies multiplication.
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
Statistical Properties of Text
Chapter 6 Large Random Samples Weiqi Luo ( 骆伟祺 ) School of Data & Computer Science Sun Yat-Sen University :
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Sums of Random Variables and Long-Term Averages Sums of R.V. ‘s S n = X 1 + X X n of course.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
SAMPLING DISTRIBUTIONS Section 7.1, cont. GET A CALCULATOR!
Section 8.3 The Sampling Distribution of the Sample Proportion.
Random Variables By: 1.
Sustainable Urban Energy Systems A Science of Cities Approach Lorraine Sugar PhD Candidate, University of Toronto ONSEP 2016.
Chapter 4 Logarithm Functions
Topics In Social Computing (67810)
Mathboat.com.
The Simple Regression Model
The normal distribution
Chapter 10: Basics of Confidence Intervals
Tutorial 5 Logarithm & Exponential Functions and Applications
Introduction to Probability: Solutions for Quizzes 4 and 5
Sample Proportions Section 9.2
Presentation transcript:

Section 6 – Ec1818 Jeremy Barofsky March 10 th and 11 th, 2010

Section 6 Outline Power Laws – General defintion / scale invariance – Zipf’s Law – Pareto’s Law Net Logo Tutorial OFFICE HOURS – THURSDAY March 11, 10-11, CGIS N outside room 320.

Power Law Definition In general: y = S -a where y = frequency/ rank of event, S = event size, and a constant exponent >= 1. Distribution: Also can be represented as a complimentary CDF where if X is a random variable, then 1 – F(X) = Pr( X > x) = bx -α where α is the constant this time, x = event size and y in this case is the Pr(X > x). The tails of this distribution (the probability of rare events) increases as α / a falls. Since as the slope of the distribution becomes steeper rare events become less likely (the tails get thinner). When 0 < α < 2, var(X) = infinity, when α ≤ 1, then E(X) = infinity!!! (moments of the distribution don’t exist).

Power Laws in the Real World Previous to Axtell, 2001 firm sizes had been described as log- normally distributed. Axtell shows that the Pr( Firm S > s) = bs -a. And he finds a = 1.25 for the U.S. and close to 1 for other nations. Biological power laws – Kleiber’s law: metabolic needs of mammal increase = mass ¾. Geoffrey West (from Santa Fe Institute) explained from the transport system for energy to the body through an efficient branching circulatory network. Strogatz (NY Times blog 2009) – Find 3/4ths power laws in city size infrastructure too: “For instance, if one city is 10 times as populous as another one, does it need 10 times as many gas stations? No. Bigger cities have more gas stations than smaller ones (of course), but not nearly in direct proportion to their size. The number of gas stations grows only in proportion to the 0.77 power of population. The crucial thing is that 0.77 is less than 1. This implies that the bigger a city is, the fewer gas stations it has per person. Put simply, bigger cities enjoy economies of scale. In this sense, bigger is greener.”

Scale Invariance of Power Laws Random variables (R.V.) that are power law distributed exhibit scale invariance, meaning that we can multiply city size (our R.V.) by any units and still retain the distribution’s shape. Therefore, S is unit-free. Only rank matters. Take a rank- preserving transformation of distribution (take logs) and we have a simple test for Zipf’s law. Ln y = ln b – α ln(S). Shown to hold in U.S. 1890, 1940, 1990, and most other nations within modern times. India 1911, China 1850s (references from Gabaix). Deep intuition into processes of city growth / income distribution or just a statistical reality like the CLT that provides no insight?

Zipf’s Law Discovered by George Kingsley Zipf, 1949, in analysis of word rank and word usage (Human Behavior and the Principle of Least Effort). Specific type of power law where Pr( X > x) = bx - α and α = 1. So, the probability of seeing extreme events (large cities) falls at the rate of α = 1. For city sizes, S= city size: Pr( S > s) = Rank of city size = bs - α Meaning that the prob. of city being > size s = 1/s. The larger the city size the less likely we are to see a city of that size. Not a vacuous result: could have had normally or log-normally distributed city sizes. Implications? (Use board and STATA).

Pareto’s Law Another power law form, discovered to describe the tail-end of the income distribution around 1900 by Italian economist Vilifredo Pareto. Pr( S > s) = s - α If income is distributed Pareto, then this implies the rule, that 80% of the wealth is held by 20% of population – Pareto principle. Verified that income distributed this way around the world. Continuous version of Zipf where b = 1.