Power Laws Otherwise known as any semi- straight line on a log-log plot.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Estimation of Means and Proportions
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
A Sampling Distribution
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
Sampling Distributions (§ )
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
N.D.GagunashviliUniversity of Akureyri, Iceland Pearson´s χ 2 Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted.
Hypothesis testing Week 10 Lecture 2.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
1 (Student’s) T Distribution. 2 Z vs. T Many applications involve making conclusions about an unknown mean . Because a second unknown, , is present,
Section 7-2 Hypothesis Testing for the Mean (n  30)
AM Recitation 2/10/11.
Copyright © 2012 by Nelson Education Limited. Chapter 8 Hypothesis Testing II: The Two-Sample Case 8-1.
Fundamentals of Hypothesis Testing: One-Sample Tests
Linear Regression Inference
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Basic Statistics. Basics Of Measurement Sampling Distribution of the Mean: The set of all possible means of samples of a given size taken from a population.
Inference for One-Sample Means
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
Albert Morlan Caitrin Carroll Savannah Andrews Richard Saney.
Sections 6-1 and 6-2 Overview Estimating a Population Proportion.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
1 Statistical Distribution Fitting Dr. Jason Merrick.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Confidence intervals and hypothesis testing Petter Mostad
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Section 8-5 Testing a Claim about a Mean: σ Not Known.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 6 The Normal Distribution and Other Continuous Distributions.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Chapter 7 Statistical Inference: Estimating a Population Mean.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
© Copyright McGraw-Hill 2004
Machine Learning 5. Parametric Methods.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Inferential Statistics. Population Curve Mean Mean Group of 30.
Sampling Distributions Chapter 9 Central Limit Theorem.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Markov Chain Monte Carlo in R
Confidence Intervals and Sample Size
Sampling Distributions
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Sampling Distributions
Sampling Distributions
Sampling Distribution Models
Statistical Process Control
Sampling Distribution of the Sample Mean
Statistical Assumptions for SLR
Sampling Distributions
Diagnostics and Remedial Measures
Sampling Distributions (§ )
Sampling Distributions
Presentation transcript:

Power Laws Otherwise known as any semi- straight line on a log-log plot

Self Similar The distribution maintains its shape This is the only distribution with this property

Fitting a line Assumptions of linear Regression do not hold: noise is not Gaussian Many distributions approximate power laws, leading to high R 2 indepent of the quality of the fit Regressions will not be properly normalized

Maximum Likelihood Estimator for the continuous case α is greater than 1 – necessary for convergence There is some x min below which power law behavior does not occur – necessary for convergence Converges as n→∞ This will give the best power law, but does not test if a power law is a good distribution!!!

How Does it do? Actual Value: 2.5 Continuous Discreet

Error as a function of X min and n For Discreet DataFor Continous Data

Setting X min Too low: we include non power-law data Too high: we lose a lot of data Clauset suggests “the value x min that makes the probability distributions between the measured data and the best- fit power-law model as similar as possible above x min ” Use KS statistic

How does it perform?

But How Do We Know it’s a Power Law? Calculate KS Statistic between data and best fitting power law Find p-value – theoretically, there exists a function p=f(KS value) But, the best fit distribution is not the “true” distribution due to statistical fluctuations Do a numerical approach: create distributions and find their KS value Compare D value to best fit value for each data set We can now rule out a power law, but can we conclude that it is a power law?

Comparison of Models Which of two fits is least bad Compute likelihood (R) of two distributions, higher likelihood = better fit But, we need to know how large statistical fluctuations will be Using central limit theroem, R will be normally distributed – we can calculate p values from the standard deviation

How does real world data stack up?

Mechanisms Summation of exponentials Random walk – often first return The Yule process, whereby probabilities are related to the number that are already present Self-organized criticality – the burning forest

Conclusions It’s really hard to show something is a power law With high noise or few points, it’s hard to show something isn’t a power law