Negative Examples for Sequential Importance Sampling of Binary Contingency Tables Ivona Bezáková (RIT) Daniel Štefankovič (Rochester) Alistair Sinclair.

Slides:



Advertisements
Similar presentations
1 2 Test for Independence 2 Test for Independence.
Advertisements

Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
Counting Algorithms for Knapsack and Related Problems 1 Raghu Meka (UT Austin, work done at MSR, SVC) Parikshit Gopalan (Microsoft Research, SVC) Adam.
Sampling: Final and Initial Sample Size Determination
Mathematics in Today's World
Directed Investigation Dice Differences Your Name.
STAT E100 Exam 2 Review.
Suppose we are interested in the digits in people’s phone numbers. There is some population mean (μ) and standard deviation (σ) Now suppose we take a sample.
Ch. 28 Chi-square test Used when the data are frequencies (counts) or proportions for 2 or more groups. Example 1.
Experiments and Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan April, 2006.
Is used when we have categorical (nominal) rather than interval / ratio data can also be used for measurement data, is less powerful and than typical tests.
511 Friday Feb Math/Stat 511 R. Sharpley Lecture #15: Computer Simulations of Probabilistic Models.
Experiments and Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan January, 2006.
BCOR 1020 Business Statistics
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Between Group & Within Subjects Designs Mann-Whitney Test.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
CHAPTER 10: Introducing Probability
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
Stat 1510: Introducing Probability. Agenda 2  The Idea of Probability  Probability Models  Probability Rules  Finite and Discrete Probability Models.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Units Of Output TFCTVCTCMCAFCAVC Marginal Cost Puzzle 20.
Statistical test for Non continuous variables. Dr L.M.M. Nunn.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Do What Needs to Be Done Today. The secret of happy successful living is to do what needs to be done now, and not worry about the past or the future.
Remote Sensing Classification Accuracy
4.1 Probability Distributions NOTES Coach Bridges.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Slide Slide 1 Section 6-4 Sampling Distributions and Estimators.
Probability Refresher COMP5416 Advanced Network Technologies.
Chapter 10 Introducing Probability BPS - 5th Ed. Chapter 101.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Lec. 08 – Discrete (and Continuous) Probability Distributions.
1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.
Warm Up: 2003 AP FRQ #2. We usually denote random variables by capital letters such as X or Y When a random variable X describes a random phenomenon,
8-3: Probability and Probability Distributions English Casbarro Unit 8.
Probability Distribution of a Discrete Random Variable If we have a sample probability distribution, we use (x bar) and s, respectively, for the mean.
Statistics for Engineer. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems and design.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
+ Binomial and Geometric Random Variables Geometric Settings In a binomial setting, the number of trials n is fixed and the binomial random variable X.
Chapter 11 Chi-Square Procedures 11.2 Contingency Tables; Association.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis – mutually exclusive – exhaustive.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
Practice Page # 21 Practice X = Stanford-Binet Y = WAIS b =.80 (15 / 16) =.75 a = 100 – (.75)100 = 25 Y = 25 + (.75)X = 25 + (.75)65 It’s.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chi Square Tests Chapter 17. Assumptions for Parametrics >Normal distributions >DV is at least scale >Random selection Sometimes other stuff: homogeneity,
The Poincaré Constant of a Random Walk in High- Dimensional Convex Bodies Ivona Bezáková Thesis Advisor: Prof. Eric Vigoda.
Simulations and Normal Distribution Week 4. Simulations Probability Exploration Tool.
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Topic 6: Proportions and Probabilities
Statistical Modelling
Computer Science 210 Computer Organization
Discrete and Continuous Random Variables
Bayes Net Learning: Bayesian Approaches
M & M Distribution.
The Binomial and Geometric Distributions
Relations in Categorical Data
Section 3: Estimating p in a binomial distribution
Categorical Data Analysis
Statistics.
Uniform Distributions and Random Variables
Chapter 7: The Distribution of Sample Means
Vital Statistics Probability and Statistics for Economics and Business
Section Way Tables and Marginal Distributions
Quadrat sampling & the Chi-squared test
Producing good data through sampling and experimentation
Quadrat sampling & the Chi-squared test
Uniform Probability Distribution
Presentation transcript:

Negative Examples for Sequential Importance Sampling of Binary Contingency Tables Ivona Bezáková (RIT) Daniel Štefankovič (Rochester) Alistair Sinclair (Berkeley) Eric Vigoda (Gatech)

The Voyage of the Beagle Galápagos archipelago (1835) Darwin’s Finches

© Robert H. Rothman Darwin’s Finches

10 8 Darwin’s Finches

chance OR competitive pressures ?

Given: marginals (row sums, column sums) Goal: sample tables uniformly at random count tables Binary Contingency Tables

Given: marginals (row sums, column sums) Goal: sample tables uniformly at random count tables Binary Contingency Tables

Given: marginals (row sums, column sums) Goal: sample tables uniformly at random count tables

Importance Sampling for counting problems x with positive probability  (x)>0 Probability distribution  on the points +   Random variable  (s) = 1/  (s) 0 if s in the set if s is  { Unbiased estimator E[  ] =   (x).1/  (x) = size of the set

a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] Sequential Importance Sampling for BCT

a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] Sequential Importance Sampling for BCT

a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] Sequential Importance Sampling for BCT

a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05]  r i /(n-r i ) where product ranges over i: rows with assignment 1 Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment 1 3  r i /(n-r i ) Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment 1 3  r i /(n-r i ) Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment 1 33  r i /(n-r i ) Sequential Importance Sampling for BCT

4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment 1 33  r i /(n-r i ) Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i ) Sequential Importance Sampling for BCT

4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i )

Sequential Importance Sampling for BCT 4 assign the column with probability proportional to a specific  fill table column-by-column assign each column ignoring other column sums [Chen-Diaconis-Holmes-Liu ’05] where product ranges over i: rows with assignment  r i /(n-r i )

A Counterexample for SIS mm 1 mm Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability).

A Counterexample for SIS mm Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability). Intuition 1

A Counterexample for SIS mm Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability). Intuition 1 Random table: - randomly choose  m ones

A Counterexample for SIS mm Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability). Intuition 1 Random table: - randomly choose  m ones

A Counterexample for SIS mm Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability). Intuition 1 Random table: - randomly choose  m ones

A Counterexample for SIS mm Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability). Intuition 1 Random table: - randomly choose  m ones mm Expect:  m ones SIS: asymptotically fewer

A Counterexample for SIS Thm [Bezáková-Sinclair-Štefankovič-Vigoda ‘06]: For any , SIS output after any subexponential number of trials is off by an exponential factor (with high probability). Intuition Expect:  m ones SIS: asymptotically fewer all tables tables with ~  m ones tables seen by SIS whp

SIS – Experimental Results Bad example, m = 300,  = 0.6,  = 0.7 log-scale of SIS estimate number SIS steps correct

SIS – Experimental Results Regular marginals: m=50, marginals 5 SIS estimate number SIS steps correct