Cancer Clustering Phenomenon

Slides:



Advertisements
Similar presentations
Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.
Advertisements

Caitlin Bryant Jennifer Davis Zach Wooten PREDICTING THE NEED FOR SPECIAL EDUCATION CLASSES.
1 Software Testing and Quality Assurance Lecture 36 – Software Quality Assurance.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Sampling and Randomness
Machine Learning CMPT 726 Simon Fraser University
7-1 Chapter Seven SAMPLING DESIGN. 7-2 Sampling What is it? –Drawing a conclusion about the entire population from selection of limited elements in a.
7-2 Estimating a Population Proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Business Statistics - QBM117 Statistical inference for regression.
Inference on averages Data are collected to learn about certain numerical characteristics of a process or phenomenon that in most cases are unknown. Example:
The Lognormal Distribution
The Multivariate Normal Distribution, Part 1 BMTRY 726 1/10/2014.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Chapter 4 SUMMARIZING SCORES WITH MEASURES OF VARIABILITY.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Space-Filling DOEs Design of experiments (DOE) for noisy data tend to place points on the boundary of the domain. When the error in the surrogate is due.
Ecology: Populations. Characteristics of Populations 1.Geographic distribution 2.Density 3.Growth Rate 4.Age Structure.
PATTERN RECOGNITION AND MACHINE LEARNING
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Modeling Cure Rates Using the Survival Distribution of the General Population Wei Hou 1, Keith Muller 1, Michael Milano 2, Paul Okunieff 1, Myron Chang.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Statistics 300: Elementary Statistics Section 6-5.
Distribution of the Sample Mean (Central Limit Theorem)
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
Measuring Inequality A practical workshop On theory and technique San Jose, Costa Rica August 4 -5, 2004.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
RESEARCH & DATA ANALYSIS
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
INCLUDING UNCERTAINTY MODELS FOR SURROGATE BASED GLOBAL DESIGN OPTIMIZATION The EGO algorithm STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION GROUP Thanks.
Percolation Percolation is a purely geometric problem which exhibits a phase transition consider a 2 dimensional lattice where the sites are occupied with.
Populations. Remember a population is… A group of the same species in the same area at the same time. A group of the same species in the same area at.
Homogeneity of Variance Pooling the variances doesn’t make sense when we cannot assume all of the sample Variances are estimating the same value. For two.
Chapter 6: Random Errors in Chemical Analysis. 6A The nature of random errors Random, or indeterminate, errors can never be totally eliminated and are.
Estimating standard error using bootstrap
Inference: Conclusion with Confidence
Populations - Chapter 19.
Chapter 7. Classification and Prediction
Inference: Conclusion with Confidence
Environmental Science 20
PCB 3043L - General Ecology Data Analysis.
Sampling Distributions
Sampling Distributions
Populations.
SA3202 Statistical Methods for Social Sciences
Chapter 8 - Estimation.
POINT ESTIMATOR OF PARAMETERS
Chapter 8 What Is a Population?
Section 3: Estimating p in a binomial distribution
Pattern Recognition and Machine Learning
Homogeneity of Variance
The estimate of the proportion (“p-hat”) based on the sample can be a variety of values, and we don’t expect to get the same value every time, but the.
Fixed, Random and Mixed effects
Populations.
Introduction to Probability
Sampling Distribution of a Sample Proportion
Sampling: How to Select a Few to Represent the Many
Populations Chapter 5 Unit 2.
How Populations Grow.
Approximation of Percolation Thresholds
Phenomenon: The environment limits the growth of a population
Markov Chains & Population Movements
Warm Up “What factors might change a population size?”
Presentation transcript:

Cancer Clustering Phenomenon Whitney Lamm November 30, 2004

Outline What is a Cancer Cluster? Scientific and Public definitions The One-dimensional and D-Dimensional Cases Expected logarithmic growth rates Purpose Collecting the simulated data MatLab program Matrices of 1’s and 0’s Results Conclusions Further Research Questions

What is a Cancer Cluster? The Public’s definition The public usually thinks of a cancer cluster as a large group of the population in a certain area of the country that are diagnosed with cancer due to some type of industrial pollution, usually caused by a chemical contamination. The Epidemiologists define a Cancer Cluster as “a geographic area, time period, or group of people with a greater than expected number of cases of cancer”

One and D-dimensional Cases The expected growth rate for the longest run of heads grows logarithmically For the case of D-dimensions, more specifically two dimensions, the expected growth rate is logarithmic, but more closely resembles the following

One and D-dimensional Cases In both the one dimensional and d-dimensional cases, the limiting distribution is a Gumbel or Type I extreme value distribution.

Purpose Find a way to simulate cancer clusters in two-dimensions Does the simulated data grow logarithmically? How does changing the probability of susceptibility affect the simulated cancer clusters? Estimate the expected cancer cluster size for the population of the United States.

Collecting Simulated Data What is the best way to simulate data for the two dimensional case of cancer clusters? MatLab program creates a random n by n matrix composed of 1’s and 0’s finds largest blocks of 1’s within larger n by n matrix, these blocks represent simulated cancer clusters

Collecting Simulated Data

Results For 100 trials, with probability .5. For 100 trials of Matrix size 1,000.

Results (continued)

Results (continued)

Results (continued)

Conclusions Does the maximum cluster size grow logarithmically as n increases? Yes, but much slower than we would expect for the cases of the common and base two log, but a more accurate fit to the square root logarithmic expression. How does decreasing the probability affect the maximum block size? The block size decreases as the probability decreases.

Conclusions (continued) Can we predict the expected cluster size for the population of the US? The second method of approximation was found to be more accurate for the simulated data, so we will assume that the expected block size for the population of the US would be between a 3 by 3 matrix and a 4 by 4 matrix.

Further Research Questions How close is the distribution of the two-dimensional simulations to a Gumbel distribution (Type I extreme value distribution)? Increase the number of trials and/or the sample size Change the probabilities for different regions of the simulations What would occur?

Are there any questions? Thank You for your time! Are there any questions?