PollutantsEventsDiseases

Slides:



Advertisements
Similar presentations
Chapter 4 Probability Distributions
Advertisements

Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 5-2.
STATISTICS Statistics refers to a set of techniques that are used to transform raw data into useful information.
Transforming and Combining Random Variables
CHAPTER 6 Random Variables
Part 8: Poisson Model for Counts 8-1/34 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chapter 8 Introduction to Hypothesis Testing
Chapter 10 Hypothesis Testing
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
5-2 Probability Distributions This section introduces the important concept of a probability distribution, which gives the probability for each value of.
Chapter 6: Random Variables
Spatial Statistics Applied to point data.
JMB Chapter 6 Lecture 3 EGR 252 Spring 2011 Slide 1 Continuous Probability Distributions Many continuous probability distributions, including: Uniform.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Chapter 10 Hypothesis Testing
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 6: Random Variables Section 6.1 Discrete and Continuous Random Variables.
Chapter 6 Random Variables
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
So, what’s the “point” to all of this?….
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Geographical Data and Measurement Geography, Data and Statistics.
ANALYZING DATA USING GRAPHS. Statistics  Statistics is the collection and classification of data that are in the form of numbers. Summarize Characterize.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 6 Random Variables 6.1 Discrete and Continuous.
Random Variables Ch. 6. Flip a fair coin 4 times. List all the possible outcomes. Let X be the number of heads. A probability model describes the possible.
STA 2023 Module 5 Discrete Random Variables. Rev.F082 Learning Objectives Upon completing this module, you should be able to: 1.Determine the probability.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Correlation between People’s Behaviors in Cyber World and Their Geological Position Lixiong Chen Jan 24 th, 2009.
Module 9.4 Random Numbers from Various Distributions -MC requires the use of unbiased random numbers.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Methods of Assigning Probabilities l Classical Probability; l Empirical Probability; and l Subjective Probability l P (A) = N(A) / N l P (X) = ƒ (X) /
10.2 ESTIMATING A POPULATION MEAN. QUESTION: How do we construct a confidence interval for an unknown population mean when we don’t know the population.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
SAMPLING DISTRIBUTION
Section 2: Statistics and Models
3. The X and Y samples are independent of one another.
Joint Probability Distributions and Random Samples
G10 Anuj Karpatne Vijay Borra
Discrete and Continuous Random Variables
Network-side Positioning of Cellular-band Devices with Minimal Effort
NATURE NEUROSCIENCE 2007 Coordinated memory replay in the visual cortex and hippocampus during sleep Daoyun Ji & Matthew A Wilson Department of Brain.
STATISTICS FOR RESEARCH IN ECOLOGY.
Research Focus Objectives: The Data Analysis and Intelligent Systems (DAIS) Lab  aims at the development of data analysis, data mining, GIS and artificial.
Research Focus Objectives: The Data Analysis and Intelligent Systems (DAIS) Lab  aims at the development of data analysis, data mining, GIS and artificial.
Outlier Discovery/Anomaly Detection
Ecolog.
Section 1.1: Equally Likely Outcomes
Brainstorming How to Analyze the 3AuCountHand Datasets
Ecolog.
CHAPTER 15 SUMMARY Chapter Specifics
Effective Techniques for Training
Warmup Consider tossing a fair coin 3 times.
12/6/ Discrete and Continuous Random Variables.
Brief General Discussion of Probability: Some “Probability Rules”
Measure Validation - Miscellaneous Thoughts
Chapter 7 (Probability)
Chapter 7: Random Variables
Ecolog.
Chapter 6: Random Variables
Processes & Patterns Spatial Data Models 5/10/2019 © J.M. Piwowar
11.1 Functions of two or more variable
Ecolog.
Chapter 9 Hypothesis Testing: Single Population
Data Mining Anomaly Detection
Emissions What are the most sensitive parameters in emissions to improve model results (chemical species, spatio-temporal resolution, spatial distribution,
Chapter 5: Sampling Distributions
Comments Task AS1 Tasks 12 Given a collection of boolean spatial features, the co-location pattern discovery process finds the subsets of features that.
Presentation transcript:

PollutantsEventsDiseases Analysis Scope: A set of pollutants pol1,…,polk, diseases d1,…,dm and disaster events e1,…,er Given Data (or need to be created during the course of the project): A spatio-temporal grid of pollution observations P pol1,x,y,t ,…, polk,x,y,t where poli,x,y,t denotes the concentration of the i-th pollutant at grid-cell (x,y,t) for an observation period A set of disease outbreak instances D described in the form: <disease, longitude, latitude, time> A set of disaster events E in the form: <event-type, longitude, latitude, start-time, end-time> Project Outcomes: A spatial pollution base line for pollutants pol1,…,polm for each spatial grid cell (x,y): Pol(j,x,y)= Raw average pollutant polj concentration in cells (x,y,1),…,(x,y,tmax) Pol’(j,x,y):= Baseline average of pollutant polj; same as Pol(j,x,y) but does not use cells (i,j,t’) in which disaster events occurred in average/… computations; similarly, we can compute base line standard deviations. Event-based (es for s=1,…,r) pollutant spike summaries that are obtained by comparing concentrations of particular pollutants associated with a particular event type (e.g. fire) with the pollutant’s baseline. Conceptual/geospatial models that associate pollutant concentrations with the occurrence of particular diseases; a “naïve” approach to get those could be: compute disease densities in each pollution grid-cell; then learn models mi using each grid-cell as training examples that map pollutant concentrations into densities/probabilities of the k-th disease (for i=1,…,m).

Assumptions and Discussion Conceptual models are functions that map pollutant concentrations into disease probabilities/densities; e.g. a model might map benzene and chlorine concentrations into lung disease probabilities. Disease outbreak delays represent a challenge for creating conceptual models. As conceptual models are learnt from spatial pollution grids, conceptual models can be used to compute regional disease probability maps from pollutant concentration grids. The previous slide suggests to compute baselines by excluding observation that are associated with disaster events (e.g. oil spill); an alternative approach could be to apply an outlier removal technique to the concentrations of a particular pollutant, and then to compute the baseline based on the “surviving” observations. The current project description also says something about “identifying pollution sources”, which is not considered in the previous slide that centers on obtaining pollutant concentrations and on computing baselines, event spike summaries and conceptual models from those. In general, I believe this is a very challenging task. Remark: Project outcomes can be computed by changing grid granularities.