Lecture 8 Probabilities and distributions Probability is the quotient of the number of desired events k through the total number of events n. If it is.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Critical Reading Strategies: Overview of Research Process
Chapter 3 Properties of Random Variables
The logic behind a statistical test. A statistical test is the comparison of the probabilities in favour of a hypothesis H 1 with the respective probabilities.
2013/12/10.  The Kendall’s tau correlation is another non- parametric correlation coefficient  Let x 1, …, x n be a sample for random variable x and.
Lecture 8 Probabilities and distributions Probability is the quotient of the number of desired events k through the total number of events n. If it is.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
Statistical Issues in Research Planning and Evaluation
Sampling Distributions (§ )
What is the probability that of 10 newborn babies at least 7 are boys? p(girl) = p(boy) = 0.5 Lecture 10 Important statistical distributions Bernoulli.
Body size distribution of European Collembola Lecture 9 Moments of distributions.
SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
Introduction to Econometrics The Statistical Analysis of Economic (and related) Data.
PPA 415 – Research Methods in Public Administration Lecture 5 – Normal Curve, Sampling, and Estimation.
Lecture 8 First steps in statistics. Literature Planning Data Analysis Interpretation Defining the problem Identifying the state of art Formulating specific.
CHAPTER 6 Statistical Analysis of Experimental Data
1 A MONTE CARLO EXPERIMENT In the previous slideshow, we saw that the error term is responsible for the variations of b 2 around its fixed component 
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
Chapter 2: The Research Enterprise in Psychology
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Chapter 2: The Research Enterprise in Psychology
Statistical Analysis Statistical Analysis
Regression Analysis (2)
Copyright © Allyn & Bacon 2007 Chapter 2: Research Methods.
STAT02 - Descriptive statistics (cont.) 1 Descriptive statistics (cont.) Lecturer: Smilen Dimitrov Applied statistics for testing and evaluation – MED4.
Statistical Techniques I EXST7005 Review. Objectives n Develop an understanding and appreciation of Statistical Inference - particularly Hypothesis testing.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
PTP 560 Research Methods Week 8 Thomas Ruediger, PT.
Ch2: Probability Theory Some basics Definition of Probability Characteristics of Probability Distributions Descriptive statistics.
Theory of Probability Statistics for Business and Economics.
Statistical Inference Two Statistical Tasks 1. Description 2. Inference.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.
Scientific Paper. Elements Title, Abstract, Introduction, Methods and Materials, Results, Discussion, Literature Cited Title, Abstract, Introduction,
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Fundamentals of Data Analysis Lecture 3 Basics of statistics.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
C M Clarke-Hill1 Analysing Quantitative Data Forming the Hypothesis Inferential Methods - an overview Research Methods.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Question paper 1997.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Research Report. Introduction Introduce the research problem Introduce the research problem Why is the study important and to whom Why is the study important.
Chapter Eight: Using Statistics to Answer Questions.
What is Science? This slide show will present a question, followed by a slide with an acceptable answer. For some questions, there is a definite correct.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
THE SCIENTIFIC METHOD: It’s the method you use to study a question scientifically.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Fundamentals of Data Analysis Lecture 3 Basics of statistics.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Chapter 9 Introduction to the t Statistic
Quadrat Sampling Chi-squared Test
Unit 6 Probability.
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
Introduction to Econometrics
Sampling Distributions (§ )
Chapter Nine: Using Statistics to Answer Questions
Poster Title ___ Title is at top of the poster, short, descriptive of the project and easily readable at a distance of about 4-5 feet (words about
Presentation transcript:

Lecture 8 Probabilities and distributions Probability is the quotient of the number of desired events k through the total number of events n. If it is impossible to count k and n we might apply the stochastic definition of probability. The probability of an event j is approximately the frequency of j during n observations.

What is the probability to win in Duży Lotek? The number of desired events is 1. The number of possible events comes from the number of combinations of 6 numbers out of 49. We need the number of combinations of k events out of a total of N events Bernoulli distribution

What is the probability to win in Duży Lotek? Wrong! Hypergeometric distribution P = N K=n+k n We need the probability that of a sample of K elements out of a sample universe of N exactly n have a desired probability and k not.

Assessing the number of infected persons Assessing total population size Capture – recapture methods The frequency of marked animals should equal the frequency wothin the total population Assumption: Closed population Random catches Random dispersal Marked animals do not differ in behaviour N real = 38 We take a sample of animals/plants and mark them We take a second sample and count the number of marked individuals

The two sample case You take two samples and count the number of infected persons in the first sample m 1, in the second sample m 2 and the number of infected persons noted in both samples k. How many persons have a certain infectuous desease?

m species l species k species In ecology we often have the problem to compare the species composition of two habitats. The species overlap is measured by the Soerensen distance metric. We do not know whether S is large or small. To assess the expectation we construct a null model. Both habitats contain species of a common species pool. If the pool size n is known we can estimate how many joint species k contain two random samples of size m and l out of n. n species Common species pool Habitat A Habitat B The expected number of joint species. Mathematical expectation The probability to get exactly k joint species. Probability distribution.

Ground beetle species of two poplar plantations and two adjacent wheet fields near Torun (Ulrich et al. 2004, Annales Zool. Fenn.) Pool size 90 to 110 species. There are much more species in common than expected just by chance. The ecological interpretation is that ground beetles colonize fields and adjacent seminatural habitats in a similar manner. Ground beetles do not colonize according to ecological requirements (niches) but according to spatial neighborhood.

First steps in statistics

Literature Planning Data Analysis Interpretation Defining the problem Identifying the state of art Formulating specific hypothesis to be tested Study design, power analysis, choosing the analytical methods, design of the data base, Observations, experiments Meta analysis Statistical analysis, modelling Comparing with current theory Publication Scientific writing, expertise How to perform a biological study Theory

Preparing the experimental or data collecting phase Let’s look a bit closer to data collecting. Before you start any data collecting you have to have a clear vision of what you want to do with these data. Hence you have to answer some important questions For what purpose do I collect data? Did I read the relevant literature? Have similar data already been collected by others? Is the experimental or observational design appropriate for the statistical data analytical tests I want to apply? Are the data representative? How many data do I need for the statistical data analytical tests I want to apply? Does the data structure fit into the hypothesis I want to test? Can I compare my data and results with other work? How large are the errors in measuring? Do theses errors prevent clear final results? How large might the errors be for the data being still meaningful?

How to lie with statistics

Representative sampling

Scientific publications of any type are classically divided into 6 major parts Title, affiliations and abstract In this part you give a short and meaningful title that may contain already an essential result. The abstract is a short text containing the major hypothesis and results. The abstract should make clear why a study has been undertaken The introduction The introduction should shortly discuss the state of art and the theories the study is based on, describe the motivation for the present study, and explain the hypotheses to be tested. Do not review the literature extensively but discuss all of the relevant literature necessary to put the present paper in a broader context. Explain who might be interested in the study and why this study is worth reading! Materials and methods A short description of the study area (if necessary), the experimental or observational techniques used for data collection, and the techniques of data analysis used. Indicate the limits of the techniques used. Results This section should contain a description of the results of your study. Here the majority of tables and figures should be placed. Do not double data in tables and figures. Discussion This part should be the longest part of the paper. Discuss your results in the light of current theories and scientific belief. Compare the results with the results of other comparable studies. Again discuss why your study has been undertaken and what is new. Discuss also possible problems with your data and misconceptions. Give hints for further work. Acknowledgments Short acknowledgments, mentioning of people who contributed material but did not figure as co-authors. Mentioning of fund giving institutions Literature

The source data base Each row gets a single data record. Columns contain variables. Variables can be of text or metric type. Never use the original data base for calculations. Use only a replicate. Take care of empty cells. In calculated cells take care of impossible values.

NoRaw dataClassesClass meansCounter Number of occassions Frequencies Cummulative frquencies D LICZ.JEŻE LI(B$2:B$2 01;"<1") =E11-E12=F11/E$11=G11+H Frequency distribution

NoRaw dataClassesClass meansCounter Number of occassions Frequencies Cummulative frquencies D LICZ.JEŻE LI(B$2:B$2 01;"<1") =E11-E12=F11/E$11=G11+H Cumulative frequency distribution

Probability density function (pdf) Statistical or probability distributions add up to one. Discrete distribution Continuous distribution Discrete and continuous distributions Probability generating function (pgf)

Shapes of frequency distributions

Many statistical methods rely on a comparison of observed frequency distributions with theoretical distributions. Deviations from theory (from expectation) (so called residuals) are measures of statistical significance.  f(x) If the  f(x) are too large we accept the hypothesis that our observations differ from the theoretical expectation. The problem in statistical inference is to find the appropriate theoretical distribution that can be applied to our data.

Home work and literature Refresh: Arithmetic, geometric, harmonic mean Variance, standard deviation standard error Central moments Third and fourth central moment Mean and variance of power and exponental function statistical distributions Pseudocorrelation Sample bias Coefficient of variation Representative sample Prepare to the next lecture: Bernoulli distribution Pascal distribution Hypergeometric distribution Linear random number Literature: Mathe-online Łomnicki: Statystyka dla biologów.