Overview of probability and statistics

Slides:



Advertisements
Similar presentations
Statistical Methods Descriptive Statistics Inferential Statistics Collecting and describing data. Making decisions based on sample data.
Advertisements

Copyright ©2009 Cengage Learning 1.1 Day 3 What is Statistics?
Chapter 1: Introduction to Statistics
1 Basic Definitions Greg C Elvers, Ph.D.. 2 Statistics Statistics are a set of tools that help us to summarize large sets of data data -- set of systematic.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Chapter 1: The Nature of Statistics
What Are Statistics and What are They Used For?. Statistics is the science of collecting, organizing, summarizing, analyzing, and making inferences from.
What’s with all those numbers?.  What are Statistics?
Statistical Reasoning
The field of statistics deals with the collection,
PPDAC Cycle.
Data I.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
An Overview of Statistics Lesson 1.1. What is statistics? Statistics is the science of collecting, organizing, analyzing, and interpreting data in order.
Plan for Today: Chapter 1: Where Do Data Come From? Chapter 2: Samples, Good and Bad Chapter 3: What Do Samples Tell US? Chapter 4: Sample Surveys in the.
Statistics -Descriptive statistics 2013/09/30. Descriptive statistics Numerical measures of location, dispersion, shape, and association are also used.
Psy B07 Chapter 1Slide 1 BASIC CONCEPTS. Psy B07 Chapter 1Slide 2  Population  Random Sampling  Random Assignment  Variables  What do we do with.
Statistica /Statistics Statistics is a discipline that has as its goal the study of quantity and quality of a particular phenomenon in conditions of.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Chapter 1 Introduction and Data Collection
Intro to Research Methods
MATH 201: STATISTICS Chapters 1 & 2 : Elements of Statistics
Statistics: Experimental Design
Statistics in Management
Data Analysis.
Making inferences from collected data involve two possible tasks:
Sampling Why use sampling? Terms and definitions
Probability and Statistics
Chapter 1 Why Study Statistics?
Goals of Statistics.
Part III – Gathering Data
Sampling Population: The overall group to which the research findings are intended to apply Sampling frame: A list that contains every “element” or.
Introduction to Statistics
PCB 3043L - General Ecology Data Analysis.
Probability and Statistics
Chapter 1 Why Study Statistics?
Statistical Data Analysis
Overview of Statistics
CHAPTER 2: PSYCHOLOGICAL RESEARCH METHODS AND STATISTICS
Statistical Analysis of Research
Basics of Statistics.
Statistics Branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data. Practice or science of.
Chapter 1 Getting Started Understandable Statistics Ninth Edition
Variables and Measurement (2.1)
Gathering and Organizing Data
Overview of Statistics
Probability and Statistics
Daniela Stan Raicu School of CTI, DePaul University
Political Science 30 Political Inquiry
6A Types of Data, 6E Measuring the Centre of Data
Statistical Data Analysis
Chapter 7: Sampling Distributions
Section 2.2: Sampling.
1.) Come up with 10 examples of how statistics are used in the real life. Be specific and unique. 2.) Video.
Statistics for the Social Sciences
Business Statistics: A First Course (3rd Edition)
Statistics for the Social Sciences
Gathering and Organizing Data
Chapter 9: Sampling Distributions
Chapter 8: Estimating with Confidence
Two Halves to Statistics
DESIGN OF EXPERIMENT (DOE)
Chap. 1: Introduction to Statistics
Sampling Distributions
Introductory Statistics Introductory Statistics
Probability and Statistics
Presentation transcript:

Overview of probability and statistics

Data Scientists are constantly dealing with collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data and for drawing conclusions based on information in the data.

Population and samples An investigation will typically focus on a well-defined collection of objects that make up the population of interest. When the desired information is available for all objects in the population, we have a census. Constraints on time, money, availability, etc. usually make a census impractical. Instead, a subset of the population, a sample, is selected.

Branches of statistics In descriptive statistics, the investigator simply summarizes and describes important features of the data. Graphs (e.g. histograms, boxplots) and numerical summaries (e.g. sample means, standard deviations, and correlation) are used. In inferential statistics, the investigator uses information from the sample to draw conclusions (make inferences) about the population.

Probability versus statistics In probability, properties of the population are modeled and parameters of the model are assumed known. Questions regarding the sample are answered. The model is typically an approximation (hopefully a very good one) of the true process generating the data.

Probability versus statistics (cont.) In statistics, characteristics of a sample are used to draw conclusions about the population. Before we can understand what a particular sample can tell us about the population, we should first understand the uncertainty associated with taking a sample from a given population. That is one reason why we study probability before statistics.

Collecting data Statistics also deals with methods to properly collect data so that the investigator will be able to answer relevant questions. One problem is that the target population may be different from the population that is actually sampled.

Ways to collect data Simple random sampling Stratified sampling Convenience sample Designed experiment

Simple random sampling This method requires a frame (a list of the population units) Every unit has exactly the same chance of being in the sample One can pick numbers out of a hat or use a random number generator to pick the sample

Stratified sampling The population is separated into non-overlapping groups and a sample is taken from each group. This helps to ensure that no one group is over- or under-represented in the sample

Convenience sampling The individuals are selected without systematic randomization An example is a phone-in poll or an internet poll There is always the question of whether this type of sample is representative of the population (e.g., only people with strong opinions take the time to phone in to a telephone poll)

Designed experiment Different treatments (such as fertilizers or coatings for corrosion protection) are allocated to various experimental units (plots of land or pieces of pipe). The levels of the factors making up the treatments are varied to study their effects We have means of dealing with variables that we don’t want to affect the outcomes (e.g., we can keep them fixed across the experiment) This design is better than sampling if we want to establishing causation

Numerical measures of location The sample mean is given by The sample median is the ordered value if is odd, and the average of the ordered values if is even.

Measures of variability The simplest measure of variability is the range, the difference between the largest and smallest observation Another measure is the sample variance, defined as The sample standard deviation is .