Lecture 1 Sections 1.1 – 1.2 Objectives:

Slides:



Advertisements
Similar presentations
Chapter 1 & 3.
Advertisements

ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
CHAPTER 1: Picturing Distributions with Graphs
Objectives (BPS chapter 1)
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Chapter 1 The Role of Statistics. Three Reasons to Study Statistics 1.Being an informed “Information Consumer” Extract information from charts and graphs.
1 Chapter 3 Looking at Data: Distributions Introduction 3.1 Displaying Distributions with Graphs Chapter Three Looking At Data: Distributions.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Copyright 2011 by W. H. Freeman and Company. All rights reserved.1 Introductory Statistics: A Problem-Solving Approach by Stephen Kokoska Chapter 2 Tables.
Descriptive Statistics  Individuals – are the objects described by a set of data. Individuals may be people, but they may also be animals or things. 
Chapter 0: Why Study Statistics? Chapter 1: An Introduction to Statistics and Statistical Inference 1
1 Take a challenge with time; never let time idles away aimlessly.
Unit 1 - Graphs and Distributions. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.
Chapter 1: Exploring Data AP Statistics. Statistics Main Idea: The world would like to describe, discuss, etc. an entire “group,” i.e. all elements Problem:
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
The rise of statistics Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain understanding from.
Types of variables Discrete VS Continuous Discrete Continuous
Chapter 1.1 Displaying Distributions with graphs.
Chapter 1: Exploring Data
ISE 261 PROBABILISTIC SYSTEMS
Chapter 2: Methods for Describing Data Sets
Looking at data Visualization tools.
Lesson 8 Introduction to Statistics
Warm Up.
Chapter 1 & 3.
Statistical Reasoning
Laugh, and the world laughs with you. Weep and you weep alone
Distributions and Graphical Representations
CHAPTER 1: Picturing Distributions with Graphs
Unit 1 - Graphs and Distributions
Basics of Statistics.
Displaying Distributions with Graphs
CHAPTER 1: Picturing Distributions with Graphs
Displaying Quantitative Data
Overview Engineers and scientists are constantly exposed to collections of facts, or data, both in their professional capacities and in everyday activities.
Descriptive Statistics
2-1 Data Summary and Display 2-1 Data Summary and Display.
Descriptive Statistics
CHAPTER 1 Exploring Data
Good Morning AP Stat! Day #2
Elementary Statistics (Math 145)
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Descriptive Statistics
Descriptive Statistics
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Welcome!.
Basic Practice of Statistics - 3rd Edition
Methods of Acquiring Information
CHAPTER 1: Picturing Distributions with Graphs
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Descriptive Statistics
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Math 145 September 5, 2007.
CHAPTER 1 Exploring Data
Organizing, Displaying and Interpreting Data
Displaying Distributions with Graphs
Descriptive Statistics
Math 145 January 24, 2007.
Math 145 May 28, 2009.
Math 341 January 24, 2007.
CHAPTER 1 Exploring Data
Biostatistics Lecture (2).
Presentation transcript:

Lecture 1 Sections 1.1 – 1.2 Objectives: Populations, Samples and Processes Visual Display for Univariate Data Numerical Variables Stem-and-Leaf Displays Dotplots Histograms Categorical Variables Bar Chart Pie Chart Introduction to R

Branches of Statistics Descriptive Statistics Exploratory Data Analysis (EDA) Chapters 1 – 3 Used to summarize and describe important features in the data, either graphically or numerically. Inferential Statistics Involves techniques for generalizing from a sample to a population. Chapters 7 – 8.

Population vs. Sample Population: The entire group of individuals in which we are interested in but can’t usually assess directly. E.g. all individuals who received a B.S in engineering in 2011. Sample: The part of the population we actually examine and for which we do have data. The sample is selected in some prescribed manner. Population Sample

Variables We are usually interested only in certain characteristics of the objects in a population, e.g. the age of an engineering graduate, the gender of a graduate. A characteristic may be categorical – e.g. gender - or it may be numerical – e.g. age. A variable is any characteristic whose value may change from one object to another in the population. The value varies from object to object. We will use lowercase letters to denote variable. Example: x = age of a graduating engineer; y = braking distance of an automobile under specified conditions.

Discrete or Continuous A variable is discrete if its set of possible values is either finite or can be listed in an infinite sequence. A variable is continuous if its possible values consist of an entire interval on the number line. e.g. x takes values 0,1,2,3,…. e.g. x is the pH of a chemical substance. x can take values like 7.0, 7.03, 7.032 etc

Univariate vs. Multivariate A univariate data set is when observations are made on a single variable. E.g. type of transmission A bivariate data set is when observations are made on two variables E.g. (height, weight) pair for each basketball player. A multivariate data set is when observations are made on more than two (multiple) variables

Example 1.1 (pg. 4) The tragedy that befell the space shuttle Challenger and its astronauts in 1986 led to a number of studies to investigate the reasons for mission failure. Attention quickly focused on the behavior of the rocket engine’s O-rings. Here is data consisting of observations on x=O-ring Temperature (oF) for each test firing or actual launch of the shuttle rocket engine (Presidential Commission on the Space Shuttle Challenger Accident, Vol. 1, 1986: 129-131). 84 49 61 40 83 67 45 66 70 69 80 58 68 60 67 72 73 70 57 63 70 78 52 67 53 67 75 61 70 81 76 79 75 76 58 31 Without any organization, it is very difficult to get a sense of what a typical or representative temperature might be, whether the values are highly concentrated about a typical value or quite spread out, whether there are any gaps in the data, what percentage of the values are in the 60s, and so on.

How to examine a distribution? The distribution of a variable tells us what values the variable takes and how often it takes these values. Almost always plot data as preliminary analysis 2. Look for the overall pattern Shape Location Spread 3. Look for the striking deviation from overall pattern Outlier

Stem-and-Leaf Plot How to make a stemplot: Separate each observation into a stem, consisting of all but the final (rightmost) digit, and a leaf, which is that remaining final digit. Stems may have as many digits as needed, but each leaf contains only a single digit. Write the stems in a vertical column with the smallest value at the top, and draw a vertical line at the right of this column. Write each leaf in the row to the right of its stem, in increasing order out from the stem. Stem 3 4 5 6 7 8 Leaf 1 059 23788 01136777789

Histogram The range of values that a variable can take is divided into equal size intervals. The histogram shows the number of individual data points that fall in each interval. Histogram Shapes Unimodal, bimodal or multimodal Symmetric, positively skewed or negatively skewed

Categorical Data Because the variable is categorical, the data in the graph can be ordered any way we want (alphabetical, by increasing value, by year, by personal preference, etc.) Bar graphs Each category is represented by a bar. A Pareto diagram is a bar chart from a quality control study Pie charts The slices must represent the parts of one whole.

Some examples in R Example 1 Space Shuttle Challenger Accident Data 84 49 61 40 83 67 45 66 70 69 80 58 68 60 67 72 73 70 57 63 70 78 52 67 53 67 75 61 70 81 76 79 75 76 58 31 Stem-and-Leaf Graph: x=c(84,49,61,40,83,67,45,66,70,69,80,58,68,60,67,72,73,70,57,63,70,78,52,67,53,67,75,61,70,81,76,79,75,76,58,31) #temperature data length(x) #the sample size stem(x) #stem-and-leaf plot Histogram: hist(x,main="histogram of Temperature",xlab="Temperature") #histogram

Example 2 In the manufacture of printed circuit boards, finished boards are subjected to a final inspection before they are shipped to customers. Here is data on the type of defect for each board rejected at final inspection during a particular time period. Type of defect Frequency Low copper plating 112 Poor electroless coverage 35 Lamination problems 10 Plating separation 8 Etching problems 5 Miscellaneous 12 defect=c("Low copper plating","Poor electroless coverage","Lamination problems","Plating separation","Etching problems","Miscellaneaous") #type of defect frequency=c(112,35,10,8,5,12) #frequency Bar Graph: barplot(frequency,names.arg=defect) #barplot Pie Chart: pie(frequency,labels=defect) #pie chart