Lecture 1 Describing Data.

Slides:



Advertisements
Similar presentations
Chapter 1 Why Study Statistics?
Advertisements

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 4. Measuring Averages.
Lecture 2 Describing Data II ©. Summarizing and Describing Data Frequency distribution and the shape of the distribution Frequency distribution and the.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Overview and Measures of Center
Statistics for Decision Making Descriptive Statistics QM Fall 2003 Instructor: John Seydel, Ph.D.
Measures of Central Tendency
© 2010 Pearson. All rights reserved. 1 Chapter 3 Numerically Summarizing Data Insert photo of cover.
3-2 Descriptive Statistics Inferential Statistics
Chapter 11 Data Descriptions and Probability Distributions
1 Chapter 4: Variability. 2 Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure.
Chapter Numerically Summarizing Data © 2010 Pearson Prentice Hall. All rights reserved 3 3.
Statistics 300: Introduction to Probability and Statistics Section 2-2.
Lecture 3-2 Summarizing Relationships among variables ©
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
1 Statistics This lecture covers chapter 1 and 2 sections in Howell Why study maths in psychology? “Mathematics has the advantage of teaching you.
Initial Data Analysis Central Tendency. Notation  When we describe a set of data corresponding to the values of some variable, we will refer to that.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
LECTURE 6 TUESDAY, 10 FEBRUARY 2008 STA291. Administrative Suggested problems from the textbook (not graded): 4.2, 4.3, and 4.4 Check CengageNow for second.
STA Lecture 111 STA 291 Lecture 11 Describing Quantitative Data – Measures of Central Location Examples of mean and median –Review of Chapter 5.
BIOSTAT - 2 The final averages for the last 200 students who took this course are Are you worried?
Chapter 3 – Descriptive Statistics
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
STAT02 - Descriptive statistics (cont.) 1 Descriptive statistics (cont.) Lecturer: Smilen Dimitrov Applied statistics for testing and evaluation – MED4.
1 Excursions in Modern Mathematics Sixth Edition Peter Tannenbaum.
Some definitions In Statistics. A sample: Is a subset of the population.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Are You Smarter Than a 5 th Grader?. 1,000,000 5th Grade Topic 15th Grade Topic 24th Grade Topic 34th Grade Topic 43rd Grade Topic 53rd Grade Topic 62nd.
Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
Sections 3-1 and 3-2 Review and Preview and Measures of Center.
Chapter 8 – Basic Statistics. 8.1 – Introduction to Basic Statistics.
Topics for our first Seminar The readings are Chapters 1 and 2 of your textbook. Chapter 1 contains a lot of terminology with which you should be familiar.
1. 2 To be able to determine which of the three measures(mean, median and mode) to apply to a given set of data with the given purpose of information.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
1 Review Sections 2.1, 2.2, 1.3, 1.4, 1.5, 1.6 in text.
Measures of Central Tendency
FARAH ADIBAH ADNAN ENGINEERING MATHEMATICS INSTITUTE (IMK) C HAPTER 1 B ASIC S TATISTICS.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Chapter 3 Data Description Section 3-2 Measures of Central Tendency.
CHAPTER Basic Definitions and Properties  P opulation Characteristics = “Parameters”  S ample Characteristics = “Statistics”  R andom Variables.
Numerical descriptions of distributions
Descriptive Statistics Research Writing Aiden Yeh, PhD.
Chapter 3 Descriptive Statistics: Numerical Methods.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
STATISICAL ANALYSIS HLIB BIOLOGY TOPIC 1:. Why statistics? __________________ “Statistics refers to methods and rules for organizing and interpreting.
Agenda Introduction to Statistics Descriptive Statistics Measures of Center Measures of Spread - Variability.
Statistics -Descriptive statistics 2013/09/30. Descriptive statistics Numerical measures of location, dispersion, shape, and association are also used.
Copyright © 2009 Pearson Education, Inc. 4.3 Measures of Variation LEARNING GOAL Understand and interpret these common measures of variation: range, the.
MATH-138 Elementary Statistics
Numerical descriptions of distributions
Chapter 1 Why Study Statistics?
Lesson 8 Introduction to Statistics
Welcome to Week 02 Thurs MAT135 Statistics
Chapter 1 Why Study Statistics?
4.3 Measures of Variation LEARNING GOAL
Representation and Summary of Data - Location.
Statistics Central Tendency
MEASURES OF CENTRAL TENDENCY
THE STAGES FOR STATISTICAL THINKING ARE:
Overview of Statistics
Statistics: The Interpretation of Data
THE STAGES FOR STATISTICAL THINKING ARE:
Chapter Three Numerically Summarizing Data
Ticket in the Door GA Milestone Practice Test
Ticket in the Door GA Milestone Practice Test
Lecture Slides Essentials of Statistics 5th Edition
STAT 515 Statistical Methods I Sections
Presentation transcript:

Lecture 1 Describing Data

Histogram and frequency table Example Visualizing your clients’ age range using a histogram.

Histogram Example Age range Frequency ~15 ~20 ~25 4 ~30 5 ~35 11 ~40 ~20 ~25 4 ~30 5 ~35 11 ~40 ~45 6 ~50 ~55 2 ~60 More

From the histogram, we can learn that Clients of age between 31~35 and 36~40 are the primary clients. It is important to maintain the satisfaction of these clients. Provide new services for other age ranges to increase client base.

Making Histogram and Frequency Table Open the data “Clients list” which is stored in our Applied Stat Folder. This is the data for the histogram shown in the previous slides.

Numerical measures of data summary (I) Difference between Population and Sample Mean (Average) Median

Difference between Population and Sample A population is the complete set of all items in which an investigator is interested.

Examples of Populations Names of all registered voters in the United States. Incomes of all families living in Daytona Beach. Grade point averages of all the students in your university.

A major objective of statistics is to make an inference about the population. For example “What is the average income of all families living in Daytona Beach?” Often, collecting the data for the population is costly or impossible. Therefore, we often collect data for only a part of the population. Such data is called a “Sample”.

Sample A sample is an observed subset of population values.

Numerical measure of summarizing data 1-1 Mean (Average) How to compute the mean (average) Understanding the mathematical notation of the mean (average) Cautionary notes for the use of the mean

1-2 How to compute the mean Sum all the data, then divide it by the number of observations. We use the term “sample size” to mean the number of observations.

1-3 Computing the mean: an example Client ID Age 1 49 2 37 3 48 4 46 5 This is a sample data of the ages of your business clients. Compute the mean age of your clients in this sample. Note that this is a typical data format that we will encounter in this course. It has the observation id (Client ID), and the value of the variable of interest (age) for each observation.

2-1 Understanding the mathematical notation of the mean Observation id Variable X 1 x1 2 x2 3 x3 . n xn This is one of the most common format of data that we deal with. In the first column, we have the observation id, and the second column has the value for each observation. (Often observation id is omitted) In the previous example, variable X is the age of the clients. Then observation id =1 means that this is the first customer in your customer list, and x1 is the age of the customer.

2-2 Understanding the mathematical notation of the mean Observation id Variable X 1 x1 2 x2 3 x3 . n xn When a data set is given in this format, the sample mean of the variable X, denoted by ,is given by The notation, is the summation notation. This is simply the sum from x1 to xn

2-3 Sample Mean and Population Mean Most often we use a sample data. For example, if we want to know the popularity rating of the current government, we may use data from 10,000 interviews. This is just a part of the whole voting population. Though not often, we may have the data from the whole population.

2-4 Sample Mean and Population Mean Later, it will become convenient to distinguish sample mean and population mean. Thus we will use different notations for the sample mean and the population mean.

2-5 Notations for the sample mean and the population mean For a sample mean, we use the following notation For the population mean, we use μ. We also use upper case N to denote the population size.

3-1 Cautionary note : Mean (average) is not necessarily the “center of the data”

3-2 Example “The average Japanese household savings in year 2005 is ¥17,280,000” This data may make you feel “well, if I do not have this much savings, I am not normal” Now, take a look at the histogram of the household savings in the next slide.

The mean may not be “the center of the data”: An example About 50% of people are here

One may think that the average is the “normal household” One may think that the average is the “normal household”. However, you can see that a lot of households have savings much less than the average. The average savings is very high because a few households have huge savings. In such case, “median” can give you a better sense of a “normal household”. The definition of the median is given in the next slide.

4-1 Median Sort the data in an ascending order. Then the median is the value in the middle (middle observation) When the number of observations is an even number, then there is no “middle observation”. In such case, take the average of the two middle numbers

4-2 Median Exercise Open the file “ Computation of median A”. This data contains the age of a company’s clients. Find the median age of this sample Open the file “Computation of median B”. This data contains the revenue of bag sales. Find the median of this sample.

Japanese household savings revisited

Corresponding chapters This lecture note covers the following topics of the textbook: 1.2 Sampling 3.1 Arithmetic Mean, Median