Initial Data Analysis Frequency. IDA  Often overlooked or sloughed off as being not all that important but…  It is at the beginning stages where much.

Slides:



Advertisements
Similar presentations
Chapter 2: Frequency Distributions
Advertisements

Analyzing Data (C2-5 BVD) C2-4: Categorical and Quantitative Data.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 2 Picturing Variation with Graphs.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Slide 1 Spring, 2005 by Dr. Lianfen Qian Lecture 2 Describing and Visualizing Data 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data.
Beginning the Visualization of Data
Chapter 1 Displaying the Order in a Group of Numbers
Frequency distributions and their graphs Frequency distribution tables give the number if instances of each value in a distribution. Frequency distribution.
Stem and Leaf Display Stem and Leaf displays are an “in between” a table and a graph – They contain two columns: – The left column contains the first digit.
Measures of Dispersion
Frequency Distribution Ibrahim Altubasi, PT, PhD The University of Jordan.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data with Tables and Graphs.  A frequency distribution is a collection of observations produced by sorting observations into classes and showing.
Distributions & Graphs. Variable Types Discrete (nominal) Discrete (nominal) Sex, race, football numbers Sex, race, football numbers Continuous (interval,
Welcome to Data Analysis and Interpretation
Chapter 1: Introduction to Statistics
Psy B07 Chapter 2Slide 1 DESCRIBING AND EXPLORING DATA.
Initial Data Analysis Central Tendency. Notation  When we describe a set of data corresponding to the values of some variable, we will refer to that.
Data Handling Collecting Data Learning Outcomes  Understand terms: sample, population, discrete, continuous and variable  Understand the need for different.
Business Statistics Chapter 2 Charts & Graphs by Ken Black.
Graphs of Frequency Distribution Introduction to Statistics Chapter 2 Jan 21, 2010 Class #2.
Chapter 2 Summarizing and Graphing Data
Chapter 2 Summarizing and Graphing Data Sections 2.1 – 2.4.
CHAPTER 2 Graphical Descriptions of Data. SECTION 2.1 Frequency Distributions.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Describing and Exploring Data Initial Data Analysis.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Histogram Differences from a bar chart: bars have equal width and always touch width of bars represents quantity heights of bars represent frequency f.
2.2 Organizing Quantitative Data. Data O Consider the following data O We would like to compute the frequencies and the relative frequencies.
Describing Data Lesson 3. Psychology & Statistics n Goals of Psychology l Describe, predict, influence behavior & cognitive processes n Role of statistics.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 2-2 Frequency Distributions.
Chapter 2 – Descriptive Statistics
Chapter 11 Data Descriptions and Probability Distributions Section 1 Graphing Data.
Chapter 3 – Graphical Displays of Univariate Data Math 22 Introductory Statistics.
Today’s Questions Once we have collected a large number of measurements, how can we summarize or describe those measurements most effectively by using.
1 Review Sections 2.1, 2.2, 1.3, 1.4, 1.5, 1.6 in text.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 2-1 Business Statistics, 4e by Ken Black Chapter 2 Charts & Graphs.
Frequency Distributions Chapter 2. Distributions >Four different ways to visually describe just one variable Frequency table Grouped frequency table Frequency.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Two Organizing Data.
Chapter 3 Displaying Data. 2 Major Points Plotting data: Why bother? Plotting data: Why bother? Histograms Histograms Frequency polygon Frequency polygon.
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
1 Frequency Distributions. 2 After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Chapter 3: Displaying and Summarizing Quantitative Data Part 1 Pg
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 2 Section 2 – Slide 1 of 37 Chapter 2 Section 2 Organizing Quantitative Data.
HISTOGRAMS Frequency Tables too. Histograms A special type of bar graph in which the bars touch Width of bar has meaning (age range, distance interval,
Chapter 3 EXPLORATION DATA ANALYSIS 3.1 GRAPHICAL DISPLAY OF DATA 3.2 MEASURES OF CENTRAL TENDENCY 3.3 MEASURES OF DISPERSION.
Graphing of data (2) Histograms – Polygon - Ogive.
Lesson Organizing Quantitative Data: The popular displays.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Graphing options for Quantitative Data
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Descriptive Statistics
An Introduction to Statistics
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Experimental Design Experiments Observational Studies
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Presentation transcript:

Initial Data Analysis Frequency

IDA  Often overlooked or sloughed off as being not all that important but…  It is at the beginning stages where much trouble can be avoided and if the data is glossed over this can lead to missed findings or results that will not be able to be replicated because they represent bad data.  Bad data?

 IDA includes: A healthy inspection of the individual variables’ behaviors Outlier analysis Descriptive and graphical output

Describing and Exploring Data  Once a bunch of data has been collected, the raw numbers must be manipulated in some fashion to make them more informative.  Several options are available including plotting the data or calculating descriptive statistics.

Plotting Data  Often, the first thing one does with a set of raw data is to plot frequency distributions.  Usually this is done by first creating a table of the frequencies broken down by values of the relevant variable, then the frequencies in the table are plotted in a histogram.

Frequency Data  Example: Age as estimated by a questionnaire in a statistics class.  Note: The frequencies in the adjacent table were calculated by simply counting the number of subjects having the specified value for the age variable.

Grouping data  Plotting is easy when the variable of interest has a relatively small number of values (like our age variable did).  However, the values of a variable are sometimes more continuous, resulting in uninformative frequency plots if done in the above manner.

Grouped Frequency Distribution Example: Binning our weight variable.  For example, with a variable like weight we might obtain a range from 100 lb. to 200 lb. If we used the previously described technique, we would end up with 100 bars, most of which with a frequency less than 2 or 3 (and many with a frequency of zero).  We can get around this problem by grouping our values into bins. Try for around 10 classes (or bins) with natural splits.

Graphic Depiction of Frequency  Histogram Similar to a bar chart with the only difference being that histograms are representative of non-nominal data. Age example 

Weight example  Check out this demo which clearly shows how the width of the bin that you select can clearly affect the “look” of the datathis demo  Here is another similar demonstration of the effects of bin width demonstration

Number of Classes and Class Width  The number of classes should be between 5 and 15. Fewer than 5 classes cause excessive summarization. More than 15 classes tends not to add much.  Class Width Divide the range by the number of classes for an approximate class width Round up to a convenient number

Example of Ungrouped Data Scores on a social introversion inventory

Relative Frequency Relative Relative Class IntervalFrequencyFrequency 20-under under under under under under Total Total501.00

Cumulative Frequency Cumulative Cumulative Class IntervalFrequencyFrequency 20-under under under under under under Total50 Total50

Class Midpoints, Relative Frequencies, and Cumulative Frequencies RelativeCumulative RelativeCumulative Class IntervalFrequencyMidpointFrequencyFrequency 20-under under under under under under Total Total501.00

Cumulative Relative Frequencies Cumulative RelativeCumulativeRelative RelativeCumulativeRelative Class IntervalFrequencyFrequencyFrequencyFrequency 20-under under under under under under Total Total501.00

Histogram Construction Class IntervalFrequency 20-under under under under under under 801

Frequency Polygon Class IntervalFrequency 20-under under under under under under 801

Advantages/Disadvantages  With the grouped frequency distribution we can take large data sets and make them much more manageable and easier to understand.  However, we also lose information about individual data points.

Stem and Leaf Plots  If values of a variable must be grouped prior to creating a frequency plot, then the information related to the specific values becomes lost in the process (i.e., the resulting graph depicts only the frequency values associated with the grouped values).  However, it is possible to obtain the graphical advantage of grouping and still keep all of the information if stem & leaf plots are used.

Stem and Leaf Plots  These plots are created by splitting a data point into that part associated with the ‘group’ and that associated with the individual point.  For example, the numbers 180, 180, 181, 182, 185, 186, 187, 187, 189 could be represented as:

Raw Data Stem Leaf

Construction of Stem and Leaf Plot Raw Data Stem Leaf Stem Leaf Stem Leaf

 Thus, we could represent our weight data in the following stem & leaf plot:

 Stem & leaf plots are especially nice for comparing distributions.

Advantages  Using a stem and leaf offers several advantages It retains individual data points Displays large amounts of data well (compared to a normal frequency distribution) Provides a ‘graphical’ display of the data  Disadvantage Kind of ugly

Terminology Related to Distributions  Often, frequency histograms tend to have a roughly symmetrical bell-shape and such distributions are called normal or gaussian.

 Sometimes, the bell shape is not symmetrical.  The term positive skew refers to the situation where the “tail” of the distribution is to the right, negative skew is when the “tail” is to the left.

Example: Pizza Data

Distribution Shapes Normal Positively Skewed Negatively Skewed Bimodal