hss2381A – stats... or whatever

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Statistics for the Social Sciences Psychology 340 Fall 2006 Distributions.
Displaying Data Objectives: Students should know the typical graphical displays for the different types of variables. Students should understand how frequency.
Chapter 2: Frequency Distributions
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
1 NURS/HSCI 597 Frequency Distribution Heibatollah Baghi, and Mastee Badii.
Chapter 1 Displaying the Order in a Group of Numbers
Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
Introductory Statistics Options, Spring 2008 Stat 100: MWF, 11:00 Science Center C. Stat 100: MWF, 11:00 Science Center C. –General intro to statistical.
BASIC STATISTICAL TOOLS
Chapter 1 & 3.
QUANTITATIVE DATA ANALYSIS
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Descriptive Statistics
Analysis of Research Data
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Data observation and Descriptive Statistics
Measures of Central Tendency
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Summarizing Scores With Measures of Central Tendency
CHAPTER 2 Percentages, Graphs & Central Tendency.
Chapter 3 Statistical Concepts.
EPE/EDP 557 Key Concepts / Terms –Empirical vs. Normative Questions Empirical Questions Normative Questions –Statistics Descriptive Statistics Inferential.
Data Presentation.
Chapter 1 Displaying the Order in a Group of Numbers and… Intro to SPSS (Activity 1) Thurs. Aug 22, 2013.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
Data Handbook Chapter 4 & 5. Data A series of readings that represents a natural population parameter A series of readings that represents a natural population.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Chapter 1 The Role of Statistics. Three Reasons to Study Statistics 1.Being an informed “Information Consumer” Extract information from charts and graphs.
Warsaw Summer School 2014, OSU Study Abroad Program Variability Standardized Distribution.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Describing Data Lesson 3. Psychology & Statistics n Goals of Psychology l Describe, predict, influence behavior & cognitive processes n Role of statistics.
Skewness & Kurtosis: Reference
CHS 221 V ISUALIZING D ATA Week 3 Dr. Wajed Hatamleh 1.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Today’s Questions Once we have collected a large number of measurements, how can we summarize or describe those measurements most effectively by using.
Chapter Eight: Using Statistics to Answer Questions.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
1 Frequency Distributions. 2 After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get.
LIS 570 Summarising and presenting data - Univariate analysis.
Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Organizing the Data Levin and Fox Elementary Statistics In Social Research Chapter 2.
Measures of Central Tendency (MCT) 1. Describe how MCT describe data 2. Explain mean, median & mode 3. Explain sample means 4. Explain “deviations around.
What is Statistics?. Statistics 4 Working with data 4 Collecting, analyzing, drawing conclusions.
Chapter 15 Analyzing Quantitative Data. Levels of Measurement Nominal measurement Involves assigning numbers to classify characteristics into categories.
The Diminishing Rhinoceros & the Crescive Cow
Measurements Statistics
Analysis and Empirical Results
Chapter 2: Methods for Describing Data Sets
Summarizing Scores With Measures of Central Tendency
PROBABILITY AND STATISTICS
Statistics is the science of conducting studies to collect, organize, summarize, analyze, present, interpret and draw conclusions from data. Table.
Applied Statistical Analysis
Frequency Distribution
An Introduction to Statistics
Introduction to Statistics
Basic Statistical Terms
CHAPTER 1 Exploring Data
Chapter Nine: Using Statistics to Answer Questions
Displaying the Order in a Group of Numbers Using Tables and Graphs
Univariate Description
Presentation transcript:

hss2381A – stats... or whatever Univariate Analysis, part 1 Descriptive Statistics

Evidence-based practice (EBP): Use of best clinical evidence in making patient care decisions Best source of evidence: Systematic research

How reliable is the evidence? What is the magnitude of effects? Evidence-Based Medicine (EBM) or Evidence-Based Practice (EBP) Questions: How reliable is the evidence? What is the magnitude of effects? How precise is the estimate of effects? Answering these questions requires an understanding of statistics

Data and Data Analysis In the context of a study, the information gathered to address research questions is data In quantitative research, data are usually quantitative (numbers) Quantitative data are subjected to statistical analysis

Examples of Independent and Dependent Variables Independent variable (IV): Smoking Dependent variable (DV): Lung cancer IV  DV ?

Research Question Research questions are the queries researchers seek to answer through the collection and analysis of data Research questions communicate the research variables and the population (the entire group of interest) Example: In hospitalized children (population) does music (IV) reduce stress (DV)?

Defining a Variable Two phases: Conceptual operational

Defining a Variable In studies, variables need to be defined Conceptual definition: The theoretical meaning of the underlying concept Operational definition: The precise set of operations and procedures used to measure the variable Example: Concept = how long have you been on this planet? Operation = In what age group, by years, are you in?

Descriptive Statistics Researchers collect their data from a sample of study participants—a subset of the population of interest Descriptive statistics describe and summarize data about the sample Examples: Percent female in the sample, average weight of participants

Inferential Statistics Researchers obtain data from a sample but often want to draw conclusions about a population Parameter: A descriptive index for a population Example: Average daily caloric intake of all 10-year-old children in New York Statistic: A descriptive index for a sample Example: Average daily caloric intake of 300 10-year-old children from three particular NY schools

SPSS and Statistical Analysis SPSS (Statistical Package for the Social Sciences) is among the most popular statistical software packages for analyzing research data It is user friendly and menu driven The datasets offered with this textbook are set up as SPSS files

The Data Editor in SPSS The data editor in SPSS offers a convenient spreadsheet-like method of creating, editing, and viewing data There are two “views” within the data editor: Data View: Shows the actual data values Variable View: Shows variable information for all variables

Data View in the Data Editor The columns represent one variable each; unique variable names (no more than eight characters long) are shown at the top of each column Each row is a case, representing an individual participant The data view tab is at the bottom

Variable View in the Data Editor Variable View shows a wealth of information about how variables are coded, how they will be labeled in output, level of measurement, and so on The Variable View tab is at the bottom

Versions of SPSS New versions of SPSS are created regularly, to offer improved options for analysis and presentation Examples in this book were created in SPSS Version 16.0 The student version of SPSS is available for analyzing relatively small datasets (no more than 50 variables and no more than 1,500 cases)

FREQUENCY DISTRIBUTION What is this? Same as Histogram?

Frequency Distributions A frequency distribution is a systematic arrangement of data values, with a count of how many times each value occurred in a dataset You can portray this as a table or as a graph

Constructing a Frequency Distribution List each data value in a sequence (usually, ascending order) 1, 2, 3, 4, 5… Tally each occurrence of the value Total the frequencies for each value (f) The sum of fs for all data values must equal the sample size: Σf = N

Elements of a Typical Frequency Distribution Data values Absolute frequencies (counts) Relative frequencies (percentages) Cumulative relative frequencies (the percentage for a given score value, combined with percentages for all preceding values)

Example... Let’s say we have 10 people of varying ages: Let’s construct the frequency distribution of the age GROUPS: 0-25 yrs, 26-45 yrs, >45 yrs Age group 0-25 26-45 >45 Frequency 4 2 Relative Freq. 4/10 = 40% 2/10 = 20% Cumulative Freq. 40% 40+40% = 80% 80+20% = 100%

Cumulative Percentage Summary of Our Example Data Value Frequency (f) Percentage (%) Cumulative Percentage 0-25 4 40.0 26-45 80.0 >45 2 20.0 100.0 TOTAL 10

Frequency Distributions and Measurement Levels Remember “measurement levels”? Nominal, ordinal, interval, ratio... Frequency distributions can be constructed for variables measured at any level of measurement BUT…for categorical (nominal-level) variables, cumulative frequencies do not make sense Also...

Frequency Distributions for Variables with Many Values When a variable has many possible values, a regular frequency distribution may be unwieldy For example, weight values (here, in pounds) Weight f 98 1 99 100 101 102 2 103 104 105 106 Etc. to 285 lb …

Which is Why We Used “Age Group” instead of “Age” This is sometimes called a “grouped frequency distribution” In a grouped frequency distribution contiguous values are grouped into sets (class intervals) Typically, we use groupings that are psychologically appealing (e.g., 10-25 years etc, not 7-13 years, etc)

Weight f 98 1 99 100 101 102 2 103 104 105 106 Etc. to 285 lb … This grouping communicates information more conveniently than individual weights Weight Interval f 75 - 100 6 101 - 125 15 126 - 150 33 151 - 175 26 176 - 200 24 201 - 225 14 226 - 250 9 251 - 275 276 - 300 2

Reporting Frequency Information Can be reported narratively in text (e.g., “83% of study participants were male”) In a frequency distribution table (multiple variables often presented in a single table) In a graph: Different graphs used for different types of data

Bar Graphs Bar graphs: Used for nominal (and many ordinal) level variables Bar graphs have a horizontal dimension (X axis) that specifies categories (i.e., data values) The vertical dimension (Y axis) specifies either frequencies or percentages Bars for each category drawn to the height that indicates the frequency or %

Bar Graphs Example of a bar graph Note the bars do not touch each other

Pie Chart Pie Charts: Also used for nominal (and many ordinal) level variables Circle is divided into pie-shaped wedges corresponding to percentages for a given category or data value All pieces add up to 100% Place wedges in order, with biggest wedge starting at “12 o’clock”

Pie Chart Example of a pie chart, for same marital status data

Histograms Histograms: Used for interval- and ratio-level data Similar to a bar graph, with an X and Y axis—but adjacent values are on a continuum so bars touch one another Data values on X axis are arranged from lowest to highest Bars are drawn to height to show frequency or percentage (Y axis)

Histograms Example of a histogram: Heart rate data f Heart rate in bpm

Frequency Polygons Frequency polygons: Also used for interval- and ratio-level data Similar to histograms, but instead of bars, a dot is used above score values to designate frequency/percentage Better than histograms for showing shape of distribution of scores, and is usually preferred if variable is continuous

Example of a frequency polygon (created in SPSS) Note that the line is brought down to zero for the score below lowest data point (54) and above highest data point (75)

Shapes of Distributions Distributions of data values can be described in terms of: Modality Symmetry Kurtosis

Modality Modality concerns how many peaks (values with high frequencies) there are Unimodal = 1 peak Bimodal = 2 peaks Multimodal = multiple peaks

Unimodal: Bimodal:

How is this useful?

Example: Tuberculosis What is it? We apply tuberculin skin test (also called PPD – purified protein derivative) test Positive response is an “induration” a hard, raised area with clearly defined margins at and around the injection site

What type of curve is this?

Distribution of systolic blood pressure for men (unimodal distribution)

Symmetry Symmetric Distribution: the two halves of the distribution, folded over in the middle, are identical

Symmetry Asymmetric (Skewed) Distribution: Peaks are “off center” and there is a tail trailing off for data values with low frequency Positive skew: Longer tail trails off to right (fewer people with high values, like for income) Negative skew: Longer tail trails off to left (fewer people with low values, like age at death)

Direction of Skew Examples of distributions with different skews:

Skewness Index Indexes have been developed to quantify degree of skewness One skewness index (e.g., in SPSS) has: Negative values, for a negative skew 0, for no skew Positive values, for a positive skew If skewness index is less than twice the value of its standard error (to be explained later), distribution can be treated as not skewed

Skewness Index Examples Standard error = 0.33 Positive skew Skewness index = -0.72 Standard error = 0.34 Negative skew

Kurtosis Kurtosis: Degree of pointedness or flatness of the distribution’s peak Leptokurtic: Very thin, sharp peak Platykurtic: Flat peak Mesokurtic: Neither pointy nor flat Like skewness, there is an index of kurtosis Positive values: Greater peakedness Negative values: Greater flatness

Kurtosis Examples Leptokurtic (+ index) Platykurtic (– index)

Normal Distribution What is this curve called?

Normal Distribution A normal distribution (aka normal curve, bell curve, Gaussian distribution, etc) is: Unimodal Symmetric Neither peaked nor flat Plays an important role in inferential statistics We will re-visit the Normal Distribution in more depth in the future

Some human characteristics are normally distributed (approximately), like height 1 short person, 3 medium persons, 1 tall person

Uses of Frequency Distributions in Data Analysis First step in understanding your data! Begin by looking at the frequency distributions for all or most variables, to “get a feel” for the data Through inspection of frequency distributions, you can begin to assess how “clean” the data are (will discuss next time)

Central Tendency “Central Tendency” is a characteristic of a distribution Describes how data is clustered around some value In other ways, it’s a way of summarizing your data by identifying one value in the set that is the most important There are several indices of central tendency, but 3 are the most important: Mode Median Mean Next class, we’ll get into these in more depth

Homework! P.17: A1-A4 P.36: A1-A5