Program This course will be dived into 3 parts: Part 1 Descriptive statistics and introduction to continuous outcome variables Part 2 Continuous outcome.

Slides:

Advertisements

Similar presentations

Histograms Bins are the bars Counts are the heights Relative Frequency Histograms have percents on vertical axis.

Advertisements

Lecture 2 Summarizing the Sample. WARNING: Today’s lecture may bore some of you… It’s (sort of) not my fault…I’m required to teach you about what we’re.

CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.

1 Chapter 1: Sampling and Descriptive Statistics.

Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.

Examining Univariate Distributions Chapter 2 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.

How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.

Exploratory Data Analysis. Height and Weight 1.Data checking, identifying problems and characteristics Data exploration and Statistical analysis.

Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.

Lecture 8 Distributions Percentiles and Boxplots Practical Psychology 1.

Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.

STAT 250 Dr. Kari Lock Morgan

ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:

Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc Department of Surgery Department of Clinical Epidemiology and Biostatistics March 18, 2009.

1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~

Exploratory Data Analysis

VCE Further Maths Chapter Two-Bivariate Data \\Servernas\Year 12\Staff Year 12\LI Further Maths.

Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.

Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.

Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.

SPSS Workshop Day 2 – Data Analysis. Outline Descriptive Statistics Types of data Graphical Summaries –For Categorical Variables –For Quantitative Variables.

AP Statistics Semester One Review Part 1 Chapters 1-3 Semester One Review Part 1 Chapters 1-3.

Mr. Magdi Morsi Statistician Department of Research and Studies, MOH

1 Never let time idle away aimlessly.. 2 Chapters 1, 2: Turning Data into Information Types of data Displaying distributions Describing distributions.

1 Take a challenge with time; never let time idles away aimlessly.

ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.

Chapter 5: Organizing and Displaying Data. Learning Objectives Demonstrate techniques for showing data in graphical presentation formats Choose the best.

Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.

AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.

1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.

IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)

Unit 1 - Graphs and Distributions. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.

ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.

DISPLAYING DATA DIAGRAMMATICALLY. The Aim By the end of this lecture, the students will be aware of graphical representation of data and by using SPSS.

All About that Data Unit 6 Data.

Prof. Eric A. Suess Chapter 3

Thursday, May 12, 2016 Report at 11:30 to Prairieview

Elementary Statistics

Elementary Statistics

Describing Data: Two Variables

Statistics 200 Lecture #4 Thursday, September 1, 2016

Introduction to SPSS July 28, :00-4:00 pm 112A Stright Hall

EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS

Chapter 1: Exploring Data

Looking at data Visualization tools.

All About that Data Unit 6 Data.

Statistical Reasoning

Description of Data (Summary and Variability measures)

Laugh, and the world laughs with you. Weep and you weep alone

Distributions and Graphical Representations

Unit 1 - Graphs and Distributions

Basics of Statistics.

Displaying Distributions with Graphs

Topic 5: Exploring Quantitative data

Treat everyone with sincerity,

Displaying and Summarizing Quantitative Data

Part 2 - Compare average in different groups

Honors Statistics Review Chapters 4 - 5

Descriptive Statistics

Chapter 1: Exploring Data

Ten things about Descriptive Statistics

Chapter 1: Exploring Data

Learning outcomes By the end of this session you should know about:

Association between 2 variables

Exercise 1: Entering data into SPSS

Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3

Exercise 1 (a): producing individual tables, using the cross-tabs menu

Chapter 1: Exploring Data

Presentation transcript:

Introduction to SPSS (biostatistics) Oslo Center for Biostatistics and Epidemiology (OCBE)

Program This course will be dived into 3 parts: Part 1 Descriptive statistics and introduction to continuous outcome variables Part 2 Continuous outcome variables (t-test, non-parametric tests, linear regression). Part 3 Binary outcome variables (RR, OR, χ2 test, logistic regression)

The course dataset We will use data from the Caerphilly study. Prospective heart disease study that was conducted 1979-1983 in Wales. It recorded many different lifestyle markers and outcomes: BMI, blood pressure, cholesterol, smoking, diabetes and heart disease in 1786 men.

How to open an existing dataset Click «File->Open->Data», and select the dataset. Open the dataset «caerphilly_start.sav» (you should have received it). Download: https://wiki.uio.no/med/imb/ocbe/index.php/Introduction_to_SPSS

Types of data Continuous data Categorical data Data that can be quantified or measured on a scale that can take an «infinite» number of values. Examples: Height, BMI, Blood pressure, Age, etc. Data that cannot necessarily be quantified. Nominal (cannot be ordered). Examples: Gender, Nationality, etc. Ordinal (can be ”naturally” ordered). Examples: Grades, education, pain scale, age groups, etc. Binary data is also categorical (but with only two levels). Examples: healty/sick, gender, smoker/non-smoker, etc. Categorical data must be quantified to be used in a statistical analysis (SPSS can do it for us). NOTE! In SPSS the type of data is called «measure».

Descriptive Statistics In this section we will use SPSS to explore and describe data Stastistics: - Average Median - Standard deviation Visual plots: - Boxplots Histogram - Scatterplot

Descriptive Statistics summary and presentation of continuous variables Number Analyze > Descriptive Statistics > Descriptives Simple description (mean, sd, min, max) Analyze > Descriptive Statistics > Frequencies Supplementary description (addition: percentiles) Analyze > Descriptive Statistics > Explore Supplementary description (addition: stratification + plot) Grafic Graph> Legacy Dialogs > Boxplot Scatter/Dot Histogram

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore»

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore» Drag the variables you want to analyze from the list on the right to «Dependent List»

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore» Drag the variables you want to analyze from the list on the right to «Dependent List» Under «Statistics»

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore» Drag the variables you want to analyze from the list on the right to «Dependent List» Under «Statistics» In «Statistics» select only «Descriptives»

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore» Drag the variables you want to analyze from the list on the right to «Dependent List» Under «Plots»

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore» Drag the variables you want to analyze from the list on the right to «Dependent List» Under «Plots» flag: Boxplots : «Factor…» «Stem-and-leaf» «Histogram»

Descriptive Statistics Go to «Analyze > Descriptive statistics > Explore» Drag the variables you want to analyze from the list on the right to «Dependent List» We just want statistics so select «Statistics» instead of «Both»

Output in the following table with statistics: average, median and standard deviation

Split File: separate analyses Sometimes it is convenient to carry on analyses separately for different sub-groups. You can use: «Data > Split File»

Split File: separate analyses The standard setting is «Analyze all cases, do not create groups» Then, you can make a choice.

Split File: separate analyses With «Organize output by groups» you can select a categorical variable so that all analyzes are done separately .

Split File: separate analyses Under «Current status» you can see the current settings

Split File: separate analyses Under «Current status» you can see the current settings NB! For a t-test, you can not have Split File on the group variable.

Boxplot The Boxplot is the best way to represent the distribution of data graphically, by visualizing centering and variability. The box plot shows Median (50% of data on each side) Interquartile range IQR (25%, 75% on each side) Extreme values are represented as ”o” or ”*”

= IQR

How to make a Boxplot Go to «Graphs > Legacy dialogs > Boxplot»

How to make a Boxplot Go to «Graphs > Legacy dialogs > Boxplot» Select«Simple» and «Define»

How to make a Boxplot Go to «Graphs > Legacy dialogs > Boxplot» Move the variable of interest in «Variable»

How to make a Boxplot Go to «Graphs > Legacy dialogs > Boxplot» Move the variable of interest in «Variable» Can devide boxplot with a group variable in «Category Axis»

Exercise 1a We want to explore the relationship between triglycerides in blood («trig») and BMI («bmicat») in the Caerphilly study. Make a boxplot of the distribution of triglyceride conditioned to the four BMI categories.

Scatterplot To visualize the relationship between two continuous variables, we can use the scatterplot. Go to «Graphs > Legacy dialogs > Scatter/Dot»

Scatterplot To visualize the relationship between two continuous variables, we can use the scatterplot. Go to «Graphs > Legacy dialogs > Scatter/Dot» «Simple scatter» and «Define»

Choose the variables from the list on the left: Move Triglyserid to «Y Axis» as the vertical axis

Choose the variables from the list on the left: Move Triglyserid to «Y Axis» as the vertical axis Move HDL to «X axis» as the horizontal axis Click «OK»

Choose the variables from the list on the left: Move Triglyserid to «Y Axis» and HDL to «X axis» Move a categorical variable to «Set markers by». This gives different color for the categories.

NB – if the observation for the category is MISSING, the circle disappears (with the color "invisible")

Exercise 1b Create a scatter plot to investigate the relationship between BMI as continuous variable («bmi») and HDL cholesterol («hdlchol») Does there appear to be any connection? Create the same scatter plot, but with different colors for smokers and non-smokers.

Normality plots In many statistical analyses, it is convenient to assume normal distributed data. To investigate whether this assumption holds, use three plots: Histogram

Normality plots In many statistical analyses, it is convenient to assume normal distributed data. To investigate whether this assumption holds, use three plots: Histogram Boxplot

Normality plots In many statistical analyses, it is convenient to assume normal distributed data. To investigate whether this assumption holds, use three plots: Histogram Boxplot Quantile-Quantile (QQ) plot

Normality plots In many statistical analyses, it is convenient to assume normal distributed data. To investigate whether this assumption holds, use three plots: Histogram Boxplot Quantile-Quantile (QQ) plot The first two check if the variable is symmetrical and not skewed. The QQ plot compares the data to a normal distribution on a straight line (deviations indicate outliers and heavy tails).

All these plot are under Explore: Go to «Analyze > Descriptive statistics > Explore»

All these plot are under Explore: Go to «Analyze > Descriptive statistics > Explore» Click «Plots»

All these plot are under Explore: Go to «Analyze > Descriptive statistics > Explore» Click «Plots» Select: «Factor levels together»

All these plot are under Explore: Go to «Analyze > Descriptive statistics > Explore» Click «Plots» Select: «Factor levels together» Select «Histogram»

All these plot are under Explore: Go to «Analyze > Descriptive statistics > Explore» Click «Plots» Select: «Factor levels together» Select «Histogram» For QQ-plot: select «Normality plots with tests»

All these plot are under Explore: Go to «Analyze > Descriptive statistics > Explore» Click «Plots» Select: «Factor levels together», «Histogram» and «Normality plots with tests» Click «Continue» and «OK»

Exercise 1c Investigate if the variables HDL cholesterol («hdlchol») and triglyceride («trig») from the Caerphilly study can be assumed as normal with an histogram, boxplot and QQ plot.

Exercise 1c Investigate if the variables HDL cholesterol («hdlchol») and triglyceride («trig») from the Caerphilly study can be assumed as normal with an histogram, boxplot and QQ plot. Conclusion: We can assume that HDL cholesterol is normally distributed (at least approximately), but triglyceride does not seem to be normal.

Summary With «Graphs > Legacy dialogs» you can create many different plots. For example scatterplot and boxplot With «Analyze > Descriptive statistics > Explore» you can find summary statistics and create histograms, boxplot and normality plot