Problem Set I: Review Intro, Measures of Central Tendency & Variability, Z-scores and the Normal Distribution, Correlation, and Regression.

Slides:



Advertisements
Similar presentations
Regression and correlation methods
Advertisements

For Explaining Psychological Statistics, 4th ed. by B. Cohen
Standard Normal Table Area Under the Curve
Overview Correlation Regression -Definition
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Measures of Central Tendency& Variability.
The Normal Distribution
Statistics Psych 231: Research Methods in Psychology.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Accuracy of Prediction How accurate are predictions based on a correlation?
Linear Regression and Correlation Analysis
Measures of Variability
1.2: Describing Distributions
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Data observation and Descriptive Statistics
Lecture 5: Simple Linear Regression
Getting Started with Hypothesis Testing The Single Sample.
The measure that one trait (or behavior) is related to another
Measures of Central Tendency
Copyright © Cengage Learning. All rights reserved. 6 Normal Probability Distributions.
Lecture 16 Correlation and Coefficient of Correlation
Quiz 2 Measures of central tendency Measures of variability.
SIMPLE LINEAR REGRESSION
Basic Statistics Standard Scores and the Normal Distribution.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Data Analysis and Statistics. When you have to interpret information, follow these steps: Understand the title of the graph Read the labels Analyze pictures.
Section 7.3 ~ Best-Fit Lines and Prediction Introduction to Probability and Statistics Ms. Young.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Measures of Dispersion… Prepared by: Bhakti Joshi Date: December 05, 2011.
Warsaw Summer School 2014, OSU Study Abroad Program Variability Standardized Distribution.
Basic Statistics Correlation Var Relationships Associations.
Chapter 6 Foundations of Educational Measurement Part 1 Jeffrey Oescher.
Review Ways to “see” data –Simple frequency distribution –Group frequency distribution –Histogram –Stem-and-Leaf Display –Describing distributions –Box-Plot.
VARIABILITY. Case no.AgeHeightM/F 12368M 22264F 32369F 42571M 52764F 62272M 72465F 82366M 92366F F M F M F F F.
Chapter 2 Statistical Concepts Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
UTOPPS—Fall 2004 Teaching Statistics in Psychology.
Check roster below the chat area for your name to be sure you get credit! Audio will start at class time. Previously requested topics will be gone over.
Chapter 7 Probability and Samples: The Distribution of Sample Means.
 IWBAT summarize data, using measures of central tendency, such as the mean, median, mode, and midrange.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
 Two basic types Descriptive  Describes the nature and properties of the data  Helps to organize and summarize information Inferential  Used in testing.
Reasoning in Psychology Using Statistics Psychology
2 Kinds of Statistics: 1.Descriptive: listing and summarizing data in a practical and efficient way 2.Inferential: methods used to determine whether data.
Data Analysis.
Statistics What is statistics? Where are statistics used?
Chapter 5: z-scores – Location of Scores and Standardized Distributions.
THE NORMAL DISTRIBUTION AND Z- SCORES Areas Under the Curve.
LESSON 6: REGRESSION 2/21/12 EDUC 502: Introduction to Statistics.
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
The Normal distribution and z-scores
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Normal Distributions (aka Bell Curves, Gaussians) Spring 2010.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 16, 2009.
Chapter 15: Correlation. Correlations: Measuring and Describing Relationships A correlation is a statistical method used to measure and describe the relationship.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Describing a Score’s Position within a Distribution Lesson 5.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Slide Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple.
Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.
Regression and Correlation
Understanding Research Results: Description and Correlation
Since When is it Standard to Be Deviant?
CORRELATION & REGRESSION compiled by Dr Kunal Pathak
Presentation transcript:

Problem Set I: Review Intro, Measures of Central Tendency & Variability, Z-scores and the Normal Distribution, Correlation, and Regression

QUESTION 1: Short answer: What is a statistic? Give a definition and an example. Explain how the example illustrates the definition you have provided. A statistic is a number that organizes, summarizes, and makes understandable a collection of data. An example of a statistic is the mean. The mean is a single number calculated on a set of data which gives an idea of the collection of values without having to report them all individually.

QUESTION 2: A psychologist interested in the dating habits of undergraduates in the Psychology major samples 10 students and determines the number of dates they have had in the last six months. He knows that the mean number of dates is 7.8, and the sum of squares (SS) is Assume a normal distribution. A.What percentage of all undergraduate students went on less than 4 dates in the last six months? B.If the psychologist had 10 students total, approximately how many of these students went on between 8 and 13 dates in the last six months? In order to make ANY conclusions about proportions, we need to use the z-table. To use z-scores, we need the mean (x) and standard deviation (s). _ x _ = 7.8 s =4.98 Turn 4 into a z-score: (4-7.8)/4.98 = -.76 After shading in the distribution, it’s clear that “less than 4 dates” refers to an AREA C The AREA C for z=.76 is.2236, or 22.36%

QUESTION 2: A psychologist interested in the dating habits of undergraduates in the Psychology major samples 10 students and determines the number of dates they have had in the last six months. He knows that the mean number of dates is 7.8, and the sum of squares (SS) is Assume a normal distribution. A.What percentage of all undergraduate students went on less than 4 dates in the last six months? B.If the psychologist had 10 students total, approximately how many of these students went on between 8 and 13 dates in the last six months? In order to make ANY conclusions about proportions, we need to use the z-table. To use z-scores, we need the mean (x) and standard deviation (s). _ x _ = 7.8 s =4.98 Turn 4 into a z-score: (4-7.8)/4.98 = -.76 After shading in the distribution, it’s clear that “less than 4 dates” refers to an AREA C The AREA C for z=.76 is.2236, or 22.36% After shading in the distribution, it’s clear that “between 8 and 13” refers to an portion of the distribution which we can only find by combining areas from the table. Turn 13 into a z-score: (13-7.8)/4.98 = 1.04 Turn 8 into a z-score: (8-7.8)/4.98 =.04 The AREA B for z= 1.04 is.3508 The AREA B for z=.04 is =.3348 and so 10(.3348) = 3.35 students (approximately 3)

QUESTION 2: A psychologist interested in the dating habits of undergraduates in the Psychology major samples 10 students and determines the number of dates they have had in the last six months. He knows that the mean number of dates is 7.8, and the sum of squares (SS) is Assume a normal distribution. In order to make ANY conclusions about proportions, we need to use the z-table. To use z-scores, we need the mean (x) and standard deviation (s). _ x _ = 7.8 s =4.98 C. What is the number of dates one must have gone on in the last six months in order to be in the top 2.5%? After shading in the distribution, it’s clear that the top 2.5% is in the right tail of the distribution, and extends from some z-score and beyond (this means it’s an AREA C). We need the z-score which has an AREA C closest to.0250 without going over. We find an AREA C which is EXACTLY.0250 for a z-score of We need to turn this into a raw score, in other words, a number of dates. x _ Raw Score = + z(s) Raw Score = (4.98) Raw Score = Raw Score = dates

QUESTION 3: A researcher in a learning laboratory believes that the amount of water a rat drinks before entering a maze will affect how well the rat performs in the maze. He records the amount of water consumed by each of his 4 rats (in ounces) and then puts them each into a maze and records how long it takes each rat to complete the maze (in seconds). He then calculates the correlation coefficient between these two variables, which is.48. His data can be found below: Water consumed (oz)Maze Completion Time (sec) A.As practice, find the correlation coefficient of this data by hand. Confirm that it does indeed come out to be.48. B.Write out the equation of the regression line for predicting maze performance from amount of water consumed. xy watertimex2x2 y2y2 xy Sums Raw Score Method

QUESTION 3: A researcher in a learning laboratory believes that the amount of water a rat drinks before entering a maze will affect how well the rat performs in the maze. He records the amount of water consumed by each of his 4 rats (in ounces) and then puts them each into a maze and records how long it takes each rat to complete the maze (in seconds). He then calculates the correlation coefficient between these two variables, which is.48. His data can be found below: Water consumed (oz)Maze Completion Time (sec) A.As practice, find the correlation coefficient of this data by hand. Confirm that it does indeed come out to be.48. B.Write out the equation of the regression line for predicting maze performance from amount of water consumed. MEAN S xyZxZyZxZy watermazetime Sumxy=1.43

QUESTION 3: A researcher in a learning laboratory believes that the amount of water a rat drinks before entering a maze will affect how well the rat performs in the maze. He records the amount of water consumed by each of his 4 rats (in ounces) and then puts them each into a maze and records how long it takes each rat to complete the maze (in seconds). He then calculates the correlation coefficient between these two variables, which is.48. His data can be found below: Water consumed (oz)Maze Completion Time (sec) A.As practice, find the correlation coefficient of this data by hand. Confirm that it does indeed come out to be.48. B.Write out the equation of the regression line for predicting maze performance from amount of water consumed. MEAN S Z-score Method xyZxZyZxZy watermazetime Sumxy=1.43

QUESTION 3: A researcher in a learning laboratory believes that the amount of water a rat drinks before entering a maze will affect how well the rat performs in the maze. He records the amount of water consumed by each of his 4 rats (in ounces) and then puts them each into a maze and records how long it takes each rat to complete the maze (in seconds). He then calculates the correlation coefficient between these two variables, which is.48. His data can be found below: Water consumed (oz)Maze Completion Time (sec) A.As practice, find the correlation coefficient of this data by hand. Confirm that it does indeed come out to be.48. B.Write out the equation of the regression line for predicting maze performance from amount of water consumed. MEAN S b = r(s y /s x ) b =.48(3.70/2.65) b =.67 a = (3.50) a = y – bx __ y =.67x a = 8.16 xyZxZyZxZy watermazetime Sumxy=1.43

QUESTION 3: A researcher in a learning laboratory believes that the amount of water a rat drinks before entering a maze will affect how well the rat performs in the maze. He records the amount of water consumed by each of his 4 rats (in ounces) and then puts them each into a maze and records how long it takes each rat to complete the maze (in seconds). He then calculates the correlation coefficient between these two variables, which is.48. His data can be found below: Water consumed (oz)Maze Completion Time (sec) C.Make a prediction of how long it would take a rat that drank 10oz of water to complete this maze. D.Write out the equation of the regression line to predict amount of water consumed from time spent to complete the maze. y =.67x y =.67(10) y = seconds MEAN S b = r(s y /s x ) b =.48(2.65/3.70) b =.34 a = (10.50) a = y – bx __ y =.34x -.07 a = -.07

QUESTION 3: A researcher in a learning laboratory believes that the amount of water a rat drinks before entering a maze will affect how well the rat performs in the maze. He records the amount of water consumed by each of his 4 rats (in ounces) and then puts them each into a maze and records how long it takes each rat to complete the maze (in seconds). He then calculates the correlation coefficient between these two variables, which is.48. His data can be found below: Water consumed (oz)Maze Completion Time (sec) E.What kind of relationship exists between water consumption and maze completion speed? Is it better for the rats to have consumed a lot of water prior to entering the maze, or does it hinder their performance? F.Calculate the coefficient of determination. What does this value tell you about how well you are or are not able to make an accurate prediction using this regression line. Since there is a positive moderate correlation between water consumption and maze completion, it implies that the more water a rat drinks, the longer it takes to complete the maze. It seems as though it’s better for the rats not to consume a lot of water so their completion time is quicker. r-squared is (.48)(.48) =.2304, which means there is only 23% of completion time accounted for by water consumed. This is a small amount of variation, telling us perhaps our prediction is not very accurate.

QUESTION 4: Below is a sample of scores on a new version of an IQ test. The range of possible points on this test is NameScore Maria78 John90 David50 Julia65 Marta100 A.Calculate the mean, standard deviation, and variance of these scores. (do this by hand, show your work) B.What is the z-score obtained by Julia, and what does this z-score tell us about her grade? NameScore (x) x^2 Maria John David Julia Marta Sums Raw score method: Mean =  x/N  x = /5 = 76.6 xx x2x2

QUESTION 4: Below is a sample of scores on a new version of an IQ test. The range of possible points on this test is NameScore Maria78 John90 David50 Julia65 Marta100 A.Calculate the mean, standard deviation, and variance of these scores. (do this by hand, show your work) B.What is the z-score obtained by Julia, and what does this z-score tell us about her grade? Deviation Method: Mean =  x/N  x = /5 = 76.6 NameScore (x) xbarx-xbar(x-xbar)^2 Maria John David Julia Marta Sum  (x-x) 2 aka SS _ z = (x-x)/s _ z = ( )/19.82 z = -.59 Julia’s z-score is negative, indicating she performed worse than average, and specifically.59 standard deviations below average.

QUESTION 4: Below is a sample of scores on a new version of an IQ test. The range of possible points on this test is Suppose you want to know if this IQ test is in any way related to the old IQ test, so you administer a version of the old test to each of these individuals. The following are their scores on the old IQ test: NameScore Maria110 John130 David70 Julia90 Marta160 C. Is there a relationship between the scores on the old test and the scores on the new test? In other words, does the new test seem to be measuring IQ in the same way? Describe the relationship. xy NEWOLDx^2y^2xy Sums

xy NEWOLDx^2y^2xy Sums QUESTION 4: Below is a sample of scores on a new version of an IQ test. The range of possible points on this test is Suppose you want to know if this IQ test is in any way related to the old IQ test, so you administer a version of the old test to each of these individuals. The following are their scores on the old IQ test: NameScore Maria110 John130 David70 Julia90 Marta160 C. Is there a relationship between the scores on the old test and the scores on the new test? In other words, does the new test seem to be measuring IQ in the same way? Describe the relationship. Raw Score Method

QUESTION 4: Below is a sample of scores on a new version of an IQ test. The range of possible points on this test is Suppose you want to know if this IQ test is in any way related to the old IQ test, so you administer a version of the old test to each of these individuals. The following are their scores on the old IQ test: NameScore Maria110 John130 David70 Julia90 Marta160 C. Is there a relationship between the scores on the old test and the scores on the new test? In other words, does the new test seem to be measuring IQ in the same way? Describe the relationship. Yes, there is a strong positive correlation between the two versions of the test. The higher the score on the old version, the higher the score on the new version, thus it seems that the two tests are measuring IQ the same way.

QUESTION 5: Over the years, my students have informed me that they feel as though I seem to grade paper assignments according to their length. To assess this relationship, I decide to perform a correlational analysis on the number of pages of 12 papers and the grades I assigned to them. I find that the correlation coefficient (r ) is The following is also known: Page lengthGrade xx x2x Suppose a student had access to this information and wanted to predict their grade for an upcoming paper. Their paper is 3 pages long. A.Write out the equation of the regression line to predict grade from paper length. A.Predict the grade for this student whose paper is 3 pages long. B.If someone received a grade of 100 on their paper, predict the number of pages of their paper (this will involve multiple steps; ie find the equation of the regression line first, then plug in to make a prediction). Mean s The Mean Page length Grade x y

QUESTION 5: Over the years, my students have informed me that they feel as though I seem to grade paper assignments according to their length. To assess this relationship, I decide to perform a correlational analysis on the number of pages of 12 papers and the grades I assigned to them. I find that the correlation coefficient (r ) is The following is also known: Page lengthGrade xx x2x Suppose a student had access to this information and wanted to predict their grade for an upcoming paper. Their paper is 3 pages long. A.Write out the equation of the regression line to predict grade from paper length. A.Predict the grade for this student whose paper is 3 pages long. B.If someone received a grade of 100 on their paper, predict the number of pages of their paper (this will involve multiple steps; ie find the equation of the regression line first, then plug in to make a prediction). Standard Deviation (s) Mean s Page length Grade We know that… and x y

QUESTION 5: Over the years, my students have informed me that they feel as though I seem to grade paper assignments according to their length. To assess this relationship, I decide to perform a correlational analysis on the number of pages of 12 papers and the grades I assigned to them. I find that the correlation coefficient (r ) is The following is also known: Page lengthGrade xx x2x Suppose a student had access to this information and wanted to predict their grade for an upcoming paper. Their paper is 3 pages long. A.Write out the equation of the regression line to predict grade from paper length. A.Predict the grade for this student whose paper is 3 pages long. B.If someone received a grade of 100 on their paper, predict the number of pages of their paper (this will involve multiple steps; ie find the equation of the regression line first, then plug in to make a prediction). Mean s x y b = r(sy/sx) b = -.90(10.21/3.63) b =-2.53 a = – (-2.53(7.42)) a = y = -2.53x y = -2.53(3) = 92.1 a = y - bx _

QUESTION 5: Over the years, my students have informed me that they feel as though I seem to grade paper assignments according to their length. To assess this relationship, I decide to perform a correlational analysis on the number of pages of 12 papers and the grades I assigned to them. I find that the correlation coefficient (r ) is The following is also known: Page lengthGrade xx x2x C. If someone received a grade of 100 on their paper, predict the number of pages of their paper (this will involve multiple steps; ie find the equation of the regression line first, then plug in to make a prediction). Mean s y x b = r(sy/sx) b = -.90(3.63/10.21) b = -.32 a = 7.42 – (-.32(80.92)) a = y = -.32x y = -.32(100) = 1.31 pages a = y - bx _

QUESTION 6: You are collecting IQ data from a sample of 20 of your classmates. You record the following IQ scores: IQ = {120, 110, 120, 100, 120, 130, 100, 110, 130, 120, 80, 140, 110, 90, 70, 120, 120, 110, 130, 140} A.Describe the shape of the distribution of IQ scores. A.Find the Mean, Median, and Mode. Use these values to support your judgment of the distribution’s shape in part A. The distribution is negatively skewed and unimodal. The mean of this distribution is 113.5, median and mode are both 120. The fact that the mean is smaller than the median supports the conclusion that the distribution is negatively skewed.