Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aim: 1. To debrief correlation studies- What is a correlation study

Similar presentations


Presentation on theme: "Aim: 1. To debrief correlation studies- What is a correlation study"— Presentation transcript:

1 Aim: 1. To debrief correlation studies- What is a correlation study
Aim: 1. To debrief correlation studies- What is a correlation study? How is it used in research? 2. Introduction to how stats are use in reporting data in research. DO NOW: Take out your paper and staple rubric to the FRONT of your paper. Homework: Complete Stats Vocabulary Sheet. Define ALL terms AND look up the terms you are unsure about. Bring in Calculator!!

2 The Dangers of Bread Evaluate the reasoning. Is it scientifically sound or flawed? Why? What lessons can we take from this?

3

4 About 1 in 8 U.S. women (about 12%) will develop invasive breast cancer over the course of her lifetime. In 2017, an estimated 252,710 new cases of invasive breast cancer are expected to be diagnosed in women in the U.S., along with 63,410 new cases of non-invasive (in situ) breast cancer. About 2,470 new cases of invasive breast cancer are expected to be diagnosed in men in A man’s lifetime risk of breast cancer is about 1 in 1,000. Breast cancer incidence rates in the U.S. began decreasing in the year 2000, after increasing for the previous two decades. They dropped by 7% from 2002 to 2003 alone. Facts from Breastcancer.org

5 Susan G. Komen Foundation
Rates of breast cancer are low in women under 40. Fewer than 5 percent of women diagnosed with breast cancer in the U.S. are younger than 40 [4]. Rates begin to increase after age 40 and are highest in women over age 70 (see Figure 2.1 below). The median age of diagnosis of breast cancer for women in the U.S. is 62 [6]. However, the median age of diagnosis varies by race and ethnicity. For example, African-American women tend to be diagnosed at a younger age than white women [6]. The median age at diagnosis for African-American women is 59, compared to 63 for white women [6]. Susan G. Komen Foundation

6 Age is also a risk factor for breast cancer in men.
The older a man is, the more likely he is to get breast cancer. However, breast cancer is much less common in men than in women (see Figure 2.1 below). The median age of diagnosis of breast cancer for men in the U.S. is 68 [6]. However, the median age of diagnosis varies by race and ethnicity. For example, African-American men tend to be diagnosed at a younger age than white men [6]. The median age at diagnosis for African-American men is 65, compared to 68 for white men [6]. Learn more about breast cancer in men.

7 Statistics Is concerned with the collection, analysis, interpretation, & presentation of data Must use a common language so we all know what we are talking about.

8 What do we already know about statistics?
Define ALL of the terms you know. Put a question mark next to the terms you are unsure about. What questions do you have? What do you wonder about?

9 Aim: How stats are use in reporting data in research.
DO NOW: Go over TEST and put in test folder. Take out HW – vocabulary sheet. What is the difference between the 4 scales of measurement? Which one is truly quantifiable? Qualitative? Homework: Quiz on Stats (60 Pt Quiz) Wednesday

10 Measurement Scales DEFINE/ EXAMPLE
Nominal Ordinal Interval Ratio

11 Nominal Ordinal Interval Ratio Categorical
Non-numerical; cannot compute mean. Show with bar graph. These numbers are purely for categorizing data into groups. They have no quantitative properties (can’t compute mean). Channels on TV OR A survey where 1 = Democrat, 2 = Republican, or 3 = Independent Favorite ice cream Ordinal Ordered These numbers contain some quantitative information, namely that of determining ranking (Better, Faster, Smarter). But are not numerical – can’t compute mean. However, The distance between scale points is not equal. NCAA Seed Rankings, Positions in a race. Interval There are equal interval between points; These numbers contain quantities information. The spaces in between the numbers mean something, and they can be added and subtracted, and average mean. Temperature (Fahrenheit or Celsius) Intelligence Test Score Ratio These numbers contain the most quantitative information. They can also be added, subtracted, multiplied, and divided, and they have a true zero point. Money, speed, height, gpa.

12 Continuous/ Discontinuous
Continuous = there are no limits on the value Discontinuous = there are a finite number of values

13 Check for Understanding
Complete 1 – 6 p. 2 Complete 1 – 15 p.

14 What is the difference between bar graphs and histograms?
used to display "categorical data", that is data that fits into categories Nominal, ordinal Histogram present "continuous data", that is data that represents measured quantity where, at least in theory, the numbers can take on any value in a certain range Interval,ratio

15

16

17 Aim: How stats are use in reporting data in research.
DO NOW: Homework: Complete p. 10 – 11 Stats Practice in Stats Packet. Quiz on Stats (60 Pt Quiz) Thursday – 15 Multiple Choice and computations.

18 Descriptive Statistics
Inferential Statistics Descriptive Statistics PAGE 4 to reach conclusions beyond describing the data Estimating the likelihood you would get the same results (probability) Estimate whether you can generalize the results to larger population Just describes sets of data. Summarize in a meaningful way Notice patterns Can’t really draw conclusions

19 Check for Understanding
Identify as descriptive or inferential TOP OF PAGE 4

20 Measure of Central Tendency
Mean “average” of a distribution Calculate - sum of scores/ number of entries Mode most common score in a distribution Median “middle” score in a distribution Calculate – list all scores lowest to highest, middle number Which of the above would be most useful in describing a skewed curve?

21 Central Tendency Mean, Median and Mode. Watch out for extreme scores or outliers. Let’s look at the salaries of the employees at Dunder Mifflen Paper in Scranton: $25,000-Pam $25,000- Kevin $25,000- Angela $100,000- Andy $100,000- Dwight $200,000- Jim $300,000- Michael The mean salary good at about $110,000 but it is affected by outliers. The median salary looks good at $100,000. But the mode salary is only $25,000.

22 The Median is a much better measure of the center
But the mean doesn’t work in a skewed distribution (what type is this?) The Median is a much better measure of the center 22

23 Skewed Curves PAGE 7– Positive v. Negative
Skewed Curves PAGE 7– Positive v. Negative? Where are the mean, median and mode? Negatively Skewed Outliers skew distributions (not representative of majority). If group has one high score, the curve has a positive skew (contains more low scores) If a group has a low outlier, the curve has a negative skew (contains more high scores) Positively Skewed 23

24 Measures of Variation Page 5
Dispersion = how distributed the data points are (high v. low variability) Two key ways of measuring variability: Range Standard Deviation 24

25 Range The range simply gives the lowest and highest values of a data set but not any other info/variability of the other scores. 25

26 Standard Deviation measure of dispersion/ how much each score varies/deviates from the mean. They measures the average difference between the values. The higher the variance or SD, the more spread out the distribution is. Do scientists want a big or small SD? Shaq and Kobe may both score 30 ppg (same mean). But their SDs are very different.

27 In research what would we prefer in our sample – a high or low SD? Why?

28 Three populations with the same mean, median, and mode
Three populations with the same mean, median, and mode. Which has the smallest standard deviation?

29 Formulas for Standard Deviation

30 Standard Deviation 30

31 Check for Understanding
Page 5 Calculate the Range and SD for the data set Page 5 Estimating SD 1 – 2 at bottom

32 Standard Deviation in Action
A couple needs to be within one standard deviation of each other in intelligence (10 points in either direction). —Neil Clark Warren, founder of eHarmony.com 32

33 Interpret this graph Figure 6. The distribution of IQ scores in male and female populations. Adjusted parameter values yielded a male-female gap of SD in g equivalent to 2.43 IQ points in favor of men 33

34 Normal Distributions Page 5
The distribution of data also gives us key info. We know that many human attributes… e.g height, weight, task skill, reaction time, anxiousness, personality characteristics, attitudes etc. all …follow a normal distribution. In a normal distribution, the mean, median and mode are all the same. Symmetrical bell shaped curve in which % of scores between mean and any pt. on horizon is always the same 68% fall w/in 1 SD, 95% w/in 2 SD 34

35 Normal Distribution

36 IQ follows a Normal Distribution
Mean = 100 SD = 15 36

37 What percentage score below 100?
Mean = 100 SD = 15 37

38 What percentage score below 100?
Mean = 100 SD = 15 38

39 What percentage score above 100?
Mean = 100 SD = 15 34.1% % % 39

40 What percentage score between 85 and 100?
Mean = 100 SD = 15 34.1% 40

41 What percentage score between 85 and 115?
Mean = 100 SD = 15 34.1% % = 68.2% 41

42 What percentage score between 70 and 130?
13.6% % % % = 95.4% Mean = 100 SD = 15 42

43 What percentage score below 70 and above 130?
Mean = 100 SD = 15 43

44 The shelf life of a particular dairy product is normally distributed with a mean of 12 days and a standard deviation of 3 days. About what percent of the products last between 9 and 15 days? About what percent of the products last between 12 and 15 days? About what percent of the products last 6 days or less? About what percent of the products last 15 or more days?

45 Aim: How stats are use in reporting data in research.
DO NOW: Take out your HW Complete NORMAL CURVE practice. Homework: Quiz on Stats (60 Pt Quiz) Tomorrow – 15 Multiple Choice and computations.

46

47 Inferential Statistics Page 8
Calculations that estimate the likelihood that you’d get similar results if you repeated the study; ie, that your results are NOT a fluke or random, atypical event. Calculate the probability. T Test, chi-squared, ANOVA p value = likelihood that results are a fluke or coincidental Which should you trust more, results with a low or high p value? How low? If p < 0.05, then the results are “statistically significant”. Statistically significant – not likely due to random chance

48 Testing for Differences
If we have results (means) from two groups, before we infer causation we must ask the question: Is there a real difference between the means of the two groups or did it just happen by chance? To answer the question, we must run a t-Test 48

49 Example of when to do a t-test
Does caffeine improve our reaction time? We recruit 40 people and give (random assignment) 20 a caffeine pill (experimental group) 20 a sugar pill (control group) We give them a brief reaction time test and record the results. 49

50 Example of when to do a t-test
Experimental Group results (caffeine) Mean = ms SD = ms Control Group results (placebo) Mean = ms SD = 50

51 Example of when to do a t-test
Caffeine No Caffeine 51

52 Why can’t I be done! Yes, they are different. . .
But you don’t know if that difference was due to your IV (caffeine) or just dumb luck. You have to be sure that the results are statistically significant

53 T-Test formula

54 T-test excel formula =TTEST(array1,array2,tails,type)
Array1 is the first data set. Array2 is the second data set. Tails specifies the number of distribution tails. If tails = 1, TTEST uses the one-tailed distribution. If tails = 2, TTEST uses the two-tailed distribution. Type is the kind of t-Test to perform. IF TYPE EQUALS THIS TEST IS PERFORMED 1 Paired 2 Two-sample equal variance (homoscedastic) 3 Two-sample unequal variance (heteroscedastic)

55 T-test yields a p-value
Generally, the t test gives a P value that allows us a measure of confidence in the observed difference. It allows us to say that the difference is real and not just by chance. A p value of less than 0.05 is a common criteria for significance. We call this statistically significant Note: We also need to be careful about finding false negatives (Type II Errors). Look up ‘statistical power’ if you want to know more about this. 55

56 T-test results Does caffeine improve our reaction time?
Caffeine condition has a lower mean RT. We run a t-test on our samples and get: p = 0.039 Can we be confident that the difference in the data is not due to chance? 56

57

58 Kahoot

59 Scores A unit that measures the distance of one score from the mean.
A positive z score means a number above the mean. A negative z score means a number below the mean.

60 PERCENTILES If scores from an English exam range from 0 to 100 and your raw score is 75, does that mean you are in the 75th percentile? % of people who “fall” at each RAW SCORE” need mean and SD to figure out!

61 Meta Analysis Combines and analyzes data from many different studies
Determines how much variance in scores across all studies could be explained by a particular variable.

62 Z - Scores Raw scores can be converted to Z scores
Tells you how far a given score is above or below the mean, using the SD as a unit of measurement Raw Score – Mean divided by SD

63 Z score - the number of standard deviations from the mean


Download ppt "Aim: 1. To debrief correlation studies- What is a correlation study"

Similar presentations


Ads by Google