Week 5 Descriptive Statistics Research Methods Week 5 Descriptive Statistics
Review: Primary Research Designs Descriptive Study - specific predictions, with narration of facts and characteristics concerning individual, group or situation Diagnostic Study - the frequency with which something occurs or its association with something else. The studies concerning whether certain variables are associated are examples of diagnostic research studies. Data collection - observation, questionnaires, interviewing, examination of records
Primary Research Designs Diagnostic Study - the frequency with which something occurs or is associated with something else – i.e. if we see X, how often do we also see Y? Data collection – Observation – count X and Y Questionnaires / interviews – ask about X & Y (but be careful of bias) Examination of records – count X & Y
Primary Research Designs Diagnostic Study - if we see X, how often do we also see Y? Practice – Do people in white cars leave work earlier than people in non-white cars? Observation – count cars and car color every five minutes – 3:30 – 4:15
Observation Counts Observation – count cars and car color every five minutes – 3:45 – 4:15 Time White cars Non-white cars 3:45 3:50 3:55 4:00 4:05 4:10
Statistical Descriptions Why? We want to be able to describe a data set. It is helpful if we can summarize the data and explain its most important attributes. A common starting point is these four descriptions: Mean Median Mode Standard Deviation
Statistical Descriptions Mean – Also called “average” The most common descriptor, we use it every day to sum up a set of data. Sum of counts / number of items Example – Mean (average) temperature in January: 24 + 23 + 25 + 22 + …. / 31 days Mean height of students in this class 198 cm + …… / number of students
Statistical Descriptions Mean – Also called “average” Sum of counts / number of items Mean exit time = total white cars / 6
Statistical Descriptions Graphic Display Display shows how “regular” the counts are
Statistical Descriptions Graphic Display What if the counts are not regular, but skewed to one end? The mean in this case is misleading
Problems with the Mean The mean misleads whenever one item in the count is exceptional: Person Annual income Joe 12000 Sharma 14000 Ed 9000 Ismail 8000 Sue Donald Trump 1111000 Total income 1163000 Average income 193833 Would it be fair to say the “average” person in this company earns $193,833?
Normal distribution For many populations we expect a standard curve – a few at each end, and most in the middle. We are used to seeing that curve in such things as intelligence measurements. In a “normal” distribution, the mean is a good reflection of the sample.
What if things aren’t “normal”? We just saw a set of incomes that did not follow that curve. Many other data sets are equally irregular. Think of things like population statistics (there are often clusters of age groups), or behaviors (people who have traveled probably cluster around “lots” and “little.”)
Using the median The mean works well for normal distributions but misleads whenever one item in the count is exceptional. The Median defines where half of the population lies. We arrange items in an order and find the middle. Which more accurately reflects the typical employee earnings at this company, $193,833 or $10,500? Person Annual income Ismail 8000 Ed 9000 Sue Joe 12000 Sharma 14000 Donald Trump 1111000 Median income 10500 Mean income 193833
Practice Person Annual speeding tickets Ismail 8 Ed 3 Sue 2 Joe 12 Sharma 14 Bill Amal Fatma Find the mean and median of the number of tickets issued to this set of drivers. Which number more accurately reflects the quality of Omani drivers?
Using the mode The mode helps us describe situations where the data seem to cluster. For example, in the case off traffic violations, you may hear people talk about a “bimodal distribution.” They mean people seem to fall into two groups – safe drivers and non-safe drivers. There is no real “average” driver.
Examples of the mode Where else might there be no average? Mall sizes in Nizwa Tourist season Daily rainfall Others?
Our car sample What is the mean, median, and mode of our white cars? Excel functions – Graphing – insert column Statistics – Formulas, more functions, statistics Average (mean) Median Mode.sngl
Our car sample What is the mean, median, and mode of the non-white cars? Is their pattern normal? - graph Do the two groups differ? – compare graphs
Our car study Compare the graphs and the the mean, median, and mode of the cars? Is their pattern normal? - graph Do the two groups differ? – compare graphs
Our car sample If you were doing a descriptive study of car patterns, how would you describe what you saw?
Our car sample If you were doing a diagnostic study of car patterns, how would you describe what you saw?
Our car sample If you were going to do a hypothesis testing study, what initial hypotheses could you form based on our little sample?
Next week Test first hour Statistics – could we show if white cars leave earlier than on-white cars?