Statistics and Storytelling with Data Dr. Lam TECM 4180
Let’s Review Line charts are good for?? Bar charts?? Pie Charts?? Pictographs?? Tables??
Types of Variables Quantitative variables (interval, ordinal) Interval – numerical with meaningful distance between the numbers (e.g., temperature) Ordinal – Numerical without meaningful distance between the numbers (e.g., happiness from 1-5) Qualitative variables (categorical) Non-numerical variables with no sense of ordering (e.g., hair color, eye color, job title) Sample dataset collected 5 variables from 100 undergraduate students: Academic status (freshman, sophomore, junior, and senior) Major SAT score GPA # of detentions in high school
Descriptive vs. inferential statistics Descriptive stats provide us with a broad overview of our variables Central tendencies- Mean, median, and mode E.g., AVG SAT score Frequencies and frequency distribution E.g., How many freshman vs. senior students Inferential Stats tell us about how variables relate Do variables correlate? Does one group have a significantly different average on SAT scores to another group?
Descriptive statistics and visualizations Year SAT AVG. Freshman 1369 Sophomore 1198 Junior 1245 Senior 1009 Central tendencies (1 variable) Mean – average E.g., Average SAT score in the dataset Visualize with tables, bar charts, and line charts
Frequency Distribution Frequency distribution (1 variable) – How many times (how frequently) each variable appears E.g., You are examining 100 undergraduates; 22 are freshman, 33 are sophomores, 21 are juniors, and 24 are seniors. Can be visualized using tables or histograms (which are essentially bar charts) Year Total Participants Freshman 22 Sophomore 33 Junior 21 Senior 24
Inferential Statistics – Correlations Correlation (2 quantitative variables) – dependency between 2 or more variables Positive – When 1 variable increases, the other also increases E.g., GPA and SAT scores Negative – When 1 variable increases, the other variable decreases E.g., GPA and Detentions R, which is the correlation coefficient, tells you how strong the relationship is. -1 and 1 Values closer to 1 show a stronger positive correlation Values closer to -1 show a stronger negative correlation In Excel, use the following formula: “CORREL(FirstCell:LastCell, FirstCell:LastCell)
Inferential Statistics – comparing means What groups can we compare means to? Process: Calculate the means for each group separately ”AVERAGE(FirstCell:LastCell)” Visualize the results using a bar chart or column chart
Three elements to a successful infographic Understands audience Anderson page 62 communication goals worksheet due with Project 2 A clear framework Essentially, what is the data? Is it clear to everyone? What parts of the data will need explaining? It tells a story From Harvard Business Review (https://hbr.org/2013/04/the-three-elements-of-successf)
What makes a good story? Poll
Parts of a Story Introduction to the situation A series of events involving tension or conflict Resolution
Storytelling with data https://www.youtube.com/watch?v=6xsvGYIxJok
Four characteristics of storytelling with data Connect with people Try to convey one (or a centralized) idea Keep it simple Explore what you know
Practical strategies for infographics Repeat yourself in a variety of visual ways Draw your reader in with a compelling question or problem Use transitional statements if necessary
With a partner Complete the storytelling analysis activity on the resources page of the website.