Presentation is loading. Please wait.

Presentation is loading. Please wait.

AP CSP: Data Assumptions & Good and Bad Data Visualizations

Similar presentations


Presentation on theme: "AP CSP: Data Assumptions & Good and Bad Data Visualizations"— Presentation transcript:

1 AP CSP: Data Assumptions & Good and Bad Data Visualizations

2 Survey Reminder: Go and answer the questions from the Data Tracker project Make sure to do this everyday I don’t see you as well.

3 Introduction: Last class we learned about the benefits of accumulating a lot of data from people so that we can learn from it and possibly make predictions We also learned that using certain Visualization Tools like Google Trends allows humans to easier interpret all of that data. We can look at data much easier and we can change certain parameters within these tools to better suit our needs. It is important to remember to distinguish between what the data is telling us and why it is that way. Be careful when making predictions or assumptions.

4 Google Flu Trends: Now we are going to watch a video about Google Flu Trends. Google used this to try and predict outbreaks of the Flu. Think about the potential benefits of using a tool like Google Flu Trends Share what you think

5 Google Flu Trends Failure:
Well now that you thought about the potential benefits about using a tool like Google Flu Trends, now we are going to learn why it wasn’t a complete success. Choose one of the articles below and learn more about why their project was a failure: Why did Google Flu Trends eventually fail? What assumptions did they make about their data or their model that ultimately proved not to be true?

6 Article Discussion: Google Flu Trends worked well in some instances but often over-estimated, under- estimated, or entirely missed flu outbreaks like with the H1N1 flu virus. Reading about the flu doesn’t mean they actually have it. In general, many terms may have been good predictors of the flu for a while only because, like high school basketball, they are more searched in the winter when more people get the flu. Google began recommending searches to users, which skewed what terms people searched for. As a result, the tool was measuring Google-generated suggested searches as well, which skewed results.

7 Assumptions & Misinterpretation:
The amount of data now available makes it very tempting to draw conclusions from it. There are certainly many beneficial results of analyzing this data, but we need to be very careful. To interpret data usually means making key assumptions. If those assumptions are wrong, our entire analysis may be wrong as well. Even when you’re not conducting the analysis yourself, it’s important to start thinking about what assumptions other people are making when they analyze data, too.

8 The Digital Divide: Now that we about how making bad assumptions can lead to bad predictions, let’s learn more about why are assumptions are wrong. How much internet presences represent all of reality? Complete the Activity Guide – Digital Divide and Checking Assumptions. Use the site below to complete the first part of the guide

9 Digital Divide Conclusions:
Access and use of the Internet differs by income, race, education, age, disability, and geography. As a result, some groups are over- or under-represented when looking at activity online. When we see behavior on the Internet, like search trends, we may be tempted to assume that access to the Internet is universal and so we are taking a representative sample of everyone. In reality, a “digital divide” leads to some groups being over- or under- represented. Some people may not be on the Internet at all.

10 Checking your Assumptions:
Now that you have completed the first part now work on the second part of the guide. Choose a scenario listed and examine and critique the assumptions used to make these decisions Then suggest additional data you would like to collect or other ways their decision could be made more reliably.

11 Share your scenario: What were some of the bad assumptions made in your particular scenario? Why were these bad assumptions? Explain some of the reason you read from the digital divide or elsewhere It is always a good idea to get into the habit of checking assumptions before jumping to conclusions about trend in data.

12 Be Aware: Now that you have learned about making predictions based off bad assumptions, would you revise your explanation of you gave in the google trends activity we completed in class the other day? Get into the habit of recognizing what assumptions are being made when we interpret that data. It is a good idea to call out explicitly your assumptions and think critically about what assumptions other people are making when they interpret data. Keep an eye out for the assumptions other people are making when they try to tell us “what the data is saying.

13 Why use Visualizations:
Why do people like to make a bunch  of charts and graphs rather than just showing the raw data itself? List a few advantages and disadvantages (at least 2 for each) of using visualizations to communicate data Write in your journal Think about communication purposes

14 Comparing Data Visuals:
To better understand some of the skills we just read about, we are going to evaluate a collection of data visualizations to determine how well they communicate their message. You and a partner will work on this assignment together. You will log into code.org Unit 2 Stage 10 and choose where it says Data Visualization. One Partner will choose Collection A and the other will choose Collection B Once you are finished  share the best and worst image from their set with your partner and another group. Focus on how you would fix the worst visualization they chose.

15 What makes a good/bad data Visualization?
What was the worst and best data visualizations? Justify your reasoning Now lets look at graphic number 5 in both collections. How did different groups rate this graphic? What data is presented? What is the difference between the two visualizations?

16 Good v. Bad Visualization Characteristics
Which one did you prefer better? What makes the good one good and the bad one bad? Lets create a chart of some good data characteristics vs the bad ones List out some characteristics for both sides

17 Creating Visualizations:
We’re going to be making some of our own visualizations of data very soon. Recognize that some types of charts are more appropriate than others, depending on the nature of the data or the message the author is trying to convey. Read “Data Visualization 101: How to design charts and graphs” Read only pages 1-4

18 Discussion: Choosing the right way to visualize data is essential to communicating your ideas. There are stories in data; visualization helps you tell them. Before understanding visualizations, you must understand the types of data that can be visualized and their relationships to each other. Certain chart types are right for certain situations, depending on the data.

19 Wrap-Up: Look at that guide for a reference when you may need to create a visual representation of data. After today we should know: What are the benefits of visualizing data? Can we characterize common mistakes in visualizations to which we gave low ratings? Can we characterize common strengths in effective visualizations? Not all visualizations were charts; what other types are there? What mistakes should we avoid in creating our own visualizations?


Download ppt "AP CSP: Data Assumptions & Good and Bad Data Visualizations"

Similar presentations


Ads by Google