Download presentation
Presentation is loading. Please wait.
1
Data Visualization Jeopardy
MIST 4610 Summer 2017
2
Game Format STEP 1 GAME FORMAT
Get with your group project team. You will work together when answering the jeopardy questions GAME FORMAT Each data visualization exercise has two levels of difficulty Easy and hard 5 exercises (no warm-ups) Each easy question is worth 1 point Each hard question is worth 2 points Bonus questions is worth 3 points
3
Each member of the winning team will get …
Game Format GAME FORMAT SCORING 2 easy and 2 hard questions (total of 6 points) 1 bonus question (total of 3 points) WINNER Each member of the winning team will get … BRAGGING RIGHTS! RUNNER UP gets “nada,” sorry!
4
Game Rules RULES When you are ready to answer raise your hand. Whoever does it first gets to answer the question first. If the answer is incorrect the group loses a point (-1 if you miss the first easy query) Every group will have a maximum of 15 minutes per question. If nobody gets the right answer nobody will receive a point
5
Data Visualization Jeopardy
STARTS NOW…
6
DV 1 – Category “Easy” Create a bar chart counting the credit limit of all customers. library(DBI) require(RMySQL) require(ggvis) # set a driver m<-dbDriver("MySQL") # connect to the database conn <- dbConnect(m,user='student',password='student',host='wallaby.terry.uga.edu',dbname='ClassicModels') # if error "in .local(drv, ...): cannot allocate a new connection: 16 connections already opened" appears loop through the connections and delete them. If there is no problem move on to query the database. cons<-dbListConnections(MySQL()) for(con in cons) dbDisconnect(con) # Query the database d <- dbGetQuery(conn,"SELECT creditLimit from Customers;") head(d) # Plot the graph d %>% ggvis(~creditLimit) %>% layer_histograms(width = 10000, fill:='cornflowerblue') %>% add_axis('x',title='Credit Limit', values = seq(0, , by = 50000)) %>% add_axis('y',title='Frequency')
7
DV 1 – Category “Easy” Create a bar chart counting the credit limit of all customers.
8
DV 2 – Category “Hard” Graph a scatter plot of the number of flights from the state of New York only if those flights were not delayed. Group by day of the month. require(sqldf) url <-' delta <- read.table(url, header=T, sep=',') require(ggplot2) OrigFlightNoDelayNY <- sqldf("SELECT DayOfMonth AS 'Day', COUNT(*) AS 'Count', DepDelayMinutes FROM delta WHERE OriginState = 'NY' GROUP BY Day HAVING DepDelayMinutes <= 0;") ggplot(OrigFlightNoDelayNY,aes(x=Day, y=Count)) + geom_point(color='blue') + geom_line(color='red') + xlab('Day') + ylab('Number of Flights Out of NY With No Delay') + scale_x_continuous(breaks=1:28)
9
DV 2 – Category “Hard” Graph a scatter plot of the number of flights from the state of New York only if those flights were not delayed. Group by day of the month.
10
DV 3 – Category “Easy” Graph a bar chart to count the number of each order status. require(ggplot2) library(DBI) require(RMySQL) m<-dbDriver("MySQL"); conn <-dbConnect(m,user='student',password='student',host='wallaby.terry.uga.edu',dbname='ClassicModels'); c <- dbGetQuery(conn,"SELECT status FROM Orders;") # Internal fill color is red ggplot(c,aes(x=status)) + geom_histogram(fill='red') + xlab('Status') + ylab('Count') + coord_flip()
11
DV 3 – Category “Easy” Graph a bar chart to count the number of each order status.
12
DV 4 – Category “Hard” Use a scatterplot to graph the total payment amount of orders by year and month. Format the Y-axis label to show dollars and the x-axis to show months. # Get total payment amount of orders by month and year. f <- dbGetQuery(conn,"SELECT YEAR(orderDate) AS orderYear, MONTH(orderDate) AS orderMonth, SUM((amount)) AS totalPayment FROM Payments, Orders, Customers WHERE Orders.customerNumber = Customers.customerNumberand Payments.customerNumber = Customers.customerNumberGROUP BY orderYear, orderMonth;") head(f) # ggvis expects grouping variables to be a factor, so convert f$orderYear <- as.factor(f$orderYear) # Plot total order revenue by month and display by yearf %>% group_by(orderYear) %>% ggvis(~orderMonth,~totalPayment, stroke = ~orderYear) %>% layer_lines() %>% add_axis('x', title = 'Month') %>% add_axis('y',title='Total Payment (Dollars)', title_offset=100)
13
DV 4 – Category “Hard” Use a scatterplot to graph the total payment amount of orders by year and month. Format the Y-axis to show dollars and the x-axis to show months.
14
DV 5 – Category “Bonus” Use the Twitter bot file to graph a distribution of scores that are above the average score. require(sqldf) url <-' twitter <- read.table(url, header=T, sep=',') require(ggplot2) head(twitter) scoreLargerAvg <- sqldf("SELECT score FROM twitter WHERE score > (SELECT avg(score) from twitter);”) ggplot(scoreLargerAvg,aes(x=score)) + geom_histogram(fill='red') + xlab('Score') + ylab('Count')
15
DV 5 – Category “Bonus” Use the Twitter bot file (same as A7) to graph a distribution of scores that are above the average score.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.