Presentation is loading. Please wait.

Presentation is loading. Please wait.

Can these provide a new paradigm for the

Similar presentations


Presentation on theme: "Can these provide a new paradigm for the"— Presentation transcript:

1 Teaching Students to Work with Big Data through Visualizations Shonda Kuiper

2 Can these provide a new paradigm for the
KEY POINTS: ASA Curriculum Guidelines for Undergraduate Programs in Statistics Science Increased importance of data science Real applications More diverse models and approaches Ability to communicate Can these provide a new paradigm for the first or second course? We can start by changing the way we think about data visualization and exploratory data analysis.

3 Nolan, D., Statistical Thinking in a Data Science Course, 2016 UseR Conference

4 Changing the way we think about EDA
In this talk we explain how interactive data visualizations can encourage students to think about multiple models and approaches to authentic questions. We train students to better communicate statistical concepts and understand how statistics is relevant to their own lives.

5 Exploring Racial Disparities in New York City's Stop-and-Frisk Policies
New York City’s stop-and-frisk policies gave police officers the right to stop, search, or arrest any suspicious person with reasonable grounds for action. “Civil liberties groups say the practice is racist and fails to deter crime.” “The NYPD has defended the…policy on the grounds that most stops are conducted in high-crime neighborhoods with high concentrations of people of color.” FREE online visualizations: rstudio.grinnell.edu People stop more than 1 million people on the street. NEW YORK CITY BAR ASSOCIATION REPORT ON THE NYPD‘S STOP-AND-FRISK POLICY (page 10)

6 New York City's Stop-and-Frisk Policies
Center for Constitutional Rights. (2016, March 23). Floyd, et al. v City of New York, et al. Retrieved from: Cohen, S., Golding, B. (2015, March 3). Stop-and-frisk law so strict, cops should ‘travel with an attorney’. New York Post. New York City's Stop-and-Frisk Policies Natural Log Center for Constitutional Rights. (2016, March 23). Floyd, et al. v City of New York, et al. Retrieved from

7 New York City's Stop-and-Frisk Policies
Center for Constitutional Rights. (2016, March 23). Floyd, et al. v City of New York, et al. Retrieved from: Cohen, S., Golding, B. (2015, March 3). Stop-and-frisk law so strict, cops should ‘travel with an attorney’. New York Post. New York City's Stop-and-Frisk Policies “on August 12, 2013… a federal judge found the New York City Police Department liable for a pattern and practice of racial profiling and unconstitutional stops.” Currently, the law does not allow police officers to stop individuals who match a “generalized description of a crime suspect” or those who make “furtive movements” Natural Log Center for Constitutional Rights. (2016, March 23). Floyd, et al. v City of New York, et al. Retrieved from

8 New York City's Stop-and-Frisk Policies

9 New York City's Stop-and-Frisk Policies
Once stopped, whites have a slightly higher probability of being arrested. How can we determine if this pattern is consistent among all years?

10 New York City's Stop-and-Frisk Policies
Male suspects tend to have more force during stops than females. The most common type of “force” is where police place their hands on a suspect. The percentage of stops where police use force is increasing. Female Male Female Male Female Male

11 New York City's Stop-and-Frisk Policies
Counts can lead to very different conclusions than proportions. The number of stops where police use force is decreasing. What patterns exist for other gender identities? Female Male Female Male Female Male

12 Total number of Arrests

13 Percentage of Arrests Percentage African American

14 Log(Percentage of Arrests)
Natural Log Percentage African American

15 ASA Curriculum Guidelines for Undergraduate Programs in Statistics Science
Increased importance of data science:  access and manipulate data in various ways Introduces multivariate thinking There are a variety of graphics, statistics and models that are meaningful: counts, percentages, logs scatterplots, bar charts, choropleth maps Data is provided (with RMD file) so students can manipulate data and create their own visualizations in any software.

16 ASA Curriculum Guidelines for Undergraduate Programs in Statistics Science
Real applications:  - emphasize concepts and approaches for working with complex data - analyzing non-textbook data Authentic problem that tends to be reserved for humanities or social science courses  Emphasizes the importance of data-based decision making and the investigative process of problem- solving

17 ASA Curriculum Guidelines for Undergraduate Programs in Statistics Science
More diverse models and approaches:  issues of design, confounding, and bias Precinct 114: Rikers Island Correctional Center Precinct 22: Central Park Precinct 14: Times Square Are arrests correlated to income or education?

18 ASA Curriculum Guidelines for Undergraduate Programs in Statistics Science
Ability to communicate:  - communicate complex statistical methods in basic terms - visualize results in an accessible manner - ethical standards Assignment: Create a short report with a visualization that tells a story or interesting idea hidden within the available data (can use online apps or class software). Next Day: Small group or class discussion. What is the best graphic and why? ● Visually appealing ● Easy to understand ● Persuasive ● Demonstrates Originality

19 Quiz/Discussion Questions
All information in this dataset is self-reported by the police officers. Police officers are required to fill out a form after every stop. Could this lead to any biases in the data? After people are stopped for suspected contraband, the proportion of whites arrested are much higher than Hispanics (p.value < ). Does this provide significant evidence that the NYPD is discriminating against whites?

20 Quiz/Discussion Questions
This data is clearly not from a simple random sample, but a limited attempt to collect information on all NYPD stops in the last 10 years. What should we do with datasets that are limited or incomplete?

21 Materials are available on the Stat2Labs website.

22 Multiple Student handouts
Links to interactive websites

23 Cleaned datasets and sample code are available in GitHub
Source data

24 Example of a student handout
Example of a student handout. This guides students through the process of a working through research question.

25 Other Mapping Resources (that do not require programming skills)
While the interactive visualizations are freely available online, the following tools allow instructors and students to build upon the online apps to create their own visualizations. Tableau: Google Maps: mapping.withgoogle.com Google Fusion Tables: CartoDB: General Map Making:

26 New Paradigm web.grinnell.edu/individuals/kuipers/stat2labs/
Teach students how to think with data by having them work with authentic problems and train them to better communicate nuanced statistical ideas Move beyond carefully vetted and cleaned textbook datasets Instead of p-values, find patterns that matter (tell a story with your data) Determine what questions should be asked about a dataset and analysis Articulate how assumptions we make about the dataset can lead to misleading conclusions and poor decisions web.grinnell.edu/individuals/kuipers/stat2labs/


Download ppt "Can these provide a new paradigm for the"

Similar presentations


Ads by Google