Messy Data: Teaching Students Early on About the Realities of Data.

Slides:



Advertisements
Similar presentations
Lecture 0: Course Overview
Advertisements

General Studies Areas Core Areas –Literacy & Critical Inquiry (L) –Mathematical Studies (MA/CS) –Humanities & Fine Arts (HU) –Social & Behavioral Sciences.
Polya Mathematics Lab A Cognitive Approach Design Proposal By Deepak Kumar Adam Royalty.
CSE 321: Case Studies in Component-Based Software.
EGR 105 Foundations of Engineering I Fall 2007 – Week 1 Introduction.
1 EDTE 316 Science Methods Fall 2007 Module 1 To properly navigate through this PowerPoint, go to “View” and click on “Slide Show”
Amy Wagaman Department of Mathematics and Statistics Amherst College.
Quantitative Methods and Computer Applications in the Historical and Social Sciences Roman Studer Nuffield College
Lessons Learned: The Keck Postdoctoral Fellowship Experience Xenia Morin May 2004.
1 CENG 707 Data Structures and Algorithms Nihan Kesim Çiçekli Department of Computer Engineering Middle East Technical University Fall 2010.
IT 240 Intro to Desktop Databases Introduction. About this course Design a database: Entity Relation (ER) modeling and normalization techniques Create.
June 13, Introduction to CS II Data Structures Hongwei Xi Comp. Sci. Dept. Boston University.
CMSC 132: Object-Oriented Programming II
CS 101 Introduction to Computer Science Arif Zaman, Sohaib Khan, Tariq Jadoon.
CMSC 132: Object-Oriented Programming II Nelson Padua-Perez William Pugh Department of Computer Science University of Maryland, College Park.
Western Oregon University's Pre-Service Middle School Mathematics Focus Laurie Burton & Rachel Harrington TOTOM 2010.
CS261 Data Structures Winter 2011 Professor Timothy Budd.
CS101- Lecture 11 CS101 Fall 2004 Course Introduction Professor Douglas Moody –Monday – 12:00-1:40 – – –Web Site: websupport1.citytech.cuny.edu.
Overview of the Rose-Hulman Bachelor of Science in Software Engineering Don Bagert SE Faculty Retreat – New Faculty Tutorial August 23, 2005.
Welcome and Overview Richard Anderson University of Washington June 30, 20081IUCEE: Welcome and Overview.
TR1413: Discrete Math for Computer Science Lecture 0: Introduction.
COMPUTER SCIENCE 10: INTRODUCTION TO COMPUTER SCIENCE Dr. Natalie Linnell with credit to Cay Horstmann and Marty Stepp.
CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Automated Evaluation of Regular Lab Assignments: A Bittersweet.
USING ASSESSMENT DATA TO IMPROVE STUDENT SUCCESS IN MULTI-SECTION MATH COURSES SCOTT GENTILE GATEWAY LECTURER, ASSESSMENT COORDINATOR, AND COMMON CORE.
4 th and 5 th period Who is Ms. Kaplan???
Using the Internet in the Math Classroom Internet Workshops, Internet Projects, Internet Inquires, & Webquests Allison Duncan Canyons School District.
CS324e - Elements of Graphics and Visualization Class Intro
9/5/2015 Spring Introduction to Engineering 161 Engineering Practices II Joe Mixsell Phone:
1 EDTE 316 Science Methods Fall 2007 Module 3 (Weeks 5, 6 & 7) To properly navigate through this PowerPoint, go to “View” and click on “Slide Show”
Foundation Programming Introduction. Aims This course aims to give students a basic understanding of computer programming. On completing this course students.
- Stuart Boersma, Central Washington Univ. - Caren Diefenderfer, Hollins University - Shannon Dingman, U. of Arkansas - Bernie Madison, U. of Arkansas.
CSc 2310 Principles of Programming (Java) Dr. Xiaolin Hu.
Course Introduction Software Engineering
CSC 480 Software Engineering Lecture 1 August 21, 2002.
CSIII Proposal Mikhail Nesterenko CS Faculty Retreat May 3, 2013.
MacroView a generic software package for developing macro- editing tools Saskia Ossen, Wim Hacking, Ralph Meijers, and Peter Kruiskamp.
Welcome CSCI 1100/1202 Intro to Computer Science Winter 2002.
Introduction to MATLAB 7 Engineering 161 Engineering Practices II Joe Mixsell Spring 2010.
Intro: FIT1001 Computer Systems S Important Notice for Lecturers This file is in skeleton form only Lecturers are expected to modify / enhance.
Best Practices for Introductory Computer Science Valerie Barr, Jessica Bayliss, Monisha Pulimood, Susan Rodger, Ursula Wolz.
GIS for Environmental Modeling GEO 479/559 Spring.
Computer Science: A Structured Programming Approach Using C1 2-7 Input/Output Although our programs have implicitly shown how to print messages, we have.
What’s Right with Undergraduate Statistics? Exciting Course Options.
Advocating for Your Computer Science Program Wisconsin Mathematics Council Annual Conference Green Lake, WI Friday, May 2, 2014 Joe Kmoch
4 th period Who is Ms. Kaplan??? My involvement at DHS:  I teach Geometry, Pre-Calculus, and Intro to Computer Programming.
Introduction to MATLAB 7 Engineering 161 Engineering Practices II Joe Mixsell Spring 2012.
Robots in an Interdisciplinary Course in the Liberal Arts Ellen Walker, Computer Science Lee Braver, Philosophy.
CS 4620 Intelligent Systems. What we want to do today Course introductions Make sure you know the schedule for the next three weeks.
CSCE 315 Programming Studio Spring 2013 John Keyser.
Capstone Project Fall Course Information Instructor Ye Zhao –Office: MSB 220 – Fall 2015 (MSB162) –Time: Tue, Thu 10:45am.
PRESENTED BY : Dr. Sarah M.Eljack 8/31/ The course codeThe course nameCredit hours COMP413Computer graphics3 The course contents (lectures + exercises)
SFTW241 Programming Languages Architecture 2002~2003 Semester II Duration: 2003/2/11~ 2003/4/11 University of Macau Faculty of Science and Technology Computer.
Experiential +Engagement +Algebra Courses =Effective Student Learning TASS 2015 Denise Wilkinson Mathematics Professor First Year Experience Director Virginia.
13 strategies to use Powerpoint to support active learning in classroom.
Big Data Yuan Xue CS 292 Special topics on.
A “GRAND TOUR” OF COMPUTER SCIENCE: RE-DESIGNING CS1 FOR BREADTH AND RETENTION Natalie Linnell, Nicholas Tran, Carol Gittens: Santa Clara University.
MS. NANCY HARRIS LECTURER, DEPARTMENT OF COMPUTER SCIENCE CS 149 – Programming Fundamentals (Accelerated)
Start Right at Valencia by Taking the Correct Math Class
CSC207 Fall 2016.
PRE-AP computer science 1
Computational Reasoning in High School Science and Math
Welcome to the course! Meetings and communication: AC meetings
It’s called “wifi”! Source: Somewhere on the Internet!
The General Education Core in CLAS
Welcome to the course! Meetings and communication: AC meetings
Introduction to Data Programming
Compiler Construction
IMAGE PROCESSING >Introduction Digital images & histograms
Welcome to the First-Year Experience!
Studies in Computer Programming
Presentation transcript:

Messy Data: Teaching Students Early on About the Realities of Data

Cornell College Small liberal arts college (1100 students) Mathematics and Statistics Department with 4.5 tenure track lines Teach on the block plan

Statistics History at Cornell Intro stat Probability/Math Stat Stat 2 “New Frontiers” Epidemiology Dealing with Data: Data Manipulation, Data Visualization, and Big Data

Data Course Team taught with computer scientist Prerequisite either intro stat or CS 1 Focused on hands-on Morning was two hours of lecture Afternoon was two hours of computer lab

Data Course - Plan 1/3 of the course on each topic Data Cleaning Data Visualization Big Data Relevant computer science fundamentals addressed in a just-in- time fashion Use R as the software tool

Data Course - Reality 1/3 Data Cleaning 1/2 Data Visualization 1/6 Big Data

Daily Structure Morning – 2 hours M-Thur: Lecture 1 hour stat 1 hour CS Fri : Student presentations Afternoon – 2 hours Computer lab

Data Cleaning Simple issues Clearly wrong entries Potentially wrong entries Functions of a variable

Data Cleaning More complex issues Combining data sets Linking variable issues Making sure data sets are combined properly Different variable formatting in different data sets

Data Visualizations Look at published visualizations Discuss ways to improve published visualizations Specific visualizations created: Stream graphs Tree graphs Maps

Big Data Described “big data” Volume Velocity Variety Discussed computer science issues MapReduce Hadoop

Projects 3 Projects Chapter 2 of Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving by Deborah Nolan and Duncan Temple Lang Twitter project Group project

Project 1 Introduce students to R 10 years of data from the Cherry Blossom Road Race in DC Lots of data cleaning Introduced some visualization issues with larger data sets Introduced the idea of smoothing

Project 1 Done in pairs Deliberately formed with one “stat” and one “cs” student In class work following the steps given for the men’s data Written report due for women’s data Includes both code and statistical report

Project 2 Download public tweets Filter for a query term Assign a sentiment score Aggregate tweets by state Produce geographic visualization of data

Project 2 Again done in cs/stat pairs Final report Required an extension of the basic lab Required both code and statistical report

Project 3 Term-long 4-person group project First week Individual brainstorming about topics Friday morning – elevator pitches Second week Teams find data and refine goals Friday morning – check-in report from all teams – class feedback provided

Project 3 Third week Lab time devoted to project Finish data cleaning and do much of the analysis Friday morning – check-in report from all teams – class feedback provided

Project 3 Last 3 days of class Finishing touches on the analysis Create project website Final presentation to both class and other visitors

Examples of group projects

Lessons Learned Slower introduction to R Small individual assignments as we go More faculty input for statistical analysis of group projects

For more information Ann Cannon Department of Math and Stat Cornell College 600 First St SW Mt. Vernon, IA (319)