Introduction to Data Programming CSE 160 University of Washington Spring 2015 Ruth Anderson 1 Slides based on previous versions by Michael Ernst and earlier.

Slides:



Advertisements
Similar presentations
Draft Online Course Template Development Nnannah C. James
Advertisements

John Hurley Cal State LA
Helping your student with homework
Lecture 0 CSIS10A Overview. Welcome to CSIS10A (5 mins) – Typical format for class meetings New material first (monitors off, notebooks out) Practice.
CSE Spring 2015 INTERMEDIATE PROGRAMMING
Introduction to programming with Visual Basic.NET Dr. Marty Sirkin.
Introduction to Computer Programming I CSE 113
Welcome Data Imports Instant Imports & How to Create an Import File Ryan McIntire Digital Measures.
James Tam Introduction To CPSC 231 James Tam Administrative (James Tam) Contact Information -Office: ICT 707 -
The Information School of the University of Washington University of Washington1 Introduction INFO/CSE 100, Spring 2005.
CS 201: Introduction To Programming With Java
CS 410 Applied Algorithms Applied Algorithms Lecture #1 Introduction, class information, first problems.
CSC 171 – FALL 2004 COMPUTER PROGRAMMING LECTURE 0 ADMINISTRATION.
A-1 © 2000 UW CSE University of Washington Computer Programming I Lecture 1: Overview and Welcome Dr. Martin Dickey University of Washington.
The Information School of the University of Washington Sept University of Washington1 Introduction INFO/CSE 100, Fall 2006 Fluency.
COMPUTER SCIENCE 10: INTRODUCTION TO COMPUTER SCIENCE Dr. Natalie Linnell with credit to Cay Horstmann and Marty Stepp.
CS 115 TA Orientation Fall More students! Enrollment up to sections + night about 22% CS majors (50 on 8/16)
1 COMPUTER SCIENCE 1026A COMPUTER SCIENCE FUNDAMENTALS Topic 1 Introduction to Computer Science and Programming Notes adapted from Introduction to Computing.
WEEK 1 CS 361: ADVANCED DATA STRUCTURES AND ALGORITHMS Dong Si Dept. of Computer Science 1.
CSE 1111 Week 1 CSE 1111 Introduction to Computer Science and Engineering.
PLEASE GRAB A SEAT ANYWHERE FOR NOW. Welcome to the CMSC 201 Class!!! Mr. Lupoli ITE 207.
CSE 501N Fall ‘09 00: Introduction 27 August 2009 Nick Leidenfrost.
COMP Introduction to Programming Yi Hong May 13, 2015.
Lecture 1 Page 1 CS 111 Summer 2015 Introduction CS 111 Operating System Principles.
EECE 310 Software Engineering Lecture 0: Course Orientation.
Introduction to Information Systems and Technology MIS 213, Spring 2015 CIS 2005, CIS 1007.
1 Agenda Administration Background Our first C program Working environment Exercise Memory and Variables.
CSE 140 wrapup Michael Ernst CSE 140 University of Washington.
Prof. Matthew Hertz WTC 207D /
Computer Science 10: Introduction to Computer Science Dr. Natalie Linnell with credit to Cay Horstmann and Marty Stepp.
Introduction to EGR115 1.Welcome! 2.Your instructors 3.Class format 4.Requirements 5.Topics 6.Grading 7.Help 1.
Course Overview SYS 7340 Logistics Systems Engineering.
CSE 403 Software Engineering Richard Anderson, David Notkin, Valentin Razmov Spring 2005.
Introduction to CSE 331 Software Design & Implementation Winter 2011
+ Introduction to Class IST210 Class Lecture. + Course Objectives Understand the importance of data, databases, and database management Design and implement.
Lecture 1 Page 1 CS 111 Summer 2013 Introduction CS 111 Operating System Principles Peter Reiher.
Principles of Computer Science I Honors Section Note Set 1 CSE 1341 – H 1.
IOS110 Fall 2007 Getting Acquainted & Course Expectations.
Ministry of Higher Education Sohar College of Applied Sciences IT department Comp Introduction to Programming Using C++ Fall, 2011.
Institute for Personal Robots in Education (IPRE)‏ CSC 170 Computing: Science and Creativity.
CSE 1105 Week 1 CSE 1105 Course Title: Introduction to Computer Science & Engineering Classroom Lecture Times: Section 001 W 4:00 – 4:50, 202 NH Section.
INTRODUCTION TO PROGRAMMING ISMAIL ABUMUHFOUZ | CS 146.
Introduction to Python Lesson 1 First Program. Learning Outcomes In this lesson the student will: 1.Learn some important facts about PC’s 2.Learn how.
CSE 1105 Week 1 CSE 1105 Introduction to Computer Science & Engineering Time: Wed 4:00 – 4:50 Thurs 9:30 – 10:20 Thurs 4:00 – 4:50 Place: 100 Nedderman.
CSE 190p wrapup Michael Ernst CSE 190p University of Washington.
Welcome to Back to School Night Mrs. Wiltz 8 th Grade Science.
CSE 160 Wrap-Up CSE 160 Spring 2015 University of Washington 1.
1 CS 101 Today’s class will begin about 5 minutes late We will discuss the lab scheduling problems once class starts.
FROM SAGE ON THE STAGE TO GUIDE ON THE SIDE HELEN ADAMS BOPMA CONFERENCE 20 NOVEMBER 2015 Using BYOD to enhance student achievement.
James Tam Introduction To CPSC 233 James Tam Java Object-Orientation Graphical-user interfaces.
Dr. Sajib Datta CSE Spring 2016 INTERMEDIATE PROGRAMMING.
Dr. Sajib Datta Jan 15,  Instructor: Sajib Datta ◦ Office Location: ERB 336 ◦ Address: ◦ Web Site:
The Information School of the University of Washington Information System Design Info-440 Autumn 2002.
1 Computer Science 1021 Programming in Java Geoff Draper University of Utah.
CSC 241: Introduction to Computer Science I
CS101 Computer Programming I
Introduction to Data Programming
University of Washington Computer Programming I
Week 1 Gates Introduction to Information Technology cosc 010 Week 1 Gates
Introduction to Data Programming
Introduction to Data Programming
Introduction to Data Programming
Accelerated Introduction to Computer Science
Lecture 1a- Introduction
CSE 143: Computer Programming II
Introduction INFO/CSE 100, Spring 2006
Course Introduction Data Visualization & Exploration – COMPSCI 590
CS201 – Course Expectations
CSC 241: Introduction to Computer Science I
Presentation transcript:

Introduction to Data Programming CSE 160 University of Washington Spring 2015 Ruth Anderson 1 Slides based on previous versions by Michael Ernst and earlier versions by Bill Howe

Welcome to CSE 160! CSE 160 teaches core programming concepts with an emphasis on real data manipulation tasks from science, engineering, and business Goal by the end of the quarter: Given a data source and a problem description, you can independently write a complete, useful program to solve the problem 2

Course staff Lecturer: – Ruth Anderson TAs: – Lee Organick – Trevor Perrier – Nicholas Shahan Ask us for help! 3

Learning Objectives Computational problem-solving – Writing a program will become your “go-to” solution for data analysis tasks Basic Python proficiency – Including experience with relevant libraries for data manipulation, scientific computing, and visualization. Experience working with real datasets – astronomy, biology, linguistics, oceanography, open government, social networks, and more. – You will see that these are easy to process with a program, and that doing so yields insight. 4

What this course is not A “skills course” in Python – …though you will become proficient in the basics of the Python programming language – …and you will gain experience with some important Python libraries A data analysis / “data science” / data visualization course – There will be very little statistics knowledge assumed or taught A “project” course – the assignments are “real,” but are intended to teach specific programming concepts A “big data” course – Datasets will all fit comfortably in memory – No parallel programming 5

“It’s a great time to be a data geek.” -- Roger Barga, Microsoft Research 6 “The greatest minds of my generation are trying to figure out how to make people click on ads” -- Jeff Hammerbacher, co-founder, Cloudera

7 All of science is reducing to computational data manipulation Old model: “ Query the world ” (Data acquisition coupled to a specific hypothesis) New model: “ Download the world ” (Data acquisition supports many hypotheses) – Astronomy: High-resolution, high-frequency sky surveys (SDSS, LSST, PanSTARRS) – Biology: lab automation, high-throughput sequencing, – Oceanography: high-resolution models, cheap sensors, satellites 40TB / 2 nights ~1TB / day 100s of devices Slide from Bill Howe, eScience Institute

Example: Assessing treatment efficacy Zip code of clinic Zip code of patient number of follow ups within 16 weeks after treatment enrollment. Question: Does the distance between the patient’s home and clinic influence the number of follow ups, and therefore treatment efficacy? 8

Python program to assess treatment efficacy # This program reads an Excel spreadsheet whose penultimate # and antepenultimate columns are zip codes. # It adds a new last column for the distance between those zip # codes, and outputs in CSV (comma-separated values) format. # Call the program with two numeric values: the first and last # row to include. # The output contains the column headers and those rows. # Libraries to use import random import sys import xlrd # library for working with Excel spreadsheets import time from gdapi import GoogleDirections # No key needed if few queries gd = GoogleDirections('dummy-Google-key') wb = xlrd.open_workbook('mhip_zip_eScience_121611a.xls') sheet = wb.sheet_by_index(0) # User input: first row to process, first row not to process first_row = max(int(sys.argv[1]), 2) row_limit = min(int(sys.argv[2]+1), sheet.nrows) def comma_separated(lst): return ",".join([str(s) for s in lst]) headers = sheet.row_values(0) + ["distance"] print comma_separated(headers) for rownum in range(first_row,row_limit): row = sheet.row_values(rownum) (zip1, zip2) = row[-3:-1] if zip1 and zip2: # Clean the data zip1 = str(int(zip1)) zip2 = str(int(zip2)) row[-3:-1] = [zip1, zip2] # Compute the distance via Google Maps try: distance = gd.query(zip1,zip2).distance except: print >> sys.stderr, "Error computing distance:", zip1, zip2 distance = "" # Print the row with the distance print comma_separated(row + [distance]) # Avoid too many Google queries in rapid succession time.sleep(random.random()+0.5) 23 lines of executable code! 9

Course logistics Website: See the website for all administrative details Read the handouts and required texts, before the lecture – There is a brief reading quiz due before each lecture Take notes! Homework 1 part 1 is due Wednesday – As is a survey (and a reading quiz before lecture) You get 5 late days throughout the quarter – No assignment may be submitted more than 3 days late. (contact the instructor if you are hospitalized) If you want to join the class, sign sheet at front of class, from 10

Academic Integrity Honest work is required of a scientist or engineer Collaboration policy on the course web. Read it! – Discussion is permitted – Carrying materials from discussion is not permitted – Everything you turn in must be your own work Cite your sources, explain any unconventional action – You may not view others’ work – If you have a question, ask I trust you completely I have no sympathy for trust violations – nor should you 11

How to succeed No prerequisites Non-predictors for success: – Past programming experience – Enthusiasm for games or computers Programming and data analysis are challenging Every one of you can succeed – There is no such thing as a “born programmer” – Work hard – Follow directions – Be methodical – Think before you act – Try on your own, then ask for help – Start early 12

Me (Ruth Anderson) Grad Student at UW: in Programming Languages, Compilers, Parallel Computing Taught Computer Science at the University of Virginia for 5 years PhD at UW: in Educational Technology, Pen Computing Current Research: Computing and the Developing World, Computer Science Education

14 Introductions Name address Major Year (1,2,3,4,5) Hometown Interesting Fact or what I did over break.