Data Science – ITEC/CSCI/ERTH-4350/6350 Definitions, History of Data (and Information), Data Science, Current Challenges Peter Fox Data Science – ITEC/CSCI/ERTH-4350/6350 Module 1, September 5, 2017
Admin info (keep/ print this slide) Class: ITWS/CSCI/ERTH-4350/6350 Instructor: Peter Fox Instructor contact: pfox@cs.rpi.edu Contact hours: Monday** 3:00-4:00pm (or by appt) Contact location: Winslow 2120/ Lally 207A TA: Eliyah Afzal - afzale@rpi.edu Web site: http://tw.rpi.edu/web/courses/DataScience/2017 This course is in “Blended” format – low residency Online/ asynchrnous LMS (http://lms.rpi.edu/ 1709_Data Science [1709_ITWS_CSCI_ERTH_6350_4350_00]) Adobe Connect http://connect.mms.rpi.edu/itws6350 Synchronous hours: 3pm-5:50pm Tuesday Synchronous Location: Lally 102 Troy campus
Contents Please review slides ~ 3 – 73 from https://lms9.rpi.edu:8443/bbcswebdav/pid-140067-dt-content-rid-678604_1/xid-678604_1 (login required) or http://tw.rpi.edu/media/latest/DataScience2016_module1_Intro.ppt (no login required) AND watch/ listen to https://lms9.rpi.edu:8443/webapps/blackboard/content/contentWrapper.jsp?content_id=_140067_1&displayName=Linked+File&navItem=content&attachment=true&course_id=_1425_1&tab_group=courses&href=https%3A%2F%2Fconnect.mms.rpi.edu%2Fp7yeu50qigh%2F or https://connect.mms.rpi.edu/p7yeu50qigh/ (Adobe Connect login as guest) Ignore slides that are specific to 2016 course
Current Syllabus/Schedule Module1: History of Data and Information, Data, Information, Knowledge Concepts and State-of-the-Art Module 2: Data and information acquisition (curation, preservation) and metadata - management Module 3: Data formats, metadata standards, conventions, reading and writing data and information Module 4: Data Analysis I Module 5: Class exercise - collecting data - individual Module 6: Class Presentations: present your data (3-4 groups) no classes on the Tuesday after Columbus day – Tuesday follows Monday schedule Module 7: Data Analysis II and Class exercise - group project - working with someone else's data Module 8: Introduction to Data Mining for Data Science, Data Quality, Uncertainty and Bias Module 9: Data as Service Paradigms and Data Infrastructures Module 10: Data Workflow Management, Preservation and Data Stewardship Module 11: Academic basis for Data Science, Data Models, Schema, Markup Languages Module 12: Webs of Data and Data on the Web, the Deep Web, Data Discovery, Data Integration, Data Citation Module 13: Final Project Presentations
Questions so far? There is a discussion forum on LMS for Module 1 – please post questions there and watch for answers
Introductions Listen to the video recording for 2016 to hear about last years students Blog posts on LMS under “Introductions” – enter your responses (text or record voice/ video) – just a few minutes Who you are, background? Why you are here? What you expect to learn?
Next start at Slide 13 in 2016 recording/ and the slide deck through to end Complete readings for week 1