how it fits in our course Intro to Data Science how it fits in our course
A Society that is “Always On” Society, organizations, and people are “Always On”. Your “data plan” keeps you (always) in the touch Data are collected about anything, at any time, and at any place. Register for courses Check/post on social media Pay toll with your PeachPass Track your exercises with FitBit
Tell Me about Your Data Plan Your data plan may be 4G a month. How big is a GigaByte? Bit (0 or 1) and Byte (8 bits: big enough for a char.) Kilo-Byte ~= 1000 Bytes (1024 to be exact) Mega-, Giga-, and Tera- are common now Peta-, Exa-, and Zetta- An IDC study estimates that the amount of digital information stored in 2014 already exceeded 4 Zettabytes and predicts that the “digital universe” will to grow to 44 Zettabytes in 2020. The study characterizes 44 Zettabytes as “6.6 stacks of iPads from Earth to the Moon”.
A Big Challenge … One of the main challenges of today’s organizations is to extract information and value from data stored in their information systems. What is information? How is it different from data? Data are the facts of the World. Information is context- and format-specific, and may need to be processed from raw data: Sales amount & transaction time * sales associates: check discount applicability; accurate to second * top executives: review annual sales total; accurate to year
Internet of Events
A Travel by Train Episode
Data Science: definition
Data Science: illustration Data science aims to turn data into real value… Data Value Extract Trasform-ation Learn-ing DB;Spreadseet Un-Structured: Structured: Email; text Streaming Static Any type of visualization delivering insights Small Big
Data Scientists Assist organizations in turning data into value. A data scientist answers questions, like • (Reporting) What happened? • (Diagnosis) Why did it happen? • (Prediction) What will happen? • (Recommendation) What is the best that can happen?
Contributing Disciplines In this course, we’ll explore in the areas as marked with Spreadsheet Database Programming DM/ML methods