Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Science with Python

Similar presentations


Presentation on theme: "Data Science with Python"— Presentation transcript:

1 Data Science with Python
University of Cincinnati Philip Bohun

2 Day 1 Introduction Overview of Python Data Munging/rangling
Data Visualization Regression

3 Introduction Goals: Setup and run a Python environment
Understand strengths and weaknesses of Python Basic data manipulation with Python Do basic data wrangling/manipulation Use packages for data analysis

4 Data Science Is it A or B? → classification
Is this weird? → anomoly detection How much/many? → regression How is this organized? → clustering What should I do next? → reinforcement learning

5 Python Overview Setup Environment Python is an interpreted language
Python interpreter Text editor Python is an interpreted language White space is important [in class exercise]

6 Variables, Functions, Modules
Variable: named object that can change value Literal: an unchangeable string or number Function: reusable block of code Module: reusable group of functions and variables [in class exercise]

7 Control Flow Control flow allows us to make decisions in code
Types of control flow Sequence Selection Iteration [in class exercise]

8 File I/O Files can be opened an closed Files can be text or binary
[in class exercise]

9 Simple Database Interaction
Python has a built in library to create and interact with SQLite databases Great for small and local projects [in class exercise]

10 Data Types and Data Structures
Major data types in Python are: Numeric Sequences Sets Mappings [in class exercise]

11 Data Munging/Wrangling
Dates and times are complicated Dealing with empty values (null,NaN, etc.) Handling strings and unstructured data [in class exercise]

12 Data Visualization Visualizing data can be important for EDA as well as communicating results There are tools specifically designed for data visualization We will cover the basics [in class exercise]

13 Regression Regressions allow us to answer the question how much or how many There are many types of regressions Let’s look at some simple regressions [in class exercise]

14 Day 2 Program Design Modules Machine Learning

15 Program Design Problem decomposition is one of the most important parts of program design Rule of thumb: functions should be < 30 lines A function should do one thing well [in class exercise]

16 Algorithm Basics Algorithms determine how much work your programs do
If not using a library function for something, always search for the best algorithm Don’t start with code, it’s best to solve the problem, then write code [in class example]

17 Modules Code that is reusable and covers a single topic should be gathered into a module Mixing concerns in software leads to unnecessary complexity making software difficult to test and debug [in class exercise]

18 Machine Learning Machine learning solves problems of optimization
It is very useful for searching very large possibility spaces Also useful when approximate answers are useful [in class lab]

19 Machine Learning [ FULL PROJECT ]


Download ppt "Data Science with Python"

Similar presentations


Ads by Google