Transforming Data (Python®)

Slides:



Advertisements
Similar presentations
Lists Introduction to Computing Science and Programming I.
Advertisements

Connecting with Computer Science 2 Objectives Learn why numbering systems are important to understand Refresh your knowledge of powers of numbers Learn.
Higher Computing Computer Systems S. McCrossan 1 Higher Grade Computing Studies 1. Data Representation Data Representation – Why do we use binary? simplicity,
PYTHON FOR HIGH PERFORMANCE COMPUTING. OUTLINE  Compiling for performance  Native ways for performance  Generator  Examples.
Input, Output and Variables GCSE Computer Science – Python.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 6- 1.
1 i206: Lecture 17: Exam 2 Prep ; Intro to Regular Expressions Marti Hearst Spring 2012.
CSC 108H: Introduction to Computer Programming Summer 2011 Marek Janicki.
Stats Lab #3.
Binary Fractions.
PH2150 Scientific Computing Skills
AP CSP: Cleaning Data & Creating Summary Tables
Course Contents KIIT UNIVERSITY Sr # Major and Detailed Coverage Area
Excel STDEV.S Function.
User-Written Functions
IGCSE 4 Cambridge Data types and arrays Computer Science Section 2
EGR 2261 Unit 10 Two-dimensional Arrays
Chapter 7 Matrix Mathematics
Containers and Lists CIS 40 – Introduction to Programming in Python
Matrix 2015/11/18 Hongfei Yan zip(*a) is matrix transposition
Hashing - Hash Maps and Hash Functions
Matrix 2016/11/30 Hongfei Yan zip(*a) is matrix transposition
Chapter 6 Floating Point
Introduction to Pointers
Introduction to Pointers
void Pointers Lesson xx
Understanding the Log Function
Statistical Analysis with Excel
Arrays & Functions Lesson xx
Object Oriented Programming COP3330 / CGS5409
Learning to Program in Python
Learning to Program in Python
Statistical Analysis with Excel
Linked List Lesson xx   In this presentation, we introduce you to the basic elements of a linked list.
Teaching Computing to GCSE
One-Dimensional Array Introduction Lesson xx
CMSC202 Computer Science II for Majors Lecture 04 – Pointers
Teach A level Computing: Algorithms and Data Structures
Winter 2018 CISC101 12/1/2018 CISC101 Reminders
Lesson 16: Functions with Return Values
CS 240 – Lecture 9 Bit Shift Operations, Assignment Expressions, Modulo Operator, Converting Numeric Types to Strings.
ID1050– Quantitative & Qualitative Reasoning
LESSON 13 – INTRO TO ARRAYS
Coding Concepts (Sub- Programs)
Java Programming Arrays
Teaching Computing to GCSE
Coding Concepts (Data Structures)
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
ARRAYS 1 GCSE COMPUTER SCIENCE.
Coding Concepts (Basics)
Variables Title slide variables.
Fundamentals of Data Representation
Arrays .
Functions Pass By Value Pass by Reference
Coding Concepts (Data- Types)
Recall: ROM example Here are three functions, V2V1V0, implemented with an 8 x 3 ROM. Blue crosses (X) indicate connections between decoder outputs and.
CS202 - Fundamental Structures of Computer Science II
Lecture 1. Program Surgery
The Study of Computer Science
Reference semantics, variables and names
ARRAYS 2 GCSE COMPUTER SCIENCE.
Mastering Memory Modes
Overloading functions
Homework Finishing Chapter 2 of K&R. We will go through Chapter 3 very quickly. Not a lot is new. Questions?
Outline Announcements Differences between FORTRAN and C
Dr. Sampath Jayarathna Old Dominion University
ECE 352 Digital System Fundamentals
SQL – Application Persistence Design Patterns
Class code for pythonroom.com cchsp2cs
Presentation transcript:

Transforming Data (Python®) Computer Science and Software Engineering © 2014 Project Lead The Way, Inc.

Transforming Data: Why? Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Transforming Data: Why? Examples: Change representation ['$1', '$2', '$3', …]  [1, 2, 3, …] Arithmetic to convert units or calculate a result [1, 2, 3, …]  [1.05, 2.10, 3.15, …] What if you want to increase all your data by 5% to account for inflation? How would you do some arithmetic or other operation to each data value? In Python the for loop comes to mind. A column of formulas in Excel® sounds appropriate too. This presentation will show three other ways to transform a list of data using Python.

Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Solution 1: for Loop old_data = [1, 2, 3, 4, 5] new_data = [ ] for element in old_data: new_data.append(element*1.05) In [ ]:new_data Out[ ]:[1.05, 2.10, 3.15, 4.20, 5.25] To do this in Python with a for loop, change the elements in place or use an aggregator to create a new list for the results, as shown here.

Solution 2: Array Operations Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Solution 2: Array Operations old_data = [1, 2, 3, 4, 5] data = numpy.array(old_data) new_data = 1.05 * data In [ ]:new_data Out[ ]:array([1.05, 2.10, 3.15, 4.20, 5.25]) The numpy library contains the "array" data type and defines addition and multiplication for arrays like this. We'll say more about arrays in numpy, and arrays in general, in a few slides.

Solution 3: map(function, list) Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Solution 3: map(function, list) map() applies a function to each element in list old_data = [1, 2, 3, 4, 5] def inflate(x): return x * 1.05 new_data = map(inflate, old_data) In [ ]:new_data Out[ ]:[1.05, 2.10, 3.15, 4.20, 5.25] The Python built-in function map() will do the trick here, too. Here we've defined a new function: inflate. We use map to apply inflate() to each element.

[ <expression> for <element> in <list> ] Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Solution 4: Python Generator Expression [ <expression> for <element> in <list> ] [f(x) for x in list] old_data = [1, 2, 3, 4, 5] new_data = [x*1.05 for x in old_data] In [ ]:new_data Out[ ]:[1.05, 2.10, 3.15, 4.20, 5.25] For loops and arrays are common across nearly all programming languages. The solution shown here, however, uses a syntax unique to Python. The "lazy generator expression" looks like a for loop inside of a list and has lazy properties that conserve memory, processing power, and storage access times – details not relevant to us here but one of the more powerful aspects of Python.

Transforming Data Examples x  int(x)  min(0,(int(x)) Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Transforming Data Examples x  int(x)  min(0,(int(x)) How do these calculations connect to the picture that tells a thousand words, the histogram? A normal distribution of sample data is shown in the left plot. These thousand values are floats. Taking the int() function on each value in the data set transforms the data. The int() function sends all values toward 0. In the third plot, all negative values have been changed to 0.

A Bit More About Arrays in Python Presentation Name Course Name Unit # – Lesson #.# – Lesson Name A Bit More About Arrays in Python list + list appends instead of transforms In [ ]: a = [1, 2, 3] In [ ]: b = [4, 5, 6] In [ ]: a + b Out[ ]: [1, 2, 3, 4, 5, 6] But numpy arrays allow + and * array + array In [ ]: np.array(a) + np.array(b) Out[ ]: np.array([5,7,9]) scalar + array and scalar * array In [ ]: 2*np.array(a) + 100 Out[ ]: np.array([102, 104, 106]) The earlier example with numpy arrays multiplied all elements by 1.05. A number like 1.05 is called a scalar. If you multiply an array by a scalar, you get another array. Add an array to a scalar, and you still get an array. So the [1, 2, 3] in array a have been doubled and then increased by 100. You can also add two arrays together, item by item. The 5 is the sum of the first elements of a and b, 1 and 4. You can't add lists that way; Python will just concatenate them together.

A Lot More About Arrays – All Languages Presentation Name Course Name Unit # – Lesson #.# – Lesson Name A Lot More About Arrays – All Languages Arrays are faster than lists An array has elements of one data type Binding table Additional memory Name Starting Address Address Increment Class Element Type foo 0x32F2 0x0008 array int The last two slides here are more advanced, but they might help you understand what is going on in the computer at a lower level. Arrays are different than lists. Arrays are faster than lists because they are stored differently. In an array, each element takes up the same number of bits in memory. The address of the 10th element can easily be calculated. The computer can access any element quickly with simple arithmetic to calculate the address of the element. Address 0x32F2 0x32FA 0x3302 Contents foo[0] foo[1] foo[2]

Lists – All Languages Lists store an address and type for each element Presentation Name Course Name Unit # – Lesson #.# – Lesson Name Lists – All Languages Lists store an address and type for each element Binding table Additional memory Name Starting Address Class foo 0x32F2 list The references are each the same size, e.g., 64 bit (0x0040) Lists are more flexible since any data type can be stored in any element, but they're slower. Good data skills include thinking about how to write code that will work in reasonable time even when scaled up to terabyte data. Address 0x32F2 0x3332 0x3372 Contents:Starting Address Contents:Element Type 0x8C32 int 0xE333 float 0x9A12 tuple Address 0x8C32 Contents foo[0] Address 0xE333 Contents foo[1] Address 0x9A12 Contents foo[2]