Programming For Big Data Darren Redmond
Programming Languages Python R Java C, C++ Ruby
Why, Why, Why History of Python Why Python The End Game Guido van Rossum – 1989 was bored at Christmas Why Python Easy to learn Powerful Data structures Modular Embedding Map Reduce / Lambda / Yield Interactive Shell http://www.python-course.eu/index.php The End Game http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
Interactive Interpreter python >>> print “Hello World” easier? “Hello World” 12 / 7 12.0 / 7 3 + 2 * 4 # 11 _ # the most recent value _ * 3 # 33 Ctrl-D
Execute Script Multiple ways to execute a script – below are 4 ways for a script called script-name.py: From command prompt - uncompiled python script-name.py From python interpreter – ensure to start python from directory of script for now. import py_compile py_compile.compile(‘script-name.py’) Compile from command line python –m py_compile script-name.py python –m compileall . Py and Pyc files available
Indentation Scope achieved through indentation – not brackets Auto creation and interpretation of variables i = 42 i = i + 1 # 43 print id(i) Types Numbers -> integers, long integers, floating point numbers, complex Strings -> functions – concat (+), slicing [2:4] Operators – input, raw_input Casting to list etc…
Conditional if elif else max = (a > b) ? a : b; This is an abbreviation for the following C code if (a > b) max=a; max=b; C programmers have to get used to a different notation in Python max = a if (a > b) else b;
Looping #!/usr/bin/env python n = 100 sum = 0 i = 1 while i <= n: sum = sum + i i = i + 1 print "Sum of 1 until %d: %d" % (n,sum)
Bibliography Python for Data Analysis Programming Python, 4th Edition Data Wrangling with Pandas, NumPy, and Ipython Wes McKinney, O’Reilly, 2012 Programming Python, 4th Edition Powerful Object-Oriented Programming Mark Lutz, O’Reilly, 2010 Agile Data Science – Building Data Analytics with Hadoop Russell Jurney, O’Reilly, 2013 Functional Python Programming Steven Lott, Pakt Publishing, 2015
Practice, Practice, Practice From Lecture 1 you should be able to write a python script file to do calculations and print them to the screen Write a program to print ‘Hello World’ to the screen Write a program to sum the first 100 numbers Write a program to multiply the first 10 numbers
Summary Programming Languages for Big Data Why Python Hello World Executing a Script Indentation Conditional Programming Looping Bibliography Practice