CSC 458– Predictive Analytics I, Fall 2017, Intro. To Python This is an outline. Come to class or attend via RTVC and play along as we play with Python to learn how it works. Updated 8/31/2017 for Python 3.x.
Introduction to Python ~parson/DataMine/prepdata1 has two examples. lsTOarff.py is my completed, working code example. psTOarff.py is the starting point for assignment 1. Both of these Python programs extract data from a “raw data” text file and format it for an ARFF file (Attribute Relation File Format) for use by the Weka data mining tool. In ~parson/DataMine see lsTOarff.rawtestdata.txt lsTOarff.arff.ref psTOarff.rawtestdata.txt psTOarff.arff.ref.
Python’s read-eval-print UI. You can interaction with Python to compute interactively. It can also interpret script files. You create variables on the fly. They hold whatever type of data you put into them. $ python –V # Can be version 2.x or 3.x. Use 3.x per course page. Python 3.6.0 $ python >>> a = 2 ; b = 4.7 ; (a - 7) * b -23.5 # ; and newline are command separators
Python uses indentation, not {}, to delimit flow-of-control constructs >>> a = 7 >>> if a <= 7: print (a, "Is low”) else: print(a, "is high”) 7 Is low # Do NOT mix leading spaces with TABS in assignments. # Use leading spaces to be compatible with handouts.
for loop iterates over list of values. >>> a = 7 >>> mylist = [a, 'a', "Strings use either delimiter"] >>> for s in mylist: print(s) 7 a Strings use either delimiter
range() creates a list of numbers >>> for i in range(1,3): print("i is", i) # Note that the final value is exclusive i is 1 i is 2 >>> for i in range(3,-3,-2): # -2 here is an increment print("i is", i) i is 3 i is -1
Use and, or, not instead of &&, ||, ! as used in Java or C++ >>> a = 1 ; b = 5 >>> while (a <= 3) and (b >= 3): print("a, b", a, b) a += 1 ; b = b - 2 a, b 1 5 a, b 2 3 >>> print("a, b", a, b) a, b 3 1
Basic data types Basic data types include strings, ints, floats, and None, which is Python’s “no value” type. Use a raw string to make escape sequences literal. >>> a = "a string" ; b = 'another string' ; c = -45 ; d = 4.5 ; e = None >>> print(a,b,c,d,e) a string another string -45 4.5 None >>> raws = r'a\n\nraw string' >>> print(raws) a\n\nraw string
Aggregate data types A Python list is a sequence of values. A dictionary maps keys to values. We won’t use sets or tuples in assignment 1. >>> L = ['a', 1, ["b", 2]] >>> for e in L: print(e) a 1 ['b', 2]
Dictionary maps keys to values >>> m = {'a': 1, "b" : 2} ; m['c'] = 3 >>> for k in m.keys(): print(k, m[k]) a 1 c 3 b 2 >>> 'b' in m # same as 'b' in m.keys() # Python 2.x allows: m.has_key('b') True >>> 'z' in m False
Python has functions and classes We will not use them in assignment 1. >>> def f(a, b): ... return a + b ... >>> f(1, 3.5) 4.5 >>> f("prefix", 'suffix') 'prefixsuffix'
Library modules & more We will go over the re (regular expression), sys (system), and datetime modules’ functions used in assignment 1 when we go over it. Come to class. We will explore Python in class in person or via RTVC. https://docs.python.org/3/ has a tutorial and more detailed documentation.