Python Programming in Context Chapter 7. Objectives To use Python lists as a means of storing data To implement a nontrivial data mining application To.

Slides:



Advertisements
Similar presentations
Algorithms and applications
Advertisements

Python Mini-Course University of Oklahoma Department of Psychology Day 2 – Lesson 8 Fruitful Functions 05/02/09 Python Mini-Course: Day 2 - Lesson 8 1.
Programming Patterns CSC 161: The Art of Programming Prof. Henry Kautz 9/30/2009.
Adapted from John Zelle’s Book Slides
Python Programming: An Introduction to Computer Science
Vahé Karamian Python Programming CS-110 CHAPTER 2 Writing Simple Programs.
Computer programming Lecture 3. Lecture 3: Outline Program Looping [Kochan – chap.5] –The for Statement –Relational Operators –Nested for Loops –Increment.
Alice in Action with Java Chapter 10 Flow Control in Java.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 6 Repetition Statements.
Alice in Action with Java Chapter 10 Flow Control in Java.
Chapter 2 Writing Simple Programs
K-means Clustering. What is clustering? Why would we want to cluster? How would you determine clusters? How can you do this efficiently?
1 Python Chapter 4 Branching statements and loops © Samuel Marateck 2010.
INTRODUCTION TO PYTHON PART 3 - LOOPS AND CONDITIONAL LOGIC CSC482 Introduction to Text Analytics Thomas Tiahrt, MA, PhD.
Computer Science 111 Fundamentals of Programming I Iteration with the for Loop.
Loops and Iteration Chapter 5 Python for Informatics: Exploring Information
COMPE 111 Introduction to Computer Engineering Programming in Python Atılım University
Python – Part 4 Conditionals and Recursion. Modulus Operator Yields the remainder when first operand is divided by the second. >>>remainder=7%3 >>>print.
Python Programming, 2/e1 Python Programming: An Introduction to Computer Science Chapter 2.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 6 Value- Returning Functions and Modules.
CS 127 Writing Simple Programs.  Stages involved  Analyze the problem  Understand as much as possible what is trying to be solved  Determine Specifications.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley STARTING OUT WITH Python Python First Edition by Tony Gaddis Chapter 6 Value-Returning.
Python Programming in Context Chapter 2. Objectives To understand how computers can help solve real problems To further explore numeric expressions, variables,
Copyright © 2010 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with Programming Logic & Design Second Edition by Tony Gaddis.
Programming for Engineers in Python Sawa 2015 Lecture 2: Lists and Loops 1.
Chapter 4: Decision Making with Control Structures and Statements JavaScript - Introductory.
Computer Science 111 Fundamentals of Programming I The while Loop and Indefinite Loops.
Clustering Methods K- means. K-means Algorithm Assume that K=3 and initially the points are assigned to clusters as follows. C 1 ={x 1,x 2,x 3 }, C 2.
 Expression Tree and Objects 1. Elements of Python  Literals, Strings, Tuples, Lists, …  The order of file reading  The order of execution 2.
Python Programming in Context Chapter 12. Objectives To introduce the concept of inheritance To create a working object-oriented graphics package To provide.
Chapter 1. Objectives To provide examples of computer science in the real world To provide an overview of common problem- solving strategies To introduce.
Python Programming in Context
Python Programming in Context Chapter 13. Objectives To write an event driven program To understand and write callback functions To practice with lists.
Python Programming in Context Chapter 4. Objectives To understand Python lists To use lists as a means of storing data To use dictionaries to store associative.
If..else Use random numbers to compute an approximation of pi Simulation of a special game of darts Randomly place darts on the board pi can be computed.
Python uses boolean variables to evaluate conditions. The boolean values True and False are returned when an expression is compared or evaluated.
Clustering Clustering is a technique for finding similarity groups in data, called clusters. I.e., it groups data instances that are similar to (near)
List comprehensions (and other shortcuts) UW CSE 160 Spring 2015.
Chapter 8: MuPAD Programming I Conditional Control and Loops MATLAB for Scientist and Engineers Using Symbolic Toolbox.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley STARTING OUT WITH Python Python First Edition by Tony Gaddis Chapter 5 Repetition.
1 Standard Version of Starting Out with C++, 4th Brief Edition Chapter 5 Looping.
Chapter 6 Looping Structures. Do…LoopDo…Loop Statement Can operate statements repetitively Do intx=intx + 1 Loop While intx < 10 –The Loop While operates.
2. WRITING SIMPLE PROGRAMS Rocky K. C. Chang September 10, 2015 (Adapted from John Zelle’s slides)
Python Mini-Course University of Oklahoma Department of Psychology Day 3 – Lesson 11 Using strings and sequences 5/02/09 Python Mini-Course: Day 3 – Lesson.
Midterm Exam Topics (Prof. Chang's section) CMSC 201.
Clustering (1) Chapter 7. Outline Introduction Clustering Strategies The Curse of Dimensionality Hierarchical k-means.
COMP Loop Statements Yi Hong May 21, 2015.
9. ITERATIONS AND LOOP STRUCTURES Rocky K. C. Chang October 18, 2015 (Adapted from John Zelle’s slides)
Canopy Clustering Given a distance measure and two threshold distances T1>T2, 1. Determine canopy centers - go through The list of input points to form.
While ( number
 Python for-statements can be treated the same as for-each loops in Java Syntax: for variable in listOrstring: body statements Example) x = "string"
Python: Iteration Damian Gordon. Python: Iteration We’ll consider four ways to do iteration: – The WHILE loop – The FOR loop – The DO loop – The LOOP.
Introduction to Python Developed by Dutch programmer Guido van Rossum Named after Monty Python Open source development project Simple, readable language.
Learning Objectives 1. Understand how the Small Basic Turtle operates. 2. Be able to draw geometric patterns using iteration.
Computer C programming Chapter 3. CHAPTER 3 Program Looping –The for Statement –Nested for Loops –for Loop Variants –The while Statement –The do Statement.
Control Statements 1. Conditionals 2 Loops 3 4 Simple Loop Examples def echo(): while echo1(): pass def echo1(): line = input('Say something: ') print('You.
Data Mining – Algorithms: K Means Clustering
Chapter 4.
Python Loops and Iteration
loops for loops iterate over a given sequence.
Basic operators - strings
While Loops in Python.
Python - Loops and Iteration
Frozen Graphics Lesson 3.
Coding Concepts (Basics)
Topics Introduction to Value-returning Functions: Generating Random Numbers Writing Your Own Value-Returning Functions The math Module Storing Functions.
Python Basics with Jupyter Notebook
Introduction to Computer Science
Programming II Vocabulary Review Loops & functions
Neal Kurande, WinaGodwin Anyanwu Jr., Adam Chau
Presentation transcript:

Python Programming in Context Chapter 7

Objectives To use Python lists as a means of storing data To implement a nontrivial data mining application To understand and implement cluster analysis To use visualization as a means of displaying patterns

Cluster Data points that have something in common Clusters are dissimilar to each other Use simple Euclidean distance to measure how close one point is to another Centroid is a point that represents a cluster (not necessarily a real data point)

Figure 7.1

Figure 7.2

Figure 7.3

Figure 7.4

Listing 7.1 def euclidD(point1, point2): sum = 0 for index in range(len(point1)): diff = (point1[index]-point2[index]) ** 2 sum = sum + diff euclidDistance = math.sqrt(sum) return euclidDistance

Figure 7.5

Listing 7.2 def readFile(filename): datafile = open(filename, "r") datadict = {} key = 0 for aline in datafile: key = key + 1 score = int(aline) datadict[key] = [score] return datadict

Indefinite Iteration Repeating a process an unknown number of times Control is based on a boolean expression Infinite loop is possible Any for loop can be written as a while loop

Listing 7.3 while : statement1 statement2... statementn

Figure 7.6

Listing 7.4 sum = 0 for anum in range(1,11): sum = sum + anum print(sum)

Listing7.5 sum = 0 anum = 1 #initialization while anum <= 10: #condition sum = sum + anum anum = anum + 1 #change of state print(sum)

Listing 7.6 sum = 0 anum = 1 while anum <= 10: sum = sum + anum print(sum)

Listing 7.7 def readFile(filename): datafile = open(filename, "r") datadict = {} key = 0 aline = datafile.readline() while aline != "": key = key + 1 score = int(aline) datadict[key] = [score] aline = datafile.readline() return datadict

Creating Clusters Decide on number of clusters Choose data points to be initial centroids Assign data points to be members of a centroid Recompute centroids Repeat

Listing 7.8 def createCentroids(k, datadict): centroids=[] centroidCount = 0 centroidKeys = [] while centroidCount < k: rkey = random.randint(1,len(datadict)) if rkey not in centroidKeys: centroids.append(datadict[rkey]) centroidKeys.append(rkey) centroidCount = centroidCount + 1 return centroids

Listing 7.9 def createClusters(k, centroids, datadict, repeats): for apass in range(repeats): print("****PASS",apass,"****") clusters = [] for i in range(k): clusters.append([]) for akey in datadict: distances = [] for clusterIndex in range(k): dist = euclidD(datadict[akey],centroids[clusterIndex]) distances.append(dist) mindist = min(distances) index = distances.index(mindist) clusters[index].append(akey) dimensions = len(datadict[1])

Listing 7.9 continued for clusterIndex in range(k): sums = [0]*dimensions for akey in clusters[clusterIndex]: datapoints = datadict[akey] for ind in range(len(datapoints)): sums[ind] = sums[ind] + datapoints[ind] for ind in range(len(sums)): clusterLen = len(clusters[clusterIndex]) if clusterLen != 0: sums[ind] = sums[ind]/clusterLen centroids[clusterIndex] = sums for c in clusters: print ("CLUSTER") for key in c: print(datadict[key], end=" ") print() return clusters

Figure 7.7

Listing 7.10 def clusterAnalysis(dataFile): examDict = readFile(dataFile) examCentroids = createCentroids(5, examDict) examClusters = createClusters(5, examCentroids, examDict, 3) clusterAnalysis("cs150exams.txt")

Visualizing Clusters Earthquake data Show clusters on a map Use turtle module to plot data

Figure 7.8

Listing 7.11 def visualizeQuakes(dataFile): datadict = readFile(dataFile) quakeCentroids = createCentroids(6, datadict) clusters = createClusters(6, quakeCentroids, datadict, 7) quakeT = turtle.Turtle() quakeWin = turtle.Screen() quakeWin.bgpic("worldmap.gif") quakeWin.screensize(448,266) quakeWin.setup(width=500, height=300) wFactor = (quakeWin.screensize()[0]/2)/180 hFactor = (quakeWin.screensize()[1]/2)/90 quakeT.hideturtle() quakeT.up() colorlist = ["red","green","blue","orange","cyan","yellow"] for clusterIndex in range(6): quakeT.color(colorlist[clusterIndex]) for akey in clusters[clusterIndex]: lon = datadict[akey][0] lat = datadict[akey][1] quakeT.goto(lon*wFactor,lat*hFactor) quakeT.dot() quakeWin.exitonclick()

Figure 7.9