Modeling Performance.

Slides:



Advertisements
Similar presentations
Chapter 2: Algorithm Discovery and Design
Advertisements

Fundamentals of Python: From First Programs Through Data Structures
18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
 Pearson Education, Inc. All rights reserved Searching and Sorting.
 Expression Tree and Objects 1. Elements of Python  Literals, Strings, Tuples, Lists, …  The order of file reading  The order of execution 2.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
How to Multiply using Lattice. Step 1: How many boxes to you need? 23 X 5 You need 2 boxes on top and 1 on the side.
STEPS FOR MULTIPLYING A 2-DIGIT NUMBER BY ANOTHER 2-DIGIT NUMBER With Partial Products.
MA/CS 375 Fall 2002 Lecture 3. Example 2 A is a matrix with 3 rows and 2 columns.
Graphical User Interface You will be used to using programs that have a graphical user interface (GUI). So far you have been writing programs that have.
Announcements: Project 5 Everything we have been learning thus far will enable us to solve interesting problems Project 5 will focus on applying the skills.
Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
Solve Systems of Equations Using Elimination Section 6.3.
CSC 212 – Data Structures Lecture 15: Big-Oh Notation.
MULTIPROCESSING MODULE OF PYTHON. CPYTHON  CPython is the default, most-widely used implementation of the Python programming language.  CPython - single-threaded.
Machine Language Computer languages cannot be directly interpreted by the computer – they are not in binary. All commands need to be translated into binary.
Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Embedded Systems MPSoC Architectures OpenMP: Exercises Alberto Bosio
Recursion Data Structure Submitted By:- Dheeraj Kataria.
Invitation to Computer Science 6th Edition
Problem Solving: Brute Force Approaches
16 Searching and Sorting.
Topic: Recursion – Part 2
Chapter 3: Process Concept
Learning to Program D is for Digital.
Algorithmic complexity: Speed of algorithms
CMSC201 Computer Science I for Majors Lecture 22 – Binary (and More)
Computer Programming Fundamentals
© 2016 Pearson Education, Ltd. All rights reserved.
Operating Systems (CS 340 D)
multiprocessing and mpi4py Python for computational science
Atomic Operations in Hardware
Process Management Presented By Aditya Gupta Assistant Professor
Topic: Functions – Part 2
A451 Theory – 7 Programming 7A, B - Algorithms.
Knowing your math operation terms
Algorithm Analysis CSE 2011 Winter September 2018.
Performing What-if Analysis
Operating Systems (CS 340 D)
While Loops BIS1523 – Lecture 12.
Problem Solving: Brute Force Approaches
While loops The while loop executes the statement over and over as long as the boolean expression is true. The expression is evaluated first, so the statement.
Lecture 2: Processes Part 1
Jack Meyers & Daniel Morrison
Learning to Program in Python
Parallel Processing Sharing the load.
Copyright © 2011, Elsevier Inc. All rights Reserved.
Unit-2 Divide and Conquer
Fundamentals of Programming
Operating Systems.
Multiplication using the grid method.
Cooperative Caching, Simplified
What does this do? def revList(L): if len(L) < = 1: return L x = L[0] LR = revList(L[1:]) return LR + x.
Background and Motivation
Algorithmic complexity: Speed of algorithms
Sub-Quadratic Sorting Algorithms
Creating Computer Programs
Algorithmic complexity: Speed of algorithms
How to Multiply using Lattice
Quadratic Equations, Inequalities, and Functions
Module 5 ● Vocabulary 1 a sensing block which will ask whatever question is typed into the block and display an input text box at the bottom.
Creating Computer Programs
1.3.7 High- and low-level languages and their translators
PYTHON PANDAS FUNCTION APPLICATIONS
Lecture 20 – Practice Exercises 4
Design and Analysis of Algorithms
Introduction to Computer Science
Presentation transcript:

Modeling Performance

In addition to solving the model quickly, getting the answer also requires building the model quickly In this section we present some methods used to build models using docplex in python. The constraint we are building is for a classroom scheduling model. In this model, we cannot have any potential sections overlap at any time for a specific room. 𝑠∈𝑜 𝑡 \t 𝐶 𝑠 ≤ 𝑠∈𝑜 𝑡 \t − 𝑠∈𝑜 𝑡 \t 𝑢∈𝜕(𝑡) 𝐶 𝑢 ∀ 𝑡∈𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑𝑠

We apply five approaches We want to use as much out-of-the box functionality as possible Looping: In looping we iterate through lists and add variables to create the constraints Pandas: We leverage the pandas package to do the looping with single commands and group the data together Multiprocessing: We use the python multiprocessing package to send multiple jobs to run in parallel. Pandas, revisited: We adjust how we sum value and observe the performance change. Dask: use the python dask library to compute sums from the data frames in parallel We randomly generate three potential sections (start times) for 10,000 different classes and then for 1,000 different classes. We build the constraint to ensure there are no overlaps. The timing does not include the time required to create the data structures.

Looping for t in potentialTimes.timeID.unique(): #find the times that overlap with the given time lhsVars = [] for idx, row in overlappingTimes[(overlappingTimes.timeID_x == t) & (overlappingTimes.timeID_y != t)].iterrows(): for idx2, row2 in potentialTimes[potentialTimes.timeID == row.timeID_y].iterrows(): lhsVars.append(row2.scheduleSection) rhsVars = [] for idx3, row3 in potentialTimes[potentialTimes.timeID == t].iterrows(): rhsVars.append(row3.scheduleSection) m.add_constraint(m.sum(lhsVars) <= len(lhsVars) - len(lhsVars) * m.sum(rhsVars), 'overlap_%s'%t)

Looping results 10,000 classes 𝜇=3.18,17.40 𝜎=0.11, 0.16 𝑛=10, 5 𝜇=1.00, 2.03 𝜎=0.026, 0.09 𝑛=10, 5

Pandas gb = potentialTimes.groupby(by='timeID').scheduleSection.sum() overlaps = overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y].merge(potentialTimes,how='inner',left_on = 'timeID_y',right_on='timeID') numOverlaps = overlaps.groupby(by='timeID_x').timeID_y.count() overlappingSections = overlaps.groupby(by='timeID_x').scheduleSection.sum() m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx) for idx, sections in gb.iteritems()])

Pandas results 10,000 classes 𝜇=14.26, 15.02 𝜎=0.20, 0.38 𝑛=10, 5 The pandas coding is more compact, but requires getting used to creating merges and groupbys. Pandas is slowed by the merge, which in this case is close to a Cartesian multiplier. 1,000 classes 𝜇=0.27, 0.27 𝜎=0.02 𝑛=5

Multiprocessing # Create queues task_queue = Queue() done_queue = Queue() # Start worker processes for i in range(NUMBER_OF_PROCESSES): Process(target=worker, args=(task_queue, done_queue)).start() NUM_TASKS = 0 task_queue.put((aggregate, potentialTimes[['timeID','varNames’]], 'timeID’, 'varNames’, 'gb'))) gb = 0 NUM_TASKS += 1 task_queue.put((merge, (overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y],potentialTimes[['timeID','varNames']],'inner','timeID_y','timeID','overlaps'))) overlaps = 0 overlappingSections = 0 numOverlaps = 0 for idx in range(NUM_TASKS): result = done_queue.get() if result[0] == 'gb': gb = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'overlaps': overlaps = result[1] task_queue.put((aggregate, (result[1],'timeID_x','varNames','overlappingSections'))) task_queue.put((getCount, (result[1],'timeID_x','varNames','numOverlaps'))) if result[0] == 'overlappingSections': overlappingSections = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'numOverlaps': numOverlaps = result[1] m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx)\ for idx, sections in gb.iteritems()])

Multiprocessing continued task_queue.put((merge, (overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y], potentialTimes[['timeID','varNames’]], 'inner’, 'timeID_y’, 'timeID’, 'overlaps'))) overlaps = 0 NUM_TASKS += 1 overlappingSections = 0 numOverlaps = 0 for idx in range(NUM_TASKS): result = done_queue.get() if result[0] == 'gb': gb = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'overlaps': overlaps = result[1] task_queue.put((aggregate, (result[1],'timeID_x','varNames','overlappingSections'))) task_queue.put((getCount, (result[1],'timeID_x','varNames','numOverlaps'))) if result[0] == 'overlappingSections': overlappingSections = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'numOverlaps': numOverlaps = result[1] m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx)\ for idx, sections in gb.iteritems()])

Multiprocessing continued for idx in range(NUM_TASKS): result = done_queue.get() if result[0] == 'gb': gb = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'overlaps': overlaps = result[1] task_queue.put((aggregate, (result[1], 'timeID_x’, 'varNames’, 'overlappingSections'))) task_queue.put((getCount, (result[1], 'timeID_x’, 'varNames’, 'numOverlaps'))) if result[0] == 'overlappingSections': overlappingSections = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'numOverlaps': numOverlaps = result[1] m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx)\ for idx, sections in gb.iteritems()])

Multiprocessing continued m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx) for idx, sections in gb.iteritems()])

Multiprocessing results 10,000 classes 𝜇=2.90, 2.79 𝜎=0.23, 0.10 𝑛=5, 10 Multiprocessing requires far more coding, but can leverage more of the processors on your system. Docplex variable objects cannot be pickled (used to store until the multiprocessing can pick up the data to process), so the variable names must be passed back and forth. As a result, packages such as dask will not work to aggregate docplex variables in a DataFrame. There is some overhead in creating the queues and starting the processes, so it is not always the fastest. 1,000 classes 𝜇=1.70, 1.71 𝜎=0.03, 0.04 𝑛=5, 10

Pandas with text “summing” gb = potentialTimes.groupby(by='timeID').varNames.sum() overlaps = overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y].merge(potentialTimes[['timeID','varNames']],how='inner',left_on = 'timeID_y',right_on='timeID') numOverlaps = overlaps.groupby(by='timeID_x').timeID_y.count() overlappingSections = overlaps.groupby(by='timeID_x').varNames.sum() m.add_constraints([(m.sum(getVarList(m,overlappingSections[idx],'!')) <= numOverlaps[idx] \ - numOverlaps[idx] * m.sum(getVarList(m,sections,'!')),'overlap_%s'%idx)\ for idx, sections in gb.iteritems()]) def getVarList(model,namesList,delimiter): return [model.get_var_by_name(name) for name in namesList.split(delimiter)[:-1]]

Pandas with text “summing” results 10,000 classes 𝜇= 1.18 𝜎=0.07 𝑛= 10 The pandas coding is more compact, but requires getting used to creating merges and groupbys. Pandas is slowed by the merge, which in this case is close to a Cartesian multiplier. Summation of string values is faster much faster in pandas and translation from name to variable object is fast in python. 1,000 classes 𝜇=0.091 𝜎=0.003 𝑛=10

Dask with text “summing” pt = dd.from_pandas(potentialTimes,npartitions = 8) ot = dd.from_pandas(overlappingTimes, npartitions = 8) gb = pt.groupby(by='timeID').varNames.sum() overlaps = ot[ot.timeID_x != ot.timeID_y].merge(pt[['timeID','varNames']],how='inner',left_on = 'timeID_y',right_on='timeID') numOverlaps = overlaps.groupby(by='timeID_x').timeID_y.count() numOverlaps = numOverlaps.compute() overlappingSections = overlaps.groupby(by='timeID_x').varNames.sum() overlappingSections = overlappingSections.compute() m.add_constraints([(m.sum(getVarList(m,overlappingSections[idx],'!')) <= numOverlaps[idx] \ - numOverlaps[idx] * m.sum(getVarList(m,sections,'!')),'overlap_%s'%idx)\ for idx, sections in gb.compute().iteritems()]) def getVarList(model,namesList,delimiter): return [model.get_var_by_name(name) for name in namesList.split(delimiter)[:-1]]

Dask with text “summing” results 10,000 classes 𝜇= 1.28, 1.30, 1.75 𝜎=0.059, 0.039, 0.101 𝑛= 10 The dask coding requires some conversion back to pandas for some of the constraint generation. As a result, it slows down in implementation. The number of partitions affects performance. Results represent two, four, and eight partitions. 1,000 classes 𝜇=0.29, 0.44, 0.76 𝜎=0.059, 0.061, 0.074 𝑛=10

Conclusion Docplex supports many ways to build model constraints. It may be easiest to start with an intuitive method and migrate to a fast method. Leverage python packages such as pandas. Dask may hold promise for very large problems.