Modeling Performance.

Slides:

Advertisements

Similar presentations

Chapter 2: Algorithm Discovery and Design

Advertisements

Fundamentals of Python: From First Programs Through Data Structures

18.337: Image Median Filter Rafael Palacios Aeronautics and Astronautics department. Visiting professor (IIT-Institute for Research in Technology, University.

Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.

 Pearson Education, Inc. All rights reserved Searching and Sorting.

 Expression Tree and Objects 1. Elements of Python  Literals, Strings, Tuples, Lists, …  The order of file reading  The order of execution 2.

Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.

How to Multiply using Lattice. Step 1: How many boxes to you need? 23 X 5 You need 2 boxes on top and 1 on the side.

STEPS FOR MULTIPLYING A 2-DIGIT NUMBER BY ANOTHER 2-DIGIT NUMBER With Partial Products.

MA/CS 375 Fall 2002 Lecture 3. Example 2 A is a matrix with 3 rows and 2 columns.

Graphical User Interface You will be used to using programs that have a graphical user interface (GUI). So far you have been writing programs that have.

Announcements: Project 5 Everything we have been learning thus far will enable us to solve interesting problems Project 5 will focus on applying the skills.

Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?

Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.

Solve Systems of Equations Using Elimination Section 6.3.

CSC 212 – Data Structures Lecture 15: Big-Oh Notation.

MULTIPROCESSING MODULE OF PYTHON. CPYTHON  CPython is the default, most-widely used implementation of the Python programming language.  CPython - single-threaded.

Machine Language Computer languages cannot be directly interpreted by the computer – they are not in binary. All commands need to be translated into binary.

Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.

SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.

Embedded Systems MPSoC Architectures OpenMP: Exercises Alberto Bosio

Recursion Data Structure Submitted By:- Dheeraj Kataria.

Invitation to Computer Science 6th Edition

Problem Solving: Brute Force Approaches

16 Searching and Sorting.

Topic: Recursion – Part 2

Chapter 3: Process Concept

Learning to Program D is for Digital.

Algorithmic complexity: Speed of algorithms

CMSC201 Computer Science I for Majors Lecture 22 – Binary (and More)

Computer Programming Fundamentals

© 2016 Pearson Education, Ltd. All rights reserved.

Operating Systems (CS 340 D)

multiprocessing and mpi4py Python for computational science

Atomic Operations in Hardware

Process Management Presented By Aditya Gupta Assistant Professor

Topic: Functions – Part 2

A451 Theory – 7 Programming 7A, B - Algorithms.

Knowing your math operation terms

Algorithm Analysis CSE 2011 Winter September 2018.

Performing What-if Analysis

Operating Systems (CS 340 D)

While Loops BIS1523 – Lecture 12.

Problem Solving: Brute Force Approaches

While loops The while loop executes the statement over and over as long as the boolean expression is true. The expression is evaluated first, so the statement.

Lecture 2: Processes Part 1

Jack Meyers & Daniel Morrison

Learning to Program in Python

Parallel Processing Sharing the load.

Copyright © 2011, Elsevier Inc. All rights Reserved.

Unit-2 Divide and Conquer

Fundamentals of Programming

Operating Systems.

Multiplication using the grid method.

Cooperative Caching, Simplified

What does this do? def revList(L): if len(L) < = 1: return L x = L[0] LR = revList(L[1:]) return LR + x.

Background and Motivation

Algorithmic complexity: Speed of algorithms

Sub-Quadratic Sorting Algorithms

Creating Computer Programs

Algorithmic complexity: Speed of algorithms

How to Multiply using Lattice

Quadratic Equations, Inequalities, and Functions

Module 5 ● Vocabulary 1 a sensing block which will ask whatever question is typed into the block and display an input text box at the bottom.

Creating Computer Programs

1.3.7 High- and low-level languages and their translators

PYTHON PANDAS FUNCTION APPLICATIONS

Lecture 20 – Practice Exercises 4

Design and Analysis of Algorithms

Introduction to Computer Science

Presentation transcript:

Modeling Performance

In addition to solving the model quickly, getting the answer also requires building the model quickly In this section we present some methods used to build models using docplex in python. The constraint we are building is for a classroom scheduling model. In this model, we cannot have any potential sections overlap at any time for a specific room. 𝑠∈𝑜 𝑡 \t 𝐶 𝑠 ≤ 𝑠∈𝑜 𝑡 \t − 𝑠∈𝑜 𝑡 \t 𝑢∈𝜕(𝑡) 𝐶 𝑢 ∀ 𝑡∈𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑𝑠

We apply five approaches We want to use as much out-of-the box functionality as possible Looping: In looping we iterate through lists and add variables to create the constraints Pandas: We leverage the pandas package to do the looping with single commands and group the data together Multiprocessing: We use the python multiprocessing package to send multiple jobs to run in parallel. Pandas, revisited: We adjust how we sum value and observe the performance change. Dask: use the python dask library to compute sums from the data frames in parallel We randomly generate three potential sections (start times) for 10,000 different classes and then for 1,000 different classes. We build the constraint to ensure there are no overlaps. The timing does not include the time required to create the data structures.

Looping for t in potentialTimes.timeID.unique(): #find the times that overlap with the given time lhsVars = [] for idx, row in overlappingTimes[(overlappingTimes.timeID_x == t) & (overlappingTimes.timeID_y != t)].iterrows(): for idx2, row2 in potentialTimes[potentialTimes.timeID == row.timeID_y].iterrows(): lhsVars.append(row2.scheduleSection) rhsVars = [] for idx3, row3 in potentialTimes[potentialTimes.timeID == t].iterrows(): rhsVars.append(row3.scheduleSection) m.add_constraint(m.sum(lhsVars) <= len(lhsVars) - len(lhsVars) * m.sum(rhsVars), 'overlap_%s'%t)

Looping results 10,000 classes 𝜇=3.18,17.40 𝜎=0.11, 0.16 𝑛=10, 5 𝜇=1.00, 2.03 𝜎=0.026, 0.09 𝑛=10, 5

Pandas gb = potentialTimes.groupby(by='timeID').scheduleSection.sum() overlaps = overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y].merge(potentialTimes,how='inner',left_on = 'timeID_y',right_on='timeID') numOverlaps = overlaps.groupby(by='timeID_x').timeID_y.count() overlappingSections = overlaps.groupby(by='timeID_x').scheduleSection.sum() m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx) for idx, sections in gb.iteritems()])

Pandas results 10,000 classes 𝜇=14.26, 15.02 𝜎=0.20, 0.38 𝑛=10, 5 The pandas coding is more compact, but requires getting used to creating merges and groupbys. Pandas is slowed by the merge, which in this case is close to a Cartesian multiplier. 1,000 classes 𝜇=0.27, 0.27 𝜎=0.02 𝑛=5

Multiprocessing # Create queues task_queue = Queue() done_queue = Queue() # Start worker processes for i in range(NUMBER_OF_PROCESSES): Process(target=worker, args=(task_queue, done_queue)).start() NUM_TASKS = 0 task_queue.put((aggregate, potentialTimes[['timeID','varNames’]], 'timeID’, 'varNames’, 'gb'))) gb = 0 NUM_TASKS += 1 task_queue.put((merge, (overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y],potentialTimes[['timeID','varNames']],'inner','timeID_y','timeID','overlaps'))) overlaps = 0 overlappingSections = 0 numOverlaps = 0 for idx in range(NUM_TASKS): result = done_queue.get() if result[0] == 'gb': gb = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'overlaps': overlaps = result[1] task_queue.put((aggregate, (result[1],'timeID_x','varNames','overlappingSections'))) task_queue.put((getCount, (result[1],'timeID_x','varNames','numOverlaps'))) if result[0] == 'overlappingSections': overlappingSections = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'numOverlaps': numOverlaps = result[1] m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx)\ for idx, sections in gb.iteritems()])

Multiprocessing continued task_queue.put((merge, (overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y], potentialTimes[['timeID','varNames’]], 'inner’, 'timeID_y’, 'timeID’, 'overlaps'))) overlaps = 0 NUM_TASKS += 1 overlappingSections = 0 numOverlaps = 0 for idx in range(NUM_TASKS): result = done_queue.get() if result[0] == 'gb': gb = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'overlaps': overlaps = result[1] task_queue.put((aggregate, (result[1],'timeID_x','varNames','overlappingSections'))) task_queue.put((getCount, (result[1],'timeID_x','varNames','numOverlaps'))) if result[0] == 'overlappingSections': overlappingSections = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'numOverlaps': numOverlaps = result[1] m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx)\ for idx, sections in gb.iteritems()])

Multiprocessing continued for idx in range(NUM_TASKS): result = done_queue.get() if result[0] == 'gb': gb = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'overlaps': overlaps = result[1] task_queue.put((aggregate, (result[1], 'timeID_x’, 'varNames’, 'overlappingSections'))) task_queue.put((getCount, (result[1], 'timeID_x’, 'varNames’, 'numOverlaps'))) if result[0] == 'overlappingSections': overlappingSections = result[1].map(lambda x: m.sum(getVarList(m,x,'!'))) if result[0] == 'numOverlaps': numOverlaps = result[1] m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx)\ for idx, sections in gb.iteritems()])

Multiprocessing continued m.add_constraints([(overlappingSections[idx] <= numOverlaps[idx] - numOverlaps[idx] * sections,'overlap_%s'%idx) for idx, sections in gb.iteritems()])

Multiprocessing results 10,000 classes 𝜇=2.90, 2.79 𝜎=0.23, 0.10 𝑛=5, 10 Multiprocessing requires far more coding, but can leverage more of the processors on your system. Docplex variable objects cannot be pickled (used to store until the multiprocessing can pick up the data to process), so the variable names must be passed back and forth. As a result, packages such as dask will not work to aggregate docplex variables in a DataFrame. There is some overhead in creating the queues and starting the processes, so it is not always the fastest. 1,000 classes 𝜇=1.70, 1.71 𝜎=0.03, 0.04 𝑛=5, 10

Pandas with text “summing” gb = potentialTimes.groupby(by='timeID').varNames.sum() overlaps = overlappingTimes[overlappingTimes.timeID_x != overlappingTimes.timeID_y].merge(potentialTimes[['timeID','varNames']],how='inner',left_on = 'timeID_y',right_on='timeID') numOverlaps = overlaps.groupby(by='timeID_x').timeID_y.count() overlappingSections = overlaps.groupby(by='timeID_x').varNames.sum() m.add_constraints([(m.sum(getVarList(m,overlappingSections[idx],'!')) <= numOverlaps[idx] \ - numOverlaps[idx] * m.sum(getVarList(m,sections,'!')),'overlap_%s'%idx)\ for idx, sections in gb.iteritems()]) def getVarList(model,namesList,delimiter): return [model.get_var_by_name(name) for name in namesList.split(delimiter)[:-1]]

Pandas with text “summing” results 10,000 classes 𝜇= 1.18 𝜎=0.07 𝑛= 10 The pandas coding is more compact, but requires getting used to creating merges and groupbys. Pandas is slowed by the merge, which in this case is close to a Cartesian multiplier. Summation of string values is faster much faster in pandas and translation from name to variable object is fast in python. 1,000 classes 𝜇=0.091 𝜎=0.003 𝑛=10

Dask with text “summing” pt = dd.from_pandas(potentialTimes,npartitions = 8) ot = dd.from_pandas(overlappingTimes, npartitions = 8) gb = pt.groupby(by='timeID').varNames.sum() overlaps = ot[ot.timeID_x != ot.timeID_y].merge(pt[['timeID','varNames']],how='inner',left_on = 'timeID_y',right_on='timeID') numOverlaps = overlaps.groupby(by='timeID_x').timeID_y.count() numOverlaps = numOverlaps.compute() overlappingSections = overlaps.groupby(by='timeID_x').varNames.sum() overlappingSections = overlappingSections.compute() m.add_constraints([(m.sum(getVarList(m,overlappingSections[idx],'!')) <= numOverlaps[idx] \ - numOverlaps[idx] * m.sum(getVarList(m,sections,'!')),'overlap_%s'%idx)\ for idx, sections in gb.compute().iteritems()]) def getVarList(model,namesList,delimiter): return [model.get_var_by_name(name) for name in namesList.split(delimiter)[:-1]]

Dask with text “summing” results 10,000 classes 𝜇= 1.28, 1.30, 1.75 𝜎=0.059, 0.039, 0.101 𝑛= 10 The dask coding requires some conversion back to pandas for some of the constraint generation. As a result, it slows down in implementation. The number of partitions affects performance. Results represent two, four, and eight partitions. 1,000 classes 𝜇=0.29, 0.44, 0.76 𝜎=0.059, 0.061, 0.074 𝑛=10

Conclusion Docplex supports many ways to build model constraints. It may be easiest to start with an intuitive method and migrate to a fast method. Leverage python packages such as pandas. Dask may hold promise for very large problems.