Parallelizing the graph isomorphism portion of an automatic reaction mechanism generation algorithm Geoff Oxberry 18.337 Project, Spring 2009.

Slides:



Advertisements
Similar presentations
H EURISTIC S OLVER  Builds and tests alternative fuel treatment schedules (solutions) at each iteration  In each iteration:  Evaluates the effects of.
Advertisements

Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Getting Started with MPI Self Test with solution.
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Reference: Message Passing Fundamentals.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
CompuNet Grid Computing Milena Natanov Keren Kotlovsky Project Supervisor: Zvika Berkovich Lab Chief Engineer: Dr. Ilana David Spring, /
1 Computer System Overview OS-1 Course AA
Scalable, Reliable, Power-Efficient Communication for Hardware Transactional Memory Seth Pugsley, Manu Awasthi, Niti Madan, Naveen Muralimanohar and Rajeev.
Fundamentals of Information Systems, Second Edition
Cliff Rhyne and Jerry Fu June 5, 2007 Parallel Image Segmenter CSE 262 Spring 2007 Project Final Presentation.
DECISION SUPPORT SYSTEM DEVELOPMENT
1 Verification Codes Michael Luby, Digital Fountain, Inc. Michael Mitzenmacher Harvard University and Digital Fountain, Inc.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 17 Slide 1 Rapid software development.
Software System Integration
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
Fundamentals of Python: From First Programs Through Data Structures
Rapid Prototyping Model
Introduction 01_intro.ppt
CS 221 – May 13 Review chapter 1 Lab – Show me your C programs – Black spaghetti – connect remaining machines – Be able to ping, ssh, and transfer files.
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
Fundamentals of Python: First Programs
Computer System Overview Chapter 1. Operating System Exploits the hardware resources of one or more processors Provides a set of services to system users.
Cluster-based SNP Calling on Large Scale Genome Sequencing Data Mucahid KutluGagan Agrawal Department of Computer Science and Engineering The Ohio State.
1 The Euclidean Non-uniform Steiner Tree Problem by Ian Frommer Bruce Golden Guruprasad Pundoor INFORMS Annual Meeting Denver, Colorado October 2004.
Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.
Event Management & ITIL V3
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
RMG-Py: Transitioning to the New Style Adjacency List Connie Gao 10/22/2014 Green Group Meeting 1.
ANALYSIS AND IMPLEMENTATION OF GRAPH COLORING ALGORITHMS FOR REGISTER ALLOCATION By, Sumeeth K. C Vasanth K.
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
Chapter 6 Prototyping, RAD, and Extreme Programming Systems Analysis and Design Kendall & Kendall Sixth Edition.
Fundamentals of Information Systems, Second Edition 1 Systems Development.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
AMB HW LOW LEVEL SIMULATION VS HW OUTPUT G. Volpi, INFN Pisa.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 Implementing Business/IT Solutions.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
Chapter 4 Decision Support System & Artificial Intelligence.
Robust Real Time Face Detection
CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
Project 2: Classification Using Genetic Programming Kim, MinHyeok Biointelligence laboratory Artificial.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
PARALLELIZATION OF ARTIFICIAL NEURAL NETWORKS Joe Bradish CS5802 Fall 2015.
Why it might be interesting to look at ARM Ben Couturier, Vijay Kartik Niko Neufeld, PH-LBC SFT Technical Group Meeting 08/10/2012.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Process Asad Ur Rehman Chief Technology Officer Feditec Enterprise.
Biologically Inspired Computation Ant Colony Optimisation.
Exploring Parallelism with Joseph Pantoga Jon Simington.
The Planning Phase Recognize the problem MIS steering committee 7. ManagerSystems analyst Define the problem Set system objectives Identify system constraints.
3/14/20161 SOAR CIS 479/579 Bruce R. Maxim UM-Dearborn.
SCHOOL OF ENGINEERING AND ADVANCED TECHNOLOGY Engineering Project Routing in Small-World Networks.
Systems Development Life Cycle
PYTHON FOR HIGH PERFORMANCE COMPUTING. OUTLINE  Compiling for performance  Native ways for performance  Generator  Examples.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 6-1 Chapter 6 Decision Support System Development.
Genetic Algorithms An Evolutionary Approach to Problem Solving.
Presented By:- Himanshu Rajkiran Sudipta.  History  Introduction  System Specification(H/W & S/W)  Feasibility Study  Entity-Relationship Diagram.
Control Flow Constructs: Conditional Logic
multiprocessing and mpi4py Python for computational science
Gary Hughes, South Oakleigh College
Faster Data Structures in Transactional Memory using Three Paths
Parallel Density-based Hybrid Clustering
Exploring Parallelism in
Software System Integration
CSCE569 Parallel Computing
Analysis models and design models
Rapid software development
Assoc. Prof. Marc FRÎNCU, PhD. Habil.
Presentation transcript:

Parallelizing the graph isomorphism portion of an automatic reaction mechanism generation algorithm Geoff Oxberry Project, Spring 2009

Automatic reaction mechanism generation yields models quickly  Reaction mechanisms are used to model chemistry in a wide range of applications  Generating first principles reaction mechanisms can take years, requires lots of expertise  The Bill Green group developed software (RMG) that automatically generates these models based on rules

For some problems, RMG takes days to generate a mechanism  We want it to take a day or less on a cluster  Big bottleneck for us is that we have to repeatedly solve a colored graph isomorphism (GI) problem  If we can speed it up, we can solve many more interesting chemistry problems  Parallelism is one option

We want to see if parallelism can be used to speed up RMG  Want to see if a parallel version of RMG is faster than a serial version  Due to time constraints, I chose to implement skeletal prototypes of serial and parallel versions of RMG in Python  Idea is to use results for the prototypes to see if it is worth parallelizing the production-scale code

Parallelism does speed up RMG on intermediate-sized case studies  When searching for graph isomorphisms in collections of 20 or fewer graphs, serial code is faster  When searching for graph isomorphisms in collections of ~100 graphs, parallel code is faster  When searching for graph isomorphisms in collections of ~2000 graphs, serial code is faster again

Outline  Brief overview of graph isomorphism  Discussion of existing RMG algorithm and how to parallelize  Python prototypes of serial and parallel versions of RMG algorithm  Results  Discussion of obstacles  Conclusions

Two graphs are isomorphic if there exists a bijection between their nodes  These two graphs are isomorphic:  Bijection here (L-R): 1-1, 3-4, 2-5, 5-2, 4-3

In RMG, ChemGraphs represent species  ChemGraphs are graphs with node labels and edge labels  Species are represented by a class of graphs equivalent under isomorphism  Example (methane): Node labels refer to atom types, edge labels refer to bond types

RMG classifies species as one of three types  Core species make up all of the reactants of the reaction mechanism  Edge species are products of the reaction mechanism not included in the core; they may be added to the core over the course of the algorithm  Postulated species are proposed species that may be added to the edge over the course of the algorithm

RMG algorithm manipulates graphs to generate a reaction mechanism Initialize set of core species Add remaining postulated species to edge species. Determine if any edge species should be added to core. Is termination criteria met? Use GI to discard postulated species based on various criteria Generate postulated species using some rules. No Yes

Checking for duplicate graphs using GI looks parallelizable  For example, could scatter postulated species over all processors and check for duplicates against core species in parallel  Could also do this with forbidden configs, etc. Use GI to discard postulated species based on various criteria Use GI to check for forbidden configurations. Use GI to check for duplicates among postulated species. Use GI to check that postulated species aren’t duplicated in core. Discard any duplicates.

Instead of working with RMG directly, I created a prototype  RMG takes 18 mos. for a developer to get up to speed; this project was ~6 wks.  To save time, I built a prototype in Python because its syntax and available libraries enable rapid development  Also enabled me to focus on the parts of the code that matter (GI algorithms) and ignore the rest

Serial prototype throws out everything but GI checking Initialize set of core species Add remaining postulated species to core species. Is termination criteria met? Use GI to discard postulated species based on various criteria Select postulated species from existing RMG output. No Yes

Parallel prototype parallelizes part of the GI comparisons  Checking postulated species against core species is embarrassingly parallel  Postulated species are essentially independent in that step Use GI to discard postulated species based on various criteria (in prototype) Use GI to check for duplicates among postulated species. Use GI in parallel to check that postulated species aren’t duplicated in core. Discard any duplicates.

Prototypes were implemented in Python/MPI on a cluster  Software: Python 2.5 (w/ C extensions) igraph module (graph data structure, GI algorithms) mpi4py module (MPI bindings for Python)  Hardware: 64-node cluster (pharos.mit.edu) 8 GB RAM per node Each node has 2 quad- core Xeon processors (either 2.33 GHz or 2.66 GHz)

Parallel prototype was faster on intermediate-sized problems  Species database was obtained from existing RMG output  Initial set of core species was 50% of database, randomly chosen  Program ran until all species in database were moved into core, or it reached 100 iterations Database Size (species) Serial time/ parallel time (a.u.)

Communication is slow in large test cases due to passing graph objects  Graphs are implemented using a class in the igraph library  mpi4py converts non-native Python objects using cPickle, which is compute-intensive  cPickle is probably why the serial code is faster in large test cases  Alternative approach would use NumPy and define an MPI derived data type; would be faster

Many technical problems occurred during the project  Laptop experienced hardware failures  Difficulties installing igraph and mpi4py on pharos System libraries had to be recompiled Environment variables were reset so igraph and mpi4py could be recognized on all nodes  Incomplete mpi4py documentation  Python extended debugger not installed; no graphical front-end

Parallelism can be used to speed up RMG for some case studies  Saw speed up for intermediate-sized case studies on parallel prototype  Additional opportunities for parallelism within RMG algorithm  Can also decrease MPI communication costs w/ additional development, use of debugger/profiler

Future Work  Install extended Python debugger/profiler  Use NumPy and MPI derived data type to reduced communication overhead  Try alternative strategies for parallelization: Reorganize algorithm (check core species, then postulated species) Parallelize checks of postulated species against themselves

Acknowledgments  RMG team: Franklin Goldsmith Sandeep Sharma Josh Allen Richard West Michael Harper Greg Magoon  Ray Speth  Kushal Kedia  Prof. Bill Green  DOE CSGF for funding