Parallel System for Interactive Multi-Experiment Computational Studies (pSIMECS)

Slides:



Advertisements
Similar presentations
Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
Advertisements

1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
Enabling Interactive Multi-Experiment Computational Studies through a Permeable Runtime System-Application Interface Ph.D. Thesis Proposal Siu Yau Jun.
Small-world Overlay P2P Network
CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 6.3.2, H. Casanova, A. Legrand, Z. Zaogordnov, and F. Berman, "Heuristics.
Sim-X: Parallel System Software for Interactive Multi-Experiment Computational Studies Siu-Man Yau, New York University Eitan Grinspun.
Application-Aware Management of Parallel Simulation Collections Siu-Man Yau, New York University Steven G. Parker
WSN Simulation Template for OMNeT++
Presented by Zeehasham Rasheed
Using Application-Domain Knowledge in the Runtime Support of Multi-Experiment Computational Studies Siu Yau Dissertation Defense, Dec 08.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Result Reuse in Design Space Exploration: A Study in System Support for Interactive Parallel Computing Siu-Man Yau, New York University.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Numerical Grid Computations with the OPeNDAP Back End Server (BES)
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
J. He, G. Kesidis and D.J. Miller – The Pennsylvania State University In collaboration with K. Levitt, J. Rowe, S.F. Wu – The University of California.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
Chapter 7: Architecture Design Omar Meqdadi SE 273 Lecture 7 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
A Navigation Mesh for Dynamic Environments Wouter G. van Toll, Atlas F. Cook IV, Roland Geraerts CASA 2012.
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
1 Enabling Large Scale Network Simulation with 100 Million Nodes using Grid Infrastructure Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
Data Structures & AlgorithmsIT 0501 Algorithm Analysis I.
Software Architecture Framework for Ubiquitous Computing Divya ChanneGowda Athrey Joshi.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
Rio de Janeiro, October, 2005 SBAC Portable Checkpointing for BSP Applications on Grid Environments Raphael Y. de Camargo Fabio Kon Alfredo Goldman.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
Intelligent Database Systems Lab 1 Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Silvia Nittel Kelvin T.Leung Amy Braverman 國立雲林科技大學 National Yunlin.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Computer Science Research and Development Department Computing Sciences Directorate, L B N L 1 Storage Management and Data Mining in High Energy Physics.
Mehmet Can Kurt, The Ohio State University Gagan Agrawal, The Ohio State University DISC: A Domain-Interaction Based Programming Model With Support for.
Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
*Partially funded by the Austrian Grid Project (BMBWK GZ 4003/2-VI/4c/2004) Making the Best of Your Data - Offloading Visualization Tasks onto the Grid.
A university for the world real R © 2009, Chapter 9 The Runtime Environment Michael Adams.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
第 1 讲 分布式系统概述 §1.1 分布式系统的定义 §1.2 分布式系统分类 §1.3 分布式系统体系结构.
Author Utility-Based Scheduling for Bulk Data Transfers between Distributed Computing Facilities Xin Wang, Wei Tang, Raj Kettimuthu,
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.
Dynamic Load Balancing Tree and Structured Computations.
Nguyen Thi Thanh Nha HMCL by Roelof Kemp, Nicholas Palmer, Thilo Kielmann, and Henri Bal MOBICASE 2010, LNICST 2012 Cuckoo: A Computation Offloading Framework.
System Components Operating System Services System Calls.
VGrADS and GridSolve Asim YarKhan Jack Dongarra, Zhiao Shi, Fengguang Song Innovative Computing Laboratory University of Tennessee VGrADS Workshop – September.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Self Healing and Dynamic Construction Framework:
Spark Presentation.
CHAPTER 3 Architectures for Distributed Systems
CSI 400/500 Operating Systems Spring 2009
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Spatial Online Sampling and Aggregation
GENERAL VIEW OF KRATOS MULTIPHYSICS
Chapter 2: Operating-System Structures
Introduction to Operating Systems
Chapter 2: Operating-System Structures
MapReduce: Simplified Data Processing on Large Clusters
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Parallel System for Interactive Multi-Experiment Computational Studies (pSIMECS)

Simecs – Problem Description ● Multi-Experiment Computational Studies: – Computational Studies involving multiple experiments, each corresponding to an individual execution of a simulation software ● Example: Design Space Exploration – Goal: Given a set of possible parameter values (a parameter space), an experiment that maps a parameter value to a performance metric, find a subset of the parameter space whose performance metrics fit certain criteria.

Simecs – Problem Description ● Model Application: Pareto Frontier Discovery. ● Pareto Frontier is a set of points on the parameter space that is not completely dominated by any other point in the parameter space. – p “completely dominates” q iff there is all components in p's performance metric perform better than q's.

Simecs – Pareto Frontier Insights ● Simulations are independent – embarrassingly parallel ● An experiment corresponds to an execution of a simulation software, which can itself be parallel or sequential ● Result from one simulation can be used to speed up simulations of nearby parameter values (e.g., as initial guess for Newton Iteration.)

Simecs – Pareto Frontier Insights ● Decisions can be made with imprecise results: can trade off precision Vs resources ● If parameter space is large, sweeps are inefficient. ● Need to prune portions of the space as the study progresses, either automatically or interactively. ● Active Sampler can automatically pick "interesting" simulations (e.g., close to boundary)

Simecs – Example Problem ● Bridge design computational study: 1D bridge in 2D space, with end points clamped. Two elastic supports are added to the middle of bridge. ● Parameter space: distance of the two supports from the end of the bridge. ● Performance measures: maximum deflection of the bridge, and the cost of supports ● Bridge is clamped at all support points, with bending and stretching forces, and uniform load.

Simecs – Example Problem Test Problem. Parameter: Performance metric:. Cost function: c(r)

Simecs – Goal ● Simecs: Software on parallel systems that manages simulation processes in a Multi- Experiment Computational Study. ● Frees users and application developers from micromanaging every simulation process ● Goal: Interactive, Steerable Design Space Exploration

Simecs – User View ● Two types of parameters – technique parameters (e.g., discretisation of nodes, convergence tolerance) – model parameters (e.g., young's modulus of a material, viscosity of a fluid). ● Goal: As the Pareto frontier obtained from one set of parameters is forming, the user can switch to another setup and continue the study. – e.g., Limit the exploration space but increase the resolution.

Simecs – Developer View ● Application Developer provides 3 modules: – Simulation: Maps a parameter space point to performance space point – Visualisation & interaction: Displays the relevant information to user; Collects information from user, and maps the information into the Simulation module – Transformation: Transform a state of a simulation on one technique parameter into another. ● e.g., interpolate checkpoints from different resolutions

Simecs – System View ● Shared object layer, Active sampler, Resource Allocator

Simecs – System View ● Shared object space layer: System-wide repository of shared objects (e.g., checkpoints, error estimations, results) ● Sampler: Based on users' specifications, issues sample points where simulations will be run ● Resource Allocator / Manager: Maps simulations into computing elements, decides whether to use a checkpoint.

Simecs – SISOL ● Spatially-Indexed Shared Object Layer (SISOL) ● Used for storing system-wide shared objects. ● For the model problem, checkpoints, and results (performance metric at each parameter point). ● names a unique object in the system.

Simecs – SISOL ● Objects are typed: SISOL requires pack() and unpack() implementations for each type. For parallel object types, also requires a function to map parallel objects into different decompositions. ● Supports split-phase create, delete, read and write: to enforce read-modify-write consistency ● Supports neighborhood query

Simecs – SISOL Implementation ● Ideal implementation: directory-based cache, where each node participates in storing of objects. ● Current implementation: – Single TCP Server – In core – Hash-map based lookup – Linear lookup for nearest neighbor – Supports only sequential objects

Simecs – SISOL Implementation – Object sets created on server – Nearest neighbor query retrieves coordinates only – Supports Sequential Petsc Vector object type by default. ● Sufficient for small sets, small objects

Simecs – SISOL Use ● Current Pareto Frontier problem uses two object sets: – Result set (parameter point => performance metric) – Checkpoint set (parameter point => Sequential Petsc vectors) ● In the test problem, parameter point is a 2D vector, so result set & checkpoint set have 2D indices.

Simecs – FUEL ● Frame/Update Exchange Layer: Control layer between the manager and simulation processes ● Codes that represent a functional aspect of a steerable application are grouped together (called a Satellite). ● Event-based on manager process; Poll- based on simulation processes ● Dynamic model: Satellites can be activated and decommissioned as a simulation is running

Simecs – FUEL Interaction ● As simulator runs one simulation for a parameter point, the manager is processing the last one(s). Simulator Process Manager Process Calculate point X Query Sampler, gets point Y Time Register X result, Query Sampler, get point Z Calculate point YCalculate point Z Register Y result, Query Sampler, get point A X result Y Z result A Y result Z

Simecs – Active Sampler ● Resolves the pareto frontier progressively – Maintains a task queue and a result set – Task queue = points in parameter space of interest, result set = points discovered so far that are undominated (i.e., current pareto set candidates) – Seeds a task queue with points from a lattice on the parameter space. – Run the task queue.

Simecs – Active Sampler – For each result that comes back, decide if the point is undominated by all points in the result set. If so, remove all points in the result set that are dominated by it, add it to the result set, and insert its lattice neighbors into the task queue. – Continue until task queue is empty. – Refine the lattice, then repeat ● Effect: result set contains a set of pareto point candidates that had originated from a lattice. The lattice is finer as more time is spent.

Simecs – Active Sampler Initial Grid

Simecs – Active Sampler 1 st level results

Simecs – Active Sampler First Level Pareto Frontier

Simecs – Active Sampler First Refinement

Simecs – Active Sampler 2 nd level results

Simecs – Active Sampler Second level Pareto Frontier

Simecs – Active Sampler 2 nd Refinement

Simecs – Active Sampler 3 rd level results

Simecs – Active Sampler 3 rd level Pareto Frontier

Simecs – Manager ● Spawns off simulation processes ● When the result of a simulation comes back (via a FUEL callback): – Registers the result – Asks active sampler for the next point to run – Looks up the SISOL for a checkpoint to jump- start the next point – Sends the parameters of the next simulation, coordinates of the checkpoint, and error tolerances to the simulation process.

Simecs – Test System ● Single Server implementation of SISOL to store checkpoint set ● 3 Versions Samplers: Active, Random, and Sweep ● TCP-based FUEL ● Simulation implemented with PETSc SNES solver. ● Jump-start from Checkpoints = use checkpoint's configuration as the starting guess

Simecs – Test System ● Heterogenous cluster: – 1 1.5GHz Athlon node (manager, SISOL Server), – GHz Duron nodes (simulation processes) – 10 3 GHz Pentium 4 nodes. (simulation processes) – 100Mbps switched Ethernet network between Athlon and Duron nodes, 10Mbps Ethernet between Pentium 4 nodes.

Simecs – Test Result (Sampler) ● Active Sampler compared against: 1) Grid- based sampler, which performs a parameter sweep on the grid with increasing refinement, 2) Random sampler ● Both run for 1500 simulations, and the partial frontiers are dumped at periodic intervals. Housedorff distance is measured, using the final Active Sampler-based frontier with 1500 simulations as the ground truth.

Simecs – Test Result (Sampler)

Simecs – Test Results (Sampler)

Simecs - Test Result (Checkpoints) ● Cuts down number of iterations per simulation.

Simecs – Test Result (Scaling) Duron nodes added (Slower speed, faster communication)

Simecs – Test Result (Scaling)

Simecs – Conclusions ● Multiple experiments can be managed automatically ● Interactive speed can be achieved via re-use of checkpoints, active sampling, and partial results – run time goes from 3088 seconds down to 17, and lower if partial frontiers can be used

Simecs – Conclusions ● TCP-based communication framework provides system with portability - can be used on heterogeneous clusters ● Spatially-indexed object sets are useful communication substrate

Simecs – Future work ● Distributed implementation of SISOL ● Parallelise individual simulations (SISOL Support for Parallel Objects) ● MPI-based communication for SISOL and FUEL ● Interactivity