Computer Science Overview

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
Parallel Mesh Refinement with Optimal Load Balancing Jean-Francois Remacle, Joseph E. Flaherty and Mark. S. Shephard Scientific Computation Research Center.
Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.
ParFUM Parallel Mesh Adaptivity Nilesh Choudhury, Terry Wilmarth Parallel Programming Lab Computer Science Department University of Illinois, Urbana Champaign.
1CPSD NSF/DARPA OPAAL Adaptive Parallelization Strategies using Data-driven Objects Laxmikant Kale First Annual Review October 1999, Iowa City.
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
Parallelization Of The Spacetime Discontinuous Galerkin Method Using The Charm++ FEM Framework (ParFUM) Mark Hills, Hari Govind, Sayantan Chakravorty,
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Adaptive MPI Milind A. Bhandarkar
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Supporting Multi-domain decomposition for MPI programs Laxmikant Kale Computer Science 18 May 2000 ©1999 Board of Trustees of the University of Illinois.
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
1CPSD Software Infrastructure for Application Development Laxmikant Kale David Padua Computer Science Department.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©
Connections to Other Packages The Cactus Team Albert Einstein Institute
Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
CSAR Master Presentation Presenter Name 20 May 2003 ©2003 Board of Trustees of the University of Illinois ©
CSAR Overview Laxmikant (Sanjay) Kale 11 September 2001 © ©2001 Board of Trustees of the University of Illinois.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.
Computer Science Overview Laxmikant Kale October 29, 2002 ©2002 Board of Trustees of the University of Illinois ©
Quality of Service for Numerical Components Lori Freitag Diachin, Paul Hovland, Kate Keahey, Lois McInnes, Boyana Norris, Padma Raghavan.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.
Unstructured Meshing Tools for Fusion Plasma Simulations
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Xing Cai University of Oslo
Katsuyo Thornton1, R. Edwin García2, Larry Aagesen3
Operating Systems (CS 340 D)
Data Structures for Efficient and Integrated Simulation of Multi-Physics Processes in Complex Geometries A.Smirnov MulPhys LLC github/mulphys
Parallel Unstructured Mesh Infrastructure
Parallel Objects: Virtualization & In-Process Components
Distribution and components
University of Technology
Performance Evaluation of Adaptive MPI
Operating Systems (CS 340 D)
Component Frameworks:
Milind A. Bhandarkar Adaptive MPI Milind A. Bhandarkar
CLUSTER COMPUTING.
GENERAL VIEW OF KRATOS MULTIPHYSICS
Runtime Optimizations via Processor Virtualization
Analysis models and design models
Gary M. Zoppetti Gagan Agrawal
Faucets: Efficient Utilization of Multiple Clusters
Java History, Editions, Version Features
BigSim: Simulating PetaFLOPS Supercomputers
Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale
Process Management -Compiled for CSIT
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
An Orchestration Language for Parallel Objects
Support for Adaptivity in ARMCI Using Migratable Objects
Parallel Implementation of Adaptive Spacetime Simulations A
Presentation transcript:

Computer Science Overview © Laxmikant Kale Department of Computer Science June 5, 2001 ©2000 Board of Trustees of the University of Illinois 1

CS Faculty and Staff Investigators T. Baker M. Bhandarkar M. Campbell E. de Sturler H. Edelsbrunner R. Fiedler M. Heath J. Hoeflinger L. Kale J. Liesen J. Norris D. Padua D. Reed P. Saylor K. Seamons A. Sheffer S. Teng M. Winslett plus numerous students

Computer Science Research Overview Computational Mathematics and Geometry Linear solvers and preconditioners Eigensolvers Mesh generation and adaptation Interface propagation and interpolation Computational Environment Software integration framework Parallel I/O and data migration Performance tools and techniques Computational steering Visualization

Linear Solvers Analysis of Krylov subspace methods Development of faster and more robust Krylov subspace methods Development of more robust methods for ill-conditioned linear and nonlinear systems Improvement of Jacobi-Davidson methods for eigenvalue problems Derivation of sharper error estimates and stopping criteria for iterative methods Preconditioners for radiation transport problems

Mesh Generation and Adaptation Mesh adjustment for moving boundaries Data structures for non-conformal meshes in discontinuous Galerkin methods Space-time mesh generation Model simplification for meshing Surface parameterization and element shape improvement Skin model for evolving boundary space-time meshing

Mesh Generation and Adaptation Library for mixed 3D cohesive element meshes a program for introducing cohesive elements based on material types. Interaction with Geubelle 2) Mesh quality measures & Laplace smoothing in the ALE code with Mike Brandyberry 3) Continuing research on Space-Time meshing in 2DxTIME 4) Surface parameterization with E. de Sturler's group in colaboration with Sandia.

Software Integration Framework Flexible framework for coupling stand-alone application codes Encapsulation via objects and threads Runtime environment to support dynamic behavior (e.g., refinement, load balancing) Intelligent interface for mediating communication between component modules

Roccom -- Component Manager Mechanism for inter-component data and function sharing Roccom API Programming interface for application modules Roccom developers interface C++ interface for service modules Roccom implementations Runtime systems of Roccom

Rationales of Roccom Object-oriented philosophy Enforce encapsulation of data Enable runtime polymorphism of functions Minimal changes to existing applications Each component manages its own data, and Publicize data and functions by registering to Roccom Maximal concurrency in code development No need to worry about details of others data structure Maximal flexibility for integration Switching application component, service component, and runtime system with minimal changes to codes

Data and Function Organization Window – distributed object Geometrically, portion of mesh in contact with another More generally, collection of interface data and functions Pane – chuck of distributed object Portion of window specific to thread with its own arrays Each thread can have multiple panes Attribute – public data member of window Window attr., pane attr., node attr., and element attr. Composite attr.: e.g., “mesh”, “all” Function – public function member of window

Status of Roccom Roccom implementation Complete base implementation for SPMD style integration Charm++ and Autopilot based are ongoing Current Roccom service components Rocface-2.0 – data transfer between nonmatching meshes Interface convergence check Two switchable HDF output modules One sequential, and one parallel using Panda Application modules Rocflo, Rocsolid, Rocfrac, and Rocburn

Rocface –Interface Component Robust and efficient algorithm for overlaying two surface meshes + = Rocface handles transferring data between nonmatching surface meshes. To handle the data transfer, Rocface first constructs a reference mesh, which is the overlay, I.e., the common refinement of two meshes.

Example Overlay on Star Grain Example of overlaying two meshes on a star grain geometry. It demonstrates the ability of Rocface of handling relatively complex geometry with sharp edges and corners.

Least Squares Data Transfer Minimizes error and enforces conservation Handles node and element centered data Made possible by the overlay Achieved superb experimental results After computing the overlay, Rocface transfers data using a least-squares formulation. This formulation is both accurate and conservative, and works for both node and element Centered data. This formulation can be solved accurately and efficiently using the overlay of two meshes. Our experimental results show that this method works much better than others. The figures Show the displacements using two different methods for a burning cavity problem with Uniform pressure and uniform regression after 500 time steps. The figure to the left is Using the new method, which is very accurate. The one to the right used the conservative load transfer algorithm by Farhat. It is apparent that the one to the left is much better than The one to the left. (The right one had an error of about 20%.) Load transfer (Farhat) Our method

Performance of GEN1 Using Charm++

Load Balancing with Charm++ 267.75 299.85 301.56 235.19 Time Step 133.76 149.01 150.08 117.16 Pre-Cor Iter 46.83 52.20 52.50 41.86 Solid update 86.89 96.73 97.50 75.24 Fluid update 8P3,8P2 w. LB 8P3,8P2 w/o LB 16P2 16P3 Phase

New Capabilities Shrink and expand FEM Framework Based on Charm++ the set of assigned processors can be changed at runtime FEM Framework Fortan 90 and C++ based parallelization of unstructured mesh-based codes (FEM) Components can be used for generation of communication-lists: used for new ROCCrack Planned: insertion and deletion of elements Based on Charm++

AMPI What is AMPI: Adaptive load balancing for MPI Programmers Uses Charm++’s load balancing framework Uses multiple MPI threads per processor light weight threads Recent progress: Compiler support for automatic conversion global variables packing-unpacking functions Cross communicators Allows multiple components to communicate across Two independent MPI “Worlds” can communicate Implemented for ROCFLO/ROCSOLID separation Picture

AMPI and ROC* Rocflo Rocface Rocsolid

Plans for the framework Automatic out-of-core execution Take advantage of Data-driven execution Cluster Management Job Scheduler to maximize throughput Using Strechable jobs (as well as fixed-size ones)

Parallel I/O and Data Migration Parallel output of snapshots for GEN1 Combine arrays for different blocks into single virtual array Output multiple arrays at once using array group Manage metadata for outputting HDF files for Rocketeer Automatic tuning of parallel I/O performance Data migration concurrent with application Automatic choice of data migration strategy Alpha testing and benchmarking of Panda 3.0

Autopilot Interfacing Library AP Roccom Implementation Library for interfacing multiple applications over a network For parallel multi-physics or multi-component scientific applications. Little to no source code changes required C, C++, F77, F90, HPF Independant of application parallelization implementation Cross platform interfacing with Linux, AIX, Solaris, and Irix Built on Globus

Autopilot ROCCOM Provides mechanisms for runtime computational steering Requires some (little) source code changes Mechanisms for user/client based or automatic steering Dynamic starting, stopping, and swapping of application components at runtime Provides mechanisms for runtime performance tuning and visualization Built on top of existing Pablo performance suite Mechanisms for automatic performance based steering at runtime Remote performance visualization on workstations or I-desk using Virtue

Visualization with Rocketeer Built on Visualization Toolkit (VTk) and OpenGL Supports Structured and unstructured grids Cell-centered and node-centered data Ghost cells Seamless merging of multiple data files Automated animation Smart HDF reader Translucent isosurfaces New features – spheres, etc Parallel, client-server implementation in progress

Department of Computer Science Prof. Laxmikant Kale Department of Computer Science University of Illinois at Urbana-Champaign 2262 Digital Computer Laboratory 1304 West Springfield Avenue Urbana, IL 61801 USA kale@cs.uiuc.edu http://www.cs.uiuc.edu/contacts/ faculty/heath.html telephone: 217-333-6268 fax: 217-333-1910