Problem Solving with NetSolve Michelle Miller, Keith Moore,

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

1.2 History of Operating Systems
Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: NWS papers.
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
NetSolve Happenings A Progress Report of the NetSolve Grid Computing System Cluster and Computational Grids for Scientific.
Advanced Computational Software Scientific Libraries: Part 2 Blue Waters Undergraduate Petascale Education Program May 29 – June
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Multiple Processor Systems
Bookshelf.EXE - BX A dynamic version of Bookshelf –Automatic submission of algorithm implementations, data and benchmarks into database Distributed computing.
A Computation Management Agent for Multi-Institutional Grids
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Globus Toolkit 4 hands-on Gergely Sipos, Gábor Kecskeméti MTA SZTAKI
Technical Architectures
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution.
Application-specific Tools Netsolve, Ninf, and NEOS CSE 225 Chas Wurster.
MCell Usage Scenario Project #7 CSE 260 UCSD Nadya Williams
NetSolve Henri Casanova and Jack Dongarra University of Tennessee and Oak Ridge National Laboratory
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Workload Management Massimo Sgaravatto INFN Padova.
1/16/2008CSCI 315 Operating Systems Design1 Introduction Notice: The slides for this lecture have been largely based on those accompanying the textbook.
NetSolve / GridSolve By Milan Novakovic, Steven Morgan.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Windows Server 2008 Chapter 11 Last Update
CSE 160/Berman Programming Paradigms and Algorithms W+A 3.1, 3.2, p. 178, 5.1, 5.3.3, Chapter 6, 9.2.8, , Kumar Berman, F., Wolski, R.,
Ekrem Kocaguneli 11/29/2010. Introduction CLISSPE and its background Application to be Modeled Steps of the Model Assessment of Performance Interpretation.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
STRATEGIES INVOLVED IN REMOTE COMPUTATION
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
1 Intel Mathematics Kernel Library (MKL) Quickstart COLA Lab, Department of Mathematics, Nat’l Taiwan University 2010/05/11.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Chapter 1. Introduction What is an Operating System? Mainframe Systems
Institute for Mathematical Modeling RAS 1 Dynamic load balancing. Overview. Simulation of combustion problems using multiprocessor computer systems For.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
DCE (distributed computing environment) DCE (distributed computing environment)
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
How computer’s are linked together.
Nomadic Grid Applications: The Cactus WORM G.Lanfermann Max Planck Institute for Gravitational Physics Albert-Einstein-Institute, Golm Dave Angulo University.
1 Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication Micah Beck Jack Dongarra Terry Moore James Plank University.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Amesos Sparse Direct Solver Package Tim Davis, Mike Heroux, Rob Hoekstra, Marzio Sala, Ken Stanley, Heidi Thornquist, Jim Willenbring Trilinos Users Group.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.
INFORMATION SYSTEM-SOFTWARE Topic: OPERATING SYSTEM CONCEPTS.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
Amesos Interfaces to sparse direct solvers October 15, :30 – 9:30 a.m. Ken Stanley.
FOUNDATION IN INFORMATION TECHNOLOGY (CS-T-101) TOPIC : INFORMATION SYSTEM – SOFTWARE.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Operating Systems.
MGRID Architecture Andy Adamson Center for Information Technology Integration University of Michigan, USA.
Network Weather Service. Introduction “NWS provides accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
1 Chapter 1 Basic Structures Of Computers. Computer : Introduction A computer is an electronic machine,devised for performing calculations and controlling.
VGrADS and GridSolve Asim YarKhan Jack Dongarra, Zhiao Shi, Fengguang Song Innovative Computing Laboratory University of Tennessee VGrADS Workshop – September.
Hands-On Microsoft Windows Server 2008
Steven Whitham Jeremy Woods
Computer Science I CSC 135.
Ch 15 –part 3 -design evaluation
CLUSTER COMPUTING.
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Presentation transcript:

Problem Solving with NetSolve Michelle Miller, Keith Moore, Susan Blackford, NetSolve group, Innovative Computing Lab, University of Tennessee

NetSolve Problem Statement –Software libraries could be easier to install and use Locate library Configure and install library on local machine –Need access to bigger/different machines

NetSolve Solution –Simple, consistent interface to numeric packages –Ease of use – locate/use already configured/installed/running solvers through simple procedure call –Resource sharing (hardware & software) Greater access to machine resources New software packages made available simply

NetSolve Architecture Server 1 blas, petsc Agent Client Proxy Server 2 lapack, mcell Server 3 blas, itpack Server 4 superLU Server 5 aztec, MA28 Matlab Client

NetSolve Architecture Server 1 blas, petsc Agent Client Proxy Server 2 lapack, mcell Server 3 blas, itpack Server 4 superLU Server 5 aztec, MA28 Matlab Client netsolve(‘problemX’, A, rhs)

NetSolve Architecture Server 1 blas, petsc Agent Client Proxy Server 2 lapack, mcell Server 3 blas, itpack Server 4 superLU Server 5 aztec, MA28 Matlab Client netsolve(‘problemX’, A, rhs) Servers?

NetSolve Architecture Server 1 blas, petsc Agent Client Proxy Server 2 lapack, mcell Server 3 blas, itpack Server 4 superLU Server 5 aztec, MA28 Matlab Client netsolve(‘problemX’, A, rhs) Servers? workload Server1, Server3

NetSolve Architecture Server 1 blas, petsc Agent Client Proxy Server 2 lapack, mcell Server 3 blas, itpack Server 4 superLU Server 5 aztec, MA28 Matlab Client problemX, A, rhs

NetSolve Architecture Server 1 blas, petsc Agent Client Proxy Server 2 lapack, mcell Server 3 blas, itpack Server 4 superLU Server 5 aztec, MA28 Matlab Client result

Parallelism in NetSolve Task Farming –Single request issued that specifies data partitioning –Data parallel, SPMD support

Task Farming Interface /*** BEFORE ***/ preamble and initializations; status1 = netslnb(‘iqsort’, size1, array1, sorted1); status2 = netslnb(‘iqsort’, size2, array2, sorted2);. status20 = netslnb(‘iqsort’, size20, array20, sorted20); program continues; /*** AFTER ***/ preamble and initializations; status_array = netsl_farm(‘iqsort’, “i=0,19”, netsl_int_array(size_array, “$i”), netsl_ptr_array(input_array, “$i”), netsl_ptr_array(sorted_array, “$i”)); program continues;

Request Sequencing Sequence of computations Data dependency analysis to reduce extra data transfers in sequence steps Transmit superset of all input/output parameters and make persistent near server(s) for duration of sequence execution.

netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); ClientServer command1(A, B) result C ClientServer command2(A, C) result D ClientServer command3(D, E) result F netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); ClientServer sequence(A, B, E) Server ClientServer result F input A, intermediate output C intermediate output D, input E Data Persistence

NetSolve Applications MCell (Salk Institute) –Monte Carlo simulator of cellular microphysiology – synaptic transmission –Large numbers of same computation with different parameters (diffusion and chemical reaction calculations) –Task farming used for parallel runs

NetSolve and Metacomputing Backends NetSolve Client NetSolve Servers NetSolve Agent NetSolve Services MDS GASS GRAM Client-Proxy Interface NetSolve Proxy Client-Proxy Interface Globus Proxy Client-Proxy Interface Ninf Proxy Legion Proxy Ninf Services Legion Services Client-Proxy Interface

NetSolve Authentication with Kerberos Kerberos used to maintain Access Control Lists and manage access to computational resources. NetSolve properly handles authorized and non-authorized components together in the same system.

NetSolve Authentication with Kerberos NetSolve client NetSolve agent NetSolve servers Kerberos KDC Typical NetSolve Transaction Kerberized Interaction

NetSolve Authentication with Kerberos NetSolve client NetSolve agent NetSolve servers Kerberos KDC Servers register their presence with the agent and KDC Client issues problem request; Agent responds with list of servers Client sends work request to server; server replies requesting authentication credentials Client requests ticket from KDC Client sends ticket and input to server; server authenticates and returns the solution set

NWS Integration NetSolve Server CPU sensor Host Machine NetSolve Server CPU sensor Host Machine NetSolve Agent NWS Forecaster NWS Memory Sensors report to NWS memory.

NWS Integration NetSolve Server CPU sensor Host Machine NetSolve Server CPU sensor Host Machine NetSolve Agent NWS Forecaster NWS Memory

NWS Integration NetSolve Server CPU sensor Host Machine NetSolve Server CPU sensor Host Machine NetSolve Agent NWS Forecaster NWS Memory Agent probes NWS Forecaster

NWS Integration NetSolve Server CPU sensor Host Machine NetSolve Server CPU sensor Host Machine NetSolve Agent NWS Forecaster NWS Memory Forecaster probes memory.

NWS Integration NetSolve Server CPU sensor Host Machine NetSolve Server CPU sensor Host Machine NetSolve Agent NWS Forecaster NWS Memory Forecaster makes forecast

Agent chooses server NWS Integration NetSolve Server CPU sensor Host Machine NetSolve Server CPU sensor Host Machine NetSolve Agent NWS Forecaster NWS Memory

Newly enabled libraries

PETSc SuperLU SPOOLES MA28 Sparse Matrices/Solvers Support for compressed row/column sparse matrix storage -- significantly reduces network data transmission. Iterative and direct solvers: PETSc, Aztec, SuperLU, Ma28, … All available solver packages will be made available from UTK NetSolve servers and others.

Matlab interface Calls to PETSc, Aztec –[x, its] = netsolve(‘iterative_solve_parallel’,‘PETSC’,A,b,1.e-6,500); –[x, its] = netsolve(‘iterative_solve_parallel’, ‘AZTEC’,A,b,1.e-6,500); Similar for SuperLU, MA28 –[x] = netsolve(‘direct_solve_serial’,’SUPERLU’,A,b,0.3,1); –[x] = netsolve(‘direct_solve_serial’, ‘MA28’,A,b,0.3,1); Calls to LAPACK, ScaLAPACK –[lu, p, x, info] = netsolve(‘dgesv’,A,b); –[lu,p,x,info] = netsolve(‘pdgesv’,A,b);

‘LinearSolve’ interface Uncertain which library to choose? ‘LinearSolve’ interface chooses the library and appropriate routine for the user –[x] = netsolve(‘LinearSolve’,A,b);

Heuristics Interface analyzes the matrix A –matrix shape (square, rectangular)? If rectangular, choose linear least squares solver from LAPACK –matrix element density If square, is the matrix sparse or dense? Manually check the percentage of nonzeros and transform to sparse matrix format If dense, is it symmetric?

Heuristics cont’d If sparse, and dense band, use LAPACK If sparse, and if there is a block or diagonal structure (Aztec), can yield higher performance (level 3 BLAS) If sparse, direct or iterative solver? –Size of the matrix (large matrix, fill-in for a direct method can be larger than iterative)

Heuristics cont’d –Numerical properties (direct solvers can handle more complicated matrices than iterative methods) How to estimate fill-in and gauge numerical properties of A? Future Work: ‘Eigensolve” interface

Interfacing to Parallel Libraries Improved task scheduling and load balance –NWS memory sensor, latency and bandwidth sensors –provide matrix distributions for the user –heuristics for the best choice of the # of processors, amount of matrix per processor, process grid dimension...

Get NetSolve1.3 Now! Release date -- April 2000! UNIX client/agent/server source code. UNIX client binaries available. Win32 dlls for C/Matlab/Mathematica clients.