Download presentation
Presentation is loading. Please wait.
Published byElizabeth May Modified over 9 years ago
1
Javier Cuenca, José González Department of Ingeniería y Tecnología de Computadores Domingo Giménez Department of Informática y Sistemas University of Murcia SPAIN Towards the Design of an Automatically Tuned Linear Algebra Library
2
Linear Algebra: highly optimizable operations, but optimizations are Platform Specific Traditional method: Hand-Optimization for each platform Time-consuming Incompatible with Hardware Evolution Incompatible with changes in the system (architecture and basic libraries) Unsuitable for systems with variable workload Misuse by non expert users Current Situation of Linear Algebra Parallel Routines
3
Some groups and projects: ATLAS, GrADS, LAWRA, FLAME, I-LIB But the problem is very complex. Solutions to this situation?
4
Routines Parameterised: System parameters, Algorithmic parameters System parameters obtained at installation time Analytical model of the routine and simple installation routines to obtain the system parameters A reduced number of executions at installation time Algorithmic parameters From the analytical model with the system parameters obtained in the installation process Our approach
5
Our approach: the scheme LAR-IF EXECUT. OF LAR-ERs BL LIBRARY INCLUSION PROCESS LAR-OAPF OAP SELECTION LAR-SPF INSTALLATIONINSTALLATION SYSTEM MANAGER IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
6
Design: Modelling the LAR LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR
7
The behaviour of the algorithm on the platform is defined T exec = f (SPs, n, APs) SPs = f(n, APs)System Parameters APsAlgorithmic Parameters nProblem Size LAR-MOD:Analytical Model of LAR
8
System Parameters (SPs): Hardware Platform Physical Characteristics Current Conditions Basic libraries LARs Performance LAR-MOD:Analytical Model of LAR
9
System Parameters (SPs): Hardware Platform Physical Characteristics Current Conditions Basic libraries Two Kinds of SPs: Communication System Parameters (CSPs) Arithmetic System Parameters (ASPs) LARs Performance LAR-MOD:Analytical Model of LAR
10
System Parameters (SPs): Hardware Platform Physical Characteristics Current Conditions Basic libraries Two Kinds of SPs: Communication System Parameters (CSPs): t s start-up time t w word-sending time Arithmetic System Parameters (ASPs) LARs Performance LAR-MOD:Analytical Model of LAR
11
System Parameters (SPs): Hardware Platform Physical Characteristics Current Conditions Basic libraries Two Kinds of SPs: Communication System Parameters (CSPs) Arithmetic System Parameters (ASPs): t c arithmetic cost. Using BLAS: k 1 k 2 and k 3 LARs Performance LAR-MOD:Analytical Model of LAR
12
System Parameters (SPs): Hardware Platform Physical Characteristics Current Conditions Basic libraries How to estimate each SP? 1º.- Obtain the kernel of performance cost of LAR 2º.- Make an Estimation Routine from this kernel LARs Performance LAR-MOD:Analytical Model of LAR
13
Design LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR
14
Design: Making the LAR-ERs IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
15
Arithmetic System Parameters (ASPs): Computation Kernel of the LAR Estimation Routine Similar storage scheme Similar quantity of data Communication System Parameters (CSPs): Communication Kernel of the LAR Estimation Routine Similar kind of communication Similar quantity of data LAR-ERs: Estimation Routines
16
IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs Design
17
IMPLEMEN. OF LAR-ERs LAR-DESIGNER HAND-MADE ONLY ONCE MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs Design: Process has finished
18
Installation: Runing the LAR-ERs LAR-IF EXECUT. OF LAR-ERs BL LAR-SPF INSTALLATIONINSTALLATION SYSTEM MANAGER IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
19
Installation: obtaining the OAP LAR-IF EXECUT. OF LAR-ERs BL LAR-OAPF OAP SELECTION LAR-SPF INSTALLATIONINSTALLATION SYSTEM MANAGER IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
20
Algorithmic Parameters (APs) Known the SPs values, the Optimum Values for the APs are calculated ( OAP ): b block size pnumber of processors r c logical topology grid configuration (logical 2D mesh) Installation: obtaining the OAP
21
Installation LAR-IF EXECUT. OF LAR-ERs BL LAR-OAPF OAP SELECTION LAR-SPF INSTALLATIONINSTALLATION SYSTEM MANAGER IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
22
Installation: putting it all together LAR-IF EXECUT. OF LAR-ERs BL LIBRARY INCLUSION PROCESS LAR-OAPF OAP SELECTION LAR-SPF INSTALLATIONINSTALLATION SYSTEM MANAGER IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
23
Installation process finished LAR-IF EXECUT. OF LAR-ERs BL LIBRARY INCLUSION PROCESS LAR-OAPF OAP SELECTION LAR-SPF INSTALLATIONINSTALLATION SYSTEM MANAGER IMPLEMEN. OF LAR-ERs LAR-DESIGNER MODELLING LAR LAR-MOD DESIGNDESIGN LAR LAR-ERs
24
LAR: Least Squares Toeplitz Routine. Platform:Network of PCs LAR: One-sided Block Jacobi Method to solve the Symmetric Eigenvalue Problem. Platform:SGI Origin 2000 LAR: Gaussian elimination. Platform:NoW (heterogeneous system) LAR: block LU factorization. Platforms:IBM SP2, SGI Origin 2000, NoW Basic Libraries:reference BLAS, machine BLAS, ATLAS Experiments
25
Quotient between the execution time with the parameters provided by the model and the optimum execution time. In the sequential case, and in parallel with 4 and 8 processors. LU on IBM SP2
26
Quotient between the execution time with the parameters provided by the model and the optimum execution time. In the sequential case, and in parallel with 4, 8 and 16 processors. LU on Origin 2000
27
Quotient between the execution time with the parameters provided by the model and the optimum execution time. In the sequential case, and in parallel with 4 processors. Using machine BLAS and ATLAS as basic libraries. LU on NoW
28
We try to develop a methodology valid for a wide range of systems, and to include it in the design of linear algebra libraries: it is necessary to analyse the methodology in more systems and with more routines The Basic Linear Algebra Library to use can be considered as another parameter An installation strategy common to a set of routines must be developed At the moment we are analysing routines individually, but it could be preferable to analyse algorithmic schemes We are working in the design of a strategy for the parameters election in dynamic systems Future Works
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.