Download presentation
Presentation is loading. Please wait.
Published byBrandon Neal Modified over 8 years ago
1
University of Tennessee www.netlib.org/atlas Automatically Tuned Linear Algebra Software (ATLAS) R. Clint Whaley University of Tennessee www.netlib.org/atlas
2
What is ATLAS A package that adapts to differing architectures via AEOS techniques -Initially, supply BLAS Automated Empirical Optimization of Software (AEOS) -Machine searches opt space -Finds application- apparent architecture AEOS requires: -Method of code variation »Code generation »Multiple implement. »Parameterization -Sophisticated Timers -Robust search heuristic
3
University of Tennessee www.netlib.org/atlas Why ATLAS is needed BLAS require many man-hours / platform -Only done if financial incentive is there »Many platforms will never have an optimal version -Lags behind hardware -May not be affordable by everyone -Improves vendor code Allows for portably optimal codes -Obsolescence insurance Operations may be important, but not general enough for standard
4
University of Tennessee www.netlib.org/atlas ATLAS Software Coming soon -pthread support -Open source kernels »SSE & 3DNOW! »GOTO ev5/6 BLAS -Performance for banded and packed -More LAPACK Coming not-so- soon -Sparse support -User customization Currently provided -Full BLAS (C & F77) »Level 3 BLAS u Generated GEMM -1-2 hours install time per precision u Recursive GEMM- based L3 BLAS -Antoine Petitet »Level 2 BLAS u GEMV & GER ker »Level 1 BLAS -Some LAPACK »LU, LLt
5
University of Tennessee www.netlib.org/atlas Algorithmic Approach for Matrix Multiply Only generated code is on-chip multiply All BLAS operations written in terms of generated on-chip multiply All transpose cases coerced through data copy to 1 case of on-chip multiply -Only 1 case generated per platform M C A B N K N M K * NB
6
University of Tennessee www.netlib.org/atlas Algorithmic approach for Level 3 BLAS Recur down to L1 cache block size Need kernel at bottom of recursion -Use gemm-based kernel for portability 0 0 0 0 0 0 0 Recursive TRMM
7
University of Tennessee www.netlib.org/atlas 500x500 DGEMM Across Various Architectures
8
University of Tennessee www.netlib.org/atlas 500 x 500 Double Precision RB LU factorization
9
University of Tennessee www.netlib.org/atlas 500x500 Recursive BLAS on UltraSparc 2200
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.