Download presentation
Presentation is loading. Please wait.
Published byChristopher Preston Modified over 9 years ago
1
Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben Bergen Applied Computer Science (CCS-7) Los Alamos National Laboratory Kartik Ramkrishnan and Ben Bergen Applied Computer Science (CCS-7) Los Alamos National Laboratory
2
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA SKA – Static Kernel Analyzer SKA is a very useful tool to improve the development process. Performs static architecture aware analysis of kernels. Outputs code metrics during the development process. Visualizes the code execution on the specified pipeline. What is SKA Slide 2
3
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA SKA-Enhanced Development Cycle Slide 3
4
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA define i32 @main(i32 %argc, i8** nocapture %argv) nounwind uwtable readnone { entry: %a1 = alloca [32 x float], align 4 %b2 = alloca [32 x float], align 4 %c3 = alloca [32 x float], align 4 br label %"3" "3": ; preds = %"3", %entry %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %"3" ] %0 = getelementptr [32 x float]* %a1, i64 0, i64 %indvars.iv %1 = load float* %0, align 4 %2 = getelementptr [32 x float]* %b2, i64 0, i64 %indvars.iv %3 = load float* %2, align 4 %4 = getelementptr [32 x float]* %c3, i64 0, i64 %indvars.iv %5 = load float* %4, align 4 %6 = fmul float %3, %5 %7 = fadd float %1, %6 store float %7, float* %4, align 4 %indvars.iv.next = add i64 %indvars.iv, 1 %lftr.wideiv = trunc i64 %indvars.iv.next to i32 %exitcond = icmp eq i32 %lftr.wideiv, 32 br i1 %exitcond, label %"5", label %"3" "5": ; preds = %"3" ret i32 0 Example kernel – saxpy.ll Slide 4
5
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA LLVM IR is SSA (single static assignment) which has infinite register count. ISAs(instruction set architectures) have a limited number of registers. We improve SKA’s fidelity by allocating registers to the IR based on the target ISA. Register allocation support for SKA Slide 5
6
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Simple register allocation algorithm. Register Allocation algorithm Slide 6
7
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Liveness Tables Slide 7
8
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA SKA takes an LLVM IR module as input and builds a liveness table. Build Liveness Tables Slide 8 Partial liveness table for saxpy.ll
9
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Liveness Tables Slide 9 Top level loop Single BB liveness calculation Populate liveness table
10
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Interference Graph Slide 10
11
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Traverse the liveness table to create the interference graph. Build Interference Graph Slide 11 Partial igraph for saxpy.ll
12
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Build Interference Graph Slide 12 Top level loop Populate igraph
13
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Simplify Interference Graph Slide 13
14
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Populate a stack which records whether a register (node) is simple or not. Simplify Interference Graph Slide 14 Partial node stack for saxpy.ll
15
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Simplify Interference Graph Slide 15 Populate simple node stack
16
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Assign ISA Registers to IR Slide 16
17
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Assign ISA Registers to IR Slide 17 Assign ISA registers to IR, if no true spill. We choose between int, float and vector. Partial register allocation for saxpy.ll
18
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Assign ISA registers to IR Slide 18 Assign register if no true spill
19
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Rewrite IR Slide 19
20
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA The live range of %a1 is shown in red. It reduces after rewriting the IR. Rewrite IR Slide 20
21
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Rewrite IR Slide 21 Store instruction into stack Load, use and store
22
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Register allocation done ! Slide 22
23
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Specified in an xml file. Specifies logical units, instructions they process, latencies, issue width … Virtual architecture specification Slide 23 Partial architecture example
24
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Pipeline simulation Slide 24 Pipeline simulation of saxpy.ll
25
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Skaview Slide 25 Graphical visualization of saxpy.ll
26
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA SKA outputs useful metrics about the code. Primitive statistics include basic performance counters, such as instructions, cycles and stalls. Derived statistics are obtained from primitive statistics. Code metrics Slide 26
27
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA CPI prediction is better after register allocation. Results for residual.ll Slide 27
28
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA No change in CPI prediction. Why ? Results for ef_operator.ll Slide 28
29
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Predicts CPI > 1.0 for KNC for single threaded workloads. Results for KNC (Knights corner) Slide 29
30
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA SKA now supports register allocation. Register allocation improves SKA’s fidelity by 5- 10% across three architectures for a compute intensive benchmark. Dynamic scheduling and cache models can further improve SKA fidelity. Conclusion Slide 30
31
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA Questions ? Thank You ! Slide 31
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.