Presentation is loading. Please wait.

Presentation is loading. Please wait.

HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com.

Similar presentations


Presentation on theme: "HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com."— Presentation transcript:

1 HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com

2 Convey Hybrid-Core Computing Intel® Processor Coprocessor Oil & Gas Financial Custom CAE Sciences Application-Specific Personalities Cache-coherent shared virtual memory Applications x86-64 Instructions Coprocessor Instructions Convey Compilers An x86 processor is combined with a coprocessor that implements highly parallel instructions Copyright 200909/10/09 2

3 Using Personalities Convey Software Development Suite Hybrid-Core Executable x86-64 and Coprocessor Instructions Hybrid-Core Executable x86-64 and Coprocessor Instructions C/C++ Fortran Convey HC-1 Intel x86 Coprocessor P P Personalities description file specifies available instructions personality loaded at runtime by OS Program using ANSI standard C/C++ and Fortran User specifies personality at compile time OS demand loads personalities at runtime Copyright 200909/10/09 3

4 Language/Library Support for Massive Parallelism Apply massive amounts of logic for a single thread of execution – Do that via specialization – Have hardware adapt to the application rather than the application adapting to the hardware – Parallelizing at the instruction level not the core level C/C++ and Fortran programming – No special languages – Code can run on X86 servers without coprocessors Copyright 200909/10/09 4 multiple units in each pipe for instruction level parallelism instructions can be very complex Multiple function pipes for data parallelism Crossbar Dispatch Crossbar Dispatch Crossbar Dispatch Crossbar Dispatch

5 Development Tools executable Intel® 64 code Coprocessor code C/C++ Fortran95 Common Optimizer Intel® 64 Optimizer & Code Generator Convey Vectorizer& Code Generator Procedural Personality Interface Linker other objects Program in ANSI standard C/C++ and Fortran Unified compiler generates x86 & coprocessor instructions Seamless debugging environment for Intel & coprocessor code Executable can run on x86_64 nodes or on Convey Hybrid-Core nodes Copyright 2009 5 09/10/09

6 Multi Mode Compilation 09/10/09 Original code: for (j=0; j<N; j++) a[j] = b[j]+scalar*c[j]; Generated code: if(CP available) { coprocessor instructions } else { x86 instructions } Convey backend x86-64 backend x86-64 backend Personality Definition Files Convey Multi Mode Compiler Convey systems are inherently heterogeneous Can select from a set of architectures Required architectures are dynamically loaded at runtime Higher level parallelism supported via MPI or threads Copyright 2009 6

7 Custom Convey Runtime Intel® 64 code Coprocessor code Convey Shared Libraries Convey Simulator shared library launched by OS cny_runtime.o executable coprocessor hardware gdb debugging on HW & simulator SPAT performance simulator if dlopen of shared library fails, Intel 64 code executed x86-64 hardware FAP DP SP personalities are demand loaded by OS at runtime Copyright 2009 7 09/10/09

8 Debugging Hybrid-Core Applications (gdb) run Starting program: /home/guest/Desktop/DEMOS/compiler_demo/vec_auto.exe Breakpoint 1, main (argc=1, argv=0x7fffa9111ee8) at vec_main.c:19 19 for (i=0; i<n; i++) { (gdb) disass Dump of assembler code for function main: 0x0000000000405818 : push %rbp 0x0000000000405819 : mov %rsp,%rbp 0x000000000040581c : add $0xffffffffffffffb0,%rsp 0x0000000000405820 : add $0xfffffffffffffff8,%rsp 0x0000000000405824 : fnstcw (%rsp) 0x0000000000405827 : andw $0xfcff,(%rsp) 0x000000000040582d : orw $0x300,(%rsp) (gdb) cont Continuing. Breakpoint 4, 0x00000000008000a0 in __cny_region_triad0 () (gdb) disass 0x8000a0 0x8000c0 Dump of assembler code from 0x8000a0 to 0x8000c0: 0x00000000008000a0 : mov %a11,%VL 0x00000000008000a8 : ld.dw $0x0(%a10),%v0r 0x00000000008000ac : or %a11,$0,%a13 0x00000000008000b0 : ld.dw $0x0(%a9),%v1r 0x00000000008000b4 : add.sq %a12,%a13,%a12 0x00000000008000b8 : fma.fs %v0r,%s1,%v1r,%v0r 0x00000000008000c0 : st.dw %v0r,$0x0(%a8) End of assembler dump. (gdb) Copyright 2009 8 09/10/09

9 Copyright 2009 9 09/10/09

10 Third Party Libraries Third Party Libraries run unmodified on the X86 Key kernels have been optimized by Convey Third party libraries can call Convey optimized routines – BLAS – LAPACK – etc. Copyright 200909/10/09 10


Download ppt "HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com."

Similar presentations


Ads by Google