Presentation is loading. Please wait.

Presentation is loading. Please wait.

GPU baseline architecture and gpgpu-sim

Similar presentations


Presentation on theme: "GPU baseline architecture and gpgpu-sim"— Presentation transcript:

1 GPU baseline architecture and gpgpu-sim
Presented by 王建飞

2 A typical GPGPU: Related terminology: On-chip memory: GPC:SM cluster
SM:streaming multiprocessor SIMT core:single instruction multiple threads (?SIMD) On-chip memory: RF:register file,large L1D cache:private,weak coherence Shared memory: programmer-controlled

3 Runtime of GPGPU 1:

4 Runtime of GPGPU 2: Scheduler:LRR,GTO SIMT stack:post-dominator
Operand collector:access RF Lane:SP,SFU,MEM

5 A typical code study 1: Constant gridDim.x,blockDim.x
Variable:blockIdx.x threadIdx.x blocksPerGrid = 32 threadsPerBlock = 256 So: gridDim.x = 32 blockDim.x = 256 __global__: call from host __device__: call from device Source: cuda by example;

6 A typical code study 2:

7 GPGPU-sim: a cycle-level GPU performance simulator that focuses on "GPU computing" (general purpose computation on GPUs) Replace cuda api and supply a configurable GPU Simulation model: functional simulation (cuda-sim.h/cc) and timing simulation (shader.h/cc) gpu-cache.h/cc: cache model

8 Simulation line: register_set: instruction temporary buffer
m_fu: sp, sfu, ldst_unit Reference: GPGPU-sim manual; Nvidia Fermi/Kepler architecture whitepaper

9 Instruction Set Architecture:
PTX: Parallel Thread eXecution , a pseudo-assembly instruction set  ptxas SASS: a native GPU ISA (strength reduction, instruction scheduling, register allocation) PTXPlus: to extend PTX with the required features in order to provide a one-to-one mapping to SASS

10 Instruction Set Architecture:

11 Instruction Set Architecture:
//SASS S2R R0, SR_CTAid_X; S2R R2, SR_Tid_X; //PTX mov.u32 %r3, %ctaid.x; mov.u32 %r5, %tid.x;; //PTXPlus mad.lo.u16 $r0, %ctaid.x, 0x , $r0; mov.u16 $r4.lo, 0x ;

12 Thanks


Download ppt "GPU baseline architecture and gpgpu-sim"

Similar presentations


Ads by Google