Presentation is loading. Please wait.

Presentation is loading. Please wait.

Please do not distribute

Similar presentations


Presentation on theme: "Please do not distribute"— Presentation transcript:

1 Please do not distribute
5/10/2018 Rapid Exploration of Accelerator-Rich Architectures: Automation from Concept to Prototyping David Brooks, Jason Cong, Zhenman Fang, Yakun Sophia Shao, and Sam Xi Harvard University & UCLA GYW

2 Please do not distribute
5/10/2018 Tutorial Outline Time Topic Speaker 8:30 am – 9:00 am Accelerator Research Infrastructure Overview Sophia Shao 9:00 am – 9:30 am Aladdin: Accelerator Pre-RTL Modeling 9:30 am – 10:00 am Rapid Hardware Specialization with HLS: Glass Half Full Prof. Zhiru Zhang 10:00 am – 10:30 am PARADE: HLS-Based Accelerator-Rich Architecture Simulation Zhenman Fang 10:30 am – 11:00 am Break 11:00 am – 11:30 am gem5-Aladdin: Accelerator System Co-Design Sam Xi 11:30 am – 12:00 pm ARAPrototyper: FPGA Prototyping 12:00pm – 13:30 pm Lunch 13:30 pm – 14:00 pm Virtual Machine Setup Sophia Shao & Sam Xi 14:00 pm – 14:30 pm Hands-on: Accelerator Design Space Exploration using Aladdin 14:30 pm – 15:00 pm Hands-on: SoC Design Space Exploration using gem5-Aladdin GYW

3 Moore’s Law

4 CMOS Scaling is Slowing Down
Please do not distribute 5/10/2018 CMOS Scaling is Slowing Down 180 nm 130 nm 90 nm 65 nm 45 nm 32 nm 22 nm 14 nm 10 nm GYW

5 CMOS Technology Scaling
Please do not distribute 5/10/2018 CMOS Technology Scaling Technological Fallow Period GYW

6 Potential for Specialized Architectures
16 Encryption 17 Hearing Aid 18 FIR for disk read 19 MPEG Encoder 20 Baseband [Zhang and Brodersen]

7 Cores, GPUs, and Accelerators: Apple A8 SoC
Please do not distribute 5/10/2018 Cores, GPUs, and Accelerators: Apple A8 SoC Out-of-Core Accelerators GYW

8 Cores, GPUs, and Accelerators: Apple A8 SoC
Please do not distribute 5/10/2018 Cores, GPUs, and Accelerators: Apple A8 SoC Out-of-Core Accelerators GYW

9 Cores, GPUs, and Accelerators: Apple A8 SoC
Please do not distribute 5/10/2018 Cores, GPUs, and Accelerators: Apple A8 SoC Out-of-Core Accelerators Maltiel Consulting estimates Our estimates GYW

10 Challenges in Accelerators
Flexibility Fixed-function accelerators are only designed for the target applications. Programmability Today’s accelerators are explicitly managed by programmers.

11 Please do not distribute
5/10/2018 Today’s SoC OMAP 4 SoC GYW

12 Please do not distribute
5/10/2018 Today’s SoC DMA ARM Cores GPU DSP SD USB Audio Video Face Imaging System Bus Secondary Bus Tertiary OMAP 4 SoC GYW

13 Challenges in Accelerators
Flexibility Fixed-function accelerators are only designed for the target applications. Programmability Today’s accelerators are explicitly managed by programmers. Design Cost Accelerator (and RTL) implementation is inherently tedious and time-consuming.

14 Please do not distribute
5/10/2018 Today’s SoC GPU/DSP CPU Buses Mem Inter- face Acc GYW

15 Future Accelerator-Centric Architectures
Please do not distribute 5/10/2018 Future Accelerator-Centric Architectures GPU/DSP Big Cores Shared Resources Memory Interface Sea of Fine-Grained Accelerators Small Cores How to decompose applications into accelerators? How to rapidly design lots of accelerators? How to design and manage the shared resources? Flexibility Design Cost Programmability GYW

16 auto-generated accelerators based on HLS (AutoPilot)
PARADE: Platform for Accelerator-Rich Architectural Design & Exploration [ICCAD 15] extended gem5 (McPAT) for X86 CPU, with OS auto-generated accelerators based on HLS (AutoPilot) added SPM, DMA, GAM & TLB model extended Garnet (DSENT) for NoC extended Ruby (CACTI) for coherent cache hierarchy gem5 memory model [ISPASS 14]

17 ARAPrototyper: Prototyping an ARA on FPGA
Using Xilinx Zynq SoC (FPGA fabrics + ARM) Major components of an ARA General processor cores A sea of heterogeneous accelerators Memory system + interconnects (NoC)

18 Contributions WIICA: Accelerator Workload Characterization [ISPASS’13]
GPU/DSP Big Cores Shared Resources Memory Interface Sea of Fine-Grained Accelerators Small Cores MachSuite: Accelerator Benchmark Suite [IISWC’14] Aladdin: Accelerator Pre-RTL, Power-Performance Simulator [ISCA’14, TopPicks’15] Accelerator Design w/ High-Level Synthesis [ISLPED’13_1] gem5-Aladdin: Accelerator-System Co-Design [MICRO’16]

19 Please do not distribute
5/10/2018 Tutorial Outline Time Topic Speaker 8:30 am – 9:00 am Accelerator Research Infrastructure Overview Sophia Shao 9:00 am – 9:30 am Aladdin: Accelerator Pre-RTL Modeling 9:30 am – 10:00 am Rapid Hardware Specialization with HLS: Glass Half Full Prof. Zhiru Zhang 10:00 am – 10:30 am PARADE: HLS-Based Accelerator-Rich Architecture Simulation Zhenman Fang 10:30 am – 11:00 am Break 11:00 am – 11:30 am gem5-Aladdin: Accelerator System Co-Design Sam Xi 11:30 am – 12:00 pm ARAPrototyper: FPGA Prototyping 12:00pm – 13:30 pm Lunch 13:30 pm – 14:00 pm Virtual Machine Setup Sophia Shao & Sam Xi 14:00 pm – 14:30 pm Hands-on: Accelerator Design Space Exploration using Aladdin 14:30 pm – 15:00 pm Hands-on: SoC Design Space Exploration using gem5-Aladdin GYW


Download ppt "Please do not distribute"

Similar presentations


Ads by Google