Download presentation
Presentation is loading. Please wait.
Published byAdam Maxwell Modified over 9 years ago
1
1 Provided By: Ali Teymouri Based on article “Jaguar: A Next-Generation Low-Power x86-64 Core ” Coarse: Custom Implementation of DSP Systems University of Tehran School of Electrical and Computer Engineering
2
Outline 2 Introduction Motivation Comparing two core Architecture Improvements Conclusion
3
3 List of AMD microprocessors [5] 1 AMD-originated architectures 1.1 Am2900 series (1975) 1.2 29000 (29K) (1987–95) 2 non x86 architecture processors 2.1 2nd source (1974) 2.2 2nd source (1982) 3 x86 architecture processors 3.1 2nd source (1979–91) 3.2 Am X86 series (1991–95) 3.3 K5 architecture (1995) 3.4 K6 architecture (1997–2001) 3.5 K7 architecture (1999–2005) 3.6 K8 core architecture 3.7 K10 core architecture 3.8 Bulldozer module architecture 3.9 Bobcat core architecture
4
Bobcat 4 Core power gating and a micro architecture optimized for low power designed for mobile, tablet to address the specific customer demands 4.5 – 18 watt power range Bobcat low-power core [2]
5
Jaguar core 5 Bobcat low-power core [4] Jaguar core [4] Jaguar
6
Jaguar CU 6 First AMD 28nm quad- core x86-64 Build unit to deploy into a wide variety of SoCs for different applications Span wide array of applications from sub 5W to 25W Jaguar CU[4]
7
Motivation Jaguar 7 [1] Build SoC to fit range of markets – Tablet, hybrids – Value notebook – Ultrathin notebook – Value desktop
8
Comparing two core 8 [1]
9
Architecture 9 Improved IPC, frequency and power more than BT Estimated typical IPC improvement over “Bobcat”: >15%* The load-store unit is redesigned 4x32B Instruction Cache loop buffer for power Improved Instruction Cache prefetcher for IPC Added L2 prefetcher Added hardware integer divider Improved C6 and CC6 entry/exit latencies Clock gate >92% flops in typical applications
10
Architecture 10 The JG core is optimized at two main frequency targets, low and high voltage giving the core a dynamic range for application in several markets 3 Vt solution: HVT/RVT/LVT Longer lengths for each Vt BT had 10 metal stack JG uses 11 metal stack [1]
11
High Speed Flop 11 custom built flip-flops [4] to maximize performance over traditional master-slave flops larger flops consume more dynamic power To minimize the power and area impact they are inserted only in critical paths [1] custom flops account for < 8%
12
CU Level Clock Distribution 12 Matched clock delay to all endpoints to minimize latency extensive clock gating Each unit’s clock independently gated to reduce dynamic power [1]
13
Power Gating 13 Integrated Power Gating Headers have 4 independent enables to Longer lengths for each Vt Diagram showing highlighted headers within the JG core Area overhead is ~3% [1]
14
Conclusion 14 “Jaguar” is first AMD 28nm bulk CPU Quad core with shared L2 support a wide range of applications Is low-power and Focus on high density and smaller chip area Improved IPC, frequency and power more than BT Worthy successor to “Bobcat” x86-64 core
15
References 15 [1]. T. Singh, J. Bell, S. Southard., “Jaguar: A Next-Generation Low- Power x86-64 Core,” in 2013 IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 17–21, 2013, section 3 [2]. D. Foley, P. Bansal, D. Cherepacha, R. Wasmuth, A. Gunasekar, S. Gutta, A. Naini, ‘‘A Low-Power Integrated x86–64 and Graphics Processor for Mobile Computing Devices, ’’ IEEE Journal of Solid-State Circuits, VOL. 47, NO. 1, January 2012. [3]. www.hitechreview.com [4]. www. semiaccurate.com [5]. www.wikipedia.org
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.