Download presentation
Presentation is loading. Please wait.
1
IClass – A Many-core processor based on RISC-V
RISE Lab, IIT Madras
2
Objective To build an out-of-order core that could compete with present day cores of Desktop and mobile environments. To develop interconnects with cache-coherence support. To create a many-core processor using hybrid interconnects with a uniform interface across the interconnects.
3
Features of Out-of-Order core.
Supports RV64IMAFD ISA as defined by RISC-V spec version 2.1. Supports RISC-V privilege spec all modes. 8- stage core with Out-of-Order execution through explicit register re- naming approach. Dual Issue. Parameterized set-associative I-Cache and D-Cache VIPT Caches + Non-Blocking AXI bus support with multiple masters. Parameterized tournament branch prediction unit 2 ALU units and 1 FPU Unit. Prioritized for selecting instructions from issue queue based on age MMU support modeling the Power 3-level PTW CAM based speculative load store unit Single & Double precision Pipelined floating point unit optimized for maximum performance.
4
Overview
5
Bypass Network Dependent instructions have 3 cycle bubble between them. Producer Select Drive Execute Broadcast Consumer Wakeup Select Drive Execute In bypass network instructions are predicted to get finished in certain cycles Accordingly instructions dependent are woken up. Producer Select Drive Execute Broadcast Wakeup Select Drive Execute Consumer
6
Implementation of Bypass Network
Instead of having registers for operand ready, every instruction is attributed to Delay register. Delay registers contents are moved “Shift register” at the time of broadcast. Contents of “Shift registers” are right shifted every cycle. When the right most bit in “Shift register” is set, then corresponding instruction is released for execution.
7
CAM based Load Store Unit
Each memory access instruction is allotted an entry in one of LS queues. The value from the store is forwarded in case of address match. Alias bit is set in case of wrong speculation and pipeline is flushed at the time of commit. EAC CAM SEARCH Broadcast Load result Store Queue Memory Access Load Queue Cache Store Commit CAM SEARCH Flush Wire Load Commit
8
Verification Environment
The verification environment consists of spike as golden reference. Each test case generated by AAPG consists of 20,000 odd instructions. Written in Python. We have an in house (Instruction Set Simulator) - ISS dumps state of the processor by generating the all register and memory values for each instruction executed. Tests Performed. RISC-V Tests AAPG RISC-V Torture test cases. CSMITH tests. ISS dump MATCH AAPG YES done NO dump RTL
9
IClass Performance Results.
Benchmarks: Coremarks : 3.6 coremarks/MHz Dhrystone : 2.6 DMIPS/MHz Synthesis Results: FPGA : LUT Count : 110K. FPGA : Frequency : 100MHz
10
Manager-Client Pairing.
Acquire : Request From Client to Manager Probe : Request From Manager to Client Release : Response From Client to Manager Grant : Response From Manager to Client Finish : Acknowledgement by Client. Jesse G. Beu et al. Manager Client Pairing : A Framework for Implementing Coherence Hierarchies. '11
11
Bi-directional Ring Bus with MCP
Manager Client
12
Mesh with Notification Network
Number of hops on a Mesh Network M x N Total HOPs :M+N M N Daya, Bhavya K., et al. "SCORPIO: a 36-core research chip demonstrating snoopy coherence on a scalable mesh NoC with in-network ordering. '14
13
Hybrid Interconnects. Manager Manager Client Client Manager Manager
14
Source Code All our code is open-sourced.
You can find it at Contact us for further discussions and collaborations.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.