Presentation is loading. Please wait.

Presentation is loading. Please wait.

Architecture and Synthesis for Multi-Cycle Communication

Similar presentations


Presentation on theme: "Architecture and Synthesis for Multi-Cycle Communication"— Presentation transcript:

1 Architecture and Synthesis for Multi-Cycle Communication
SOC Group, VLSICAD Lab Led by Jason Cong Yiping Fan, Guoling Han, Xun Yang, Zhiru Zhang VLSI CAD LAB Motivation What is happening now: Interconnect delays dominate the timing in DSM tech. What is about to happen: Single-cycle full chip synchronization is no longer possible. Our Approach Regular Distributed Register (RDR) micro-architecture Highly regular Direct support of multi-cycle on-chip communication MCAS: Architectural Synthesis for Multi-cycle Communication Integrated architectural synthesis (e.g. binding, scheduling) with physical planning Target at RDR architecture MCAS vs. Conventional Flow MCAS achieves 31% clock period and 24% total latency reduction with 18% resource overhead and 11% clock cycle increase on average. ICG C program Locations Placement-driven rescheduling & rebinding Scheduling-driven placement CDFG generation Register and port binding Datapath & FSM generation Floorplan constraints Resource allocation & Functional unit binding RTL VHDL Multi-cycle path constraints CDFG MCAS (Multi-Cycle Architectural Synthesis) 7.52 15.04 22.56 24.9 (mm) 1 clock 2 clock 3 clock 4 clock 5 clock 6 clock 7 clock Global Interconnect LCC FSM K cycles 1 cycle 2 cycles Register file Island Local Computational Cluster (LCC) …. Register File Wi Hi ALU MUL Cluster with area constraint 2 cycle K cycle MUX - + * 1 3 5 7 9 2 4 6 8 11 10 12 Control Data Flow Graph (CDFG) Mul1 Alu2 Mul2 Alu1 Interconnected Component Graph (ICG) MCAS vs. Synopsys Behavioral Compiler MCAS achieves 21% clock period and 29% total latency reduction on average, without area overhead. Reg. file Alu1 1,5,10 Mul2 3,7,11 Alu2 2,6,9 Mul1 4,8,12 RDR Placement MCAS System Scheduling-driven placement Integrate list-scheduling with a SA-based global placement for minimizing the total latency. Employ net weighting technique to shorten the critical global connections. Placement-driven rescheduling & rebinding Integrate force-directed list-scheduling with simultaneous rescheduling & rebinding to further minimize the latency. RDR Architecture Distribute registers to each “island” Chose the island size such that local computation and communication in each island can be done in a single cycle: Dintra-island=Dlogic+Dopt-intDlogic+2Dopt-int(Wi+Hi)T


Download ppt "Architecture and Synthesis for Multi-Cycle Communication"

Similar presentations


Ads by Google