L i a b l eh kC o m p u t i n gL a b o r a t o r y Yield Enhancement for 3D-Stacked Memory by Redundancy Sharing across Dies Li Jiang, Rong Ye and Qiang.

Slides:



Advertisements
Similar presentations
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. YuGuy G.F. Lemieux September 15, 2005.
Advertisements

A Novel 3D Layer-Multiplexed On-Chip Network
Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs Mrinmoy Ghosh Hsien-Hsin S. Lee School.
Citadel: Efficiently Protecting Stacked Memory From Large Granularity Failures June 14 th 2014 Prashant J. Nair - Georgia Tech David A. Roberts- AMD Research.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
National Tsing Hua University Po-Yang Hsu,Hsien-Te Chen,
Citadel: Efficiently Protecting Stacked Memory From Large Granularity Failures Dec 15 th 2014 MICRO-47 Cambridge UK Prashant Nair - Georgia Tech David.
1 Sensor Relocation in Mobile Sensor Networks Guiling Wang, Guohong Cao, Tom La Porta, and Wensheng Zhang Department of Computer Science & Engineering.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
Automated Layout and Phase Assignment for Dark Field PSM Andrew B. Kahng, Huijuan Wang, Alex Zelikovsky UCLA Computer Science Department
Efficient Generation of Minimal Graphs Using Independent Path Analysis Linda S. Humphrey 20 November 2006 Department of Computer Science and Engineering.
Lifetime Reliability-Aware Task Allocation and Scheduling for MPSoC Platforms Lin Huang, Feng Yuan and Qiang Xu Reliable Computing Laboratory Department.
L i a b l eh kC o m p u t i n gL a b o r a t o r y Performance Yield-Driven Task Allocation and Scheduling for MPSoCs under Process Variation Presenter:
1 Bipartite Matching Lecture 3: Jan Bipartite Matching A graph is bipartite if its vertex set can be partitioned into two subsets A and B so that.
Evaluation of Redundancy Analysis Algorithms for Repairable Embedded Memories by Simulation Laboratory for Reliable Computing (LaRC) Electrical Engineering.
FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.
Memory access scheduling Authers: Scott RixnerScott Rixner,William J. Dally,Ujval J. Kapasi, Peter Mattson, John D. OwensWilliam J. DallyUjval J. KapasiPeter.
Floorplanning and Signal Assignment for Silicon Interposer-based 3D ICs W. H. Liu, M. S. Chang and T. C. Wang Department of Computer Science, NTHU, Taiwan.
Yanyan Yang, Yunhuai Liu, and Lionel M. Ni Department of Computer Science and Engineering, Hong Kong University of Science and Technology IEEE MASS 2009.
CoNA : Dynamic Application Mapping for Congestion Reduction in Many-Core Systems 2012 IEEE 30th International Conference on Computer Design (ICCD) M. Fattah,
L i a b l eh kC o m p u t i n gL a b o r a t o r y On Effective and Efficient In-Field TSV Repair for Stacked 3D ICs Presenter: Li Jiang Li Jiang †, Fangming.
Xiaodong Wang  Dilip Vasudevan Hsien-Hsin Sean Lee University of College Cork  Georgia Tech Global Built-In Self-Repair for 3D Memories with Redundancy.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
L i a b l eh kC o m p u t i n gL a b o r a t o r y On Effective TSV Repair for 3D- Stacked ICs Li Jiang †, Qiang Xu † and Bill Eklow § † CUhk REliable.
On Timing- Independent False Path Identification Feng Yuan, Qiang Xu Cuhk Reliable Computing Lab, The Chinese University of Hong Kong ICCAD 2010.
Authors: Jia-Wei Fang,Chin-Hsiung Hsu,and Yao-Wen Chang DAC 2007 speaker: sheng yi An Integer Linear Programming Based Routing Algorithm for Flip-Chip.
1 SOC Test Architecture Optimization for Signal Integrity Faults on Core-External Interconnects Qiang Xu and Yubin Zhang Krishnendu Chakrabarty The Chinese.
A CONDENSATION-BASED LOW COMMUNICATION LINEAR SYSTEMS SOLVER UTILIZING CRAMER'S RULE Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee
Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-Path Steiner Graph Chung-Kuan Cheng, Peng Du, Andrew B. Kahng, and Shih-Hung Weng UC San.
1 Customer-Aware Task Allocation and Scheduling for Multi-Mode MPSoCs Lin Huang, Rong Ye and Qiang Xu CHhk REliable computing laboratory (CURE) The Chinese.
The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.
EE141 VLSI Test Principles and Architectures Ch. 9 - Memory Diagnosis & BISR - P. 1 1 Chapter 9 Memory Diagnosis and Built-In Self-Repair.
CSCI 3160 Design and Analysis of Algorithms Chengyu Lin.
Layout-Driven Test-Architecture Design and Optimization for 3D SoCs under Pre-Bond Test- Pin-Count Constraint Li Jiang 1, Qiang Xu 1, Krishnendu Chakrabarty.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Test Architecture Design and Optimization for Three- Dimensional SoCs Li Jiang, Lin Huang and Qiang Xu CUhk Reliable Computing Laboratry Department of.
Yi-Lin, Tu 2013 IEE5011 –Fall 2013 Memory Systems Wide I/O High Bandwidth DRAM Yi-Lin, Tu Department of Electronics Engineering National Chiao Tung University.
L i a b l eh kC o m p u t i n gL a b o r a t o r y Test Economics for Homogeneous Manycore Systems Lin Huang† and Qiang Xu†‡ †CUhk REliable computing laboratory.
An Efficient Linear Time Triple Patterning Solver Haitong Tian Hongbo Zhang Zigang Xiao Martin D.F. Wong ASP-DAC’15.
On the Topology of Wireless Sensor Networks Sen Yang, Xinbing Wang, Luoyi Fu Department of Electronic Engineering, Shanghai Jiao Tong University, China.
Jing Ye 1,2, Yu Hu 1, and Xiaowei Li 1 1 Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences.
CS/EE 5810 CS/EE 6810 F00: 1 Main Memory. CS/EE 5810 CS/EE 6810 F00: 2 Main Memory Bottom Rung of the Memory Hierarchy 3 important issues –capacity »BellÕs.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
Timing Model Reduction for Hierarchical Timing Analysis Shuo Zhou Synopsys November 7, 2006.
Simultaneous Multi-Layer Access Improving 3D-Stacked Memory Bandwidth at Low Cost Donghyuk Lee, Saugata Ghose, Gennady Pekhimenko, Samira Khan, Onur Mutlu.
Contemporary DRAM memories and optimization of their usage Nebojša Milenković and Vladimir Stanković, Faculty of Electronic Engineering, Niš.
Multiple-Vector Column-Matching BIST Design Method Petr Fišer, Hana Kubátová Department of Computer Science and Engineering Czech Technical University.
L i a b l eh kC o m p u t i n gL a b o r a t o r y Modeling TSV Open Defects in 3D-Stacked DRAM Li Jiang †, Liu Yuxi †, Lian Duan ‡, Yuan Xie ‡, and Qiang.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
11 Yibo Lin 1, Xiaoqing Xu 1, Bei Yu 2, Ross Baldick 1, David Z. Pan 1 1 ECE Department, University of Texas at Austin 2 CSE Department, Chinese University.
EE384Y: Packet Switch Architectures Scaling Crossbar Switches
Contents Introduction Bus Power Model Related Works Motivation
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
The Taxi Scheduling Problem
Fault-Tolerant Architecture Design for Flow-Based Biochips
On Efficient Graph Substructure Selection
Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
Jianbo Dong, Lei Zhang, Yinhe Han, Ying Wang, and Xiaowei Li
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu
University of Wisconsin-Madison
Automated Layout and Phase Assignment for Dark Field PSM
Alan Kuhnle*, Victoria G. Crawford, and My T. Thai
Resource Allocation for Distributed Streaming Applications
A Block Based MAP Segmentation for Image Compression
Presentation transcript:

l i a b l eh kC o m p u t i n gL a b o r a t o r y Yield Enhancement for 3D-Stacked Memory by Redundancy Sharing across Dies Li Jiang, Rong Ye and Qiang Xu Presenter: Qiang Xu CUhk REliable Computing Laboratory Department of Computer Science & Engineering The Chinese University of Hong Kong

Outline Introduction Motivation Redundancy Sharing in 3D-Stacked Memory Die Matching for Yield Enhancement Conclusion

Why 3D-stacked Memory? Now CPU-Memory Performance Gap Relative Performance Memory Wall Small Die Size Die Size Routing Cost Routing Cost Large Bandwidth Bandwidth Reduced Bus Cap Bus Cap Latency Latency

3D-Stacked DRAM are Already Here … 2002Tezzaron:1Gb,SDRAM NEC:4Gb Gbit density Interposer Peripherals 3 Gbps/pin 8 strata TSV 2009 SamSung:8Gb PCB TSV DRAM I/O Buffer RD/WR IMEC:DRAM+Logic GATech+Tezzaron 2010 Higher Bandwidth Faster Closer to Processor

×× Memory Test: Fault Bitmap Redundancy Analysis Stack Self-Reparable Dies To Guarantee Yield for 3D-Stacked Memory … More self-reparable dies High redundancy cost! More self-reparable dies High redundancy cost!

1R, 2C, Irreparable Redundancy Analysis for Reparability 1R, 2C, Self-Reparable

1R, 2C, Irreparable 0R, 3C, Reparable 2R, 1C, Reparable Redundancy Sharing for Yield Enhancement 1R, 2C, Self-Reparable With the same amount of resources, memory yield can be improved by redundancy sharing!

Redundancy Sharing across Dies Programmable Decoder Programmable Decoder Pre-fabricated multiplexor Pre-fabricated multiplexor Full sharing: Num TSV = Num Spare Row + Num Spare Col Full sharing: Num TSV = Num Spare Row + Num Spare Col Repair its own block. Use the rest to repair others What if there are defective TSVs?

Programmable Decoder Pre-fabricated multiplexor Partial sharing : Less TSVs Redundancy Sharing across Dies

Self-reparable matching: Yield = 25% Self-reparable matching: Yield = 25% Aggressive matching: Yield = 0% Aggressive matching: Yield = 0% Effective matching: Yield = 75% Effective matching: Yield = 75% Conservative matching: Yield = 50% Matching is Critical for the Final Yield

How to Conduct Die Matching? Add edges if two dies are reparable with redundancy sharing Add edges if two dies are reparable with redundancy sharing Conduct maximum matching algorithm Conduct maximum matching algorithm Construct an undirected graph with each die as an vertex Construct an undirected graph with each die as an vertex

How to Conduct Die Matching? How How do we know whether two dies are reparable after bonding? Run Run final repair algorithm between every pair Best Best yield, but time-consuming We We have to estimate estimate whether two dies matched together can form a reparable stack efficiently

Fr: faulty bits suitable for row repair Fc: faulty bits suitable for column repair Fo: orthogonal faulty bits Die Matching a.t. Reparability Condition

Optimal Matched Dies Matching a.t. reparability condition is rather conservative

Irreparability Condition Given a bipartite graph G = (V;E), the minimum number of vertices that cover all the edges is equal to the number of edges in any maximum bipartite matching of the graph Given two memory blocks with redundancy R/C  The maximum bipartite matching of G a, G b are |M a | and |M b |, the stacked memory is considered to be “reparable” if |M a | +|M b | ≤ R a + C a + R b + C b

Die Matching a.t. Irreparability Condition Reparability is NOT guaranteed due to redundancy configuration!

Die Matching a.t. Irreparability Condition Optimal Matched Dies Reparable Dies a.t. Irreparability Condition Matched Dies a.t. Irreparability Condition Matching a.t. irreparability condition is rather aggressive

Iterative matching a.t. tightened irreparability condition in each run Iterative Die Matching Optimal Matched Dies Reparable Dies a.t. Irreparability Condition Matched Dies a.t. Irreparability Condition + |M a | +|M b | ≤ R a + C a + R b + C b 0

Iterative matching a.t. tightened irreparability condition in each run Iterative Die Matching + |M a | +|M b | ≤ R a + C a + R b + C b 1 Rest of Dies Reparable Dies a.t. Irreparability Condition Matched Dies a.t. Irreparability Condition

Iterative matching a.t. tightened irreparability condition in each run Iterative Die Matching + |M a | +|M b | ≤ R a + C a + R b + C b 2

Iterative matching a.t. tightened irreparability condition in each run Iterative Die Matching + |M a | +|M b | ≤ R a + C a + R b + C b k No more reparable dies found

Experiment Setup Gb Memory, stacked to 2 Layer chips 4×4 memory blocks, 8k×8k bit-cells Fault Injection Poisson distribution with λ = Polya-Eggenberger distribution with λ=2.130 α = (more clustered faults) α = (evenly-distributed faults) random TSV faults with faulty rate as 0.1% six kinds of faults FaultSingle CellDouble CellSingle RowSingle ColDouble RowDouble Col case 140%4%20% 8% case 270%4%8% 5%

Self-reparable Reparability Matched Irreparability Irreparability Iterative Experimental Results Poisson Distribution

Case 1 Case 2 Self Repair Reparability Matched Irreparability Irreparability Iterative Experimental Results

Polya-Eggenberger Distribution α = α = 2.38 Self Repair Reparability Matched Irreparability Irreparability Iterative

We propose to conduct redundancy sharing across vertical dies in 3D-Stacked Memory We propose to conduct redundancy sharing across vertical dies in 3D-Stacked Memory Significant yield enhancement Significant yield enhancement Minor TSV and routing cost Minor TSV and routing cost We present novel solutions for selective die matching to maximize 3D-stacked memory yield We present novel solutions for selective die matching to maximize 3D-stacked memory yield Summary

l i a b l eh kC o m p u t i n gL a b o r a t o r y Thank you for your attention !