ECE 432 Group 4 Aaron Albin Jisoon Kim Kiwamu Sato High Speed Cache ECE 432 Group 4 Aaron Albin Jisoon Kim Kiwamu Sato
Bitcell Wide Cell Bitcell Advantages: Noise Margin 16.35 X 7.2 um Less Variation Noise Margin Static: 1.58V Read : 0.76V Write : 1.6V 16.35 X 7.2 um
Peripheral Circuitry Decoders Mux Pulse generator for WL and SE Used static CMOS Delay of 6-64 decoder: 3.14 ns Mux Used transmission gates Delay of 16-1 Mux: 2.81 ns Pulse generator for WL and SE To decrease power consumption, by the WL and BL BLB To ensure WL is not high during pre-charge time Why not Dynamic: With pull down, overlaps With pull up, dead space during precharge. Probably more power burst mode Decoder: we only use one 1 Row decoder to save space, unlike the text book Text book has one row decoder per block Advantages, less decoders => cut down in space, Disadvantages => each decoders will have more load but had logics to choose which bank of WL to charge up Original plan to expand this for burst mode Other decoders are for Read (Column, Bank) Write (Column, Bank)
Block Addressing Scheme 11 bits Address. 0 for column, 1-4 for bank address and 5-10 for the rows Allows for Burst Mode 2 Column and 2 Bank Decoders for read and write, 1 row decoder We chose to use an unsual addressing scheme. Since our caches 8kbytes in size, it means 11 bits for addressing. We used the first bit… blah blah… This means that instead of reading in blocks, we read across all our 16 blocks. After reading 2 words, instead of going to the next row to read the next word we would go to the next block to read the next word. The reason for this addressing scheme was to implement our special feature called Burst Mode. Burst Mode would allow us to read multiple words in one cycle. Burst mode is built around the idea of spatial locality where when reading one data there is a very high chance of reading the next few words. Our addressing scheme would allow parallel reading of words. To achieve this we used these decodes… However, Burst mode meant more complex logic. Unfortunately, we haven’t gotten to implement in testing because of time constraints. Cache 8kBytes, 2 words per column in a bank, 16 banks, 64 rows => 1 column, 4 bank, 6 row decoding The address progresses horizontally unlike usual Intended to work for Burst Mode Reading not only one words but multiple words in parallel. 4 words Burst Mode Spatial Locality Bigger Output Bus, logics to choose banks and WLs Did not have enough time
Simulation Results Area of SRAM: 6.5mm X 2.5mm Total Power: 669.8 mW 20.7 . 106 um2 20.7mm2 Total Power: 669.8 mW Total Delay: 6.8 ns Read Delay: 6.8ns Write Delay: 6ns Precharge Delay 4.4 ns Total Cycle 11 ns 91MHz Metric = 6.411 . 108 W . ns2 . micron2
Improvements Other Types of Faster Decoder And Mux Burst Mode Lyon Schediwy Decoder And Mux Burst Mode