Hamming Transcoders for Power Reduction on Internal Buses Victor Wen Jan. 13, 2000 University of California, Berkeley.

Slides:



Advertisements
Similar presentations
Low power 32-bit bus with inversion encoding Wei Jiang ELEC 6270.
Advertisements

Comparison of Altera NIOS II Processor with Analog Device’s TigerSHARC
CDA 3100 Recitation Week 11.
5.5 Encoders A encoder is a multiple-input, multiple-output logic circuit that converts coded inputs into coded outputs, where the input and output codes.
Logic Circuits Design presented by Amr Al-Awamry
Princess Sumaya University
CPEN Digital System Design
Control path Recall that the control path is the physical entity in a processor which: fetches instructions, fetches operands, decodes instructions, schedules.
1 Brief Introduction to Verilog Weiping Shi. 2 What is Verilog? It is a hardware description language Originally designed to model and verify a design.
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Power Reduction Techniques For Microprocessor Systems
Performance Analysis and Optimization (General guidelines; Some of this is review) Outline: introduction evaluation methods timing space—code compression.
Combinational Logic Design
MEMORY ORGANIZATION Memory Hierarchy Main Memory Auxiliary Memory
Aug 23, ‘021Low-Power Design Minimum Dynamic Power Design of CMOS Circuits by Linear Program Using Reduced Constraint Set Vishwani D. Agrawal Agere Systems,
CS152 / Kubiatowicz Lec26.1 5/03/01©UCB Spring 2001 CS152 Computer Architecture and Engineering Lecture 26 Low Power Design May 3, 2001 John Kubiatowicz.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Huffman Encoder Project. Howd - Zur Hung Eric Lai Wei Jie Lee Yu - Chiang Lee Design Manager: Jonathan P. Lee Huffman Encoder Project Final Presentation.
A Programmable Coprocessor Architecture for Wireless Applications Yuan Lin, Nadav Baron, Hyunseok Lee, Scott Mahlke, Trevor Mudge Advance Computer Architecture.
Combinational Logic Discussion D2.5. Combinational Logic Combinational Logic inputsoutputs Outputs depend only on the current inputs.
1 Lecture 11: Digital Design Today’s topics:  Evaluating a system  Intro to boolean functions.
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Viterbi Decoder: Presentation #1 Omar Ahmad Prateek Goenka Saim Qidwai Lingyan Sun M1 Overall Project Objective: Design of a high speed Viterbi Decoder.
Computation Energy Randy Huang Sep 29, Outline n Why do we care about energy/power n Components of power consumption n Measurements of power consumption.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved The Digital Logic Level.
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
Introduction to Digital Logic Design Appendix A of CO&A Dr. Farag
9 MSI Logic Circuits Some of digital system operations: Decoding and encoding; multiplexing; demultiplexing; comparison; code converting; data busing.
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
Automated Design of Custom Architecture Tulika Mitra
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
1 Sign Bit Reduction Encoding for Low Power Applications Hsin-Wei Lin Saneei, M. Afzali-Kusha, A. and Navabi, Z. Sign Bit Reduction Encoding for Low Power.
1 Lecture #7 EGR 277 – Digital Logic Reading Assignment: Chapter 4 in Digital Design, 3 rd Edition by Mano Chapter 4 – Combinational Logic Circuits A)
Group No 5 1.Muhammad Talha Islam 2.Karim Akhter 3.Muhammad Arif 4.Muhammad Umer Khalid.
A Decompression Architecture for Low Power Embedded Systems Lekatsas, H.; Henkel, J.; Wolf, W.; Computer Design, Proceedings International.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
FPGA-Based System Design Copyright  2004 Prentice Hall PTR Logic Design Process n Functional/ Non-functional requirements n Mapping into an FPGA n Hardware.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
1 Bus Encoding for Total Power Reduction Using a Leakage-Aware Buffer Configuration 班級:積體所碩一 學生:林欣緯 指導教授:魏凱城 老師 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION.
1 Energy-Efficient Register Access Jessica H. Tseng and Krste Asanović MIT Laboratory for Computer Science, Cambridge, MA 02139, USA SBCCI2000.
1 2-Hardware Design Basics of Embedded Processors.
By: C. Eldracher, T. McKee, A Morrill, R. Robson. Supervised by: Professor Shams.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
A Distributed and Adaptive Signal Processing Approach to Reducing Energy Consumption in Sensor Networks Jim Chou, et al Univ. of Califonia at Berkeley.
C OMBINATIONAL L OGIC D ESIGN 1 Eng.Maha AlGubali.
1 Combinational Logic Design.  A process with 5 steps Specification Formulation Optimization Technology mapping Verification  1 st three steps and last.
Computer Architecture Lecture 7: Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/30/2013.
LOW POWER DESIGN METHODS
Circuit Synthesis A logic function can be represented in several different forms:  Truth table representation  Boolean equation  Circuit schematic 
Lab 4 HW/SW Compression and Decompression of Captured Image
Everything is a number Everything in a computer memory and on storages is a number. Number  Number Characters  Number by ASCII code Sounds  Number.
CSE 140 – Discussion 7 Nima Mousavi.
L. Benini, G. DeMicheli Stanford University, USA A. Macii, E. Macii, M
Discussion 2: More to discuss
Stateless Combinational Logic and State Circuits
Adapted by Dr. Adel Ammar
ADPCM Adaptive Differential Pulse Code Modulation
Architecture & Organization 1
Topics The logic design process..
Basics Combinational Circuits Sequential Circuits Ahmad Jawdat
Architecture & Organization 1
Week 5, Verilog & Full Adder
Schematics 201 Lecture Topic: Electrical Symbols
Image Compression Purposes Requirements Types
Data Wordlength Reduction for Low-Power Signal Processing Software
A Random Access Scan Architecture to Reduce Hardware Overhead
Presentation transcript:

Hamming Transcoders for Power Reduction on Internal Buses Victor Wen Jan. 13, 2000 University of California, Berkeley

Outline Motivations Related Work Initial Approaches Transition Code Technique Preliminary Results Future Work/Conclusion

Power reduction through coding Can we encode information in a way that takes less power? Do this on chip?! Encoded Version Decode Encoder OutputInput

Reasoning Increasing importance of wires relative to transistors Spend transistors to drive wires more efficiently? Try to reduce transitions over wires Orthogonal to other power-saving techniques I.e. voltage reduction, low-swing drive clock gating Parallelism (like vectors!) Portable devices more important

Related Work Bus Invert Coding, by M. R. Stan and W. P. Burleson Reduce peak power by 50%, avg by up to 25% Work-zone Encoding, by E. Musoll et al. Compare favorably with other techniques Test Vector Ordering, by P. Girard et al. Result: 8.2% to 54.1% less activities Minimizing Power consumption, by A. Chandrakasan and R. Broderson

Huffman-based Compression Variable bit length – problem! Possible soln: macro clock Less bits != less transitions … Decode Encoder OutputInput

Hamming Weight Find a map function to minimize transition Search space is large – 256! (For 8-bit bus) Leads to transition code idea … Map Function … Decode Encoder OutputInput

Hamming Transcoder Most frequent arc assigned low-weight codes Use output codes to XOR transmission line Every 1 in coded version causes transistion Most frequent arcs cause least number of transitions Code: 0x00 Freq: 2620 Code: 0xFF Freq: 10 State Transition Diagram 256x256 table for 8 bit bus

Hamming Transcoder (con’t) Only transitions matter, not absolute value Recognize more frequent transitions & assign low-weight code to them Guarantees more frequent transitions have less bits changes on the wire

Transition Code – Setup Transition Table Prev input Cur input Transcode 8 Coded? To Bus Cur bus value 9 XOR Coder Decoder 988

Simulation Setup Verilog XL Sim Verilog simulation on picoJava core RTL Monitor Custom monitoring component outputs the bits on selected buses Post- process Post process the output files into format suitable for transcoder simulator Transcode Sim Reads the file, setup transition table and perform simulation Sun offering processor descriptions in Verilog picoJava (for now) UltraSparc (soon)

Simulation Results (1) Savings Rank 9 saves 79.52% Rank 256 saves 79.68% 9th bit overhead Rank 1: 23% Rank 9: 0.29%

Simulation Results (2) Number of transitions drops quickly as ranks increases 256x256 table might not be necessary Other trace files show similar trends Note: icu_data connects between instruction cache unit and integer unit. A fairly long bus according to picoJava’s floorplan

Conclusion & Future Work Conclusion Transition coding attacks the root of the problem Minimal change to existing circuits Orthogonal to other low power techniques Future work Simulate SPEC on Sparc & UltraSparc RTL Build adaptability into coder/decoder Use of more history Implement actual hardware