Evaluating Register File Size

Slides:



Advertisements
Similar presentations
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
Advertisements

1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
Evaluating Performance and Power of Object-oriented vs. Procedural Programming in Embedded Processors A. Chatzigeorgiou, G. Stephanides Department of Applied.
S CRATCHPAD M EMORIES : A D ESIGN A LTERNATIVE FOR C ACHE O N - CHIP M EMORY IN E MBEDDED S YSTEMS - Nalini Kumar Gaurav Chitroda Komal Kasat.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Energy Evaluation Methodology for Platform Based System-On- Chip Design Hildingsson, K.; Arslan, T.; Erdogan, A.T.; VLSI, Proceedings. IEEE Computer.
Instruction Set Architecture (ISA) for Low Power Hillary Grimes III Department of Electrical and Computer Engineering Auburn University.
Dynamically Reconfigurable Architectures: An Overview Juanjo Noguera Dept. Computer Architecture (DAC-UPC)
Synthesis of Custom Processors based on Extensible Platforms Fei Sun +, Srivaths Ravi ++, Anand Raghunathan ++ and Niraj K. Jha + + : Dept. of Electrical.
Optimization Of Power Consumption For An ARM7- BASED Multimedia Handheld Device Hoseok Chang; Wonchul Lee; Wonyong Sung Circuits and Systems, ISCAS.
The Effect of Data-Reuse Transformations on Multimedia Applications for Different Processing Platforms N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Enhancing Embedded Processors with Specific Instruction Set Extensions for Network Applications A. Chormoviti, N. Vassiliadis, G. Theodoridis, S. Nikolaidis.
BLDC MOTOR SPEED CONTROL USING EMBEDDED PROCESSOR
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
L29:Lower Power Embedded Architecture Design 성균관대학교 조 준 동 교수,
Determining the Optimal Process Technology for Performance- Constrained Circuits Michael Boyer & Sudeep Ghosh ECE 563: Introduction to VLSI December 5.
A Reconfigurable Processor Architecture and Software Development Environment for Embedded Systems Andrea Cappelli F. Campi, R.Guerrieri, A.Lodi, M.Toma,
Architectures for mobile and wireless systems Ese 566 Report 1 Hui Zhang Preethi Karthik.
Automated Design of Custom Architecture Tulika Mitra
Speculative Software Management of Datapath-width for Energy Optimization G. Pokam, O. Rochecouste, A. Seznec, and F. Bodin IRISA, Campus de Beaulieu
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Configurable, reconfigurable, and run-time reconfigurable computing.
A Graph Based Algorithm for Data Path Optimization in Custom Processors J. Trajkovic, M. Reshadi, B. Gorjiara, D. Gajski Center for Embedded Computer Systems.
3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.
Power Estimation and Optimization for SoC Design
Dual-Pipeline Heterogeneous ASIP Design Swarnalatha Radhakrishnan, Hui Guo, Sri Parameswaran School of Computer Science & Engineering University of New.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
Jason Li Jeremy Fowers 1. Speedups and Energy Reductions From Mapping DSP Applications on an Embedded Reconfigurable System Michalis D. Galanis, Gregory.
Power Analysis of Embedded Software : A Fast Step Towards Software Power Minimization 指導教授 : 陳少傑 教授 組員 : R 張馨怡 R 林秀萍.
1 November 11, 2015 A Massively Parallel, Hybrid Dataflow/von Neumann Architecture Yoav Etsion November 11, 2015.
NISC set computer no-instruction
An Automated Development Framework for a RISC Processor with Reconfigurable Instruction Set Extensions Nikolaos Vassiliadis, George Theodoridis and Spiridon.
SR: 599 report Channel Estimation for W-CDMA on DSPs Sridhar Rajagopal ECE Dept., Rice University Elec 599.
Compacting ARM binaries with the Diablo framework – Dominique Chanet & Ludo Van Put Compacting ARM binaries with the Diablo framework Dominique Chanet.
The Effect of Data-Reuse Transformations on Multimedia Applications for Application Specific Processors N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Block Cache for Embedded Systems Dominic Hillenbrand and Jörg Henkel Chair for Embedded Systems CES University of Karlsruhe Karlsruhe, Germany.
Embedded Systems. What is Embedded Systems?  Embedded reflects the facts that they are an integral.
An Overview CS341 Digital Logic and Computer Organization F2003.
Andreas Hoffmann Andreas Ropers Tim Kogel Stefan Pees Prof
PROGRAMMABLE LOGIC CONTROLLERS SINGLE CHIP COMPUTER
Memory Segmentation to Exploit Sleep Mode Operation
LOW POWER DESIGN METHODS V.ANANDI ASST.PROF,E&C MSRIT,BANGALORE.
Application-Specific Customization of Soft Processor Microarchitecture
Embedded Systems Design
Introduction ( A SoC Design Automation)
Modeling and Simulation Issues of Programmable Architectures
AVR Microcontrollers Prepared By: Disha Ruparelia ( )
Chapter 1: Introduction
Department of Electrical & Computer Engineering
Methodology of a Compiler that Compresses Code using Echo Instructions
Digital Processing Platform
Dynamically Reconfigurable Architectures: An Overview
Getting the Most Out of Low Power MCUs
Dual Mode Logic An approach for high speed and energy efficient design
A High Performance SoC: PkunityTM
Final Project presentation
Department of Electrical Engineering Joint work with Jiong Luo
The University of Adelaide, School of Computer Science
Measuring the Gap between FPGAs and ASICs
Mapping DSP algorithms to a general purpose out-of-order processor
Application-Specific Customization of Soft Processor Microarchitecture
Presentation transcript:

Evaluating Register File Size in ASIP Design Manoj Kumar Jain M. Balakrishnan Indian Institue of Technology Delhi, India Lars Wehmeyer Stefan Steinke Peter Marwedel University of Dortmund Germany

Overview Introduction Experimental Setup Methodology and Results Analysis Conclusion

Application Specific Instruction Set Processors Designed for specific application Exploits special characteristics to meet the desired constraints Efficient for applications like digital signal processing, automatic control systems, cellular phones

GPP-ASIP-ASIC GPP ASIP ASIC Performance Low High Very High Flexibility Excellent Good Poor HW Design Effort Nil Large Very Large SW Design Effort Small Power Medium Reuse Markets Relatively large Cost Mainly on SW S-O-C Volume sensitive

Flow Diagram of a typical ASIP Design Methodology Application & Design Constraints Application Analysis Architectural Design Space Exploration Instruction Set Generation Code Synthesis Hardware Synthesis Object Code Processor Description

Objectives Study the effect of change in register file size on - Power - Performance - Code Size

Experimental Setup encc Instruction Set Benchmark Simulator Suite Compiler Instruction Set Simulator Benchmark Suite Register File Size Trace Data

encc Compiler Environment C Code encc assembly Assembler & Linker executable energy database profiling information trace analyzer trace file ISS

ARM7TDMI processor Features: 32 Bit RISC 16 GP Registers ALU, Multiplier, Shifter 2 Instruction Sets: ARM & THUMB Evaluation Board: 4 KB On-Chip Memory 512 KB External RAM

Benchmark Suite DSP Algorithms: biquad_N_sections lattice_init matrix-mult Media Application: me_ivlin Standard Sorting Algorithms: bubble_sort heap_sort insertion_sort selection_sort http://www.cse.iitd.ernet.in/~manoj/research/benchmarks.html

Power Model Based on Tiwari’s model Consider processor power and memory Power based on actual measurements Power models associated with each instruction for Two different configurations Off-chip data and instructions Ptotal(inst) = Pcpu(inst) + Poffchip(read,16)+ Poffchip(read/write,width) On-chip instruction and off-chip data Ptotal(inst) = Pcpu(inst) + Ponchip(read,16)+ Poffchip(read/write,width)

Assumptions Processor cycle does not change with number of registers Power consumption by each instruction does not change significantly with the change in the number of registers

Methodology Steps to generate the data: Generate code using encc Evaluate code quality Static: analysis of assembly code Dynamic: analysis of trace generated by ISS Change number of registers in the compiler configuration file Differences in code quality caused by spilling

Results Range Number of registers 3 to 8 Memory configurations - only off chip - on-chip instruction off-chip data Results collected - number of instructions executed - number of cycles - ratio of spilling instructions (static) - power consumption - energy consumption

Number of executed instructions

Number of Cycles (off-chip memory)

Number of Cycles (on-chip instr. Off-chip data)

Average power consumption (off-chip memory)

Average power consumption (on-chip instr. off-chip data)

Energy Consumption (off-chip memory)

Energy Consumption (on-chip instr. Off-chip data)

Ratio of spill instructions to total static code size

Maximum variation in results

Results for the program lattice_init

Result for the program me_ivlin

Time saving and Power saving contributions in Energy Saving

Energy Saving due to Voltage Scaling Here we have assumed total execution time as constant. To keep execution time as constant when execution requires lesser number of cycles we have increased the clock period. With the increased clock period we can reduce supply voltage. For estimating supply voltage with varying clock period we had referred The paper titled “Low Power CMOS Digital Design” – A.P Chandrakasan et al IEEE J. Solid-State Circuits, Vol. 27, No. 4, pp. 473-484, April 1992. With this estimated voltage we have calculated Energy. Since Energy is product of Average Power Consumption and Execution time, here Execution time is constant and Power depends quadratically on Voltage. Keeping these facts into consideration we have computed Energy Consumption.

Conclusion Studied results for number of inst. executed cycles, spilling, power and energy consumption for ARM7TDMI processor. Similar results for LEON processor. Range of number of registers 3 to 8. Single increase in number of registers results in up to 57.5% performance improvement and 62.9% reduction in energy consumption.

Future work Identify and extract application parameters to assist early estimation of optimal number of registers. Consider effect of changing number of registers on instruction encoding and instruction bit-width

References Ghazal, N. et al “Retargetable estimation scheme for DSP architecture selection” ASP-DAC 2000. pp. 485-489. Gupta T.V.K. et al “Processor evaluation in an embedded system design environment” VLSI 2000. pp. 98-103 http://www.arm.com/ Jain M.K., Balakrishnan M. Anshul Kumar “ASIP Design Methodologies: Survey and Issues” to appear in VLSI 2001. http://ls12-www.cs.uni-dortmund.de/~leupers/lanceV2/lanceV2.html Sato J. et al “An integrated design environment for application specific integrated processor” ICCAD 1991. pp. 414-417. Tiwari V. et al “Power analysis of embedded software” ICCAD 1994. pp. 384-390.

Thanks