Specific Choice of Soft Processor Features Mark Grover Prof. Greg Steffan Dept. of Electrical and Computer Engineering
Hard and Soft Processors Hard Processors Soft Processors Verilog Made from transistors Cost millions to make Faster in speed Consume less power Made from transistors Cost millions to make Faster in speed Consume less power Built on FPGA Fabric Are customizable Can cater to application specific needs Built on FPGA Fabric Are customizable Can cater to application specific needs Processor Architecture
Research Problem Choose the best micro-architectural features – Want to optimize the use of resources Power consumption(as minimum as possible) Area(as less as possible) Wall Clock Time(lesser the better) Time Spent
SPREE Soft Processor Rapid Exploration Environment Scanned the whole of design space Is it viable enough? – What if a new application comes into picture? – What if the performance criteria changes? Say, the user doesn’t care about area any more?
Enhanced Simulator (MINT) Research Objective Enhanced Simulator Part 1 Maximum power, area Software Application Fastest micro- architectural combination What if a new application comes into picture? What if the performance criteria changes? Enhanced Simulator Part 2 Approximates
Outline Motivation Implementation – Implementation Scheme(in general) – Data deciphering Results – Multiplier option Discussion Conclusion Long Term Goal
Implementation Scheme Experimental Data for some Benchmarks Look for trends and dependencies Propose a suitable relationship Comparing with the trade- offs and providing the best solution
Data Deciphering Multiplier option(Hard/Soft Multiplier) – Approximate cycle count change on using them? Multiplication operation is converted to a set of shifts and adds – Simulated the algorithm to find the equivalent number of instructions – Plotted the number of equivalent instructions vs. the changes in cycle counts(experimental data)
Hard and Soft Multiplier Hard Multiplier Does the multiply operation as a single instruction Occupies finite area Delays the clock by a finite time Consumes finite amount of power Soft Multiplier No dedicated multiplier Each multiply instruction converted into simpler instructions No change in area, frequency or power
Method of Analysis A*B Set of Branches, Shifts and Add instructions For all multiply instructions in the benchmark Plot with the change in cycle count (experimental)for all processor variants Total change in equivalent instructions
Outline Motivation Implementation – Implementation Scheme(in general) – Data deciphering Results – Multiplier option Discussion Conclusion Long Term Goal
Results Gnuplot used to plot graphs on log scale A linear correlation obtained between the points plotted
Example 1 Increase in cycle counts(Log Scale) Change in equivalent instructions from hard- multiplier to soft multiplier on pipe5,barrelshift proc
Example 2 Increase in cycle counts(Log Scale) Change in equi. instructions from hard-multiplier to soft multiplier on serial shift, high rise processor
Outline Motivation Implementation – Implementation Scheme(in general) – Data deciphering Results – Multiplier option Discussion Conclusion Long Term Goal
Discussion “Fit.log” as a good measure of correlation Percentage uncertainty is expressed by Asymptotic Standard Error(A.S.E) Example 1- A.S.E is 4.132% Example 2- A.S.E is 3.166% A linear dependence is found on log scale Generated by gnuplot
A.S.E of all Processor Variants
Outline Motivation Implementation – Implementation Scheme(in general) – Data deciphering Results – Multiplier option Discussion Conclusion Long Term Goal
Conclusion Linear fit enables to predict quite accurately the change in cycle count with change in feature This change for all the features servers as input to part 2 of the enhanced simulator Template for future work
Example 2 Increase in cycle counts(Log Scale) Change in equi. instructions from hard-multiplier to soft multiplier on serial shift, high rise processor From part 1 of MINT by running the application on it This gives the approx. change in cycle count for new application
Future Work Presently, dealt only with the multiplier option Similar analysis on other features Comparison between user demands and approximate cycle counts
References Improving Pipelined Soft Processors with Multithreading, Martin Labrecque and J. Gregory Steffan Application-Specific Customization of Soft Processor Microarchitecture, Peter Yiannacouras, J. Gregory Steffan and Jonathan Rose
Special Thanks Prof. Greg Steffan CARG(Compiler & Architecture Reading- Group) PaCRaT(Parallelism and Customization Research At university of Toronto)
What I learnt? Research is not a 9 to 5 Job, it’s a lifestyle of discovering something small but relevant from time to time At times, you see that nothing is bearing fruits for you, then is the time to get off from your seat
Thanks Any Questions ???