Download presentation
Presentation is loading. Please wait.
1
October 9, 2003
2
Acknowledgements The team would like to acknowledge the technical assistance of Dr. Tyagi and Sriram Nadathur.
3
Definitions Clock cycle time - The time that it takes to complete a clock period. Commonly measured in frequency. Functional units - Individual blocks of logic in the processor. Hazards - Situations in the processor where more than one instruction is trying to access the same memory at the same time. IPC - Instruction per clock cycle, a measure of the performance of a processor. Issue buffer - A memory which determines what instructions can be executed in parallel. Pipeline - An architectural scheme where specific tasks are performed in stages on a processor. Pipeline latch - A memory device between the pipeline stages. Rename space - Temporary storage space inside the processor. Superscalar - A computer architecture where multiple instructions are executed in one clock cycle. Stalls - When the rename space is full, the processor cannot keep issuing instructions.
4
Superscalar Processors
Superscalar processors have a pipeline which is capable of issuing multiple instructions per cycle. This control complexity is managed by using a register reorder buffer to keep instruction execution in order. The pipeline is still forced to stall when the reorder buffer is full. The purple instruction is waiting on i/o in the reserve unit. It is at the front of the re-order buffer The blue instructions in the commit unit have been processed, but have to wait to commit until the purple instruction is completed. The processor must stall because the re-order buffer is full, no new instructions can be dispatched, despite that two reserve units are sitting idle.
5
Problem Statement Achieve a net gain in superscalar processor performance by adaptively changing the rename space size
6
General Solution To use the idle function units in stalled pipelines as additional rename space when the reorder buffer is full.
7
Approach Determine if the possible performance enhancement from such a scheme outweighs the extended time per clock cycle with simulations on an Alpha processor model. Design and implement a control algorithm to use the additional rename space. Verify the correctness of the control logic Implement the control algorithm in SPICE Quantify the architectural performance gains using the SPEC2000 benchmark
8
Operating Environment
The design will be tested using processor simulations and hardware models. Software simulations will be done in SimpleScalar The modified processor will not actually be fabricated, but the basic environment is that of a typical super scalar processor
9
Intended Users Dr. Tyagi and his research assistants
Microprocessor companies Other researchers in the field
10
Intended Uses Dr. Tyagi’s research in computer architecture performance Improve performance of sequentially executed programs Providing research into increasing super scalar processor performance
11
Assumptions There will be a performance gain by using pipeline latches for rename space When rename space is full, there are functional units that cannot be utilized Any control strategy that would yield gains is feasible in CMOS technology
12
Limitations Using pipeline latches for rename space will increase capacitance and extend time per clock cycle There are hazards that increasing the rename space size will not fix There will be a limited numbler of pipeline latches available Any implementation of control strategy would be processor dependent
13
End Product/Deliverables
A research paper detailing the team’s results Modified SimpleScalar code that simulates the new control algorithm. The code will be documented and maintainable so further work can be done if necessary. SPICE simulations and results quantifying the affect on processor performance
14
Approaches Considered 1/3
Determine how performance is affected by rename space stalls Selected Approach: Simulate using SimpleScalar Advantages: SimpleScalar is familiar to the client SimpleScalar is open source and easily modified Disadvantages: none
15
Approaches Considered 3/3
Find an optimal size for the rename space that will decrease cycle time Approach 1 – Run many simulations varying the rename space size Advantages: Gives detailed picture of how rename space size relates to performance Disadvantages: Requires running a large number of simulations Doesn’t reveal at what size rename space is used most efficiently Approach 2 – Run simulations and determine what rename space is filled to its capacity the largest percentage of the time Advantages: Gives a detailed picture of how rename space fills up Disadvantages: Doesn’t reveal what size yields best performance to size ratio Selected: Approach 1 and 2 – to get as much information as possible
16
Approaches Considered 2/2
Develop an algorithm to adaptively increase rename space using functional units Approach 1- Use standard functional units to store instructions or data Advantages: Doesn’t involve changing the functional units Disadvantages: May be a significant capacitance increase Approach 2-Use specially designed functional units Advantages: May decrease capacitance compared with approach 1 Disadvantages: Would take a lot of work that might not be worth the gain Selected: Approach 1. Approach 2 is too large a risk without being able to quantify potential gains. If 1 proves infeasible, we will switch to approach 2.
17
Research Activities 1/3 Research performance results of different rename space sizes
18
Research Activities 2/3 Can functional units be used for additional rename space? Find out which functional units are available when stalls happen Find out how long functional units are available when stalls happen
19
Research Activities 3/3 Research relationship between rename space size, clock speed and performance Decide under what conditions should additional rename space be used How much adaptive rename space is optimal
20
Present Accomplishments
Determined that the less rename space the less capacitance in the chip and the faster the clock can be set Determined that after a certain size, the benefits of increasing rename space is dramatically decreased. Determined that using a two cycle access to adaptive rename space allows us to keep the increase in clock cycle time gain gotten by decreasing the traditional rename space. Determined that integer alu and integer multiplier functional units are often available while the rename space is stalled. Determined that additional rename space should be issued in blocks of 8 and for at least 10 cycles Using both functional units and dedicated memory for adaptive rename space is the best approach.
21
Design Activities Designed tests cases and simulations with varying space sizes Developed rudimentary control algorithm as a test of concept
22
Implementation Activities
Coding of control strategy in SimpleScalar Evaluation of clock cycle speed increase of reducing rename space size from 64 to 40
23
Future Required Activities
Develop more advanced control strategy to increase gains. Design physical implementation for fabrication. Write a paper discussing rename space implementation and control strategy.
24
Resources Personnel Other Resources Poster $40 Printing $10 Total $50
Hentzel – 55 Poster $40 Printing $10 Total $50 Brandt – 86 Thompson – 65 Taylor – 52 Total Hours: 258
25
Schedules
26
Project Evaluation Milestones Successfully Completed Determine how processor performance is affected by rename space. Determine what functional units can be used to increase rename space Finding an optimal size for the rename space size that will decrease the cycle time of the processor Milestones in Progress Develop an algorithm using functional units to adaptively increase rename space size Use SPICE simulations to determine the affects of changes on capacitance and cycle time Milestones Not Yet Begun Quantification of the increase in performance A Research paper detailing the results The project will be a success! The team is on schedule to complete all milestones and the thorough preparation for the implementation stage has yielded a viable solution.
27
Commercialization This project may have future commercial considerations, but our interest is in the academic research.
28
Recommendations for Further Work
Algorithm could be further optimized and ported to other processor architectures. Instruction fetch buffer could be examined to find new optimal points with the new architecture.
29
Lessons Learned Details of superscalar processor.
Computer Architecture research and design flow. Group motivation and task management of complex and simple tasks.
30
Risk Management 1/2 Anticipated Team motivation
Handled by continual checking of group members attitude Members falling behind on knowledge and understanding Handled by weekly meetings where questions were asked to members. Loss of Member or Graduate Advisor Handled by working with a new graduate student with background in similar area.
31
Risk Management 2/2 Unanticipated
Time requirements of class and meeting time difficulties. Handled by distributing projects between team members and meeting to review each other’s sections Little gain from increasing the rename space size Handled by making a smaller issue stage with a smaller traditional rename space, so the clock rate increased due to the use of adaptive rename space.
32
Closing Summary The goal was to come up with a viable strategy to enhance processor performance via implementing an adaptive rename space. The solution is on track to be a success. This project may lead to more efficient processor designs being produced.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.