Procedure Return Predictors Use buffer (stack) of return addresses
Performance Issues Limitations of branch prediction schemes Prediction accuracy (80% - 95%) Type of program Size of buffer Penalty of misprediction Fetch from both directions to reduce penalty Memory system should: Dual-ported Have an interleaved cache Fetch from one path and then from the other
Approaches to Improve Performance Goal so far: achieve CPI = 1 Eliminate structural, data, and control stalls Additional performance improvements Make clock rate faster Improve manufacturing process Increase the number of stages Superpipelining Multiple issue of instructions Superscalar VLIW IPC instead of CPI !
Superscalar Processors Issue more than one instruction per cycle Duplication of functional units Constraints Structural Data dependencies Control dependencies Scheduling of instructions Static Dynamic Sound familiar?
Midterm Performance
Problem Presentations Problem 1: Kai Kang Problem 2: Martin Jansche Problem 3: Jeff Kostrzewski Problem 4: Muthu M. Pugalanthiran