CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst The Pentium Pro® (P6) Bus Reference: “Penium Pro and Pentium II System Architecture”

Slides:



Advertisements
Similar presentations
CPU Structure and Function
Advertisements

L.N. Bhuyan Adapted from Patterson’s slides
EECS 470 Busses in the real world Lecture 22 – Fall 2013.
Chapter 6 Computer Architecture
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
Computer Organization and Architecture
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Interrupts (contd..) Multiple I/O devices may be connected to the processor and the memory via a bus. Some or all of these devices may be capable of generating.
PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13A - Processor design Programming Language Design.
Chapter 12 Pipelining Strategies Performance Hazards.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
Data Manipulation Computer System consists of the following parts:
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Nov 14, 2005 Topic: Cache Coherence.
TECH CH03 System Buses Computer Components Computer Function
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
PhD/Master course, Uppsala  Understanding the interaction between your program and computer  Structuring the code  Optimizing the code  Debugging.
Reducing Cache Misses 5.1 Introduction 5.2 The ABCs of Caches 5.3 Reducing Cache Misses 5.4 Reducing Cache Miss Penalty 5.5 Reducing Hit Time 5.6 Main.
COMP3221 lec31-mem-bus-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 32: Memory and Bus Organisation - II
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
CH12 CPU Structure and Function
CS-334: Computer Architecture
Computer Architecture Lecture 08 Fasih ur Rehman.
Processor Structure & Operations of an Accumulator Machine
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
Computer Architecture
PCI Team 3: Adam Meyer, Christopher Koch,
MICROPROCESSOR INPUT/OUTPUT
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Top Level View of Computer Function and Interconnection.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
System bus.
DMA Versus Polling or Interrupt Driven I/O
Chapter 1: Introduction. 1.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 1: Introduction What Operating Systems Do Computer-System.
Computer Architecture System Interface Units Iolanthe II approaches Coromandel Harbour.
Interrupts, Buses Chapter 6.2.5, Introduction to Interrupts Interrupts are a mechanism by which other modules (e.g. I/O) may interrupt normal.
August 1, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 9: I/O Devices and Communication Buses * Jeremy R. Johnson Wednesday,
EEE440 Computer Architecture
ECEG-3202 Computer Architecture and Organization Chapter 3 Top Level View of Computer Function and Interconnection.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Pipelining Basics.
Spring 2003CSE P5481 Precise Interrupts Precise interrupts preserve the model that instructions execute in program-generated order, one at a time If an.
By Fernan Naderzad.  Today we’ll go over: Von Neumann Architecture, Hardware and Software Approaches, Computer Functions, Interrupts, and Buses.
Dr Mohamed Menacer College of Computer Science and Engineering, Taibah University CE-321: Computer.
Lecture on Central Process Unit (CPU)
Computer Architecture System Interface Units Iolanthe II in the Bay of Islands.
Processor Memory Processor-memory bus I/O Device Bus Adapter I/O Device I/O Device Bus Adapter I/O Device I/O Device Expansion bus I/O Bus.
CS 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Bus Protocols and Interfacing Bus basics I/O transactions MPC555 bus Reference:
Chapter 2 Data Manipulation © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
The Pentium Series CS 585: Computer Architecture Summer 2002 Tim Barto.
1 load [2], [9] Transfer contents of memory location 9 to memory location 2. Illegal instruction.
Networked Embedded Systems Pengyu Zhang & Sachin Katti EE107 Spring 2016 Lecture 11 Direct Memory Access.
EE 107 Fall 2017 Lecture 7 Serial Buses – I2C Direct Memory Access
Superscalar Pipelines Part 2
High Performance Computing
Chapter 11 Processor Structure and function
William Stallings Computer Organization and Architecture 7th Edition
Presentation transcript:

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst The Pentium Pro® (P6) Bus Reference: “Penium Pro and Pentium II System Architecture”

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst A Bus for a Different Purpose Context: –The “P6” architecture was originally used in the “Pentium Pro” line, which was targeted at high-performance and SMP (Symmetric MultiProcessing) users. –The P5 (Pentium) bus was very simple One thing at a time - max Among other things, had very poor support for multiprocessing –Intel decided to redesign the bus interface to natively support: More complicated transactions for higher performance Basic cache coherence signals for multiprocessor efficiency

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst 3 Types of “Agents” on the bus –Request Agents –Response Agents – Snooping Agents

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Who can do what? Examples of each type: Request Agents –CPU(s) –PCI or other bridge or I/O agents Response Agents –Memory Controller –PCI or other bridge or I/O agents Snooping Agents –CPU(s) –Stand-alone bus-attached caches

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Phases The P6 bus divides each transaction into 6 phases –Arbitration –Request –Error –Snoop –Response –Data Each phase uses a different (mutually exclusive) set of signals on the bus –Key issue: why?

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst CS 352 Flashback In our basic processor, each instruction had 5 “steps” it had to work through: –Fetch, Decode, Execute, Memory, Write-back Single-cycle and multi-cycle designs dictated : –No new instruction can start until the previous instruction is completely finished. Architects used pipelining to greatly increase throughput of the machine –When an instruction leaves a stage, another can use those resources add$t4, $t5, $t6 FDEMW beq$t1, $t2, loop FDEMW lw$t3, 300($zero) FDEMW

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Bus Pipelining We can accomplish the same goal (increased throughput) on busses –Substitute “Transaction” for “Instruction” Allow multiple transactions to be in-flight at once –Each in a different “step” (phase) of the transaction Key requirement – physical resources ( i.e. bus lines) cannot be shared among phases –Just like hardware could not be shared between processor pipeline stages

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Step-by-Step – Part I – Arbitration The P6 bus uses symmetric arbitration, meaning that all request agents on the bus decide amongst themselves who is granted the bus next. –Recall PowerPC uses a central bus arbiter – the P6 bus has no arbiter. Who gets to go in the case of a conflict? –The bus uses a rotational priority scheme –Each agent keeps track of who has the bus and who is next in line –Exception: a denoted priority agent can trump the rotational scheme

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Step-by-Step – Part 2 – Request Once you have successfully taken control of the bus, you can make a request (provided someone else isn’t already) Request phase lasts 2 cycles –2 “packets” of data are transferred across the request lines

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Step-by-Step – Part 3 – Error Pretty Simple: if the request’s parity failed at the receiver end, an error signal (AERR#) is asserted, and the transaction needs to start over.

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Step-by-Step – Part 4 – Snoop If another (snoop) agent on the bus is holding a more up-to-date copy of a certain piece of data, it gets a chance to intervene. –This is necessary to support “cache coherence” (CE 452) If a snooper wants to interfere, its signal will dictate how the next 2 phases develop. This is also the phase where a response agent can opt to DEFER

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Step-by-Step – Part 5 – Response Response agent will give its official response. This can include: –Retry/Defer –No Data –Normal Data –Hard Failure Sets us up for…

CE 478: Microcontroller Systems University of Wisconsin-Eau Claire Dan Ernst Step-by-Step – Part 6 – Data The Data is transferred over the data bus (64-bit) Who does the transferring depends on: –Request type (R/W) –Response type –Snoop results