Multiprocessor Initialization

Slides:



Advertisements
Similar presentations
Register In computer architecture, a processor register is a small amount of storage available on the CPU whose contents can be accessed more quickly than.
Advertisements

Chapter 2 (cont.) An Introduction to the 80x86 Microprocessor Family Objectives: The different addressing modes and instruction types available The usefulness.
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Chapter 10 Input/Output Organization. Connections between a CPU and an I/O device Types of bus (Figure 10.1) –Address bus –Data bus –Control bus.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
Using the 8254 Timer-Counter Understanding the role of the system’s 8254 programmable Interval-Timer/Counter.
I/O Unit.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
Multiprocessor Initialization
Using the 8254 Timer-Counter Understanding the role of the system’s 8254 programmable Interval-Timer/Counter.
Using the 8254 Timer-Counter
Interrupts What is an interrupt? What does an interrupt do to the “flow of control” Interrupts used to overlap computation & I/O – Examples would be console.
DIRECT MEMORY ACCESS CS 147 Thursday July 5,2001 SEEMA RAI.
Message Signaled Interrupts
1 Hardware and Software Architecture Chapter 2 n The Intel Processor Architecture n History of PC Memory Usage (Real Mode)
Multiprocessor Initialization An introduction to the use of Interprocessor Interrupts.
6-1 I/O Methods I/O – Transfer of data between memory of the system and the I/O device Most devices operate asynchronously from the CPU Most methods involve.
Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table.
Basic Input/Output Operations
Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table.
ICS312 Set 3 Pentium Registers. Intel 8086 Family of Microprocessors All of the Intel chips from the 8086 to the latest pentium, have similar architectures.
Computer Organization and Assembly language
INPUT/OUTPUT ORGANIZATION INTERRUPTS CS147 Summer 2001 Professor: Sin-Min Lee Presented by: Jing Chen.
Unit-1 PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE Advance Processor.
1 Computer System Overview Chapter 1 Review of basic hardware concepts.
NS Training Hardware. System Controller Module.
Chapter 7 Input/Output Luisa Botero Santiago Del Portillo Ivan Vega.
General System Architecture and I/O.  I/O devices and the CPU can execute concurrently.  Each device controller is in charge of a particular device.
Khaled A. Al-Utaibi  Intel Peripheral Controller Chips  Basic Description of the 8255  Pin Configuration of the 8255  Block Diagram.
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
The 8253 Programmable Interval Timer
MICROPROCESSOR INPUT/OUTPUT
Faculty of Computer Science © 2006 CMPUT 229 Input and Output Devices Pooling and Interrupts.
1. Introduction 2. Methods for I/O Operations 3. Buses 4. Liquid Crystal Displays 5. Other Types of Displays 6. Graphics Adapters 7. Optical Discs 10/01/20151Input/Output.
Khaled A. Al-Utaibi  Interrupt-Driven I/O  Hardware Interrupts  Responding to Hardware Interrupts  INTR and NMI  Computing the.
1 Fundamental of Computer Suthida Chaichomchuen : SCC
Top Level View of Computer Function and Interconnection.
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 1 Computer System Overview.
PIT Programming Examples Working with the modes of PIT.
Lecture 3. APIC ID Prof. Taeweon Suh Computer Science Education Korea University COM509 Computer Systems.
I/O Interfacing A lot of handshaking is required between the CPU and most I/O devices. All I/O devices operate asynchronously with respect to the CPU.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
Computer Architecture Lecture 2 System Buses. Program Concept Hardwired systems are inflexible General purpose hardware can do different tasks, given.
EEE440 Computer Architecture
8051 Micro controller. Architecture of 8051 Features of 8051.
Time Management.  Time management is concerned with OS facilities and services which measure real time, and is essential to the operation of timesharing.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Modes of transfer in computer
Interrupt driven I/O. MIPS RISC Exception Mechanism The processor operates in The processor operates in user mode user mode kernel mode kernel mode Access.
12/16/  List the elements of 8255A Programmable Peripheral Interface (PPI)  Explain its various operating modes  Develop a simple program to.
Lecture 1: Review of Computer Organization
Review of Computer System Organization. Computer Startup For a computer to start running when it is first powered up, it needs to execute an initial program.
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
October 1, 2003Serguei A. Mokhov, 1 SOEN228, Winter 2003 Revision 1.2 Date: October 25, 2003.
I/O Organization Competency – C6. Important facts to remember when I/O devices are to be connected to CPU There is a vast variety of I/O devices. Some.
بسم الله الرحمن الرحيم MEMORY AND I/O.
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
Introduction to Exceptions 1 Introduction to Exceptions ARM Advanced RISC Machines.
Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir. A.C. Verschueren Eindhoven University of Technology Section of Digital.
Homework Reading Machine Projects Labs
Interrupts In 8085 and 8086.
8086 Microprocessor.
Basic Microprocessor Architecture
Fundamentals of Computer Organisation & Architecture
Presentation transcript:

Multiprocessor Initialization An introduction to the use of Interprocessor Interrupts

Multiprocessor topology Back Side Bus Local APIC Local APIC IO APIC CPU #0 CPU #1 Front Side Bus peripheral devices system memory bridge

The Local-APIC ID register 31 24 APIC ID reserved This register is initially zero, but its APIC ID Field (8-bits) is programmed by the BIOS during system startup with a unique processor identification- number which subsequently is used when specifying the processor as a recipient of inter-processor interrupts. Memory-Mapped Register-Address: 0xFEE00020

The Local-APIC EOI register 31 write-only register This write-only register is used by Interrupt Service Routines to issue an ‘End-Of-Interrupt’ command to the Local-APIC. Any value written to this register will be interpreted by the Local-APIC as an EOI command. The value stored in this register is initially zero (and it will remain unchanged). Memory-Mapped Register-Address: 0xFEE000B0

The Spurious Interrupt register 31 8 7 reserved E N spurious vector Local-APIC is Enabled (1=yes, 0=no) This register is used to Enable/Disable the functioning of the Local-APIC, and when enabled, to specify the interrupt-vector number to be delivered to the processor in case the Local-APIC generates a ‘spurious’ interrupt. (In some processor-models, the vector’s lowest 4-bits are hardwired 1s.) Memory-Mapped Register-Address: 0xFEE000F0

Interrupt Command Register Each Pentium’s Local-APIC has a 64-bit Interrupt Command Register It can be programmed by system software to transmit messages (via the Back Side Bus) to one or several other processors Each processor has a unique identification number in its APIC Local-ID Register that can be used for directing messages to it

ICR (upper 32-bits) Memory-Mapped Register-Address: 0xFEE00310 31 24 Destination field reserved The Destination Field (8-bits) can be used to specify which processor (or group of processors) will receive the message Memory-Mapped Register-Address: 0xFEE00310

ICR (lower 32-bits) 31 19 18 15 12 10 8 7 Vector field Delivery Mode 10 8 7 R / O Vector field Delivery Mode 000 = Fixed 001 = Lowest Priority 010 = SMI 011 = (reserved) 100 = NMI 101 = INIT 110 = Start Up 111 = (reserved) Destination Shorthand 00 = no shorthand 01 = only to self 10 = all including self 11 = all excluding self Trigger Mode 0 = Edge 1 = Level Level 0 = De-assert 1 = Assert Destination Mode 0 = Physical 1 = Logical Delivery Status 0 = Idle 1 = Pending Memory-Mapped Register-Address: 0xFEE00300

MP initialization protocol Set shared processor-counter equal to 1 Step 1: issue an ‘INIT’ IPI to all-except-self Delay for 10 milliseconds Step 2: issue ‘Startup’ IPI to all-except-self Delay for 200 microseconds Step 3: issue ‘Startup’ IPI to all-except-self Check the value of the processor-counter

Issue ‘INIT’ IPI # address Local-APIC via register FS mov $sel_fs, %ax mov %ax, %fs # broadcast ‘INIT’ IPI to ‘all-except-self’ mov $0x000C4500, %eax mov %eax, %fs:0xFEE00300) .B0: btl $12, %fs:(0xFEE00300) jc .B0

Issue ‘Startup’ IPI # broadcast ‘Startup’ IPI to all-except-self # using vector 0x11 to specify entry-point # at real memory-address 0x00011000 mov $0x000C4611, %eax mov %eax, %fs:(0xFEE00300) .B1: btl $12, %fs:(0xFEE00300) jc .B1

Timing delays Intel’s MP Initialization Protocol specifies the use of some timing-delays: 10 milliseconds ( = 10,000 microseconds) 200 microseconds We can use the 8254 Timer’s Channel 2 for implementing these timed delays, by programming it for ‘one-shot’ countdown mode, then polling bit #5 at i/o port 0x61

Mathematical examples Delaying for 10-milliseconds means delaying for 1/100-th of a second (because 100 times 10 milliseconds = one-thousand milliseconds) EXAMPLE 2 Delaying for 200-microseconds means delaying 1/5000-th of a second (because 5000 times 200 microseconds = one-million microseconds) GENERAL PRINCIPLE Delaying for x–microseconds means delaying for 1000000/x seconds (because 1000000/x times x-microseconds = one-million microseconds)

Mathematical theory PROBLEM: Given the desired delay-time in microseconds, express the desired delay-time in clock-frequency pulses and program that number into the PIT’s Latch-Register RECALL: Clock-Frequency-in-Seconds = 1193182 Hertz ALSO: One second equals one-million microseconds APPLYING DIMENSIONAL ANALYSIS Pulses-Per-Microsecond = Pulses-Per-Second / Microseconds-Per-Second Delay-in-Clock-Pulses = Delay-in-Microseconds * Pulses-Per-Microsecond CONCLUSION For a desired time-delay of x microseconds, the number of clock-pulses may be computed as x * (1193182 /1000000) = 1193182 / (1000000 / x ) as dividing by a fraction amounts to multiplying by that fraction’s reciprocal

Delaying for EAX microseconds # We use the 8254 Timer/Counter Channel 2 to generate a # timed delay (expressed in microseconds by value in EAX) mov %eax, %ecx # copy delay-time to ECX mov %1000000, %eax # microseconds-per-sec xor %edx, %edx # extended to quadword div %ecx # perform dword division mov %eax, %ecx # copy quotient into ECX mov $1193182, %ecx # input-pulses-per-sec div %ecx # perform dword division # now transfer the quotient from AX to the Channel 2 Latch

Mutual Exclusion Shared variables must not be modified by more than one processor at a time (‘mutual exclusion’) The Pentium’s ‘lock’ prefix helps enforce this Example: every processor adds 1 to count lock incl (count) Example: all processors needs private stacks mov 0x1000, %ax xadd [new_SS], %ax mov %ax, %ss

ROM-BIOS isn’t ‘reentrant’ The video service-functions in ROM-BIOS that we use to display a message-string at the current cursor-location (and afterward advance the cursor) modify global storage locations (as well as i/o ports), and hence must be called by one processor at a time A shared memory-variable (called ‘mutex’) is used to enforce this mutual exclusion

Implementing a ‘spinlock’ mutex: .word 1 spin: btw $0, mutex jnc spin lock btrw $0, mutex jnc spin # <CRITICAL SECTION OF CODE GOES HERE> btsw $0, mutex

Demo: ‘smphello.s’ Each CPU needs to access its Local-APIC The BSP (“Boot-Strap Processor”) wakes up other processors by broadcasting the ‘INIT-SIPI-SIPI’ message-sequence Each AP (“Application Processor”) starts executing at a 4K page-boundary, and needs its own private stack-area Shared variables need ‘exclusive’ access

In-class exercise Include this procedure that multiple CPUs will execute simultaneously (without ‘lock) total: .word 0 # the shared variable add_one_thousand: mov $1000, %cx nxinc: addw $1, (total) loop nxinc ret

We may need a ‘barrier’ We can use a software construct (known as a ‘barrier’) to stop CPUs from entering a block of code until a prescribed number of them are all ready to enter it together arrived: .word 0 # shared variable barrier: lock incw (arrived) await: cmpw $2, (arrived) jb await call add_one_thouand