Lecture 1. General-Purpose Computer Systems ECM583 Special Topics in Computer Systems Lecture 1. General-Purpose Computer Systems Prof. Taeweon Suh Computer Science Education Korea University
A Computer System (till 2008) CPU Main Memory (DDR2) FSB (Front-Side Bus) North Bridge Graphics card DMI (Direct Media I/F) Peripheral devices South Bridge Hard disk USB PCIe card But, don’t forget the big picture!
Present, Near Future and More… Core i7– based Systems Core 2 Duo – based Systems Main Memory (DDR2) Main Memory (DDR3) CPU CPU FSB (Front-Side Bus) Quickpath (Intel) or Hypertransport (AMD) North Bridge North Bridge South Bridge South Bridge DMI (Direct Media I/F) DMI (Direct Media I/F) Keep in mind that CPU and computer systems are evolving at a fast pace
x86 History (as of 2008)
x86 History (Cont.) 2009 Core i7 32-bit (i386) 4-bit 8-bit 16-bit
x86? What is x86? Generic term referring to processors from Intel, AMD and VIA Derived from the model numbers of the first few generations of processors: 8086, 80286, 80386, 80486 x86 Now it generally refers to processors from Intel, AMD, and VIA x86-16: 16-bit processor x86-32 (aka IA32): 32-bit processor * IA: Intel Architecture x86-64: 64-bit processor Intel takes about 80% of the PC market and AMD takes about 20% Apple also have been introducing Intel-based Mac from Nov. 2006
Example: Intel’s Core 2 Duo L2 Cache Core0 Core1 DL1 L1 32 KB, 8-Way, 64 Byte/Line, LRU, WB 3 Cycle Latency L2 4.0 MB, 16-Way, 64 Byte/Line, LRU, WB 14 Cycle Latency IL1 Source: http://www.sandpile.org
Example: Intel’s Core i7 4 cores on one chip Three levels of caches (L1, L2, L3) on chip L1: 32KB, 8-way L2: 256KB, 8-way L3: 8MB, 16-way 731 million transistors in 263 mm2 with 45nm technology
Example: AMD’s Opteron - Barcelona 4 cores on one chip 1.9GHz clock 65nm technology Three levels of caches (L1, L2, L3) on chip L1: 64KB, L2: 512KB, L3: 2MB Integrated North Bridge
Chipset We call North and South Bridges as Chipset Chipset has many PCIe devices inside North Bridge Memory controller PCI express ports to connect Graphics card http://www.intel.com/Assets/PDF/datasheet/316966.pdf South Bridge HDD (Hard-disk) controller USB controller Various peripherals connected Keyboard, mouse, timer etc PCI express ports http://www.intel.com/Assets/PDF/datasheet/316972.pdf Note that the landscape is being changed! For example, memory controller is integrated into CPU
PCI, PCI Express Devices PCI (Peripheral Component Interconnect) Computer bus connecting all the peripheral devices to the computer motherboard PCIe (PCI Express) Replaced PCI in 2004 Point-to-point connection PCI express slot x16 PCI express slots PCI slot http://www.pcisig.com/specifications/pciexpress/
An Old GP Computer System Example
PCI Express Slots in GP Systems
GP Computer System in terms of PCIe North Bridge South Bridge
Core i7-based Systems Core i7 860 (Lynnfield) – based system Core i7 920 (Bloomfield) – based system
Hardware/Software Stack in Computer Application software Written in high-level language System software Compiler Translates code written in high-level language to machine code Operating System Handling input/output Managing memory and storage Scheduling tasks & sharing resources BIOS (Basic Input/Output System) ISA Interface between hardware and low-level software Hardware Processor, memory, I/O controllers Applications (MS-office, Google Earth…) API (Application Program I/F) Operating System (Linux, Vista, Mac OS …) BIOS provides common I/Fs BIOS (AMI, Phoenix Technologies …) Instruction Set Architecture (ISA) Computer Hardware (CPU, Chipset, PCIe cards ...)
How the GP Computer System Works? x86-based system starts to execute from the reset address 0xFFFF_FFF0 The first instruction is “jmp xxx” off from BIOS ROM BIOS (Basic Input/Output System) Detect and initialize all the devices (including PCI devices via PCI enumeration) on the system Provide common interfaces to OS Hand over the control to OS OS Manage the system resources including main memory Control and coordinate the use of the hardware among various application programs for the various users Provide APIs for system and application programming
GP Systems’ Differences from Other Computer Systems How is it different from other computers systems such as embedded systems? General-purpose computer systems provide programmability to end-users You can do any kinds of programming on your PC C, C++, C#, Java etc General-purpose systems should provide backward compatibility A new system should be able to run legacy software, which could be in the form of binaries with no source codes written 30 years ago So, general purpose computer system becomes messy and complicated, still containing all legacy hardware functionalities
Abstraction Abstraction helps us deal with complexity Hide lower-level detail Instruction set architecture (ISA) An abstract interface between the hardware and the low-level software interface
Abstraction Analogies Driver Customer Abstraction layer Abstraction layer Machine Details Machine Details Hardware board in a vending machine Combustion Engine in a car Break system in a car
Abstraction in Computer Users Application programming using APIs Provides APIs (Application Programming Interface) Abstraction layer Operating Systems Instruction Set Architecture (ISA) Machine language Assembly language Abstraction layer L2 Cache Core0 Core1 Hardware implementation
A Typical Memory Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology higher level lower level Secondary Storage (Disk) On-Chip Components Main Memory (DRAM) CPU Core L1I (Instr Cache) L2 (Second Level) Cache ITLB Reg File L1D (Data Cache) DTLB Speed (cycles): ½’s 1’s 10’s 100’s 10,000’s Size (bytes): 100’s 10K’s M’s G’s T’s Cost: highest lowest
Typical and Essential Instructions Instruction categories Arithmetic and Logical (Integer) Memory Access Instructions Load and Store Branch Floating Point Registers in x86 EAX, EBX, ECX, EDX .. CS, DS, SS, ES…
Levels of Program Code High-level language Assembly language Level of abstraction closer to problem domain Provides for productivity and portability Assembly language Textual and symbolic representation of instructions Hardware representation Binary digits (bits) Encoded instructions and data
Instructions and Instruction Set If you want to talk to foreigners, you should be able to speak their languages Likewise, to talk to a computer, you must speak its language The words of a computer’s language are called instructions The collection of instructions is called instruction set Different CPUs have different instruction sets x86 has its own instruction set ARM has its own instruction set But, they have many aspects in common
x86 Instruction Examples For more information on the complete instruction sets of x86, refer to the following links http://www.intel.com/products/processor/manuals/
High Level Code to Assembly to Executable What steps do you take to run your program after writing your code “hello.c” on your Linux machine? %gcc hello.c -o hello” // hello is a machine code (binary or executable) %./hello % Hello World! %objdump –D hello // it shows human-readable code #include <stdio.h> int main(void) { printf("Hello World!\n"); return 0; }
Reality check: High Level Code to Assembly to Executable C program preprocessor Expanded C program cpp (C-preprocessor) in Linux GNU C compiler assembly code gcc in Linux GNU C Human-readable assembly code assembler object code as in Linux GNU library routines linker ld in Linux GNU executable Machine code loader memory Linux kernel loads the executable into memory
Reality check: High Level Code to Assembly to Executable (Cont) The command “gcc” hides all the details Try to compile hello.c with “gcc –v hello.c –o hello” You will see all the details of what gcc does for compilation Compilation goes through several steps to generate machine code Preprocessor Compilation Assembler Linker #include <stdio.h> int main(void) { printf("Hello World!\n"); return 0; }
Reality check: High Level Code to Assembly to Executable (Cont) Preprocessing Use to expand macros and header files included %cpp hello.c > hello.i open “hello.i” to see what you got Compilation Actual compilation of the preprocessed code to assembly language for a specific processor %gcc -Wall -S hello.i Result will be stored in hello.s Open hello.s to see what you got Assembler Convert assembly language into machine code and generate an object file %as hello.s -o hello.o The resulting file ‘hello.o’ contains the machine instructions for the Hello World program, with an undefined reference to printf
Reality check: High Level Code to Assembly to Executable (Cont) Linker Final stage of compilation Linking object files to create an executable In practice, an executable requires many external functions from system and C run-time (crt) libraries Consequently, the actual link commands used internally by GCC are complicated. Example %ld -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/gcc/i386-redhat-linux/4.3.0/crtbegin.o -L/usr/lib/gcc/i386-redhat-linux/4.3.0 hello.o -lgcc -lgcc_eh -lc -lgcc -lgcc_eh /usr/lib/gcc/i386-redhat-linux/4.3.0/crtend.o /usr/lib/crtn.o -o hello Note that “i386-redhat-linux/4.3.0/” is dependent on your Linux version Now run your program %./hello // Linux kernel loads the program into memory %Hello World! // output
Stored Program Concept CPU North Bridge South Bridge Main Memory (DDR) FSB (Front-Side Bus) DMI (Direct Media I/F) Memory (DDR) Hello World Binary (machine code) CPU Address Bus 01101000 01100000 00110011 11100101 01101000 01100000 00110011 11100101 11100111 00110000 01010101 11000011 10100000 00011111 11100111 00011110 11110011 11000011 00110011 01010101 11100111 00110000 01010101 11000011 10100000 00011111 11100111 00011110 11110011 11000011 00110011 01010101 C compiler (machine code) Data Bus “Hello World” Source code in C Instructions are represented in binary, just like data Instructions and data are stored in memory CPU fetches instructions and data to execute Programs can operate on programs e.g., compilers, linkers, … Binary compatibility allows compiled programs to work on different computers Standardized ISAs
CPU Operation CPU (MIPS) PC Registers R3 + 32 bits 0x0000 0x0000 $zero $at $v0 $v1 $fp $ra … 32 bits Registers 0x0000 0x0000 0x0004 0x0008 0x0008 0x0004 Memory Address Bus 0x00110011 0x00220022 0x00220022 0x00110011 0x0018 0x0014 0x0008 0x0004 0x0000 + 0x0018 0x0014 0x00110011 0x00220022 R3 0x00330033 Data Bus add v0,v1,v0 add v0,v1,v0 lw v1, 8(s8) lw v0, 4(s8) lw v1, 8(s8) lw v0, 4(s8)