© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 CS252 Graduate Computer Architecture Fall 2015 Lecture 18: I/O Krste Asanovic

Slides:



Advertisements
Similar presentations
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Advertisements

1  1998 Morgan Kaufmann Publishers Interfacing Processors and Peripherals.
Computer Science & Engineering
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
CS-334: Computer Architecture
Avishai Wool lecture Introduction to Systems Programming Lecture 8 Input-Output.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University I/O See: P&H Chapter
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Putting it all together: Intel Nehalem Steve Ko Computer Sciences and Engineering University.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
I/O Channels I/O devices getting more sophisticated e.g. 3D graphics cards CPU instructs I/O controller to do transfer I/O controller does entire transfer.
University College Cork IRELAND Hardware Concepts An understanding of computer hardware is a vital prerequisite for the study of operating systems.
1 Lecture 21: Virtual Memory, I/O Basics Today’s topics:  Virtual memory  I/O overview Reminder:  Assignment 8 due Tue 11/21.
5.1 Chaper 4 Central Processing Unit Foundations of Computer Science  Cengage Learning.
2. Methods for I/O Operations
Module I Overview of Computer Architecture and Organization.
CS-334: Computer Architecture
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
Chapter 8 Input/Output. Busses l Group of electrical conductors suitable for carrying computer signals from one location to another l Each conductor in.
1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,
Input and Output Computer Organization and Assembly Language: Module 9.
1 CS503: Operating Systems Spring 2014 Dongyan Xu Department of Computer Science Purdue University.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.1. Principles of I/O.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Principles of I/0 hardware.
2007 Oct 18SYSC2001* - Dept. Systems and Computer Engineering, Carleton University Fall SYSC2001-Ch7.ppt 1 Chapter 7 Input/Output 7.1 External Devices.
I/O Example: Disk Drives To access data: — seek: position head over the proper track (8 to 20 ms. avg.) — rotational latency: wait for desired sector (.5.
Top Level View of Computer Function and Interconnection.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Input/Output CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent University.
2009 Sep 10SYSC Dept. Systems and Computer Engineering, Carleton University F09. SYSC2001-Ch7.ppt 1 Chapter 7 Input/Output 7.1 External Devices 7.2.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CE-321: Computer.
Interrupts, Buses Chapter 6.2.5, Introduction to Interrupts Interrupts are a mechanism by which other modules (e.g. I/O) may interrupt normal.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
MBG 1 CIS501, Fall 99 Lecture 18: Input/Output (I/O): Buses and Peripherals Michael B. Greenwald Computer Architecture CIS 501 Fall 1999.
EEE440 Computer Architecture
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Organisasi Sistem Komputer Materi VIII (Input Output)
L/O/G/O Input Output Chapter 4 CS.216 Computer Architecture and Organization.
CS2100 Computer Organisation Input/Output – Own reading only (AY2015/6) Semester 1 Adapted from David Patternson’s lecture slides:
Chapter 13 – I/O Systems (Pgs ). Devices  Two conflicting properties A. Growing uniformity in interfaces (both h/w and s/w): e.g., USB, TWAIN.
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
Lecture on Central Process Unit (CPU)
Chapter 6 Storage and Other I/O Topics. Chapter 6 — Storage and Other I/O Topics — 2 Introduction I/O devices can be characterized by Behaviour: input,
Processor Memory Processor-memory bus I/O Device Bus Adapter I/O Device I/O Device Bus Adapter I/O Device I/O Device Expansion bus I/O Bus.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
ECE 456 Computer Architecture Lecture #9 – Input/Output Instructor: Dr. Honggang Wang Fall 2013.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Computer Architecture Chapter (7): Input / Output
CSCE 385: Computer Architecture Spring 2014 Dr. Mike Turi I/O.
Lecture 2. A Computer System for Labs
DIRECT MEMORY ACCESS and Computer Buses
Department of Computer Science and Engineering
CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches
Operating Systems (CS 340 D)
CS 286 Computer Organization and Architecture
Computer Architecture
Peng Liu Lecture 14 I/O Peng Liu
Overview of Computer Architecture and Organization
CSC3050 – Computer Architecture
Presentation transcript:

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 CS252 Graduate Computer Architecture Fall 2015 Lecture 18: I/O Krste Asanovic

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 (I/O) Input/Output Computers useless without I/O -Over time, literally thousands of forms of computer I/O: punch cards to brain interfaces Broad categories:  Secondary/Tertiary storage (flash/disk/tape)  Network (Ethernet, WiFi, Bluetooth, LTE)  Human-machine interfaces (keyboard, mouse, touchscreen, graphics, audio, video, neural,…)  Printers (line, laser, inkjet, photo, 3D, …)  Sensors (process control, GPS, heartrate, …)  Actuators (valves, robots, car brakes, …) Mix of I/O devices is highly application-dependent 2

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Interfacing to I/O Devices Two general strategies  Memory-mapped -I/O devices appear as memory locations to processor -Reads and writes to I/O device locations configure I/O and transfer data (using either programmed I/O or DMA)  I/O channels -Architecture specifies commands to execute I/O commands over defined channels -I/O channel structure can be layered over memory- mapped device structure  In addition to data transfer, have to define synchronization method -Polling: CPU checks status bits -Interrupts: Device interrupts CPU on event 3

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Memory-Mapped I/O  Programmed I/O uses CPU to control I/O device using load and store instructions, with address specifying device register to access -Load and store can have side effect on device  Usually, only privileged code can access I/O devices directly, to provide secure multiprogramming -System calls sometimes provided for application to open and reserve a device for exclusive access  Processors provide “uncached” loads and stores to prevent caching of device registers -Usually indicated by bits in page table entries or by reserving portions of physical address space 4

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Simple I/O Bus Structure  Some range of physical memory addresses map to I/O bus devices  I/O bus bridge reduces loading on critical CPU-DRAM bus  Devices can be “slaves”, only responding to I/O bus requests  Devices can be “masters”, initiating I/O bus transfers 5 CPU Caches DRAM I/O Bus Bridge Memory Bus I/O Bus I/O Device #1 I/O Device #2 I/O Device #3

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 DMA (Direct Memory Access)  DMA engines offload CPU by autonomously transferring data between I/O device and main memory. Interrupt/poll for done -DMA programmed through memory-mapped registers -Some systems use dedicated processors inside DMA engines  Often, many separate DMA engines in modern systems -Centralized in I/O bridge (usually supporting multiple concurrent channels to different devices), works on slave-only I/O busses -Directly attached to each peripheral (if I/O bus supports mastering) 6 CPU Caches DRAM I/O Bus Bridge Memory Bus I/O Bus I/O Device #1 I/O Device #2 I/O Device #3 DMA

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 More Complex Bus Structures 7 CPU Caches DRAM I/O Bus Bridge Memory Bus Fast I/O Bus I/O Device #1 I/O Device #3 Slow I/O Bus Bridge DMA I/O Device #2  Match speed of I/O connection to device demands -Special direct connection for graphics -Fast I/O bus for disk drives, ethernet -Slow I/O bus for keyboard, mouse, touchscreen -Reduces load on fast I/O bus + less bus logic needed on device Slow I/O Bus DMA I/O Device #4 Graphics DMA

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Move from Parallel to Serial I/O Off-chip 8 CPU I/O IF I/O 1I/O 2 Central Bus Arbiter Shared Parallel Bus Wires CPU I/O IF I/O 1 I/O 2 Dedicated Point-to-point Serial Links Parallel bus clock rate limited by clock skew across long bus (~100MHz) High power to drive large number of loaded bus lines Central bus arbiter adds latency to each transaction, sharing limits throughput Expensive parallel connectors and backplanes/cables (all devices pay costs) Examples: VMEbus, Sbus, ISA bus, PCI, SCSI, IDE Point-to-point links run at multi-gigabit speed using advanced clock/signal encoding (requires lots of circuitry at each end) Lower power since only one well-behaved load Multiple simultaneous transfers Cheap cables and connectors (trade greater endpoint transistor cost for lower physical wiring cost), customize bandwidth per device using multiple links in parallel Examples: Ethernet, Infiniband, PCI Express, SATA, USB, Firewire, etc.

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Move from Bus to Crossbar On-Chip  Busses evolved in era where wires were expensive and had to be shared  Bus tristate drivers problematic in standard cell flows, so replace with combinational muxes  Crossbar exploits density of on-chip wiring, allows multiple simultaneous transactions 9 ABC Tristated Bus AA BB CC Crossbar

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 I/O and Memory Mapping  I/O busses can be coherent or not -Non-coherent simpler, but might require flushing caches or only non-cacheable accesses (much slower on modern processors) -Some I/O systems can cache coherently also (SGI Origin)  I/O can use virtual addresses -Simplifies DMA into user address space, otherwise contiguous user segment needs scatter/gather by DMA engine -Provides protection from bad device drivers -Adds complexity to I/O device 10

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Interrupts versus Polling Two ways to detect I/O device status:  Interrupts +No CPU overhead until event −Large context-switch overhead on each event (trap flushes pipeline, disturbs current working set in cache/TLB) −Can happen at awkward time  Polling -CPU overhead on every poll -Difficult to insert in all code +Can control when handler occurs, reduce working set hit  Hybrid approach: -Interrupt on first event, keep polling in kernel until sure no more events, then back to interrupts 11

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Example ARM SoC Structure 12 [©ARM]

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 ARM Sample Smartphone Diagram 13 [©ARM]

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Intel Ivy Bridge Server Chip I/O 14 [©Intel]

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Intel Romley Server Platform 15 [©Intel]

© Krste Asanovic, 2015CS252, Fall 2015, Lecture 18 Acknowledgements  This course is partly inspired by previous MIT and Berkeley CS252 computer architecture courses created by my collaborators and colleagues: -Arvind (MIT) -Joel Emer (Intel/MIT) -James Hoe (CMU) -John Kubiatowicz (UCB) -David Patterson (UCB) 16