1 Lecture 26: Networks, Storage Topics: router microarchitecture, disks, RAID (Appendix D) Final exam: Monday 30 th Apr 10:30-12:30 Same rules as the midterm.

Slides:



Advertisements
Similar presentations
1 Lecture 22: I/O, Disk Systems Todays topics: I/O overview Disk basics RAID Reminder: Assignment 8 due Tue 11/21.
Advertisements

Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
RAID Redundant Array of Independent Disks
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
Chapter 6 External Memory Disk and RAID (Redundant Arrays of Independent Disks) CS-147 Fall 2010 Jonathan Wang.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
1 Lecture 17: On-Chip Networks Today: background wrap-up and innovations.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
1 Lecture 15: PCM, Networks Today: PCM wrap-up, projects discussion, on-chip networks background.
1 Lecture 23: Interconnection Networks Paper: Express Virtual Channels: Towards the Ideal Interconnection Fabric, ISCA’07, Princeton.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
1 Lecture 16: On-Chip Networks Today: on-chip networks background.
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
1 Lecture 27: Disks, Reliability, SSDs, Processors Topics: HDDs, SSDs, RAID, Intel and IBM case studies Final exam stats:  Highest 91, 18 scores of 82+
1 Lecture 21: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO’03, Princeton A Gracefully Degrading and.
1 Lecture 13: Interconnection Networks Topics: flow control, router pipelines, case studies.
1 Lecture 25: Interconnection Networks Topics: flow control, router microarchitecture Final exam:  Dec 4 th 9am – 10:40am  ~15-20% on pre-midterm  post-midterm:
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
1 Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID.
S.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 26: Interconnection Networks Topics: flow control, router microarchitecture.
1 Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance.
Lecture 39: Review Session #1 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
1 Lecture 23: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm Next semester:
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
1 Database Systems Storage Media Asma Ahmad 21 st Apr, 11.
Storage & Peripherals Disks, Networks, and Other Devices.
L/O/G/O External Memory Chapter 3 (C) CS.216 Computer Architecture and Organization.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
1 Lecture 15: Interconnection Routing Topics: deadlock, flow control.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Lecture 16: Router Design
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
1 Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections )
1 Lecture: Storage, GPUs Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)
1 Lecture: Networks, Disks, Datacenters, GPUs Topics: networks wrap-up, disks and reliability, datacenters, GPU intro (Sections , App D, Ch 4)
1 Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO’03, Princeton A Gracefully Degrading and.
Virtual-Channel Flow Control William J. Dally
W4118 Operating Systems Instructor: Junfeng Yang.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
1 Components of the Virtual Memory System  Arrows indicate what happens on a lw virtual address data physical address TLB page table memory cache disk.
1 Lecture 29: Interconnection Networks Papers: Express Virtual Channels: Towards the Ideal Interconnection Fabric, ISCA’07, Princeton Interconnect Design.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 10: Mass-Storage Systems.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
Chapter 2: Computer-System Structures
Multiple Platters.
Lecture 23: Interconnection Networks
Disks and RAID.
Lecture 23: Router Design
Lecture 16: On-Chip Networks
Lecture 13 I/O.
Lecture 28: Reliability Today’s topics: GPU wrap-up Disk basics RAID
Lecture: Interconnection Networks
Lecture: Networks Topics: TM wrap-up, networks.
Lecture: Interconnection Networks
Lecture 25: Interconnection Networks
Presentation transcript:

1 Lecture 26: Networks, Storage Topics: router microarchitecture, disks, RAID (Appendix D) Final exam: Monday 30 th Apr 10:30-12:30 Same rules as the midterm Topics: everything, but an emphasis on topics not included in the midterm (roughly split)

2 Router Functions Crossbar, buffer, arbiter, VC state and allocation, buffer management, ALUs, control logic, routing Typical on-chip network power breakdown:  30% link  30% buffers  30% crossbar

3 Router Pipeline Four typical stages:  RC routing computation: the head flit indicates the VC that it belongs to, the VC state is updated, the headers are examined and the next output channel is computed (note: this is done for all the head flits arriving on various input channels)  VA virtual-channel allocation: the head flits compete for the available virtual channels on their computed output channels  SA switch allocation: a flit competes for access to its output physical channel  ST switch traversal: the flit is transmitted on the output channel A head flit goes through all four stages, the other flits do nothing in the first two stages (this is an in-order pipeline and flits can not jump ahead), a tail flit also de-allocates the VC

4 Router Pipeline Four typical stages:  RC routing computation: compute the output channel  VA virtual-channel allocation: allocate VC for the head flit  SA switch allocation: compete for output physical channel  ST switch traversal: transfer data on output physical channel RCVASAST -- SAST -- SAST -- SAST Cycle Head flit Body flit 1 Body flit 2 Tail flit RCVASAST -- SAST -- SAST -- SAST SA -- STALL

5 Speculative Pipelines Perform VA and SA in parallel Note that SA only requires knowledge of the output physical channel, not the VC If VA fails, the successfully allocated channel goes un-utilized RC VA SA ST --SAST --SAST --SAST Cycle Head flit Body flit 1 Body flit 2 Tail flit Perform VA, SA, and ST in parallel (can cause collisions and re-tries) Typically, VA is the critical path – can possibly perform SA and ST sequentially Router pipeline latency is a greater bottleneck when there is little contention When there is little contention, speculation will likely work well! Single stage pipeline? RC VA SA ST

6 Recent Intel Router Source: Partha Kundu, “On-Die Interconnects for Next-Generation CMPs”, talk at On-Chip Interconnection Networks Workshop, Dec 2006 Used for a 6x6 mesh 16 B, > 3 GHz Wormhole with VC flow control

7 Recent Intel Router Source: Partha Kundu, “On-Die Interconnects for Next-Generation CMPs”, talk at On-Chip Interconnection Networks Workshop, Dec 2006

8 Current Trends Growing interest in eliminating the area/power overheads of router buffers; traffic levels are also relatively low, so virtual-channel buffered routed networks may be overkill Option 1: use a bus for short distances (16 cores) and use a hierarchy of buses to travel long distances Option 2: hot-potato or bufferless routing

9 Magnetic Disks A magnetic disk consists of 1-12 platters (metal or glass disk covered with magnetic recording material on both sides), with diameters between inches Each platter is comprised of concentric tracks (5-30K) and each track is divided into sectors (100 – 500 per track, each about 512 bytes) A movable arm holds the read/write heads for each disk surface and moves them all in tandem – a cylinder of data is accessible at a time

10 Disk Latency To read/write data, the arm has to be placed on the correct track – this seek time usually takes 5 to 12 ms on average – can take less if there is spatial locality Rotational latency is the time taken to rotate the correct sector under the head – average is typically more than 2 ms (15,000 RPM) Transfer time is the time taken to transfer a block of bits out of the disk and is typically 3 – 65 MB/second A disk controller maintains a disk cache (spatial locality can be exploited) and sets up the transfer on the bus (controller overhead)

11 RAID Reliability and availability are important metrics for disks RAID: redundant array of inexpensive (independent) disks Redundancy can deal with one or more failures Each sector of a disk records check information that allows it to determine if the disk has an error or not (in other words, redundancy already exists within a disk) When the disk read flags an error, we turn elsewhere for correct data

12 RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) – it uses an array of disks and stripes (interleaves) data across the arrays to improve parallelism and throughput RAID 1 mirrors or shadows every disk – every write happens to two disks Reads to the mirror may happen only when the primary disk fails – or, you may try to read both together and the quicker response is accepted Expensive solution: high reliability at twice the cost

13 RAID 3 Data is bit-interleaved across several disks and a separate disk maintains parity information for a set of bits For example: with 8 disks, bit 0 is in disk-0, bit 1 is in disk-1, …, bit 7 is in disk-7; disk-8 maintains parity for all 8 bits For any read, 8 disks must be accessed (as we usually read more than a byte at a time) and for any write, 9 disks must be accessed as parity has to be re-calculated High throughput for a single request, low cost for redundancy (overhead: 12.5%), low task-level parallelism

14 RAID 4 and RAID 5 Data is block interleaved – this allows us to get all our data from a single disk on a read – in case of a disk error, read all 9 disks Block interleaving reduces thruput for a single request (as only a single disk drive is servicing the request), but improves task-level parallelism as other disk drives are free to service other requests On a write, we access the disk that stores the data and the parity disk – parity information can be updated simply by checking if the new data differs from the old data

15 RAID 5 If we have a single disk for parity, multiple writes can not happen in parallel (as all writes must update parity info) RAID 5 distributes the parity block to allow simultaneous writes

16 Advanced Courses For GPU architectures and programming, see Mary Hall’s CS 6235, Parallel Programming for Many-Core Arch Spr’13: CS 7810: Advanced Computer Architecture  Tu/Th 10:45am-12:05pm  Core design, cache coherence, networks, memory systems, datacenters  Major course project that evaluates original ideas with simulators (often leads to publications)  No assignments  Take-home final

17 Title Bullet