COSC 1306 COMPUTER SCIENCE AND PROGRAMMING

COSC 1306 COMPUTER SCIENCE AND PROGRAMMING
Jehan-François Pâris 1 1 1

CHAPTER VIII COMPUTER ORGANIZATION
This shortened version contains all the materials that were discussed in class (and nothing else). 2 2 2

WARNING This presentation contains advanced materials that will not be on the quiz The relevant slides are marked I will still discuss them in class, time allowing Skip

Chapter Overview We will focus on the main challenges of computer architecture Managing the I/O hierarchy Caching, multiprogramming, virtual memory 4

Managing the Memory Hierarchy

Secondary storage (Disks) Mass storage (Often offline)
The memory hierarchy CPU registers Main memory (RAM) Secondary storage (Disks) Mass storage (Often offline)

CPU registers Inside the processor itself
Some can be accessed by our programs Others no Can be read/written to in one processor cycle If processor speed is 2 GHz 2,000,000,000 cycles per second 2 cycles per nanosecond

Main memory (I) Byte accessible Each group of 8 bits has an address
Dynamic random access memory (DRAM) Slower but much cheaper than static RAM Contents must be refreshed every 64 ms Otherwise its contents are lost: DRAM is volatile

Main memory (II) Memory is organized as a sequence of 8-bit bytes
Each byte an address Bytes can contain one character Roman alphabet with accents 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Main memory (III) Groups of four bytes starting at addresses that are multiple of 4 form words Better suited to hold numbers Also have half-words, double words, quad words 4 8 12

Accessing main memory (I)
When look for some item, our search criteria can include the location of the item The book on the table The student behind you, … More often our main search criterion is some attribute of the item The color of a folder The title or the authors of a book The name of an individual

Accessing main memory (II)
Computers always access their memory by location The byte at address 4095 The word at location 512 States the address of the first byte in the word Why? Fastest way to access an item 512 513 514 515

An analogy (I) Some research libraries have a closed-stack policy
Only library employees can access the stacks Patrons wanting to get an item fill a form containing a call number specifying the location of the item Could be Library of Congress classification if the stacks are organized that way.

An analogy (II) The procedure followed by the employee fetching the book is fairly simple Go at location specified by the book call number Check it the book is there Bring it to the patron

An analogy (III) The memory operates in an even simpler manner
Always fetch the contents of the addressed bytes Junk or not

Disk drives (I) Sole part of computer architecture with moving parts:
Data stored on circular tracks of a disk Spinning speed between 5,400 and 15,000 rotations per minute Accessed through a read/write head

Disk drives (II) Platter R/W head Arm Servo

Disk drives (III) Data can be accessed by blocks of 4KB, 8 KB, …
Depends on disk partition parameters User selectable To access a disk block Read/write head must be over the right track Seek time Data to be accessed must pass under the head Rotational latency

Estimating the rotational latency
On the average half a disk rotation If disk spins at 15,000 rpm 250 rotations per second Half a rotation corresponds to 2ms Most desktops have disks that spin at 7,200 rpm Most notebooks have disks that spin at 5,400 or 7,200 rpm

Accessing disk contents
Each block on a disk has a unique address Normally a single number Logical block addressing (LBA) Older PCs used a different scheme

The memory hierarchy (II)
Level Device Access Time 1 Fastest registers (2 GHz CPU) 0.5 ns 2 Main memory 10-70 ns 3 Secondary storage (disk) 7 ms 4 Mass storage (CD-ROM library) a few s

The memory hierarchy (III)
To make sense of these numbers, let us consider an analogy

Writing a paper (I) Level Resource Access Time 1 Open book on desk 1 s
2 Book on desk 3 Book in library 4 Book far away

Writing a paper (II) Level Resource Access Time 1 Open book on desk
2 Book on desk s 3 Book in library 4 Book far away

Writing a paper (III) Level Resource Access Time 1 Open book on desk
2 Book on desk s 3 Book in library 162 days 4 Book far away

Writing a paper (IV) Level Resource Access Time 1 Open book on desk
2 Book on desk s 3 Book in library 162 days 4 Book far away 63 years

The two gaps (I) Gap between CPU and main memory speeds:
Will add intermediary levels L1, L2, and L3 caches Will store contents of most recently accessed memory addresses Most likely to be needed in the future Purely hardware solution Software does not see it

The two gaps (II) Gap between main memory and disk speeds:
Will store in main memory contents of most recently accessed disk blocks Most likely to be needed in the future Purely software solution Software does not see it Can also replace the hard disk by faster flash memory 28

Why? Having hardware handle an issue Complicates hardware design
Offers a very fast solution Best approach for very frequent actions Letting software handle an issue Cheaper Has a much higher overhead Best approach for less frequent actions

Will the problem go away?
It will become worse RAM access times are not improving as fast as CPU power Disk access times are limited by rotational speed of disk drive

What are the solutions? To bridge the CPU/DRAM gap:
Interposing between the CPU and the DRAM smaller, faster memories that cache the data that the CPU currently needs Cache memories Managed by the hardware and invisible to the software (OS included)

What are the solutions? To bridge the DRAM/disk drive gap:
Storing in main memory the data blocks that are currently accessed (I/O buffer) Managing memory space and disk space as a single resource (Virtual memory) I/O buffer and virtual memory are managed by the OS and invisible to the user processes

Why do these solutions work?
Locality principle: Spatial locality: at any time a process only accesses a small portion of its address space Temporal locality: this subset does not change too frequently

The true memory hierarchy
CPU registers L1, L2 and L3 caches Main memory (RAM) Secondary storage (Disks) Mass storage (Often offline)

Handling the CPU/DRAM speed gap

The technology Caches use faster static RAM (SRAM)
(D flipflops if you really want to know) Can have Separate caches for instructions and data Great for pipelining A unified cache

Basic principles Assume we want to store in a faster memory 2n words that are currently accessed by the CPU Can be instructions or data or even both When the CPU will need to fetch an instruction or load a word into a register It will look first into the cache Can have a hit or a miss

Cache hits Occur when the requested word is found in the cache
Cache avoided a memory access CPU can proceed

Cache misses Occur when the requested word is not found in the cache
Will need to access the main memory Will bring the new word into the cache Must make space for it by expelling one of the cache entries Need to decide which one

Cache design challenges
Cache contains a small subset of memory addresses Must find a very fast access mechanism No linear search, no binary search Would like to have an associative memory Can search by content all memory entries in parallel Like human brains do

An associative memory Search for “ice cream” COSC 1306 program
Finding a parking spot Found My last ice cream Other ice cream moment

An analogy (I) Let go back to our closed-stack library example
Librarians have noted that some books get asked again and again Want to put them closer to the circulation desk Would result in much faster service The problem is how to locate these books They will not be at the right location!

An analogy (II) Librarians come with a great solution
They put behind the circulation desk shelves with 100 book slots numbered from 00 to 99 Each slot is a home for the most recently requested book that has a call number whose last two digits match the slot number can only go in slot 93 can only go in slot 67

An analogy (III) Let me see if it's in bin 93
The call number of the book I need is

An analogy (IV) To let the librarian do her job each slot much contain either Nothing or A book and its reference number There are many books whose reference number ends in 93 or 67 or any two given digits

An analogy (V) Sure Could I get this time the book whose call number ?

An analogy (VI) This time the librarian will Go bin 93
Find it contains a book with a different call number She will Bring back that book to the stacks Fetch the new book

A very basic cache Has 2n entries Each entry contains A word (4 bytes)
Its memory address Sole way to identify the word A bit indicating whether the cache entry contains something useful

A very basic cache (I) RAM Address Word Tag Contents Valid Y/N 110 111
000 001 100 101 010 011 RAM Address Word Actual caches are much bigger

Handling the DRAM/disk speed gap

What can be done? Two main techniques
Making disk accesses more efficient Doing something else while waiting for an I/O operation Not very different from what we are doing in our every day's lives

Optimizing read accesses (I)
When we shop in a market that’s far away from our home, we plan ahead and buy food for several days The OS will read as many bytes as it can during each disk access In practice, whole blocks (4KB or more) Blocks are stored in the I/O buffer

Optimizing read accesses (II)
Process Read operation I/O buffer Physical I/O Most short read operations can be completed without any disk access Disk Drive

Optimizing read accesses (III)
Buffered disk reads work quite well Most systems use it They have a major limitation If we try to read too much ahead of the program, we risk to bring into main memory data that will never be used

Optimizing read accesses (IV)
Can also keep in a buffer recently accessed blocks hoping they will be accessed again Caching Works very well because we keep accessing again and again the data we are working with Caching is a fundamental technique of OS and database design

Optimizing write accesses (II)
If we live far away from a library, we wait until we have several books to return before making the trip The OS will delay writes for a few seconds then write an entire block Since most writes are sequential, most short writes will not require any disk access

Optimizing write accesses (II)
Delayed writes work quite well Most systems use it It has a major drawback We will lose data if the system or the program crashes After the program issued a write but Before the data were saved to disk

Doing something else When we order something on the web, we do not remain idle until the goods are delivered The OS can implement multiprogramming and let the CPU run another program while a program waits for an I/O

Conclusion As computer architecture becomes more complex
Some old problems continue to bother us: Wide access time gaps between CPU and main memory Main memory and disk (or even flash) Some solutions bring new challenges: …

COSC 1306 COMPUTER SCIENCE AND PROGRAMMING

Similar presentations

Presentation on theme: "COSC 1306 COMPUTER SCIENCE AND PROGRAMMING"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COSC 1306 COMPUTER SCIENCE AND PROGRAMMING

Similar presentations

Presentation on theme: "COSC 1306 COMPUTER SCIENCE AND PROGRAMMING"— Presentation transcript:

Similar presentations

About project

Feedback