CS162 Week 5 Kyle Dewey. Overview Announcements Reactive Imperative Programming Parallelism Software transactional memory.

Slides:



Advertisements
Similar presentations
ISO 9126: Software Quality Internal Quality: Maintainability
Advertisements

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Concurrency Issues Motivation, Problems, Directions Dennis Kafura - CS Operating Systems1.
Threads Cannot be Implemented As a Library Andrew Hobbs.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Secure Operating Systems Lesson 5: Shared Objects.
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
INSE - Lectures 19 & 20 SE for Real-Time & SE for Concurrency  Really these are two topics – but rather tangled together.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
FUTURE TECHNOLOGIES Lecture 13.  In this lecture we will discuss some of the important technologies of the future  Autonomic Computing  Cloud Computing.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
IT Systems Multiprocessor System EN230-1 Justin Champion C208 –
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Multicore & Parallel Processing P&H Chapter ,
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
An Introduction To PARALLEL PROGRAMMING Ing. Andrea Marongiu
IT Systems Operating System EN230-1 Justin Champion C208 –
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
CS 300 – Lecture 20 Intro to Computer Architecture / Assembly Language Caches.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Parallel Algorithms - Introduction Advanced Algorithms & Data Structures Lecture Theme 11 Prof. Dr. Th. Ottmann Summer Semester 2006.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
Code Generation CS 480. Can be complex To do a good job of teaching about code generation I could easily spend ten weeks But, don’t have ten weeks, so.
Alex Becker.  Multi-core is short for “multiple cores”  Advances in technology allow for several discrete cores on one chip  This however is not multi-CPU.
Lecture 39: Review Session #1 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Computer Systems 1 Fundamentals of Computing The CPU & Von Neumann.
 Design model for a computer  Named after John von Neuman  Instructions that tell the computer what to do are stored in memory  Stored program Memory.
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Computer Architecture Memory Management Units Iolanthe II - Reefed down, heading for Great Barrier Island.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
Lecture 20: Parallelism & Concurrency CS 62 Spring 2013 Kim Bruce & Kevin Coogan CS 62 Spring 2013 Kim Bruce & Kevin Coogan Some slides based on those.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Processor Architecture
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
Discussion Week 2 TA: Kyle Dewey. Overview Concurrency Process level Thread level MIPS - switch.s Project #1.
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Advanced Computer Networks Lecture 1 - Parallelization 1.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin.
CS162 Week 4 Kyle Dewey. Overview Reactive imperative programming in a nutshell Reactive imperative programming implementation details.
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
CS162 Week 3 Kyle Dewey. Overview Grades posted for assignment 1 Secure information flow key implementation issues Reactive imperative programming.
1/50 University of Turkish Aeronautical Association Computer Engineering Department Ceng 541 Introduction to Parallel Computing Dr. Tansel Dökeroğlu
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
1 A simple parallel algorithm Adding n numbers in parallel.
MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.
Computer Hardware What is a CPU.
Conclusions on CS3014 David Gregg Department of Computer Science
Processor support devices Part 2: Caches and the MESI protocol
How will execution time grow with SIZE?
Lecture 5: GPU Compute Architecture
EE 193: Parallel Computing
Lecture 5: GPU Compute Architecture for the last time
Overview 1. Inside a PC 2. The Motherboard 3. RAM the 'brains' 4. ROM
Lecture 20 Parallel Programming CSE /27/2019.
Course Code 114 Introduction to Computer Science
Presentation transcript:

CS162 Week 5 Kyle Dewey

Overview Announcements Reactive Imperative Programming Parallelism Software transactional memory

TA Evaluations

unicode_to_ascii.sh

Reactive Imperative Programming Clarification

Tips Never execute a constraint when you are in atomic mode Self-recursive constraints should never trigger themselves Reactive addresses can be both added and removed via newCons, depending on what gets used in the newCons ’ body If a previously reactive address is not used when executing a constraint newCons, the address is no longer reactive

deletedAddress.not

Any lingering Reactive Imperative Programming Questions?

Parallelism

Previous Experience Anyone taken CS170? CS240A? Any pthreads users? Threads in any language? MPI users? Cloud? (AWS / Azure / AppEngine...) Comfort with parallel code?

Going back a bit... A processor executes instructions The faster it can execute instructions, the faster our program runs Ideal world: one processor that runs really, really fast

Moore’s Law The number of transistors per some unit of volume in integrated circuits doubles roughly every two years This number is roughly correlated to how fast it runs

In the Beginning: The Land was At Peace Processors get faster and faster Using clock speed as a poor estimator of actual speed: Pentium: Up to 300 Mhz Pentium II: Up to 450 Mhz Pentium III: Up to 1.4 Ghz Pentium 4: Up to 3.8 Ghz

Then Physics Happened

Heat More transistors means more power is needed More power means more heat is generated for the same amount of space Too much heat and the processor stops working

So Just Cool It Again, physics is evil Normal heatsinks and fans can only push away heat so quickly To get heat away fast enough, you need to start getting drastic Water cooling Peltier (thermoelectric) coolers Liquid nitrogen drip

Even if it Could be Cooled... When transistors get too close together, quantum physics kicks in Electrons will more or less teleport between wires, preventing the processor from working correctly Not cool physics. Not cool.

But the Computer Architects had a Plan...

...one that would allow processors to keep getting faster...

“Eh, I give up. Let the software people handle it.”

Multicore is Born Put multiple execution units on the same processor Uses transistors more efficiently Individual cores are slower, but the summation of cores is faster I.e. 2 cores at 2.4 Ghz is “faster” than a single processor at 3.8 Ghz

Problem The software itself needs to be written to use multiple cores If it is not written this way, then it will only use a single core Nearly all existing software was (and still is) written only to use a single core Oops.

Why it is hard People generally do not think in parallel Want to spend more time getting less done poorly? Just multitask. Many problems have subproblems that must be done sequentially Known as sequential dependencies Often require some sort of communication

In the Code With multiple cores, you can execute multiple threads in parallel Each thread executes its own bit of code Typical single-core programs only have a single thread of execution One explicitly requests threads and specifies what they should run

Example void thread1() { if (x == -1) { x = 5; } int x = -1; void thread2() { if (x == -1) { x = 6; }

Race Conditions This example still may get executed correctly Depends on what gets run when This is called a race condition One computation “races” another one, and depending on who “wins” you get different results IMO: the most difficult bugs to find and to fix

Fundamental Problem Need to manage shared, mutable state Only certain states and certain state transitions are valid In the example, it is valid to go from -1 to 5, or from -1 to 6, but not from 5 to 6 or from 6 to 5 Need a way of enforcing that we will not derive invalid states or execute invalid state transitions

A Solution: Locks Shared state is under a lock If you want to modify it, you need to hold a key Only one process can hold a key at a time

Example With Locks void proc1() { lock (x) { if (x == -1) { x = 5; } int x = -1; void proc2() { lock (x) { if (x == -1) { x = 6; }

Problems With Locks Very low-level and error prone Can absolutely kill performance Because of locks, the example before is now purely sequential, with locking overhead

Deadlock void proc1() { lock(x) { lock(y) {... } int x = 1; int y = 2; void proc2() { lock(y) { lock(x) {... }

Other Solutions There are a LOT: Atomic operations Semaphores Monitors Software transactional memory

Software Transactional Memory Very different approach from locks Code that needs to be run in a single unit is put into an atomic block Everything in an atomic block is executed in a single transaction

Transactions Execute the code in the atomic block If it did not conflict with anything, then commit it If there was a conflict, then roll back and retry All or nothing

Example With STM void proc1() { atomic { if (x == -1) { x = 5; } int x = -1; void proc2() { atomic { if (x == -1) { x = 6; }

Not a Lock We do not explicitly state what we are locking on We only roll back if there was a change With locks, we could lock something and never change it Atomic blocks automatically determine what needs to be “locked”

Performance Scale much better than locks Oftentimes conflicts are possible but infrequent, and performance hits are mostly at conflicts Depending on the implementation, atomic blocks can have a much lower overhead than locking