Introduction to Concurrency CSE 333 Winter 2019

Slides:



Advertisements
Similar presentations
Threads Section 2.2. Introduction to threads A thread (of execution) is a light-weight process –Threads reside within processes. –They share one address.
Advertisements

Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
1 School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Ch 4: Threads Dr. Mohamed Hefeeda.
1 School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Ch 4: Threads Dr. Mohamed Hefeeda.
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Objectives Thread definitions and relationship to process Multithreading.
A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.
Advanced Operating Systems CIS 720 Lecture 1. Instructor Dr. Gurdip Singh – 234 Nichols Hall –
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
Operating System Concepts Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University.
Contact Information Office: 225 Neville Hall Office Hours: Monday and Wednesday 12:00-1:00 and by appointment.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Threads and Processes.
Operating Systems Lecture 2 Processes and Threads Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
Chapter 4: Multithreaded Programming. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts What is Thread “Thread is a part of a program.
Department of Computer Science and Software Engineering
Processes & Threads Introduction to Operating Systems: Module 5.
Threads. Readings r Silberschatz et al : Chapter 4.
1 Why Threads are a Bad Idea (for most purposes) based on a presentation by John Ousterhout Sun Microsystems Laboratories Threads!
Operating System Concepts
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Lecturer 3: Processes multithreaded Operating System Concepts Process Concept Process Scheduling Operation on Processes Cooperating Processes Interprocess.
Threads prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University 1July 2016Processes.
Chapter 4 – Thread Concepts
Introduction to threads
Threads Some of these slides were originally made by Dr. Roger deBry. They include text, figures, and information from this class’s textbook, Operating.
Chapter 4: Threads.
Advanced Operating Systems CIS 720
Process Management Process Concept Why only the global variables?
Chapter 3: Process Concept
CS 6560: Operating Systems Design
Computer Architecture
Chapter 4 – Thread Concepts
CS399 New Beginnings Jonathan Walpole.
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
Chapter 4: Multithreaded Programming
Process Management Presented By Aditya Gupta Assistant Professor
Lecture 21 Concurrency Introduction
Threads & multithreading
Chapter 4: Threads.
Operating System Concepts
Chapter 4: Threads.
Operating Systems Chapter 5: Input/Output Management
Threads and Concurrency
Multithreaded Programming
Operating Systems (CS 340 D)
Prof. Leonardo Mostarda University of Camerino
PROCESSES & THREADS ADINA-CLAUDIA STOICA.
Server-side Programming CSE 333 Autumn 2018
Chapter 3: Processes.
Concurrency: Processes CSE 333 Summer 2018
Concurrency and Processes CSE 333 Spring 2018
Concurrency: Threads CSE 333 Autumn 2018
Threads vs. Processes Hank Levy 1.
Introduction to Concurrency CSE 333 Autumn 2018
Concurrency: Processes CSE 333 Autumn 2018
Server-side Programming CSE 333 Winter 2019
January 15, 2004 Adrienne Noble
Client-side Networking CSE 333 Winter 2019
Outline Chapter 2 (cont) Chapter 3: Processes Virtual machines
CSE 153 Design of Operating Systems Winter 2019
Low-Level I/O – the POSIX Layer CSE 333 Winter 2019
Chapter 4: Threads.
Contact Information Office: 225 Neville Hall Office Hours: Monday and Wednesday 12:00-1:00 and by appointment. Phone:
Concurrency: Threads CSE 333 Winter 2019
Concurrency: Processes CSE 333 Winter 2019
EECE.4810/EECE.5730 Operating Systems
Chapter 3: Process Management
Threads CSE 2431: Introduction to Operating Systems
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Introduction to Concurrency CSE 333 Winter 2019 Instructor: Hal Perkins Teaching Assistants: Alexey Beall Renshu Gu Harshita Neti David Porter Forrest Timour Soumya Vasisht Yifan Xu Sujie Zhou

Administrivia Sections tomorrow: pthread tutorial/demo pthread exercise posted after sections, due Monday morning Much more about concurrency in this and next several lectures But will not repeat section material hw4 due next Thursday night Yes, can still use up to 2 late days on hw4 (if you haven’t used them up already – check!) Gates Center Linux workstation networking ok now once workstation rebooted (only needs to be done once)

Outline Understanding Concurrency Concurrent Programming Styles Why is it useful Why is it hard Concurrent Programming Styles Threads vs. processes Asynchronous or non-blocking I/O “Event-driven programming”

Building a Web Search Engine We need: A web index A map from <word> to <list of documents containing the word> This is probably sharded over multiple files A query processor Accepts a query composed of multiple words Looks up each word in the index Merges the result from each word into an overall result set https://en.wikipedia.org/wiki/Shard_(database_architecture)

Web Search Architecture client index file query processor

Sequential Implementation Pseudocode for sequential query processor: doclist Lookup(string word) { bucket = hash(word); hitlist = file.read(bucket); foreach hit in hitlist { doclist.append(file.read(hit)); } return doclist; main() { while (1) { string query_words[] = GetNextQuery(); results = Lookup(query_words[0]); foreach word in query[1..n] { results = results.intersect(Lookup(word)); Display(results);

Sequential Execution Timeline network I/O GetNextQuery() main() disk I/O file.read() Lookup() results.intersect() Display() • • • Assuming 3 words in query time query

Sequential Queries – Simplified CPU 3.a I/O 3.b CPU 3.c I/O 3.d CPU 3.e CPU 2.a I/O 2.b CPU 2.c I/O 2.d CPU 2.e query 3 CPU 1.a I/O 1.b CPU 1.c I/O 1.d CPU 1.e query 2 Now assuming only 1 word per query – n.b is GetNextQuery(), n.d is file.read(). query 1 time

Sequential Queries – Simplified Only one I/O request at a time is “in flight” CPU 3.a I/O 3.b CPU 3.c I/O 3.d CPU 3.e The CPU is idle most of the time! CPU 2.a I/O 2.b CPU 2.c I/O 2.d CPU 2.e query 3 Queries don’t run until earlier queries finish CPU 1.a I/O 1.b CPU 1.c I/O 1.d CPU 1.e query 2 query 1 time

Sequential Can Be Inefficient Only one query is being processed at a time All other queries queue up behind the first one The CPU is idle most of the time It is blocked waiting for I/O to complete Disk I/O can be very, very slow At most one I/O operation is in flight at a time Missed opportunities to speed I/O up Separate devices in parallel, better scheduling of a single device, etc.

Concurrency A version of the program that executes multiple tasks simultaneously Example: Our web server could execute multiple queries at the same time While one is waiting for I/O, another can be executing on the CPU Example: Execute queries one at a time, but issue I/O requests against different files/disks simultaneously Could read from several index files at once, processing the I/O results as they arrive Concurrency != parallelism Parallelism is executing multiple CPU instructions simultaneously

A Concurrent Implementation Use multiple threads or processes As a query arrives, fork a new thread (or process) to handle it The thread reads the query from the console, issues read requests against files, assembles results and writes to the console The thread uses blocking I/O; the thread alternates between consuming CPU cycles and blocking on I/O The OS context switches between threads/processes While one is blocked on I/O, another can use the CPU Multiple threads’ I/O requests can be issued at once

Introducing Threads Separate the concept of a process from an individual “thread of control” Usually called a thread (or a lightweight process), this is a sequential execution stream within a process In most modern OS’s: Process: address space, OS resources/process attributes Thread: stack, stack pointer, program counter, registers Threads are the unit of scheduling and processes are their containers; every process has at least one thread running in it thread Faster context switching between threads System can support more threads

Multithreaded Pseudocode main() { while (1) { string query_words[] = GetNextQuery(); ForkThread(ProcessQuery()); } doclist Lookup(string word) { bucket = hash(word); hitlist = file.read(bucket); foreach hit in hitlist doclist.append(file.read(hit)); return doclist; } ProcessQuery() { results = Lookup(query_words[0]); foreach word in query[1..n] results = results.intersect(Lookup(word)); Display(results);

Multithreaded Queries – Simplified CPU 3.a I/O 3.b CPU 3.c I/O 3.d CPU 3.e query 3 CPU 2.a I/O 2.b CPU 2.c I/O 2.d CPU 2.e query 2 CPU 1.a I/O 1.b CPU 1.c I/O 1.d CPU 1.e query 1 time

Why Threads? Advantages: Disadvantages: You (mostly) write sequential-looking code Threads can run in parallel if you have multiple CPUs/cores Disadvantages: If threads share data, you need locks or other synchronization Very bug-prone and difficult to debug Threads can introduce overhead Lock contention, context switch overhead, and other issues Need language support for threads

Alternative: Processes What if we forked processes instead of threads? Advantages: No shared memory between processes No need for language support; OS provides “fork” Disadvantages: More overhead than threads during creation and context switching Cannot easily share memory between processes – typically communicate through the file system

Alternate: Asynchronous I/O Use asynchronous or non-blocking I/O Your program begins processing a query When your program needs to read data to make further progress, it registers interest in the data with the OS and then switches to a different query The OS handles the details of issuing the read on the disk, or waiting for data from the console (or other devices, like the network) When data becomes available, the OS lets your program know Your program (almost never) blocks on I/O

Event-Driven Programming Your program is structured as an event-loop void dispatch(task, event) { switch (task.state) { case READING_FROM_CONSOLE: query_words = event.data; async_read(index, query_words[0]); task.state = READING_FROM_INDEX; return; case READING_FROM_INDEX: ... } while (1) { event = OS.GetNextEvent(); task = lookup(event); dispatch(task, event);

Asynchronous, Event-Driven I/O 3.b I/O 3.d I/O 2.b I/O 2.d I/O 1.b I/O 1.d CPU 1.a CPU 2.a CPU 1.c CPU 2.c CPU 3.a CPU 1.e CPU 2.e CPU 3.c CPU 3.e time

Non-blocking vs. Asynchronous Reading from the network can truly block your program Remote computer may wait arbitrarily long before sending data Non-blocking I/O (network, console) Your program enables non-blocking I/O on its file descriptors Your program issues read() and write() system calls If the read/write would block, the system call returns immediately Program can ask the OS which file descriptors are readable/writeable Program can choose to block while no file descriptors are ready Set file descriptor to non-blocking using fcntl() with the O_NONBLOCK flag. Returns with special error code EWOULDBLOCK or EAGAIN select() or poll() to find out when is a good time to retry.

Non-blocking vs. Asynchronous Asynchronous I/O (disk) Program tells the OS to being reading/writing The “begin_read” or “begin_write” returns immediately When the I/O completes, OS delivers an event to the program According to the Linux specification, the disk never blocks your program (just delays it) Asynchronous I/O is primarily used to hide disk latency Asynchronous I/O system calls are messy and complicated 

Why Events? Advantages: Disadvantages: Don’t have to worry about locks and race conditions For some kinds of programs, especially GUIs, leads to a very simple and intuitive program structure One event handler for each UI event Disadvantages: Can lead to very complex structure for programs that do lots of disk and network I/O Sequential code gets broken up into a jumble of small event handlers You have to package up all task state between handlers

One Way to Think About It Threaded code: Each thread executes its task sequentially, and per-task state is naturally stored in the thread’s stack OS and thread scheduler switch between threads for you Event-driven code: *You* are the scheduler You have to bundle up task state into continuations (data structures describing what-to-do-next); tasks do not have their own stacks