Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies.

Slides:



Advertisements
Similar presentations
Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen(' body = response.read()
Advertisements

Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
Intro to Boto (and gevent, ) Presented
Deadlocks, Message Passing Brief refresh from last week Tore Larsen Oct
Threads. Announcements Cooperating Processes Last time we discussed how processes can be independent or work cooperatively Cooperating processes can.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 5: Threads Overview Multithreading Models Threading Issues Pthreads Solaris.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Concurrency CS 510: Programming Languages David Walker.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
Review: Operating System Manages all system resources ALU Memory I/O Files Objectives: Security Efficiency Convenience.
Server Architecture Models Operating Systems Hebrew University Spring 2004.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
3.5 Interprocess Communication
Multithreading in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Organization of Programming Languages-Cheng (Fall 2004) Concurrency u A PROCESS or THREAD:is a potentially-active execution context. Classic von Neumann.
A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.
Basic Networking & Interprocess Communication Vivek Pai Nov 27, 2001.
CS 3013 & CS 502 Summer 2006 Threads1 CS-3013 & CS-502 Summer 2006.
Threads math 442 es Jim Fix. Reality vs. Abstraction A computer’s OS manages a lot: multiple users many devices; hardware interrupts multiple application.
Scala Actors -Terrance Dsilva.  Thankfully, Scala offers a reasonable, flexible approach to concurrency  Actors aren’t a concept unique to Scala.
JavaScript Event Loop Not yo mama’s multithreaded approach. slidesha.re/ZPC2nD.
Programming Network Servers Topic 6, Chapters 21, 22 Network Programming Kansas State University at Salina.
Win32 Programming Lesson 13: Thread Pooling (Wow, Java is good for something…)
Introduction to Processes CS Intoduction to Operating Systems.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
Threads Many software packages are multi-threaded Web browser: one thread display images, another thread retrieves data from the network Word processor:
Implementing Processes and Process Management Brian Bershad.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Threads and Processes.
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
Python Concurrency Threading, parallel and GIL adventures Chris McCafferty, SunGard Global Services.
Chapter 3: Processes. 3.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts - 7 th Edition, Feb 7, 2006 Process Concept Process – a program.
Operating Systems Lecture 2 Processes and Threads Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Thread Scheduling.
Socket Models Different ways to manage your connections.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Concurrency Patterns Emery Berger and Mark Corner University.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
1 Concurrency Architecture Types Tasks Synchronization –Semaphores –Monitors –Message Passing Concurrency in Ada Java Threads.
Processes CSCI 4534 Chapter 4. Introduction Early computer systems allowed one program to be executed at a time –The program had complete control of the.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
Events in General. Agenda Post/wait technique I/O multiplexing Asynchronous I/O Signal-driven I/O Database events Publish/subscribe model Local vs. distributed.
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Silberschatz, Galvin and Gagne  Operating System Concepts Six Step Process to Perform DMA Transfer.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Martin Kruliš by Martin Kruliš (v1.1)1.
Threads versus Events CSE451 Andrew Whitaker. This Class Threads vs. events is an ongoing debate  So, neat-and-tidy answers aren’t necessarily available.
Threads. Readings r Silberschatz et al : Chapter 4.
© 2004, D. J. Foreman 1 Device Mgmt. © 2004, D. J. Foreman 2 Device Management Organization  Multiple layers ■ Application ■ Operating System ■ Driver.
1 Ivan Marsic Rutgers University LECTURE 19: Concurrent Programming.
LECTURE 11 Networking Part 2. NETWORKING IN PYTHON At this point, we know how to write a simple TCP client and simple TCP server using Python’s raw sockets.
Processes and Threads MICROSOFT.  Process  Process Model  Process Creation  Process Termination  Process States  Implementation of Processes  Thread.
1 Why Threads are a Bad Idea (for most purposes) based on a presentation by John Ousterhout Sun Microsystems Laboratories Threads!
MULTIPROCESSING MODULE OF PYTHON. CPYTHON  CPython is the default, most-widely used implementation of the Python programming language.  CPython - single-threaded.
Chapter 4: Multithreaded Programming
Task Scheduling for Multicore CPUs and NUMA Systems
Threads, Concurrency, and Parallelism
Node.Js Server Side Javascript
CSCI 315 Operating Systems Design
Operating System Concepts
CS703 - Advanced Operating Systems
Mid Term review CSC345.
Background and Motivation
Prof. Leonardo Mostarda University of Camerino
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Outline Chapter 2 (cont) Chapter 3: Processes Virtual machines
CSE 153 Design of Operating Systems Winter 2019
Chapter 4: Threads.
Module 12: I/O Systems I/O hardwared Application I/O Interface
Presentation transcript:

Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies Tommi Virtanen / DreamHost

One thing at a time per core hyperthreading is a filthy lie SMP/NUMA = think of multiple computers in one box

Concurrency vs Parallelism flip-flops vs thongs

Parallelism Speed-oriented Physically parallel

Concurrency Multiplexing operations Reasoning about program structure

Von Neumann architecture meet Von Neumann bottleneck

Multiple processes wget -q -O- \ | perl -ne 'for (/\b[[:upper:]]\w*\b/g) {print "$_\n"}' \ | tr '[:upper:]' '[:lower:]' \ | sort \ | uniq -c \ | sort -nr \ | head dreamhost 11 hosting 10 wordpress 9 web 9 we waits for network waits for disk OS can schedule to avoid IO wait (HT can schedule to avoid RAM wait)

Pipelines are lovely, but two- way communication is harder

IO waiting deadlock # parent data = child.stdout.read(8192) # child data = sys.stdin.read(8192)

"Blocking" on IO Like refusing to participate in a dinner conversation because your glass is empty.

Nonblocking "if I asked you to do X, would you do it immediately?"

Asynchronous abstraction on top of nonblocking; queue outgoing data until writable, notify of incoming data just read

Event loop GUIs: "user did X, please react" services: lots of async operations, concurrently

Callback-oriented programming services: "received bytes B on connection C"

Deferred Twisted API for delayed-response function calls

Deferred example import something def get_or_create(container, name): d = container.lookup(name) def or_create(failure): failure.trap(something.NotFoundError) return container.create(name) d.addErrback(or_create) return d d = get_or_create(cont, 'foo') def cb(foo): print 'Got foo:', foo d.addCallback(cb)

Twisted Use to poke fun at Node.js fanboys

concurrent.futures (PEP 3148) Python fails to learn from Twisted More like Java every day Twisted dev not taken by surprise

Co-operative scheduling like the nice guy at the ATM, letting you go in between because he still has 20 checks to deposit

Preemptive scheduling the CPU gets taken away from you (plus you usually get punished for hogging)

So.. back to processes

Communication overhead (most of the time, just premature optimization)

Creation cost (5000 per second on a 900MHz Pentium 3, don't know if anyone's cared since.. )

Idea: Communicate by sharing memory

Threads... now you have two problems

Unix has mmap you know (aka share memory by sharing memory) Voluntarily giving up that precious MMU counts as pretty stupid in my book... Threads originate from operating systems with bad schedulers and VM

GIL Python threads are only good for concurrency, not for parallelism

Multiprocessing Ingenious hack to emulate thread API with multiple processes. Eww, pickles.

Do not trust the pickle

def process(work): while True: item = work.get() try: r = requests.get(item['url']) if r.status_code != 200: print 'Broken link {source} -> {url}'.format(**item) continue level = item['level'] - 1 if level > 0: if r.headers['content-type'].split(';', 1)[0] != 'text/html': print 'Not html: {url}'.format(**item) continue doc = html5lib.parse(r.content, treebuilder='lxml') links = doc.xpath( | namespaces=dict(html=' ) for link in links: if link.startswith('#'): continue url = urlparse.urljoin(item['url'], link, allow_fragments=False) parsed = urlparse.urlsplit(url) if parsed.scheme not in ['http', 'https']: print 'Not http: {url}'.format(url=url) continue work.put( dict( level=level, url=url, source=item['url'], ), ) finally: work.task_done() import html5lib import multiprocessing import requests import urlparse def main(): work = multiprocessing.JoinableQueue(100) work.put( dict( level=2, url=' source=None, ), ) processes = [ multiprocessing.Process( target=process, kwargs=dict( work=work, ), ) for _ in xrange(10) ] for proc in processes: proc.start() work.join() for proc in processes: proc.terminate() for proc in processes: proc.join() if __name__ == '__main__': main()

Communicating Sequential Processes more about the thought process than the implementation

"Share memory by communicating" Radically less locking; radically less locking bugs.

Prime number sieve with CSP

Prime number sieve with Gevent from gevent import monkey; monkey.patch_all() import gevent import gevent.queue import itertools def generate(queue): """Send the sequence 2, 3, 4,... to queue.""" for i in itertools.count(2): queue.put(i) def filter_primes(in_, out, prime): """Copy the values from in_ to out, removing those divisible by prime.""" for i in in_: if i % prime != 0: out.put(i) def main(): ch = gevent.queue.Queue(maxsize=10) gevent.spawn(generate, ch) for i in xrange(100): # Print the first hundred primes. prime = ch.get() print prime ch1 = gevent.queue.Queue(maxsize=10) gevent.spawn(filter_primes, ch, ch1, prime) ch = ch1

Look ma, no threads!

monkey.patch_all()

Coroutines def foo(): a = yield b = yield yield a+b f = foo() next(f) f.send(1) print f.send(2)

Coroutines #2 def ticktock(): messages = ['tick', 'tock'] while True: cur = messages.pop(0) new = yield cur if new is not None: print '# Okay, {new} not {cur}.' \.format(new=new, cur=cur) cur = new messages.append(cur) t = ticktock() print next(t) # tick print next(t) # tock print next(t) # tick t.send('bip') # Okay, bip not tick. print next(t) # bip print next(t) # tock print next(t) # bip print next(t) # tock t.send('bop') # Okay, bop not tock. print next(t) # bop print next(t) # bip print next(t) # bop print next(t) # bip print next(t) # bop print next(t) # bip

Laziness without hubris with open(path) as f: words_on_lines = (line.split() for line in f) words = itertools.chain.from_iterable(words_on_lines) lengths = (len(word) for word in words) (a, b) = itertools.tee(lengths) count = sum(1 for _ in a) summed = sum(b) avg = summed / count

Go watch Jesse Noller's PyCon talks

Thank you! Questions? Image credits: self-portrait of anonymous macaque 83/monkey-business-can-monkey-license-its- copyrights-to-news-agency.shtml fair use from wiki/Gil_Grissom <3 CC photos/bitzcelt/ / <3 CC photos/pedro mourapinheiro/ / fair use from com/