Barry Wimlett Technical Specialist BSc MBCS blackmarble.com.

Slides:



Advertisements
Similar presentations
© 2007 Eaton Corporation. All rights reserved. LabVIEW State Machine Architectures Presented By Scott Sirrine Eaton Corporation.
Advertisements

The Brave New World of Software Adam Kemp Staff Software Engineer National Instruments.
An overview of… Luis Guerrero Plain Concepts
James Kolpack, InRAD LLC popcyclical.com. CodeStock is proudly partnered with: Send instant feedback on this session via Twitter: Send a direct message.
Parallel Extensions to the.NET Framework Daniel Moth Microsoft
System.Threading.Tasks Task Represents an asynchronous operation Supports waiting, cancellation, continuations, … Parent/child relationships 1 st -class.
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
October 2007Susan Eggers: SOSP Women's Workshop 1 Your Speaker in a Nutshell BA in Economics 1965 PhD University of CA, Berkeley 1989 Technical skills.
Intro to Threading CS221 – 4/20/09. What we’ll cover today Finish the DOTS program Introduction to threads and multi-threading.
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
Threads Section 2.2. Introduction to threads A thread (of execution) is a light-weight process –Threads reside within processes. –They share one address.
Virtual techdays INDIA │ 9-11 February 2011 Parallelism in.NET 4.0 Parag Paithankar │ Technology Advisor - Web, Microsoft India.
Parallel Programming in Visual Studio 2010 Sasha Goldshtein Senior Consultant, Sela Group
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Daniel Moth  Parallel Computing Platform Microsoft Corporation TL26.
CS 3013 & CS 502 Summer 2006 Threads1 CS-3013 & CS-502 Summer 2006.
Connect with life Bijoy Singhal Developer Evangelist | Microsoft India.
Parallel Programming in.NET Kevin Luty.  History of Parallelism  Benefits of Parallel Programming and Designs  What to Consider  Defining Types of.
A Top Level Overview of Parallelism from Microsoft's Point of View in 15 minutes IDC HPC User’s Forum April 2010 David Rich Director Strategic Business.
Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.
Computer System Architectures Computer System Software
Phil Pennington Sr. Developer Evangelist Microsoft Corporation SESSION CODE: WSV325.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
© 2011 Autodesk Single Job 1 Processor 1 Single Job 2 Single Job 3 Processor 2 Processor 3 Big Job 1 Big Job 2 Single Job 4 Processor 1 Single Job 5 Single.
Reactive Extensions Ye olde introduction and walk-through, with plenty o’ code.
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Overview of Threading with the.NET Framework  Wallace B. McClure  Scalable Development, Inc. Scalable Development, Inc. Building systems today that perform.
Python Concurrency Threading, parallel and GIL adventures Chris McCafferty, SunGard Global Services.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Parallel Extensions A glimpse into the parallel universe By Eric De Carufel Microsoft.NET Solution Architect at Orckestra
Parallel Extensions A glimpse into the parallel universe Eric De Carufel
Parallel Programming: Responsiveness vs. Performance Joe Hummel, PhD Microsoft MVP Visual C++ Technical Staff: Pluralsight, LLC Professor: U. of Illinois,
About Me Microsoft MVP Intel Blogger TechEd Israel, TechEd Europe Expert C++ Book
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Parallel Extensions A glimpse into the parallel universe By Eric De Carufel Microsoft.NET Solution Architect at Orckestra
Kernel Locking Techniques by Robert Love presented by Scott Price.
Huseyin YILDIZ Software Design Engineer Microsoft Corporation SESSION CODE: DEV314.
CS162 Week 5 Kyle Dewey. Overview Announcements Reactive Imperative Programming Parallelism Software transactional memory.
Module 8 Enhancing User Interface Responsiveness.
Parallel Pattern Library Resource Manager Task Scheduler Task Parallel Library Task Parallel Library Parallel LINQ Managed Native Key: Threads Operating.
LINQ & PLINQ (Parallel) Language Integrated Query.
Visual Studio 2010 and.NET Framework 4 Training Workshop.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
4.1 Introduction to Threads Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads.
1 Why Threads are a Bad Idea (for most purposes) based on a presentation by John Ousterhout Sun Microsystems Laboratories Threads!
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
TOPICS WHAT YOU’LL LEAVE WITH WHO WILL BENEFIT FROM THIS TALK.NET developers: familiar with parallel programming support in Visual Studio 2010 and.NET.
Tuning Threaded Code with Intel® Parallel Amplifier.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
@gfraiteur a deep investigation on how.NET multithreading primitives map to hardware and Windows Kernel You Thought You Understood Multithreading Gaël.
GeantV – Adapting simulation to modern hardware Classical simulation Flexible, but limited adaptability towards the full potential of current & future.
Chapter 4 – Thread Concepts
The Post Windows Operating System
Processes and Threads Processes and their scheduling
Tech Ed North America /20/2018 7:07 AM Required Slide
Thread Fundamentals Header Advanced .NET Threading, Part 1
Async or Parallel? No they aren’t the same thing!
Lighting Up Windows Server 2008 R2 Using the ConcRT on UMS
Task Parallel Library: Design Principles and Best Practices
Staying Afloat in the .NET Async Ocean
C++ Forever: Interactive Applications in the Age of Manycore
F# for Parallel and Asynchronous Programming
Tech·Ed North America /8/ :16 PM
Visual Studio 2010 and .NET Framework 4 Training Workshop
Patterns for Programming Shared Memory
Why Threads Are A Bad Idea (for most purposes)
Patterns for Programming Shared Memory
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Presentation transcript:

Barry Wimlett Technical Specialist BSc MBCS blackmarble.com

Concurrent Programming A lap around what’s new in Visual Studio 2010 and.net 4.0

Moore’s Law April Transistor count and computing power double every 2* years for same cost. Law#2 Manufacturing plant costs double at the same time. ftp://download.intel.com/museum/Moores_Law/Articles- Press_Releases/Gordon_Moore_1965_Article.pdf * originally 18 months

Some Data YearProcessorTransistor CountMhz , , , SX270, DX275, ,200, P603,100, Ppro5,500, k6-2008,800, Pentium27,500, Athlon22,000, Pentium328,000, P442,000, AlthlonTBird37,000, Athlon64105,000,0002, Athlon64x2233,000,0002, Phenom450,000,0002, PhenomII758,000,0002,800

Transistor Counts & Clock Speeds

Transistor Count and Clock Speeds – Log10

Wintel - “Grove giveth, and Gates taketh away.” Lazy programmers dream “The Honeymoon”

Moore’s Law is NOT dead. moores-law-is-dead law-is-not-dead-its-merely-pining-for-the-fjords.ars Last few years clock speeds flatten out at approx 3 Ghz – limited by silicon tech Transistor counts continue to increase, but... More cores not faster processors

Problems With More Cores It all goes wrong once you leave the package “The Flat” metaphor for shared resources. Cooperation required; just like development projects – blindly throwing more processors at a problem will not necessarily give increase performance and rarely linear increases.

Single Core Data/Address Bus Memory I/O Cache CPU Core

Multi Core Data/Address Bus Cache CPU Core Cache Memory I/O

ManyCore Data/Address Bus Memory I/O Cache CPU Core Cache CPU Core Cache CPU Core Cache CPU Core Cache CPU Core Cache

Windows 7/Server 2008 Kernel and I/O contention – Mention Windows 7 kernel – Kernel_Engineers_Talk_About_Windows_7_s_Ker nel

For Windows 7, Microsoft removed several locks that seriously hindered performance - all without breaking a single application. The global dispatcher lock, for instance, is gone completely, and replaced by fine-grained locking which provides 11 types of more specific locks as well as rules on how locks can be obtained so that you no longer run into deadlocks. The pre-7 dispatcher spent 15% of the CPU time waiting to acquire contended locks. "If you think about it, 15% of the time on a 128-processor system is, more than 15 of these CPUs are pretty much full- time just waiting to acquire contended locks. So we're not getting the most out of this hardware," kernel engineer Arun Kishan explained

That has obviously changed in modern times, and in Vista, this architecture simply gave in. The statistic Wang gave during the talk was pretty... Disconcerting. "As you went to 128 processors, SQL Server itself had an 88% PFN lock contention rate. Meaning, nearly one out of every two times it tried to get a lock, it had to spin to wait for it... Which is pretty high, and would only get worse as time went on." The more fine-grained approach in Windows 7 and Windows Server 2008R2 yields some serious performance improvements: on 32-processor configurations, some operations in SQL and other applications perform 15 times faster than on Vista. And remember, the new fine-grained method has been implemented without any application breakage.

Multi-threading Programming tasks in parallel for compute intensive tasks.

Asynch Programming BeginFirstThing..... Do Something Else... FinishSecondThing() Allows processor to “do something more useful” while waiting for disk/network or other I/O. Helps keep UI responsive in windows apps, by allowing UI to execute while I/O done in background.

Problems Atomic data access operations, Locking, Deadlocks Race conditions Order of execution All subtle, infrequent difficult to detect and replicate – attaching a debugger affects how the software behaves Not very unit testable.

User as a bottleneck User does not scale out Office Apps and other heavily interactive software limited more by user than processor. – Think about Excel TM and recalculation of spreadsheets “JFDI” versus “thinking about it too hard” real-time scheduling type problem

The future New languages F#, Axom Probably the as big a change as the shift from assembler to ‘C’ Task orientated Less imperative, what you want doing - not how you want it doing. – Making a cup of tea.

Task-Orientated Focus on what we want to achieve not how to do it. Read-only values where possible ( immutability, less conflict) Using “workflow” and “workflow like” programming (Azure) to scale out – PDC’08 – PDC’09

New for.Net4 for NOW ThreadPool.Queue is dead, long live Tasks WorkStealing Queues in the ThreadPool AsParrallel and P/LINQ Collections/Bags/Lists/Queues/Locks RxExtensions - IObservable ParallelExtensionsExtras

Parallel Computing and.net4 Parallel Pattern Library Resource Manager Task Scheduler Task Parallel Library Parallel LINQ Threads Operating System Native Concurrency Runtime Managed Libraries ThreadPool Data Structures Tools Async Agents Library Async Agents Library UMS Threads Microsoft Research Microsoft Research Visual Studio 2010 Parallel Debugger Windows Parallel Debugger Windows Profiler Concurrency Analysis Profiler Concurrency Analysis Race Detection Fuzzing Axum Visual F# Managed Languages Rx Native Libraries Managed Concurrency Runtime DryadLINQ Key: Research / Incubation Visual Studio 2010 /.NET 4 Windows 7 / Server 2008 R2 HPC Server Operating System

Global Queue Program Thread Worker Thread 1 ThreadPool in.NET 3.5 Item 1 Item 2 Item 3 Item 4 Item 5 Item 6

Program Thread ThreadPool in.NET 4 Lock-Free Global Queue Lock-Free Global Queue Local Work- Stealing Queue Local Work- Stealing Queue Local Work- Stealing Queue Worker Thread 1 Worker Thread p Task 1 Task 2 Task 3 Task 5 Task 4 Task 6

Demo Time “All Hail Murphy!”* * An appeasement for the mischievous demo gods.

Easy wins with P/LINQ Uses TPL IParallelEnumerable Parallel.AsParallell Migration to LINQ a good first step to parallelisation Also Parallel.Foreach Choose carefully for best performance; but either is probably better than the alternatives. Lots of knobs aspx

public static IEnumerable Zipping (IEnumerable a, IEnumerable b) { return a.AsParallel().AsOrdered().Select(element => ExpensiveComputation(element)).Zip( b.AsParallel().AsOrdered().Select(element => DifferentExpensiveComputation(element)), (a_element, b_element) => Combine(a_element,b_element)); }

public static IEnumerable Zipping (IEnumerable a, IEnumerable b) { var numElements = Math.Min(a.Count(), b.Count()); var result = new T[numElements]; Parallel.ForEach(a, (element, loopstate, index) => { var a_element = ExpensiveComputation(element); var b_element = DifferentExpensiveComputation(b.ElementAt(index)); result[index] = Combine(a_element, b_element); }); return result; }

TPL - Task is Your New Best Friend ThreadPool.QueueUserWorkItem – Great for fire-and-forget – But what about… Waiting Canceling Continuing Composing Exceptions Dataflow Debugging …

Demo Time

IObserver,IObservable Part of the Rx Reactive Framework Duality of IEnumerable ; ie. Push versus pull Good for replacing events; Asynchronous I/O Programming Used in Sliverlight Toolkit for Unit Test See “BurningMonk”’s article on Drag and Drop and Iobservable. cf: ProducerConsumer

Code Show and Tell GPS Example form MSDN

New Sync Primitives in.NET 4 Thread-safe, scalable collections – IProducerConsumerCollection ConcurrentQueue ConcurrentStack ConcurrentBag – ConcurrentDictionary Phases and work exchange – Barrier – BlockingCollection – CountdownEvent Partitioning – {Orderable}Partitioner Partitioner.Create Exception handling – AggregateException Initialization – Lazy LazyInitializer.EnsureInitialized – ThreadLocal Locks – ManualResetEventSlim – SemaphoreSlim – SpinLock – SpinWait Cancellation CancellationToken{Source} Public, and used throughout PLINQ and TPL Address many of today’s core concurrency issues

Parrallel Extensions Extras Useful but either to specific or not mature enough to properly enter the framework. Built ontop of.net 4.0 objects Not fully tested being augmented continually, feedback welcome. LINQ to Tasks Task.ToObservable Additional Task Extensions Methods BlockingCollectionExtensions StaTaskScheduler ConcurrentExclusiveInterleave Additional TaskSchedulers ReductionVariable ObjectPool Pipeline ParallelDynamicInvoke AsyncCache More to come...

LINQ for TASKS GOAL: Task result = from x in Task.Factory.StartNew( () => ProduceInt()) from y in Task.Factory.StartNew( () => Process(x)) select y.ToString(); The LinqToTasks.cs file in ParallelExtensionsExtras provides a set of more complete implementations, covering Select, SelectMany, Where, Join, GroupJoin, GroupBy, OrderBy, and more. ParallelExtensionsExtras

PipeLine Stage1Stage2Stage3 Pipeline.Create(rawChunk => Compress(rawChunk)).Next(compressedChunk => Encrypt(compressedChunk));

Visual Studio Some of the new tooling.

Summary and Links Understand the Impact of Low-Lock Techniques in Multithreaded Apps. Key Links – Parallel Computing Dev Center – Code samples – Blogs Managed: Tools: – Forums blackmarble.com