Visual Studio 2010 和 .NET Framework 4 培训研讨会
面向托管开发人员 的并行计算 ESTIMATED TIME: 60 minutes 姓名 职务 组织 电子邮件
目标 理解“并行计算转型”的重要性 理解为了帮助这种转型而在 .NET Framework 4 中引入的技术 MESSAGING: Objectives for the session.
“我一直以为数字空间还需要五十年。我认为需要五十年的事情仅十年就实现了,而认为需要十年的事情 “我一直以为数字空间还需要五十年。我认为需要五十年的事情仅十年就实现了,而认为需要十年的事情......已经实现了。我现在对此没有想法了。” - Bruce Sterling MESSAGING: It’s amazing how fast technology can sneak up on us. It’s important to remember that what we think is still years away from us is something that is actually already here and that we have to starting thinking about it now.
Baby Names DEMO: Use BabyNames
“根据摩尔定律,我们应该轻松地在下一个十年甚至更短的时间内,在主流处理器上打到 80 分。” - Justin Rattner,CTO,Intel MESSAGING: Now this has a great impact on the capabilities of our applications. Thanks to the ever-increasing power of our computers, computing power has been on a steady incline. And for the most part, our applications have been able to track this computing power increase merely by relying on the fact that faster computers will be released that will, in turn, make their own applications faster (just by running on the faster hardware). In this world, sequential programming wasn’t a problem because the ever-increasing computer power allowed our sequential programs to keep on running faster and faster as new hardware was released. However, along this computing power trend, a shift was made to CPUs containing multiple cores. Computers now contained multiple cores, where each single core was slightly less powerful than a complete processor was before. And while it would be nice to think our applications could continue to follow the computing power trend just as they did before, that’s not reality. The reality is that if we continue to use the same programming models as we use today, our performance will plateau. Not only will it plateau, there is a possibility that performance will decrease a bit as many cores are introduced that each are just a little less powerful than their predecessors (for less power consumption, less heat dissapation, etc.). Cleary, new tools for us developers to write applications with are necessary.
并行计算计划 “大众的并发” 让最聪明的开发人员去解决业务问题,而不是并发问题。 MESSAGING: Today, concurrency usually has to be done by the brightest developers in a business. The problem with this is that the brightest developers are then being consumed by concurrency minutia rather than being able to be focused on core business problems and helping the business’s bottom line. By providing new libraries and tools that make it easy to write parallel code, we hope to allow the best and brightest developers focus on the business problems at hand. This in turn enables other developers to be able to address concurrency and parallelism. This is the long term goal of Microsoft with the Parallel Computing Initiative. Visual Studio 2010 and .NET Framework 4 is merely the first step being taken in this direction. “大众的并发”
并发蓝图 对于 Visual Studio 2010 和 .NET Framework 4…… 并行 LINQ (PLINQ) 任务并行库 MESSAGING: There are four major improvement areas in Visual Studio 2010 and .NET Framework 4 that we will discuss in this talk: new Concurrency primitives, a new set of concurrent collections/containers, the Task Parallel Library, and Parallel LINQ (PLINQ). 并发集合 原语
任务并行库是一个公共类型和 API 集合,可以简化向应用程序添加并行和并发的流程,提高开发人员的生产力。 从线程 到任务 任务并行库是一个公共类型和 API 集合,可以简化向应用程序添加并行和并发的流程,提高开发人员的生产力。 MESSAGING: Asd “关注内容而不是方式”
从线程到任务 DEMO: Use TaskDemo
实际操作中的 “Work Stealing” MESSAGING: The task scheduler in Parallel Extensions is very smart and powerful. There are two aspects that make it so: the ability for different worker threads to “steal work” from other threads when there is work to be done, and the addressing of data locality when a thread is determining what work to actually do. When a thread needs to find more work to do, it looks in three different places (in the following order): It’s local queue The global queue Other thread’s queues that have work to be done It’s important to know that data locality is addressed when a thread needs to grab more work to do: When pulling from the local queue, the worker pulls the last item added to the queue. This is because the last item added to the queue has the greatest chance of still being fresh in the cache, as opposed to older items which are more likely have expired and been flushed from cache. When pulling from another worker’s queue, the current worker pulls the oldest item added to the queue. This is because the oldest item added to the queue has the greatest chance of not being in the cache even for its local worker (and hence needing to load in anyways). 员工线程 1 员工线程 p 程序线程 任务 3 Task 1 任务 4 任务 5 任务 2
平行静态类 如果程序内容是独立的…… ……它们可以是平行的 StatementA(); StatementB(); StatementC(); MESSAGING: With the introduction of the new Parallel static class, it becomes much easier to parallelize independent statements. The great thing is that if program statements are completely independent from each other (i.e. they don’t share any common shared stated between the various calls are require the calls to take place in a given order), they can easily be parallelized using the new Parallel.Invoke() method. There are other static methods declared off the Parallel class that make other parallel operations easy as well (like Parallel.For() and Parallel.ForEach(). We’ll take a look at some of these in the demo. So one might ask: “Well, if it’s that simple, why doesn’t the compiler just do this for automatically?” The problem is that it’s not quite that simple. In our world of imperative code and shared state, there’s no easy way to guarantee that these methods being called don’t have some sort of state somewhere that they are sharing. It could be five method calls deep, it could be ten, it could be more. You, as the developer, are the one with the necessary knowledge to know where this is true. Therefore, Parallel.Invoke() is strictly an opt-in process. Otherwise, we could do more damage than good. Parallel.Invoke( () => StatementA(), () => StatementB(), () => StatementC() );
平行静态类 DEMO: Use ParallelDemo
PLINQ 并行 LINQ (PLINQ) 使开发人员能够轻松利用manycore,而对现有 LINQ 编程模型的影响最小 var q = from p in people where p.Name == queryInfo.Name && p.State == queryInfo.State && p.Year >= yearStart && p.Year <= yearEnd orderby p.Year ascending select p; .AsParallel() MESSAGING: PLINQ is a technology that allows developers to _easily_ leverage manycore. The great thing about PLINQ is that if you are using LINQ-to-objects, there is a very minimal impact to your code in order for it to use PLINQ. All it takes to use PLINQ is adding “.AsParallel()” to your query. This will turn the query into a PLINQ query and will use the PLINQ execution engine when executed. One small change, and your code now takes advantage of all the hardware available to you. NOTES: AsParallel() works by returning a ParallelQuery<T> so every subsequent query operator works against the new ParallelQuery rather than the normal IEnumerable. See the hands-on lab for Parallel Extensions for more details.
PLINQ DEMO: Use PLINQDemo
障碍 障碍 “所有的车都向西雅图进发!我们在加油站见并从那里出发。” - Charlie Dennis Mac Charlie 加油站 MESSAGING: The new Barrier synchronization primitive allows you to provide “sync-points” where all threads/tasks of execution meet at. Then, no threads/tasks continue until all of them have arrived at the barrier. This is a way that now you can easily introduce “check points” into your parallel operations if there are naturally areas where you want different execution units to meet. A Barrier can be re-used multiple times to issue many “check points” that could occur in the code. As an example, think about a road trip. If some friends are taking a road trip to Seattle, for instance, they might all meet at the gas station first. They shouldn’t leave the gas station until all of them arrive so they can leave together. In this case, the gas station itself could represent the barrier. There is a natural “check point” that occurs before all of them leave and start driving again to Seattle. Charlie 加油站 西雅图
统一取消 取消令牌源 “先生,您的房间已经准备好了……” - 女招待 取消令牌 MESSAGING: “先生,您的房间已经准备好了……” - 女招待 取消令牌源 MESSAGING: A new unified cancellation model is being introduced with .NET Framework 4. This new model allows many different operations (like separate threads of execution, I/O calls, etc.) to be organized in such a way that they can all be cancelled with one simple call in a unified way (no matter what the operation itself is). The new CancellationTokenSource and CancellationToken constructs is what the new unified cancellation model is comprised of. The CancellationTokenSource is the core of this model. From a CancellationTokenSource, you can pass out as many CancellationTokens as you wish to various types of operations that are or are not happening in parallel. Then you can use the CancellationTokenSource to signify that everything needs to be cancelled by calling a Cancel() method. Then the cancellation is reflected in all the various tokens. One key aspect of this new model is that the individual tokens that are passed out have no way to signify cancellation themselves. There is also no way to get from a reference to a CancellationToken back to the original CancellationTokenSource. This is by design. When you are passing out tokens to as many operations as you need to in order to keep your cancellation unified, the last thing you want to do is to allow some random token to be able to signify a Cancel() request and then cancel all the other processing that is happening. Think of this another way, compare cancellation tokens by the devices handed out at some restaurants to signify that you (as the customer) are ready to be seated. Though you have the device in your hand, there is no possible way for you personally to make it go off (that would kind of defeat the purpose). The only party that can signal that you are “ready to be seated” is the restaurant itself. This is the same relationship the CancellationTokenSource and CancellationToken have to each other. 取消令牌
System.Threading和新的统一取消模型 DEMO: Use BarrierDemo. Also use BarrierDemo to show the new unified cancellation model via CancellationToken/CancellationTokenSource
重述要点 对于 Visual Studio 2010 和 .NET Framework 4.0…… 并行LINQ (PLINQ) 任务并行库 MESSAGING: So to recap, parallel computing enhancements for managed developers in Visual Studio 2010 and .NET Framework 4 are comprised of four major improvement areas: new Concurrency primitives, a new set of concurrent collections/containers, the Task Parallel Library, and Parallel LINQ (PLINQ). 并发集合 原语
Questions?