Download presentation
Presentation is loading. Please wait.
Published byChastity Russell Modified over 9 years ago
1
Data Parallel Application Development and Performance with Windows Azure Advisor : Professor Gagan Agrawal Present by : Yu Zhang
2
Agenda
3
Motivation
4
Goals
5
The same facilities that a desktop OS provides, but on a set of connected servers: Abstract execution environment Shared file system Resource allocation Programming environments Utility computing 24/7 operation Pay for what you use Simpler, transparent administration
7
Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET, PHP Operating SystemWindows Server 2008/R2-Compatible OS VirtualizationWindows Azure Hypervisor ServerMicrosoft Blades DatabaseSQL Azure StorageWindows Azure Storage (Blob, Queue, Table) NetworkingWindows Azure-Configured Networking
8
A Windows Azure application is called a “service” Definition information Configuration information At least one “role” Service definition is in ServiceDefinition.csde Defines aspects of a service that cannot be changed without redeployment Types of roles and static role configuration Set of configuration settings for a role Contract with the environment code runs
9
Service configuration is in ServiceConfiguration.cscfg Defines values for properties that can be dynamically updated for a running deployment Values of a configuration parameter Number of running instances
10
Definition: Role name Role type VM size (e.g. small, medium, etc.) Network endpoints Code: Web/Worker Role: Hosted DLL and other executables VM Role: VHD Configuration: Number of instances Number of update and fault domains
11
Desktop And Related Azure Concepts
12
Storage Services Public Internet Web Role Load Balancer
13
Storage Service Worker Role Web Role
14
Windows Azure Storage Abstractions
16
2 2 1 1 C1C1 C1C1 C2C2 C2C2 1 1 2 2 3 3 4 4 Producers Consumers P2P2 P2P2 P1P1 P1P1 3 3 1 1 2 2 Queue Usage Example
17
Communicating sequential processes Each process runs in its own local address space. Processes exchange data and synchronize via message passing. ( Usually, but not always, same code executed by all processes.) Need to take care of locality, in order to achieve performance – message passing does this explicitly.
18
Azure Parallel Programming Model VMS LB IIS VMS Web Role Worker Role Queue or WCF
19
MPI_Reduce(inbuf, outbuf, count, type, op, root, comm) Inbuf : address of input buffer Outbuf: address of output buffer Count : number of elements in input buffer Type : datatype of input buffer elements Op : operation Root : process id of root process public class WorkerRole : RoleEntryPoint { Public override void Run() { doWork(); var msg = new CloudQueueMessage(); queue.AddMessage(msg); }
20
MPI_Allreduce(inbuf, outbuf, count, type, op, comm) Inbuf : address of input buffer Outbuf: address of output buffer Count : number of elements in input buffer Type : datatype of input buffer elements Op : operation public class WorkerRole : RoleEntryPoint { Public override void Run() { if (queue.Exists()) { var msg = queue.GetMessage(); if (msg != null) { DoWork(); queue1.DeleteMessage(msg); } doWork(); var msg = new CloudQueueMessage(); queue.AddMessage(msg); }
21
Each worker role reads the data from matrix B Decouple the matrix A into n parts, n is the number of the worker roles. Each worker role gets one part of matrix A, for a N×N matrix, each worker role has two data sets, one is matrix B, the other is part of matrix A, say A K (1≤k≤n) n is the number of worker roles. Each worker role computes the A K ×B and add the result to its queue Web role performs the reduce operation gets the final result.
22
1. Web role calculates the initial means 2.Broadcast the k centroids to all worker roles 3. Each worker role computes distance of each local document vector to the centroids 4. Assign points to closest centroid and compute local MSE (Mean Squared Error) 5. Perform reduction for global centroids and global MSE value 6. Web role broadcast new cnetroids to all worker role until no points move.
23
1. Web role be the master, the other N worker roles are slaves. 2.Master divides the training samples to N subsets, and distributes 1 subset for each worker role. 3.Each individual worker role now computes the distance measures independently and storing the computes measures in a local array 4.When each worker role terminates distance calculation, it transmits a message to the web role indicating end of processing 5.Web role then notes the end of processing for the sender and acquires the computes measures by reduction. 6.After the web role has claimed all distance measures from all WRs, the following steps are performed: Select top k measures Sort all distance measures in ascending order Count the number of classes in the top k measures The input element’s class will belong to the class having the higher count among top k measures
24
What is Windows Communication Foundation (WCF)? WCF is Microsoft’s implementation of industry standards to provide a communication subsystem enabling applications on one machine (process boundary) or across multiple machines to communicate. WCF is a core component of the.NET Framework 3.0 and later versions which is included with Windows 7 and Vista platforms as well as the future version of Windows Server. The WCF API unifies ASMX Web Services,.NET Remoting, distributed transactions and messaging into a single programming model service orientation tenable. Fundamental to.NET Framework. ASMXWSE.NET Remoting COM+ (Enterpris e Services) MSMQ WCF
25
WCF: Address, Binding, Contract ClientService Message AddressBindingContract Where?How?What? Endpoint ABCABC Endpoints ABC WCF Services are deployed, discovered and consumed as endpoints
26
WCF : Endpoint
27
WCF in Azure maxBufferSize="10485760" maxReceivedMessageSize ="10485760" maxBufferSize="10485760" maxReceivedMessageSize ="10485760"
28
PolymorphismEncapsulationSubclassing 1980s Interface-based Dynamic Loading Runtime Metadata 1990s Object-Oriented Service-Oriented Component-Based Message-basedSchema+Contract Binding via Policy 2000s C&C++ with MPI Queue with Azure WCF with Azure
29
Experimental Evaluation MPIQueueWCF 8 Processors0.0993sec 8.8726sec 4.4533sec 4 Processors0.1656sec13.9872sec 6.349sec 2 processors0.4723sec20.6536sec 11.5783sec MPIQueueWCF 8 Processors0.10232.89021.9234 4 Processors0.25124.12243.4267 2 processors0.54207.62385.5263 MPIQueueWCF 8 Processors0.4272 sec1.0623 sec0.8976 sec 4 Processors1.2567 sec2.3457 sec1.5214 sec 2 processors2.0233 sec5.2356 sec4.1218 sec Time (sec ) Time (sec ) Time (sec ) Time (sec ) Time (sec ) Time (sec ) Matrix Multiplication Kmeans KNN Fastest Read: 31ms Slowest Read: 203ms Fastest Write: 31ms Slowest Write: 234ms Fastest Delete: 0ms Slowest Delete: 593ms simply a reliable method of delivering messages between processes Fastest Read: 31ms Slowest Read: 203ms Fastest Write: 31ms Slowest Write: 234ms Fastest Delete: 0ms Slowest Delete: 593ms simply a reliable method of delivering messages between processes QUEUE Performance
30
Azure VS Traditional Cluster CPU Ram Bandwidth Glenn 2.7Ghz 8 G20 Gbps Azure 1.6Ghz 2 G10 Gbps
31
Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.