Windows HPC Server 2008 High Productivity Computing With Windows Neil Foster HPC Partner Mgr Microsoft
High Productivity for HPC Overview Windows HPC Server 2008 Partnerships Discussion Agenda
‘The purpose of computing is insight not numbers.’ Richard Hamming The purpose if computing is?
Competitive Advantages Pressure to improve operational performance (cost, quality and time to market) Quality driven regulatory compliance Rapid cycles of product innovation HPC Drivers
“Make high-end computing easier and more productive to use. Emphasis should be placed on time to solution, the major metric of value to high- end computing users… A common software environment for scientific computation encompassing desktop to high- end systems will enhance productivity gains by promoting ease of use and manageability of systems.” High-End Computing Revitalization Task Force, 2004 (Office of Science and Technology Policy, Executive Office of the President)) High integration pain Lack of seamless integration between workstations, clusters, data Lack of user workflow integration across applications and departments Isolated technology islands High manual touch Lack of end-to-end IT process integration Cannot leverage existing investments in broad IT skills and infrastructure Application availability Limited eco-system of parallel applications Lack of developer-friendly tools, difficult to program The Challenge: High Productivity Computing
Changing face of HPC Costs and pain points have moved: Manpower more expensive than hardware Software is more expensive than hardware Every system will be multi-core Power, Cooling, facilities much more expensive than hardware! NSF Grants only cover hardware costs
Current Issues HPC and IT data centers merging: isolated cluster management Developers can’t easily program for parallelism Users don’t have broad access to the increase in processing cores and data Current Issues HPC and IT data centers merging: isolated cluster management Developers can’t easily program for parallelism Users don’t have broad access to the increase in processing cores and data How can Microsoft help? Well positioned to mainstream integration of application parallelism Have already begun to enable parallelism broadly to the developer community Can expand the value of HPC by integrating productivity and management tools How can Microsoft help? Well positioned to mainstream integration of application parallelism Have already begun to enable parallelism broadly to the developer community Can expand the value of HPC by integrating productivity and management tools Microsoft Investments in HPC Comprehensive software portfolio: Client, Server, Management, Development, and Collaboration Dedicated teams focused on Cluster Computing Unified Parallel development through the Parallel Computing Initiative Partnerships with the Technical Computing Institutes Microsoft Investments in HPC Comprehensive software portfolio: Client, Server, Management, Development, and Collaboration Dedicated teams focused on Cluster Computing Unified Parallel development through the Parallel Computing Initiative Partnerships with the Technical Computing Institutes Why Microsoft in HPC?
“ Provide the platform, tools and broad ecosystem to reduce the complexity of HPC by making parallelism more accessible to address future computational needs.” Microsoft’s Vision for HPC Ease deployment for larger scale clusters Simplify management for clusters of all scale Integrate with existing infrastructure Enable non-technical users to harness the power of HPC Address emerging cross-industry computation trends Address needs of traditional supercomputing Increase number of parallel applications and codes Offer choice of parallel development tools, languages and libraries Drive larger universe of developers and ISVs
Integrated HPC Envirnment
Complete, integrated platform for computational clustering Built on top the proven Windows Server 2008 platform Integrated development environment Windows Server Operating System Secure, Reliable, Tested Support for high performance hardware (x64, high-speed interconnects) HPC Pack Job Scheduler Resource Manager Cluster Management Message Passing Interface Microsoft Windows HPC Server 2008 Integrated Solution out-of-the-box Leverages investment in Windows administration and tools Makes cluster operation easy and secure as a single system Beta1 available from Windows HPC Server 2008
Systems Management Job Scheduling Networking & MPI Storage New System Center UI PowerShell for CLI Management High Availability for Head Nodes Windows Deployment Services Diagnostics/Reporting Support for Operations Manager Support for SOA and WCF Granular resource scheduling Improved scalability for larger clusters New Job scheduling policies Interoperability via HPC Profile NetworkDirect (RDMA) for MPI Improved Network Configuration Wizard Shared Memory MS-MPI for multi-core MS-MPI integrated with Windows Event Tracing Improved iSCI SAN Support in Win2008 Improved Server Message Block ( SMB v2) New 3 rd party parallel system file support for Windows New Memory Cache Vendors What’s New in the HPC Pack 2008
– Next generation of cluster services – Major improvement in configuration validation and management HPC Pack Includes – Setup integration with Failover Clustering Services Head Node and Failover Node set up with SQL Failover Cluster Job Scheduler services failover – Management console linked to Windows Server Failover Management console Shared Disk Private Network Head node Win2008 Enterprise Clustered SQL Server Failover Head node Win2008 Enterprise Clustered SQL Server Windows Failover Clustered Eliminates single point of failure with support for high availability Requires Windows Server 2008 Enterprise Failover Clustering Services Head Node High Availability
Priorities – Comparable with hardware-optimized MPI stacks Focus on MPI-Only Solution for version 2 – Verbs-based design for close fit with native, high-perf networking interfaces – Coordinated w/ Win Networking team’s long-term plans Implementation – MS-MPIv2 capable of 4 networking paths: Shared Memory between processors on a motherboard TCP/IP Stack (“normal” Ethernet) Winsock Direct (and SDP) for sockets-based RDMA New RDMA networking interface – HPC team partners with networking IHVs to develop/distribute drivers for this new interface User Mode Kernel Mode TCP/Ethernet Networking Kernel By-Pass MPI App Socket-Based App MS-MPI Windows Sockets (Winsock + WSD) Networking Hardware Hardware Driver Networking Hardware Mini-port Driver TCP NDIS IP Networking Hardware User Mode Access Layer Networking Hardware WinSock Direct Provider Networking Hardware NetworkDirect Provider RDMA Networking OS Component CCP Component IHV Component (ISV) App NetworkDirect A new RDMA networking interface built for speed and stability
Support for larger clusters – Create new designs for clusters of size, including “heterogeneous” clusters – Scale deployment and administration technologies – Provide interfaces for those accustomed to *nix Improve interoperability with existing IT infrastructure – Interoperability with existing job schedulers – High speed file I/O through native support for parallel and clustered file systems Broader application support – Simplify the integration of new applications with the job scheduler – Addressing needs of in-house and open source developers Platform Support – Built for Windows Server 2008 – Cluster nodes with different hardware / software Job Scheduling
Service (DLL) Service (DLL) Service (DLL) Service (DLL) Service (DLL) Service (DLL) Service (DLL) Service (DLL) Job Scheduler Resource allocation Process Launching Resource usage tracking Integrated MPI execution Integrated Security Job Scheduler Resource allocation Process Launching Resource usage tracking Integrated MPI execution Integrated Security WCF Service Router WS Virtual Endpoint Reference Request load balancing Integrated Service activation Service life time management Integrated WCF Tracing WCF Service Router WS Virtual Endpoint Reference Request load balancing Integrated Service activation Service life time management Integrated WCF Tracing V1 (focusing on batch jobs)V2 (focusing on Interactive jobs) + Scenario: Broaden Application Support
Private NetworkPublic Network Highly Available Head Node WCF Brokers Head node Failover Head node […] 1. User submits job. 2. Session Manager assigns WCF Broker node for client job 3. HN Provides WCF Broker node 5. Requests 4. Client connects to Broker and submits requests 7. Responses return to client Compute Nodes Workstation 6. Responses Service-Oriented Jobs
What is it? A draft OGSA (Open Grid Services Architectures) interoperability standard for batch job scheduler task submission and management Based on web services standards (HTTP, XML, SOAP) What is its value? Enables integration of HPC applications executing on different platforms and schedulers via web services standards What’s the Status? Passed the public comment period Working on new extensions Windows Cluster Window Center Windows Center LSF / PBS / SGE / Condor Linux, AIX, Solaris HPUX, Windows Interoperability & Open Grid Forum
Spring 2008, NCSA, # cores, 68.5 TF, 77.7% Fall 2007, Microsoft, # cores, 11.8 TF, 77.1% Spring 2007, Microsoft, # cores, 9 TF, 58.8% Spring 2006, NCSA, # cores, 4.1 TF Spring 2008, Umea, # cores, 46 TF, 85.5% 30% efficiency improvement 30% efficiency improvement Windows HPC Server 2008 Windows Compute Cluster 2003 Winter 2005, Microsoft 4 procs, 9.46 GFlops Spring 2008, Aachen, # cores, 18.8 TF, 76.5%
Windows HPC Server 2008 performed well enough to become the fastest academic system in Sweden “The Umeå cluster with Windows HPC Server 2008 performed well enough to become the fastest academic system in Sweden. We are very happy with that result.” -- Bo Kågström, Professor and Director, High Performance Computing Center North Umeå University we can rely on Windows HPC Server a fast, familiar, high performance computing platform “Ferrari is always looking for the most advanced technological solutions and, of course, the same applies for software and engineering. To achieve industry leading power-to-weight ratios, reduction in gear change times, and revolutionary aerodynamics, we can rely on Windows HPC Server It provides a fast, familiar, high performance computing platform for our users, engineers and administrators.” -- Antonio Calabrese, Responsabile Sistemi Informativi (Head of Information Systems), Ferrari Customers Windows HPC Server 2008 cluster renders our HPC services extremely attractive “Financial analysts in Europe mainly use Windows systems. As such, the deployment of a Windows HPC Server 2008 cluster renders our HPC services extremely attractive to a large potential user base.” -- Dr. M. Rosati, Manager of the Computational Materials Science and Finance Group, CASPUR We are really impressed with many of the new features of Windows HPC Server this is already a very solid product.” “ We are really impressed with many of the new features of Windows HPC Server Microsoft is a pretty young player in the HPC market, but this is already a very solid product.” -- Christian Terboven, Project Lead for HPC on Windows, Center for Computing and Communication, RWTH Aachen University
Available Now – Development and Parallel debugging in Visual Studio – 3 rd party Compilers, Debuggers, Runtimes etc.. available Emerging Technologies – Parallel Framework – LINQ/PLINQ – natural OO language for SQL queries in.NET – C# Futures – way to explicitly make loops parallel For the future: Parallel Computing Initiative (PCI) – Triple investment with a new engineering team – Focused on common tools for developing multi-core codes from desktops to clusters Compilers Visual Studio Intel C++ Gcc PGI Fortran Intel Fortran Absoft Fortran Fujitsu Profilers and Tracers PerfMon ETW (for MS-MPI) VSPerf /VSCover CLRProfiler Vampir (Being ported to Windows) Intel Collector/Analyzer(Runs on CCS w Intel MPI) Vtune & CodeAnalyst Marmot (Being ported to Windows) MPI Lint++ Debuggers Visual Studio WinDbg DDT Runtimes and Libraries MPI OpenMP C# Futures MPI.C++ and MPI.Net PLINQ Parallel Programming
Microsoft approach to HPC Best of breed software and Partners: Partner with every hardware vendor Partner with every software vendor (even Novell!) One integrated platform (Windows) Stick to our competencies: OS, management, development, tools, user interface, etc.
“Here and Now” Technologies Emerging Technologies Research Product Teams Windows HPC Server SQL Office Visual Studio Microsoft Research eScience External Research & Programs External Research Office Cross Microsoft Effort
AutomotiveAerospace Geo Services Financial Services AcademiaGovernment Life Sciences Industry Focused Solutions
Company Introduction Cluster Resources (CRI) are based in Utah, USA and Cambridge, UK CRI’s core product is Moab Moab’s pedigree goes back more than 10 years to the Maui scheduler CRI develop, maintain and support 2 open source products: TORQUE compute resource manager GOLD resource allocation and accounting suite Best efforts support for SLURM Moab is widely installed on the TOP 500 systems, and is the scheduler for the worlds first petaflop system at LANL
ACADEMIC INSTALLATIONS OF MOAB Cambridge Cardiff Birmingham Bristol UCL UCD St. Andrews ICHEC
Adaptive/Dynamic: Windows/Linux Cluster Definition: Moab is able to dynamically monitor & then adjust the operating system or other environmental factors to meet the needs of current and upcoming workload. Moab can manipulate, grow and shrink the allocated resources in order to meet QoS targets. Operating Systems Services Network / Bandwidth Application Resources Storage Space Etc. Note: Moab's Dynamic Adaptation capability is based on its abstracted workload concept and its ability to import data from external resource managers. Linux Windows Moab Linux RM Windows RM Linux Workload Upcoming Workload Examples of Dynamic Adaptations: Note: RM is responsible for monitoring and job execution. Windows Workload
Allinea Software Offers next generation tools for parallel application development – Traditionally for clusters, SMPs and MPPs – Focus on usability and scalability – Cross platform with Windows and Linux / UNIX Addressing future requirements – Growth in processors / cores... – Growth in parallel and distributed programming... Application of this technology – Experience from HPC to the desktop – Experience of embedded applications
High Profile Clients (extract) National research centres –AWE, BSC, CASPUR, CEA, CINECA, HLRS, ICHEC, IDRIS, LLNL, ONERA, PROUDMAN, RAL, …. Universities –Bristol, Dresden, Edinburgh, HLRS, IPGP, J ü lich, Karlsruhe, Leicester, LRZ, North West Grid, Oxford, PenState, Nottingham, Sharcnet, TACC, Tokyo, UFA, Vanderbilt, etc. Aerospace research –CIRA, EADS CCR, DLR, MBDA, etc. Commercial research –Airbus, AVL, CGGVeritas, Fujitsu (Japan & UK), IFP, MTEM, OHM, Total, etc.
DDTLite Plugin for Visual Studio® New product from Allinea – Simplifies Parallel development on Microsoft® platforms – Bringing popular features from DDT to Visual Studio® – Makes easy path from Linux/Unix to Microsoft® world – Available Q4 2007
Microsoft HPC Web site – download Beta 1 Today! – Windows HPC Community site – Windows Server x64 information – Windows Server System information – Get the Facts Web site – Resources
© 2008 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.