Download presentation
1
4/17/2017 5:03 AM © 2004 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
2
4/17/2017 5:03 AM The Microsoft Perspective On Where High Performance Computing Is Heading Kyril Faenov Director of HPC Windows Server Division Microsoft Corporation © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
3
Talk Outline Market/technology trends Personal supercomputing
Grid computing Leveraging IT industry investments Decoupling domain science from Computer Science
4
Top 500 Supercomputer Trends
4/17/2017 5:03 AM Top 500 Supercomputer Trends Clusters over 50% Industry usage rising x86 is winning GigE is gaining © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
5
<$250K – 97% of systems, 52% of revenue
4/17/2017 5:03 AM HPC Market Trends Report of the High-End Computing Revitalization Task Force, 2004 (Office of Science and Technology Policy, Executive Office of the President) “Make high-end computing easier and more productive to use. Emphasis should be placed on time to solution, the major metric of value to high-end computing users… A common software environment for scientific computation encompassing desktop to high-end systems will enhance productivity gains by promoting ease of use and manageability of systems.” 2004 Systems CAGR Capability, Enterprise $1M+ 1,167 4.2% Divisional $250K-$1M 3,915 5.7% Top Challenges to Implementing Clusters (IDC 2004, N=229) 22,712 Departmental $50-250K 7.7% System management capability 18% Apps availability 17% Parallel algorithm complexity 14% Space, power, cooling 11% Interconnect BW/latency 10% I/O performance 9% Interconnect complexity Other 12% 127,802 Workgroup <$50K 13.4% <$250K – 97% of systems, 52% of revenue In 2004 clusters grew 96% to 37% by revenue Average cluster size nodes Source: IDC, 2005 © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
6
4/17/2017 5:03 AM Major Implications Market pressures demand accelerated innovation cycle, overall cost reduction, and thorough outcome modeling Leverage volume markets of industry standard hardware and software Rapid procurement, installation and integration of systems Workstation-Cluster integrated applications accelerating market growth Engineering Bioinformatics Oil and Gas Finance Entertainment Government/Research The convergence of affordable high performance hardware and commercial apps is making supercomputing personal © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
7
Supercomputing Goes Personal
4/17/2017 5:03 AM Supercomputing Goes Personal 1991 1998 2005 System Cray Y-MP C916 Sun HPC10000 NewEgg.com Architecture 16 x Vector 4GB, Bus 24 x 333MHz Ultra-SPARCII, 24GB, SBus 4 x 2.2GHz x64 4GB, GigE OS UNICOS Solaris 2.5.1 Windows Server 2003 SP1 GFlops ~10 Top500 # 1 500 N/A Price $40,000,000 $1,000,000 (40x drop) < $4,000 (250x drop) Customers Government Labs Large Enterprises Every Engineer & Scientist Applications Classified, Climate, Physics Research Manufacturing, Energy, Finance, Telecom Bioinformatics, Materials Sciences, Digital Media © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
8
The Future Supercomputing on a Chip
IBM Cell processor 256 Gflops today 4 node personal cluster => 1 Tflops 32 node personal cluster => Top100 Microsoft Xbox 3 custom PowerPCs + ATI graphics processor 1 Tflops today $300 8 node personal cluster => “Top100” for $2500 (ignoring all that you don’t get for $300) Intel many-core chips “100’s of cores on a chip in 2015” (Justin Rattner, Intel) “4 cores”/Tflop => 25 Tflops/chip
9
The Continuing Trend Towards Decentralized, Networked Resources
Grids of personal and departmental clusters Personal workstations & departmental servers Minicomputers Mainframes
10
Key To Evolution Tackling system complexity
4/17/2017 5:03 AM Key To Evolution Tackling system complexity Scenario Focus Departmental Cluster Conventional scenario IT owns large clusters due to cost and complexity and allocates resources on per job basis Users submit batch jobs via scripts In-house and ISV apps, many based on MPI Scheduling multiple users’ applications onto scarce compute cycles Cluster systems administration Personal/Workgroup Cluster Emerging scenario Clusters are pre-packaged OEM appliances, purchased and managed by end-users Desktop HPC applications transparently and interactively make use of cluster resources Desktop development tools integration Interactive applications Workstation clusters, accelerator appliances Distributed, policy-based management and security HPC Application Integration Future scenario Multiple simulations and data sources integrated into a seamless application workflow Network topology and latency awareness for optimal distribution of computation Structured data storage with rich meta-data Applications and data potentially span organizational boundaries Data-centric, “whole-system” workflows Rapid prototyping of HPC applications Grids: Distributed application, systems, and data management Interoperability IT Mgr Manual, batch execution Interactive Computation and Visualization SQL © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
11
“Grid Computing” A catch-all marketing term
“Grid” Computing means many different things to many different people/companies Desktop cycle-stealing Managed HPC clusters Internet access to giant, distributed repositories Virtualization of data center IT resources Out-sourcing to “utility data centers” … Originally this was all called “Distributed Systems”
12
HPC Grids And Web Services
HPC Grid ~ Compute Grid + Data Grid Compute grid Forest of clusters and workstations within an organization Coordinated scheduling of resources Data grid Distributed storage facilities within an organization Coordinated management of data Web Services The means to achieve interoperable Internet-scale computing, including federation of organizations Loosely-coupled, service-oriented architecture
13
Service-Oriented Architectures
Lessons learned Boundaries are explicit Services are autonomous and loosely coupled Services share schema and contract (not classes) Services establish explicit relationships Service compatibility is based on policy Virtual Organizations versus Service-Oriented Architectures Virtual organizations tend to be tightly-coupled Commercial organizations want controlled interactions with each other by means of a loosely-coupled SOA
14
Computational Grid Economics*
What $1 will buy you (roughly): Computers cost $1000 (roughly) 1 cpu day (~ 10 Tera-ops) == $1 (roughly, assuming 3 yr use cycle) 10TB network transfer costs == $1 (roughly, assuming 1Gbps interconnect) Internet bandwidth costs roughly 100 $/mbps/month (not including routers and management) 1GB network transfer costs == $1 (roughly) Some observations HPC cluster communication is 10,000x cheaper than WAN communication Break-even point for instructions computed per byte transferred: Cluster: O(1) instrs/byte => many parallel applications are economical to run on a cluster or across a GigE LAN WAN: O(10,000) instrs/byte => few parallel applications are economical to run across the Internet *Computational grid economics material courtesy of Jim Gray
15
Exploding Data Sizes Experimental data: TBs PBs Modeling data Today
10’s to 100’s of GB per simulation is the common case Applications mostly run in isolation Tomorrow 10’s to 100’s of TBs, all of it to be archived Whole-system modeling and multi-application workflows
16
How Do You Move A Terabyte?*
Time/TB $/TB Sent $/Mbps Rent $/month Speed Mbps Context 6 years 3,086 1,000 40 0.04 Home phone 5 months 360 117 70 0.6 Home DSL 2 months 2,469 800 1,200 1.5 T1 2 days 2,010 651 28,000 43 T3 14 hours 976 316 49,000 155 OC3 14 minutes 617 200 1,920,000 9600 OC 192 24 hours 50 100 FedEx LAN Setting 1 day 100 100 Mpbs 2.2 hours 1000 Gbps 13 minutes 10000 10 Gpbs *Material courtesy of Jim Gray
17
Anticipated HPC Grid Topology
Islands of high connectivity Simulations done on personal and workgroup clusters Data stored in data warehouses Data analysis best done inside the data warehouse Wide-area data sharing/replication via FedEx? Personal cluster Data warehouse Workgroup cluster
18
Data Analysis And Mining
Traditional approach Keep data in flat files Write C or Perl programs to compute specific analysis queries Problems with this approach Imposes significant development times Scientists must reinvent DB indexing and query technologies Have to copy the data from the file system to the compute cluster for every query Results from the astronomy community Relational databases can yield speed-ups of one to two orders of magnitude SQL + application/domain-specific stored procedures greatly simplify creation of analysis queries
19
Is That The End Of The Story?
Personal cluster Relational Data warehouse Workgroup cluster
20
Distributed systems issues:
Too Much Complexity 2004 NAS supercomputing report: O(35) new computational scientists graduated per year Parallel application development Chip-level, node-level, cluster-level, LAN grid-level, WAN grid-level parallelism OpenMP, MPI, HPF, Global Arrays, … Component architectures Performance configuration & tuning Debugging/profiling/tracing/analysis Digital experimentation: Experiment management Provenance (data & workflows) Version management (data & workflows) Domain science Distributed systems issues: Security System management Directory services Storage management Personal cluster Relational Data warehouse Workgroup cluster
21
(Partial) Solution Leverage IT Industry’s Existing R&D
Parallel applications development High-productivity IDEs Integrated debugging/profiling/tracing/analysis Code designer wizards Concurrent programming frameworks Platform optimizations Dynamic, profile-guided optimization New programming abstractions Distributed systems issues Web Services & HPC grids Security Interoperability Scalability Dynamic Systems Management Self (re)configuration & tuning Reliability & availability RDMS + data mining Ease-of-use Advanced indexing & query processing Advanced data mining algorithms Digital experimentation Collaboration-enhanced Office productivity tools Structure experiment data and derived results in a manner appropriate for human reading/reasoning (as opposed to optimizing for query processing and/or storage efficiency) Enable collaboration among colleagues (Scientific) workflow environments Automated orchestration Visual scripting Provenance
22
Separating The Domain Scientist From The Computer Scientist
Parallel/distributed file systems, relational data warehouses, dynamic systems management, Web Services & HPC grids Concrete workflow Concrete concurrency Abstract concurrency Computational scientist Parallel domain application development Abstract workflow (Interactive) scientific workflow, integrated with collaboration-enhanced office automation tools Domain scientist Example: Write scientific paper (Word) Record experiment data (Excel) Individual experiment run (Workflow orchestrator) Share paper with co-authors (Sharepoint) Analyze data (SQL-Server) Collaborate with co-authors (NetMeeting)
23
Scientific Information Worker Past and future
Buy lab equipment Keep lab notebook Run experiments by hand Assemble & analyze data (using stat pkg) Collaborate by phone/ ; Write up results with Latex Metaphor Physical experimentation “Do it yourself” Lots of disparate systems/pieces Future Buy hardware and software Automatic provenance Workflow with 3rd party domain packages Excel and Access/Sql-Server Office tool suite with collaboration support Metaphor Digital experimentation Turn-key desktop supercomputer Single integrated system
24
Microsoft Strategy Reducing barriers to adoption for HPC clusters
4/17/2017 5:03 AM Microsoft Strategy Reducing barriers to adoption for HPC clusters Easy to develop Familiar Windows dev environment + key HPC extensions (MPI, OpenMP, Parallel Debugger) Best of breed Fortran, numerical libraries, performance analysis tools through partners Long-term, strategic investments in developer productivity Easy to use Familiarity/intuitiveness of Windows Cluster computing integrated into the workstation applications, user workflow Easy to manage and own Integration with AD and the rest of IT infrastructure Lower TCO through integrated turnkey clusters Price/performance advantage of industry standard hardware components Application support in three key HPC verticals Engagement with the top HPC ISVs Enabling Open Source applications via University relationships Leveraging a breadth of standard knowledge-management tools Web Services, SQL, Sharepoint, Infopath, Excel Focused Approach to Market Enabling broad HPC adoption and making HPC into a high volume market © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
25
© 2005 Microsoft Corporation. All rights reserved.
4/17/2017 5:03 AM © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.