Future of Scientific Computing Marvin Theimer Software Architect Windows Server High Performance Computing Group Microsoft Corporation Marvin Theimer Software Architect Windows Server High Performance Computing Group Microsoft Corporation
Supercomputing Goes Personal SystemCray Y-MP C916Sun NewEgg.com Architecture 16 x Vector 4GB, Bus 24 x 333MHz Ultra- SPARCII, 24GB, SBus 4 x 2.2GHz x64 4GB, GigE OS UNICOSSolaris 2.5.1Windows Server 2003 SP1 GFlops ~10 Top500 # 1500N/A Price $40,000,000$1,000,000 (40x drop)< $4,000 (250x drop) Customers Government LabsLarge EnterprisesEvery Engineer & Scientist Applications Classified, Climate, Physics Research Manufacturing, Energy, Finance, Telecom Bioinformatics, Materials Sciences, Digital Media
Molecular Biologist’s Workstation High-end workstation with internal cluster nodes 8 Opteron, 20 Gflops workstation/cluster for O($10,000) Turn-key system purchased from a standard OEM Pre-installed set of bioinformatics applications Run interactive workstation applications that offload computationally intensive tasks to attached cluster nodes Run workflows consisting of visualization and analysis programs that process the outputs of simulations running on attached cluster nodes High-end workstation with internal cluster nodes 8 Opteron, 20 Gflops workstation/cluster for O($10,000) Turn-key system purchased from a standard OEM Pre-installed set of bioinformatics applications Run interactive workstation applications that offload computationally intensive tasks to attached cluster nodes Run workflows consisting of visualization and analysis programs that process the outputs of simulations running on attached cluster nodes
The Future: Supercomputing on a Chip IBM Cell processor 256 Gflops today 4 node personal cluster => 1 Tflops 32 node personal cluster => Top100 Intel many-core chips “100’s of cores on a chip in 2015” (Justin Rattner, Intel) “4 cores”/Tflop => 25 Tflops/chip IBM Cell processor 256 Gflops today 4 node personal cluster => 1 Tflops 32 node personal cluster => Top100 Intel many-core chips “100’s of cores on a chip in 2015” (Justin Rattner, Intel) “4 cores”/Tflop => 25 Tflops/chip
The Continuing Trend Towards Decentralized, Dedicated Resources Grids of personal & departmental clusters Personal workstations & departmental servers Minicomputers Mainframes
The Evolving Nature of HPC ScenarioFocus Departmental Cluster Conventional scenario IT owns large clusters due to complexity and allocates resources on per job basis Users submit batch jobs via scripts In-house and ISV apps, many based on MPI Scheduling multiple users’ applications onto scarce compute cycles Cluster systems administration Personal/Workgroup Cluster Emerging scenario Clusters are pre-packaged OEM appliances, purchased and managed by end-users Desktop HPC applications transparently and interactively make use of cluster resources Desktop development tools integration Interactive applications Compute grids: distributed systems management HPC Application Integration Future scenario Multiple simulations and data sources integrated into a seamless application workflow Network topology and latency awareness for optimal distribution of computation Structured data storage with rich meta-data Applications and data potentially span organizational boundaries Data-centric, “whole- system” workflows Data grids: distributed data management Interactive Computation and Visualization Manual, batch execution IT Mgr SQL
Exploding Data Sizes Experimental data: TBs PBs Modeling data: Today: 10’s to 100’s of GB per simulation is the common case Applications mostly run in isolation Tomorrow: 10’s to 100’s of TBs, all of it to be archived Whole-system modeling and multi-application workflows Experimental data: TBs PBs Modeling data: Today: 10’s to 100’s of GB per simulation is the common case Applications mostly run in isolation Tomorrow: 10’s to 100’s of TBs, all of it to be archived Whole-system modeling and multi-application workflows
How Do You Move A Terabyte? * 14 minutes ,920, OC hours1000Gbps 1 day Mpbs 14 hours ,000155OC3 2 days2, ,00043T3 2 months2, ,2001.5T1 5 months Home DSL 6 years3,0861, Home phone Time/TB $/TB Sent $/Mbps Rent $/month Speed Mbps Context 24 hours50100FedEx *Material courtesy of Jim Gray LAN Setting 13 minutes Gpbs
Anticipated HPC Grid Topology Islands of high connectivity Simulations done on personal & workgroup clusters Data stored in data warehouses Data analysis best done inside the data warehouse Wide-area data sharing/replication via FedEx? Data warehouse Workgroup cluster Personal cluster
Data Analysis and Mining Traditional approach: Keep data in flat files Write C or Perl programs to compute specific analysis queries Problems with this approach: Imposes significant development times Scientists must reinvent DB indexing and query technologies Have to copy the data from the file system to the compute cluster for every query Results from the astronomy community: Relational databases can yield speed-ups of one to two orders of magnitude SQL + application/domain-specific stored procedures greatly simplify creation of analysis queries Traditional approach: Keep data in flat files Write C or Perl programs to compute specific analysis queries Problems with this approach: Imposes significant development times Scientists must reinvent DB indexing and query technologies Have to copy the data from the file system to the compute cluster for every query Results from the astronomy community: Relational databases can yield speed-ups of one to two orders of magnitude SQL + application/domain-specific stored procedures greatly simplify creation of analysis queries
Is That the End of the Story? Relational Data warehouse Workgroup cluster Personal cluster
Too Much Complexity Relational Data warehouse Workgroup cluster Personal cluster Distributed systems issues: Security System management Directory services Storage management Digital experimentation: Experiment management Provenance (data & workflows) Version management (data & workflows) Parallel application development: Chip-level, node-level, cluster-level, LAN grid-level, WAN grid-level parallelism OpenMP, MPI, HPF, Global Arrays, … Component architectures Performance configuration & tuning Debugging/profiling/tracing/analysis Domain science 2004 NAS supercomputing report: O(35) new computational scientists graduated per year
Separating the Domain Scientist from the Computer Scientist Computer scientist Computational scientist Domain scientist Parallel domain application development Parallel/distributed file systems, relational data warehouses, dynamic systems management, Web Services & HPC grids (Interactive) scientific workflow, integrated with collaboration-enhanced office automation tools Concrete concurrency Abstract concurrency Concrete workflow Abstract workflow Write scientific paper (Word) Record experiment data (Excel) Individual experiment run (Workflow orchestrator) Analyze data (SQL-Server) Share paper with co-authors (Sharepoint) Collaborate with co-authors (NetMeeting) Example:
Scientific Information Worker: Past and Future Past Buy lab equipment Keep lab notebook Run experiments by hand Assemble & analyze data (using stat pkg) Collaborate by phone/ ; write up results with Latex Metaphor: Physical experimentation “Do it yourself” Lots of disparate systems/pieces Past Buy lab equipment Keep lab notebook Run experiments by hand Assemble & analyze data (using stat pkg) Collaborate by phone/ ; write up results with Latex Metaphor: Physical experimentation “Do it yourself” Lots of disparate systems/pieces Future Buy hardware & software Automatic provenance Workflow with 3 rd party domain packages Excel & Access/Sql-Server Office tool suite with collaboration support Metaphor: Digital experimentation Turn-key desktop supercomputer Single integrated system