Download presentation
Presentation is loading. Please wait.
Published byBarrie Hutchinson Modified over 9 years ago
1
Scientific Computing on Amazon Web Services Dave Cuthbert Solutions Architect cuthbert@amazon.com
2
Two Facets (That I’ll Mention Today) Facet 1: Availability of scientific applications General purpose analysis Python (SciPy, NumPy, iPython notebooks). Octave, R, … C, C++, Fortran, … Databases/data formats NetCDF, HDF, … Cassandra, MongoDB, CouchDB, Redis, Berkeley DB, … MySQL/MariaDB, PostgreSQL, … Commercial Applications are widely available. Licensing can be thorny.
3
Two Facets (That I’ll Mention Today) Facet 2: Cycles What everyone thinks: HPC. Mental trap 1: It’s not “real” science if it’s not running on an HPC cluster. Mental trap 2: If your lab has an HPC cluster, you should be coding for it. So everyone demands cluster time, and…
4
A Typical HPC Cluster Workload
6
But What Is HPC, Anyway? If I wanted to start a flame war: “What is ‘real’ HPC?”
7
HPC Is Not A Panacea! Hadoop GPU Low Latency Hadoop Low Latency GPU
8
It’s A Trap! Facet 2: Cycles What everyone thinks: HPC. Mental trap 1: It’s not “real” science if it’s not running on an HPC cluster. Mental trap 2: If your lab has an HPC cluster, you should be coding for it. The right systems for the job.
9
HOW AWS IS ATTACKING THE PROBLEM
10
AmazonLinux with SLURM AMI Availability Zone us-west-2a Region: us-west-2 (Oregon) controller VPC Space: 192.168.0.0/16 Subnet 192.168.0.0/24 Availability Zone us-west-2b Subnet 192.168.1.0/24 Availability Zone us-west-2c Subnet 192.168.2.0/24 node-0 node-1 node-2 node-3 node-4 node-5 node-6 node-7 node-8 node-9 node-10 node-11 VBL S3 Bucket Scripts Code Input Decks Output Files CloudFormation Template Internet gateway Work Request Queue Work Response Queue SQS Queues CloudFormation (Bootstrap controller)
11
AmazonLinux with SLURM AMI Availability Zone us-west-2a Region: us-west-2 (Oregon) controller VPC Space: 192.168.0.0/16 Subnet 192.168.0.0/24 Availability Zone us-west-2b Subnet 192.168.1.0/24 Availability Zone us-west-2c Subnet 192.168.2.0/24 node-0 node-1 node-2 node-3 node-4 node-5 node-6 node-7 node-8 node-9 node-10 node-11 VBL S3 Bucket Scripts Code Input Decks Output Files CloudFormation Template Internet gateway Work Request Queue Work Response Queue SQS Queues CloudFormation (Bootstrap controller)
12
AmazonLinux with SLURM AMI Availability Zone us-west-2a Region: us-west-2 (Oregon) controller VPC Space: 192.168.0.0/16 Subnet 192.168.0.0/24 Availability Zone us-west-2b Subnet 192.168.1.0/24 Availability Zone us-west-2c Subnet 192.168.2.0/24 node-0 node-1 node-2 node-3 node-4 node-5 node-6 node-7 node-8 node-9 node-10 node-11 VBL S3 Bucket Scripts Code Input Decks Output Files CloudFormation Template Internet gateway Work Request Queue Work Response Queue SQS Queues CloudFormation (Bootstrap controller)
13
AmazonLinux with SLURM AMI Availability Zone us-west-2a Region: us-west-2 (Oregon) controller VPC Space: 192.168.0.0/16 Subnet 192.168.0.0/24 Availability Zone us-west-2b Subnet 192.168.1.0/24 Availability Zone us-west-2c Subnet 192.168.2.0/24 node-0 node-1 node-2 node-3 node-4 node-5 node-6 node-7 node-8 node-9 node-10 node-11 VBL S3 Bucket Scripts Code Input Decks Output Files CloudFormation Template Internet gateway Work Request Queue Work Response Queue SQS Queues CloudFormation (Bootstrap controller) min229 µs p50239 µs p90258 µs p99280 µs max472 µs min229 µs p50239 µs p90258 µs p99280 µs max472 µs min329 µs p50340 µs p90354 µs p99377 µs max611 µs min329 µs p50340 µs p90354 µs p99377 µs max611 µs min1048 µs p501052 µs p901094 µs p991182 µs max2125 µs min1048 µs p501052 µs p901094 µs p991182 µs max2125 µs
14
AmazonLinux with SLURM AMI Availability Zone us-west-2a Region: us-west-2 (Oregon) controller VPC Space: 192.168.0.0/16 Subnet 192.168.0.0/24 Availability Zone us-west-2b Subnet 192.168.1.0/24 Availability Zone us-west-2c Subnet 192.168.2.0/24 node-0 node-1 node-2 node-3 node-4 node-5 node-6 node-7 node-8 node-9 node-10 node-11 VBL S3 Bucket Scripts Code Input Decks Output Files CloudFormation Template Internet gateway Work Request Queue Work Response Queue SQS Queues CloudFormation (Bootstrap controller)
15
AmazonLinux with SLURM AMI Availability Zone us-west-2a Region: us-west-2 (Oregon) controller VPC Space: 192.168.0.0/16 Subnet 192.168.0.0/24 node-0node-1node-2node-3node-4node-5node-6node-7node-8node-9node-10node-11 VBL S3 Bucket Scripts Code Input Decks Output Files CloudFormation Template Internet gateway Work Request Queue Work Response Queue SQS Queues CloudFormation (Bootstrap controller) Placement Group A min85 µs p5096 µs p90106 µs p99189 µs max233 µs min85 µs p5096 µs p90106 µs p99189 µs max233 µs min87 µs p5099 µs p90174 µs p99189 µs max246 µs min87 µs p5099 µs p90174 µs p99189 µs max246 µs
16
Is AWS The Silver Bullet? No silver bullets – Fred Brooks Commonly heard latency number: 10 µs Proximity to other resources might be an issue. People-hours are more expensive than core- hours. Enable facilities like NERSC to focus on harder problems not served (or currently served) by COTS.
17
THANK YOU!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.