Download presentation
Presentation is loading. Please wait.
Published byProsper Golden Modified over 9 years ago
1
Scientific Computing at Amazon Disruptive Innovations in Distributed Computing Dave Ward, Principal Product Manager Adam Gray, Senior Product Manager
2
Innovation #1:
3
42
4
Building your own virtual programmable datacenter
5
ec2-run-instances
7
On Demand Global Infrastructure
8
Programmable
12
Elastic
14
Instance Types
15
Standard (m1) High Memory (m2) High CPU (c1)
16
High Performance
17
“Our 40-instance (m2.2xlarge) cluster can scan, filter, and aggregate 1 billion rows in 950 milliseconds.” Mike Driscoll – Meta Markets
18
Cluster Computing
19
MPI
20
Bandwidth Intensive
21
Cluster Compute Instance
22
2*Intel Xeon 5570 8 Cores w/ HT 23 GB RAM 1.7 TB disk HVM Cc1.4xlarge
26
linpack
28
231 November 2010
29
451 June 2011
30
New Cluster Compute Instances
31
2*Intel Xeon 16 cores w/HT 60.5GB RAM 3.4TB disk HVM cc2.8xlarge
32
linpack
34
42 November 2011
35
Innovation #2:
36
Lowering the cost of developing a distributed system
37
Case Study: Amazon’s Associates Program
39
Text Links Enhanced Links
41
how much to pay each associate?
42
orders c++ app bi-hourly flat files bi-hourly flat files
43
orders c++ app bi-hourly flat files bi-hourly flat files c++ app daily aggregations daily aggregations
44
orders c++ app bi-hourly flat files bi-hourly flat files c++ app daily aggregations daily aggregations c++ app to payments service…
45
orders c++ app bi-hourly flat files bi-hourly flat files c++ app daily aggregations daily aggregations c++ app to payments service…
46
“just one more Q4”
47
distributed computing
48
Difficulty Number of Machines 1 1
49
Difficulty Number of Machines 1 1 10 6 2
50
Difficulty Number of Machines 1 1 10 6 2
51
distributed computing is hard
52
distributed computing requires god-like engineers
53
Hadoop is… The MapReduce computational paradigm
54
Hadoop is… The MapReduce computational paradigm … implemented as an Open-source, Scalable, Fault-tolerant, Distributed System
55
PersonStartEnd Bob00:44:4800:45:11 Charlie02:16:0202:16:18 Charlie11:16:5911:17:17 Charlie11:17:2411:17:38 Bob11:23:1011:23:25 Alice16:26:4616:26:54 David17:20:2817:20:45 Alice18:16:5318:17:00 Charlie19:33:4419:33:59 Bob21:13:3221:13:43 David22:36:2222:36:34 Alice23:42:0123:42:11
56
PersonStartEndDuration Bob00:44:4800:45:11 Charlie02:16:0202:16:18 Charlie11:16:5911:17:17 Charlie11:17:2411:17:38 Bob11:23:1011:23:25 Alice16:26:4616:26:54 David17:20:2817:20:45 Alice18:16:5318:17:00 Charlie19:33:4419:33:59 Bob21:13:3221:13:43 David22:36:2222:36:34 Alice23:42:0123:42:11
57
PersonStartEndDuration Bob00:44:4800:45:1123 Charlie02:16:0202:16:18 Charlie11:16:5911:17:17 Charlie11:17:2411:17:38 Bob11:23:1011:23:25 Alice16:26:4616:26:54 David17:20:2817:20:45 Alice18:16:5318:17:00 Charlie19:33:4419:33:59 Bob21:13:3221:13:43 David22:36:2222:36:34 Alice23:42:0123:42:11
58
PersonStartEndDuration Bob00:44:4800:45:1123 Charlie02:16:0202:16:1816 Charlie11:16:5911:17:17 Charlie11:17:2411:17:38 Bob11:23:1011:23:25 Alice16:26:4616:26:54 David17:20:2817:20:45 Alice18:16:5318:17:00 Charlie19:33:4419:33:59 Bob21:13:3221:13:43 David22:36:2222:36:34 Alice23:42:0123:42:11
59
PersonStartEndDuration Bob00:44:4800:45:1123 Charlie02:16:0202:16:1816 Charlie11:16:5911:17:1718 Charlie11:17:2411:17:3814 Bob11:23:1011:23:2515 Alice16:26:4616:26:548 David17:20:2817:20:4517 Alice18:16:5318:17:007 Charlie19:33:4419:33:5915 Bob21:13:3221:13:4311 David22:36:2222:36:3412 Alice23:42:0123:42:1110
60
PersonDuration Bob23 Charlie16 Charlie18 Charlie14 Bob15 Alice8 David17 Alice7 Charlie15 Bob11 David12 Alice10
61
PersonDuration Bob23 Charlie16 Charlie18 Charlie14 Bob15 Alice8 David17 Alice7 Charlie15 Bob11 David12 Alice10 PersonStartEnd Bob00:44:4800:45:11 Charlie02:16:0202:16:18 Charlie11:16:5911:17:17 Charlie11:17:2411:17:38 Bob11:23:1011:23:25 Alice16:26:4616:26:54 David17:20:2817:20:45 Alice18:16:5318:17:00 Charlie19:33:4419:33:59 Bob21:13:3221:13:43 David22:36:2222:36:34 Alice23:42:0123:42:11 map
62
PersonDuration Bob23 Charlie16 Charlie18 Charlie14 Bob15 Alice8 David17 Alice7 Charlie15 Bob11 David12 Alice10
63
PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17
64
PersonTotal Alice25 PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17
65
PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17 PersonTotal Bob49 Alice25
66
PersonTotal Charlie63 Bob49 Alice25 PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17
67
PersonTotal David29 Charlie63 Bob49 Alice25 PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17
68
PersonTotal David29 Charlie63 Bob49 Alice25
69
PersonTotal Alice25 Bob49 Charlie63 David29 PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17 reduce
70
PersonStartEnd Bob00:44:4800:45:11 Charlie02:16:0202:16:18 Charlie11:16:5911:17:17 Charlie11:17:2411:17:38 Bob11:23:1011:23:25 Alice16:26:4616:26:54 David17:20:2817:20:45 Alice18:16:5318:17:00 Charlie19:33:4419:33:59 Bob21:13:3221:13:43 David22:36:2222:36:34 Alice23:42:0123:42:11
71
PersonDuration Alice8 7 10 Bob23 Bob15 Bob11 Charlie16 Charlie18 Charlie14 Charlie15 David12 David17
72
Hadoop is… The MapReduce computational paradigm
73
Hadoop is… The MapReduce computational paradigm … implemented as an Open-source, Scalable, Fault-tolerant, Distributed System
74
distributed computing requires god-like engineers
75
distributed computing (with Hadoop) requires god-like talented engineers
76
how much to pay each associate?
77
orders c++ app bi-hourly flat files bi-hourly flat files c++ app daily aggregations daily aggregations c++ app to payments service…
78
orders c++ app bi-hourly flat files bi-hourly flat files c++ app daily aggregations daily aggregations c++ app to payments service… PersonTotal Alice25 Bob49 Charlie63 David29
79
Orders Filter S3 Other Services
80
Orders Filter S3 Hadoop Cluster
81
Difficulty Number of Machines 1 1 10 6 2
82
Difficulty Number of Machines 1 1 10 6 2 More data? Smarter engineers.
83
Difficulty Number of Machines 1 1 10 6 2
84
Difficulty Number of Machines 1 1 10 6 2 More data? Smarter Engineers. More data? More boxes.
85
Hadoop lowers the cost of developing a distributed system.
86
What about the cost of operating a distributed system?
87
November traffic at amazon.com
89
76% 24%
90
Orders Filter S3 Hadoop Cluster
91
Amazon Elastic Compute Cloud “provides resizable compute capacity in the cloud.”
92
Amazon Elastic MapReduce = Amazon EC2 + Hadoop
93
Orders Filter S3 Hadoop Cluster
94
Filter S3 EMR Cluster Orders
97
Filter S3 EMR Cluster Orders
98
Filter S3 Orders
99
Filter S3 Orders
100
Amazon EC2 lowers the cost of operating a distributed system.
101
Hadoop lowers the cost of developing a distributed system.
102
Amazon Elastic MapReduce changes the economics of data processing.
103
Managed Apache Hadoop Service Removes MUCK from Big Data processing Provides tight integration with AWS services AMAZON ELASTIC MAPREDUCE
104
> elastic-mapreduce --create --instance-type m1.large / --instance-count 1000 --name “My Hadoop Cluster” / --jar s3://elasticmapreduce/samples/cloudburst/cloudburst.jar
105
What is big data?
106
Dataset size Number of datasets
107
Dataset size Number of datasets fits on a single machine
108
Dataset size Number of datasets Big Data
109
Dataset size Number of datasets Extremely Big Data
110
Dataset size Difficulty
111
Dataset size Difficulty
112
Dataset size Difficulty Extremely valuable Marginally valuable
113
Dataset size Difficulty Extremely valuable Marginally valuable
114
Dataset size Number of datasets Extremely Big Data
115
Dataset size Difficulty
116
Dataset size Difficulty
117
Dataset size Difficulty
118
Dataset size Difficulty
119
cheap experimentation
120
Innovation #3:
121
Lowering the cost of accessing data
122
Over 50 free data sets
123
Nearly 1 PB of free data
124
Stored at no cost to providers; also free access to consumers
125
1000 Genomes Project (110 TB) Common Crawl Corpus (60 TB) Sloan Digital Sky Survey (180 GB) United States Census (200 GB) Million Song Dataset (500 GB) Google Books Corpus (2.2 TB) Marvel Universe Social Graph (50 GB)
126
aws.amazon.com/datasets
127
Innovation #4: Creating a Market for Capacity
128
Finding Research Dollars (even further) for AWS
129
Educators
130
Up to $100 per Student in AWS Credits for intro courses
131
Researchers
132
Infrastructure Credits (EC2, S3, …)
133
4 Grant Review Cycles Per Year
134
February 10, 2012
135
Students
136
Student Organizations, Self Learning, Entrepreneurial Projects
137
aws.amazon.com/education
138
Stretching your Research Dollars (even further) on AWS
139
On-Demand
140
Reserved
141
Spot
142
Unused EC2 Capacity
143
Bid
145
July 2011
146
Interruption
147
July 2011
148
Manage Interruption
149
Grid Computing
150
MIT StarCluster
157
http://youtu.be/2Ym7epCYnSk
159
Harvard Medical School Lab of Personalized Medicine
160
Temple University Spot MPI
161
Elastic MapReduce
162
#1: Cost without Spot 4 instances *14 hrs * $0.50 = $28 Allocate 4 instances Job Flow 14 Hours Duration: #2: Cost with Spot 4 instances *7 hrs * $0.50 = $13 + 5 instances * 7 hrs * $0.25 = $8.75 Total = $21.75 Scenario #1 Add 5 Spot Instances Duration: Job Flow 7 Hours Scenario #2 Time Savings: 50% Cost Savings: ~22% Save Time and Money
163
Queue Based Architecture Amazon EC2 Spot Amazon EC2 On-Demand / Reserved Queue Applications
164
Checkpointing
166
30,000+ Cores 95,078 Instance Hours
167
$1,279/hour
168
We are Hiring! FT/Interns: amazon.com/careers Experienced: aws.amazon.com/jobs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.