Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture: Amazon AWS Instructor: Weidong Shi (Larry), PhD

Similar presentations


Presentation on theme: "Lecture: Amazon AWS Instructor: Weidong Shi (Larry), PhD"— Presentation transcript:

1 Lecture: Amazon AWS Instructor: Weidong Shi (Larry), PhD
COSC6376 Cloud Computing Lecture: Amazon AWS Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston

2 Outline AWS Elastic Compute Cloud (EC2) Simple Storage Services (S3)
Elastic Mapreduce CloudFront Simple Queue Service

3 Reading Assignment Google bigtable Thursday in class

4 Overview Amazon Web Services Infrastructure As a Service
Amazon Simple Storage Service Amazon Elastic Compute Cloud Amazon Simple Queue Service Amazon SimpleDB Amazon CloudFront Commerce As a Service Amazon Flexible Payments Service Fulfillment Web Service People As a Service Amazon Mechanical Turk Alexa Web Services Alexa Web Information Service Alexa Top Sites Alexa Site Thumbnail Alexa Web Search Platform

5 EC2

6 EC2 A typical example of utility computing Functionality:
launch instances with a variety of operating systems (windows/linux) load them with your custom application environment (customized AMI) Full root access to a blank Linux machine manage your network’s access permissions run your image using as many or few systems as you desire (scaling up/down)

7 Backyard… Powered by Xen – Virtual Machine Different from Vmware
- high performance Hardware contributions by Intel (VT-x/Vanderpool) and AMD (AMD-V) Supports “Live Migration” of a virtual machine between hosts We will dedicate one class to virtualization ...

8 Amazon Machine Images Public AMIs: Use pre-configured, template AMIs to get up and running immediately. Choose from Fedora, Movable Type, Ubuntu configurations, and more Private AMIs: Create an Amazon Machine Image (AMI) containing your applications, libraries, data and associated configuration settings Paid AMIs: Set a price for your AMI and let others purchase and use it (Single payment and/or per hour) AMIs with commercial DBMS

9 Normal way to use EC2 For web applications For data intensive analysis
Run your base system in minimum # of VMs Monitoring the system load (user traffic) Load is distributed to VMs If over some threshold  increase # of VMs If lower than some thresholds  decrease # of VMs For data intensive analysis Estimate the optimal number of nodes (tricky!) Load data Start processing

10 Tools (most are for web apps)
Elastic Block Store: mountable storage, local to each VM instance Elastic IP address: programmatically remap public IP to any instance Virtual private cloud: bridge private cloud and AWS resources CloudWatch: monitoring EC2 resouces Elastic load balancing: automatically distribute incoming traffic across instances

11 Type of instances Standard instances (micro, small, large, extra)
E.g., small: 1.7GB Memory, 1EC2 Compute Unit (1 2ghz core?), 160 GB instance storage High-CPU instances More CPU with same amount of memory

12 AMIs with special software
IBM DB2, Informix Dynamic Server, Lotus Web Content Management, WebSphere Portal Server MS SQL Server, IIS/Asp.Net Hadoop Open MPI Apache web server MySQL Oracale 11g

13 Pricing (2010)

14 Preparation Security credentials Access Key ID Security access key
X.509 certificate “create a certificate” Download the private key and the certificate (i.e., the public key) and save them to ~/.ec2/

15 AWS keys AWS access key This is actually username . It is alphanumeric text string that uniquely identifies user who owns account. No two accounts can have same AWS Access Key. AWS secret key This key plays role of password . It's called secret because it is assumed to be known to owner only.

16 AWS keys

17 Preparation Methods for accessing EC2 Command line tools
boto python library

18 Preparation Ec2 command line tools have been installed at /usr/local/ec2 You have to set up env variables JAVA_HOME EC2_HOME Add $EC2_HOME/bin to PATH EC2_PRIVATE_KEY=~/.ec2/pk-XXXXX.pem EC2_CERT=~/.ec2/cert-XXXXXXX.pem Both pk*.pem and cert*.perm are from the x.509 certificate you downloaded from your account)

19 Ready to start! Check AMIs
ec2-describe-images –o self –o amazon | grep machine|less Looking for … IMAGE ami-3c47a ec2-public-images/getting-started.manifest.xml amazon available public i386

20 Generate key pair 1. ec2-add-keypair gsg-keypair
2. Paste the following part to the file ~/.ec2/id_rsa-gsg-keypair -----BEGIN RSA PRIVATE KEY----- …. -----END RSA PRIVATE KEY----- 3. chmod 600 id_rsa-gsg-keypair

21 Run an instance ec2-run-instances ami-3c47a355 –k gsg-keypair
ec2-describe-instances i-395bf151 RESERVATION r-29f default INSTANCE i-395bf ami-3c47a pending gsg-keypair m1.small T05:16: us-east-1b aki-a71cf9ce ari-a51cf9cc monitoring-disabled RESERVATION r-29f default INSTANCE i-395bf ami-3c47a ec compute-1.amazonaws.com domU AC-33.compute-1.internal running gsg-keypair 0m1.small T05:16: us-east-1b aki-a71cf9ce ari-a51cf9cc monitoring-disabled

22 Get connected Authorize accesses to ports Connect to your instance
ec2-authorize default –p 22 ec2-authorize default –p 80 -- enable ssh and web Connect to your instance ec2-get-console-output i-395bf151 ssh -i ~/.ec2/id_rsa-gsg-keypair

23 Clean up Terminate the instance ec2-terminate-instances i-395bf151
Or in the instance, run shutdown –h now

24 Use boto to access EC2 Create connection
>>> from boto.ec2.connection import EC2Connection >>> conn = EC2Connection('<aws access key>', '<aws secret key>') Or if you have set the keys in AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY >>> import boto >>> conn = boto.connect_ec2()

25 Images >>> images = conn.get_all_images() >>> images >>> for i in range(len(images)): ... print i, images[i].location

26 Run instance >>> image = images[xxx] # some selected image >>> reservation = image.run() # have various parameter settings, such as key, security group, instance type, etc. >>> reservation.instances [Instance:i e] >>> instance = reservation.instances[0] >>> instance.state u'pending‘ >>> instance.update() u'pending' >>> # wait a few seconds to minutes u'running'

27 Retrieve information of instance
>>> instance.dns_name u'ec z-2.compute-1.amazonaws.com' >>> instance.public_dns_name >>> instance.private_dns_name u'domU z-2.compute-1.internal'

28 Run multiple instances
>>> reservation.image.run(2,2,'gsg-keypair') >>> reservation.instances [Instance:i-5f618536, Instance:i-5e618537] >>> for i in reservation.instances: ... print i.status u'pending' >>>

29 Terminate instances >>> instance.stop() >>> instance.update() >>> instance.state u'shutting-down' >>> # wait a minute u'terminated' For multiple instances >>> reservation.stop_all() >>> instances = conn.get_all_instances() >>># then check each instance

30 Security Set launch permission for private AMIs
image.get_launch_permission() image.set_launch_permission(list_of_AWS_user_IDs) image.remove_launch_permission(list_of_AWS_user_IDs) Image.reset_launch_permission()

31 Security >>> sg = rs[1] >>> sg.name u'default' >>> sg.rules [IPPermissions:tcp( ), IPPermissions:udp( ), IPPermissions:icmp(-1--1)] >>>

32 Create a security group
>>> web = conn.create_security_group('apache', 'Our Apache Group') >>> web SecurityGroup:apache >>> web.authorize('tcp', 80, 80, ' /0') True >>> web.authorize(ip_protocol='tcp', from_port=22, to_port=22, cidr_ip=' /32')

33 Revoke permission >>> web.rules [IPPermissions:tcp(80-80), IPPermissions:tcp(22-22)] >>> web.revoke('tcp', 22, 22, cidr_ip=' /32') True >>> web.rules [IPPermissions:tcp(80-80)] >>>

34 Simple AWS S3 Tutorial

35 S3 Write,read,delete objects 1byte-5gb
Namespace: buckets, keys, objects Accessible using URLs

36 S3 – quick review Objects are organized in a two-level directory
Bucket container of objects Global unique name Key Like file names Unique in the same bucket Object Indexed by (bucket, key)

37 # of objects stored 14 Billion 10 Billion 5 Billion 800 Million
August 06 April 07 October 07 January 08

38 S3 namespace Amazon S3 bucket bucket object object object object

39 S3 namespace Amazon S3 mculver-images media.mydomain.com Beach.jpg
2005/party/hat.jpg img1.jpg img2.jpg public.blueorigin.com index.html img/pic1.jpg

40 Accessing objects Bucket: my-images, key: jpg1, object: a jpg image
accessible with mapping your subdomain to S3 with DNS CNAME configuration e.g. media.yourdomain.com  media.yourdomain.com.s3.amazonaws.com/

41 Access control Access log Objects are private to the user account
Authentication Authorization ACL: AWS users, users identified by , any user … Digital signature to ensure integrity Encrypted access: https

42 Open Source Backup

43 Access methods Python library Boto
Including access methods for almost all AWS services

44 S3 Programming tools Language API Style Tools Java REST
Java compiler and runtime that supports JDK or later. C# A C# compiler and runtime environment that supports .Net Version 2.0 (e.g., Visual C# 2005 or 2008 Express Edition). Perl Perl 5 with CPAN Digest::SHA1, Bundle::LWP, and XML::Simple modules (go to and PHP The PHP example uses the base installation of PHP5. Python Boto

45 Check out AWS Developer Resource Center, for more programming examples
We will take a look at boto library

46 Create a connection >>> from boto.s3.connection import S3Connection >>> conn = S3Connection('<aws access key>', '<aws secret key>') These two keys can be found in your security credentials

47 Keys If you have set the keys in AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY >>> import boto >>> conn=boto.connect_s3()

48 Creating a bucket >>> bucket = conn.create_bucket(‘mybucket’) Note that mybucket is globally (in the entire S3 system) uniuqe

49 Storing data >>> from boto.s3.key import Key >>> k = Key(bucket) >>> k.key = 'foobar' >>> k.set_contents_from_string('This is a test of S3')

50 Retrieve data >>> import boto >>> c = boto.connect_s3() >>> b = c.create_bucket('mybucket') # substitute your bucket name here >>> from boto.s3.key import Key >>> k = Key(b) >>> k.key = 'foobar' >>> k.get_contents_as_string() 'This is a test of S3'

51 Work on files >>> k = Key(b) >>> k.key = 'myfile' >>>k.set_contents_from_filename('foo.jpg') >>> k.get_contents_to_filename('bar.jpg')

52 Check all created buckets
>>> rs = conn.get_all_buckets() Rs is a list of buckets >>> len(rs) >>> for b in rs: … print b.name … Listing of all available buckets

53 Set access control Set public readable for entire bucket
>>> b.set_acl('public-read') For one object >>> b.set_acl('public-read‘, ‘foobar’) Or if k is a Key >>>k.set_acl(‘public-read’)

54 Meta data with objects >>> k = Key(b) >>> k.key = 'has_metadata' >>> k.set_metadata('meta1', 'This is the first metadata value') >>> k.set_metadata('meta2', 'This is the second metadata value') >>>k.set_contents_from_filename('foo.txt') >>> k = b.lookup('has_metadata) >>> k.get_metadata('meta1') 'This is the first metadata value'

55 Amazon Elastic-mapreduce

56 Elastic Mapreduce Based on hadoop AMI Data stored on S3 “job flow”

57 Example elastic-mapreduce --create --stream \ --mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py \ --input s3://elasticmapreduce/samples/wordcount/input --output s3://my-bucket/output --reducer aggregate

58 Wiki page access statistics

59 Elsatic mapreduce elastic-mapreduce is a command-line interface to Amazon's elastic mapreduce

60 Launching a virtual hadoop cluster
$ elastic-mapreduce --create --name “Wiki log crunch” --alive --num-instances –instance-type c1.medium 20 Created job flow <job flow id> The --alive option tells the job flow to keep running even when it has finished all its steps.

61 Hadoop NameNode JobTracker DataNode + TaskTracker

62 Add a step $ elastic-mapreduce --jobflow <jfid> --stream \ --step-name “Wiki log crunch” \ --input s3n://dsikar-wikilogs-2009/dec/ \ --output s3n://dsikar-wikilogs-output/21 \ --mapper s3n://dsikar-wiki-scripts/wikidictionarymap.pl \ --reducer s3n://dsikar-wiki-scripts/wikireduce.pl \ public dns>:9100

63 S3cmd s3cmd is a command-line interface to Amazon's S3 cloud-storage.
Download

64 s3cmd # make bucket $ s3cmd mb s3://dsikar-wikilogs # put log files $ s3cmd put pagecounts *.gz s3://dsikar-wikilogs/dec $ s3cmd put pagecounts *.gz s3://dsikar-wikilogs/apr # list log files $ s3cmd ls s3://dsikar-wikilogs/ # put scripts $ s3cmd put *.pl s3://dsikar-wiki-scripts/ # delete log files $ s3cmd del --recursive --force s3://dsikar-wikilogs/ # remove bucket $ s3cmd rb s3://dsikar-wikilogs/

65 Elastic mapReduce --create --list --jobflow --describe --stream --terminate

66 Output files part-00000 part-00001 part (...)

67 Aggregation

68 DNA sequencing example

69 Word count example

70 ItemSimilarity

71 CloudFront

72 CloudFront For content delivery: distribute content to end users with a global network of edge locations. “Edges”: servers close to user’s geographical location Objects are organized into distributions Each distribution has a domain name Distributions are stored in a S3 bucket

73 Edge servers US EU Hongkong Japan
US and EU are partitioned to different regions Hongkong Japan

74 Use cases Hosting your most frequently accessed website components
Small pieces of your website are cached in the edge locations, and are ideal for Amazon CloudFront. Distributing software distribute applications, updates or other downloadable software to end users. Publishing popular media files If your application involves rich media – audio or video – that is frequently accessed

75 Simple Queue Service

76 Simple Queue Service Store messages traveling between computers
Make it easy to build automated workflows Implemented as a web service read/add messages easily Scalable to millions of messages a day

77 Some features Message body : <8Kb in any format
Message is retained in queues for up to 4days Messages can be sent and read simultaneously Can be “locked”, keeping from simultaneous processing Accessible with SOAP/REST Simple: Only a few methods Secure sharing

78 A typical workflow

79 Workflow with AWS


Download ppt "Lecture: Amazon AWS Instructor: Weidong Shi (Larry), PhD"

Similar presentations


Ads by Google