AMAZON CLOUD SERVICES – A WALKTHROUGH FOR COMPARISON TO GAE
Installing developer tools for Amazon's Web Services (AWS) into Eclipse Download Eclipse Enterprise Edition – Start Eclipse, use Help -> Install New Software – – Probably can omit the AWS tools for Android Restart Eclipse, go to Workbench
Creating a HelloWorld app Create a new AWS Java Web Project (orange box icon) Notice the project structure – Follows current web application standards – (Google App Engine probably will get up to date!) – Your jsps will go in the WebContent folder Right click on WebContent -> New -> JSP Insert
Tour through an AWS app
Running your HelloWorld app Choose Run -> Run As -> Run on Server Manually create a new server – Apache / Tomcat 6 – You probably will need to download and install using the button provided – Wait for Tomcat to install and start up Open up a web browser Go to – Edit the project name to match yours (HelloWorld)
Amazon Elastic Beanstalk Load-balanced web hosting platform – Very similar to Google App Engine environment – Somewhat more configurable in terms of specifying when number of servers should grow
Create a deployment environment
Ready to deploy? Choose Run -> Run As -> Run on Server Choose your new server environment Finally, hit the app in the Eclipse mini browser
All this and more… Amazon provides cloud-based storage for your data – But you need to sign up for an account – And provide them a credit card – After you sign up, you get an accessKey & secretKey – Put these in your Eclipse project within Java Resources / src / AwsCredentials.properties Then you can hit and see the resources that Amazon wants you to use online
The resources I currently am using… S3 Buckets: Basically a blob SimpleDB: non-RDBMS data store EC2: Computation servers
Amazon S3 Buckets A "bucket" is just a place to store chunks of data Every chunk has a unique URL You store data to the URL; you retrieve data later You can set restrictions to control whether other people can retrieve the content by URL (with/without authenticating first) Every bucket is associated with a specific region – High replication within that region (for reliability) – No replication outside that region (for legal reasons)
Creating a bucket // instantiate an S3 connection as shown in default index.jsp // (I'd prefer to move it into a separate class, similar to // how PMF is created for Google App Engine) String BUCKET = "cs496-bucket"; s3.listBuckets(); // you can use this to see existing buckets if (!s3.doesBucketExist(BUCKET)) s3.createBucket(BUCKET);
Storing an object in a bucket Key: Value: <% if (request.getParameter("key") != null) { s3.putObject( BUCKET, request.getParameter("key"), new java.io.ByteArrayInputStream(request.getParameter( "value").getBytes()), null); } // note that you use InputStreams to read/write objects %>
Listing all the objects in a bucket out.write("Existing Data: "); ObjectListing listing = s3.listObjects(BUCKET); for (S3ObjectSummary item : listing.getObjectSummaries()) { out.write(" "); out.write(item.getKey().replaceAll("<", "<")); out.write(" "); S3Object object = s3.getObject(BUCKET, item.getKey()); java.io.InputStream in = object.getObjectContent(); java.io.ByteArrayOutputStream bos = new java.io.ByteArrayOutputStream(); byte[] buffer = new byte[2048]; int nread; while ((nread = in.read(buffer)) > 0) bos.write(buffer, 0, nread); in.close(); String value = new String(bos.toByteArray()); out.write(value.replaceAll("<", "<")); out.write(" "); } out.write(" ");
SimpleDB Non-RDBMS data storage, very similar to the feature set of Google App Engine – No joins, queries with limited filtering – Limited transactions – May have temporary inconsistency – However, very highly scalable Every SimpleDB is in a certain region A SimpleDB is subdivided into domains – Similar to the concept of an entity kind (or an object- oriented class) – To repeat: fairly similar to using JDO on GAE
Amazon Elastic Cloud Compute (EC2) Analogous to Google App Engine instances Except that you have much more control – You control how many machines you lease – You control when the machines are turned on – You control how powerful the machines should be – You control which operating system they run Selected from existing Amazon Machine Instances (AMIs), which are virtual machine images – You have root access So you can ssh into the server and do anything Once you have machines, you can deploy onto them
Example: 1. Choosing an AMI
Example: 2. Choosing its capabilities
EC2 Instance types Ranges from micro (i.e., free)… – < 1GB of memory, approximately CPU of dual GHz 2007 Opteron or 2007 Xeon processor "for short periodic bursts" …up to M3 Double Extra Large Instance… – Around 30 GB of RAM, and up to 13 times the compute power of a micro EC2 instance … or to even higher amounts of RAM & CPU
Example: 3. Create a keypair (Required to load code onto EC2 instance)
Example: 4. Launch the instance (After choosing defaults for other options)
Waiting for the instance to start up
Log in via ssh
Do anything you please with the server
Summary: Comparison to GAE Elastic Beanstalk: Similar to GAE appengine S3: Similar to GAE datastore blobs SimpleDB: Similar to JDO on GAE datastore EC2: Similar to GAE backends Except that in all cases, Amazon gives you more control and complexity