Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.

Similar presentations


Presentation on theme: "INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014."— Presentation transcript:

1 INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014

2 Announcements PA2 grading = started! Many are done. – Let me know if you have any questions – Please check Canvas! For messages from me Start PA3 ASAP! – PA2 is conceptually harder but… – PA3 = lowest grade last quarter, “real server-side” – Key to success = Start Early! Ask Early!

3 Announcements Startup weekend! – This weekend! Highly recommended! – Help in finding jobs/internships – http://uw.startupweekend.org/ http://uw.startupweekend.org/ I’ll give you Extra Credit – Doesn’t matter where I add it but I’ll add to PA3 – +5pts if you go to demo day (Sunday) – +20pts if you participate and write code to make something! I’m going :D Email me a write up (100 words) +screenshot if you built a prototype!

4 PA2 Josh Edwards – http://coertan.azurewebsites.net/ fit everything (list/trie) spell check (steve jozs) user query Super fast, just works. Nice! I owe you lunch : )

5 Information Architecture on Azure Recall Amazon Web Services – EC2, S3, RDS, CloudFront, Load Balancer, etc Azure – A lot of similarities EC2 => Virtual Machines S3 => Blob RDS => SQL Azure CloudFront => App Services CDN Load Balancer = built in

6 New Services on Azure * either Azure only or we didn’t talk about in AWS

7 Compute Virtual Machine = EC2 Web Site – Easy way to host & scale websites – Very little customization Mobile Service (New) – Backend for mobile apps – Database, Auth, Push to client, Scheduled jobs

8 Compute Cloud Service Web Role (User-facing instance) OnStart() – initialization such as copy files such as wiki titles & read into Trie so your instance will start with everything ready! – Worker Role (off-line processing) OnStart() – initialization Run() – do work here Open VS and try this!

9 Storage Blob = same as S3 SQL Azure = same as RDS

10 Table Storage Large key-value pair – Key = {PartitionKey, RowKey} – Put different partitions on different machines = improve throughput Read more - http://msdn.microsoft.com/en-us/library/windowsazure/hh508997.aspx

11 Choosing PartitionKey/RowKey Clustered index (unique) {PartitionKey,RowKey} Choose carefully depending on query patterns For example: Amazon Books {Name, Price, PublishDate} – Query by date (get all books from 02/10/13 – 04/10/13) PartitionKey = year, RowKey = month+date – Query by Price (get all books between 10-30 USD) PartitionKey = price, RowKey = name – Query by Name PartitionKey = name – Query by Category (get all horror books) ParititionKey = category, RowKey = name

12 Queue Storage Message passing between – Web – Worker – Worker Off-line jobs. Off-line => async, does not block user

13 HDInsight Similar to AWS ElasticMapReduce Big Data processing MapReduce – programming model to process large data sets in parallel distributed across machines in a cluster Algorithm – Map Distribute input via PartitionKey to N machines – Reduce Collect the output of Map-step and combines to final output For example: – Sort 10 billion numbers – Map step, key = number, map to 10 machines, 0-1B => #1, etc. optional

14 HDInsight Let’s try it! Your first big-data processing? http://www.windowsazure.com/en-us/documentation/articles/hdinsight-get-started/ – Go to Azure portal, create HDInsight cluster, 1 data node, password = “Password344!”, WestUS, make sure you use the same storage account as where your wiki data set is for PA2 – Install Microsoft Web Platform (http://go.microsoft.com/fwlink/p/?linkid=320376&clcid=0x409)http://go.microsoft.com/fwlink/p/?linkid=320376&clcid=0x409 – Load Windows Azure Power Shell and type in “Add-AzureAccount” and login – Type the following commands to run a job & download to /example/data/… Part 2: Run this on your wiki dataset! $subscriptionName = "azpad286JYQ7806" $clusterName = "chunkaiwinfo344" $wordCountJobDefinition = New-AzureHDInsightMapReduceJobDefinition -JarFile "wasb:///example/jars/hadoop-examples.jar" - ClassName "wordcount" -Arguments "wasb:///example/data/gutenberg/davinci.txt", "wasb:///example/data/WordCountOutput" Select-AzureSubscription $subscriptionName $wordCountJob = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $wordCountJobDefinition Wait-AzureHDInsightJob -Job $wordCountJob -WaitTimeoutInSeconds 3600 Get-AzureHDInsightJobOutput -Cluster $clusterName -JobId $wordCountJob.JobId -StandardError $storageAccountName = "uwinfo344" $containerName = "chunkaiwinfo344-1" Select-AzureSubscription $subscriptionName $storageAccountKey = Get-AzureStorageKey $storageAccountName | %{ $_.Primary } $storageContext = New-AzureStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey Get-AzureStorageBlobContent -Container $ContainerName -Blob example/data/WordCountOutput/part-r-00000 -Context $storageContext -Force

15 Questions?


Download ppt "INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014."

Similar presentations


Ads by Google