Austin Donnelly | July 2010.

Slides:



Advertisements
Similar presentations
4 TIME IT CAPACITY Actual Load Allocated IT-capacities Too Much Power = Unhappy CFO Not Enough Power = Grumpy Customers & Unhappy CEO Load Forecast.
Advertisements

The Microsoft Cloud Azure Platform This presentation incorporates some content from Microsoft.
Azure Services Platform Piotr Zierhoffer. Agenda Cloud? What is Azure? Environment Basic glossary Architecture Element description Deployment.
Simulation and data analysis with Austin Donnelly | July 2010.
Programming languages + tools.NET, Visual Studio, TFS + Git, Java, NodeJS, PHP, Python, Ruby, C++ Microsoft cloud infrastructure PaaS Web Mobile.
Windows Azure for scalable compute and storage SQL Azure for relational storage for the cloud AppFabric infrastructure to connect the cloud.
Platform as a Service (PaaS)
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
Cloud Computing Systems Lin Gu Hong Kong University of Science and Technology Sept. 21, 2011 Windows Azure—Overview.
WSV206. X64 Server $40,000,000$1,000,000$1,000.
Windows Azure Migrating SQL Server Workloads Speaker Title Organization.
Windows Azure Storage Services Saranya Sriram, Technology Evangelist, Microsoft, India.
WINDOWS AZURE STORAGE 11 de Mayo, 2011 Gisela Torres – Windows Azure MVP Aventia-Renacimiento Twitter:
Windows Azure SQL Database and Storage Name Title Organization.
Components of Windows Azure - more detail. Windows Azure Components Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET,
Introduction To Windows Azure Cloud
Customers Live on Windows Azure Platform
MSDN Event. WINDOWS AZURE STORAGE Windows Azure Storage Storage in the Cloud –Scalable, durable, and available –Anywhere at anytime access –Only pay.
Larisa kocsis priya ragupathy
Austin code camp 2010 asp.net apps with azure table storage PRESENTED BY CHANDER SHEKHAR DHALL
1 NETE4631 Using Google Web Services and Using Microsoft Cloud Services Lecture Notes #7.
Bring your own machines, connectivity, software, etc. Complete control Complete responsibility Static capabilities Upfront capital costs for the infrastructure.
Mostafa Abdollahi Mazandaran University Of Science And Technology January 2011.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Overview of Cloud Computing Sven Rosvall ACCU
Windows Azure Conference 2014 Designing Applications for Scalability.
AZR308. Building distributed systems on an abstraction against commodity hardware at Internet scale, composed of multiple services. Distributed System.
T.N.C.Venkata Rangan CEO, Vishwak Solutions Your Data on Cloud.
Building Applications with Windows Azure Storage Brad Calder Director/Architect Microsoft Corporation.
Virtual techdays INDIA │ august 2010 Building & Migrating Web applications using Windows Azure storage Ramaprasanna Chellamuthu │ Developer Evangelist;
Visual Studio Windows Azure Portal Rest APIs / PS Cmdlets US-North Central Region FC TOR PDU Servers TOR PDU Servers TOR PDU Servers TOR PDU.
1 Common Mistakes in Performance Evaluation (1) 1.No Goals  Goals  Techniques, Metrics, Workload 2.Biased Goals  (Ex) To show that OUR system is better.
Windows Azure for scalable compute and storage SQL Azure for relational storage for the cloud AppFabric infrastructure to connect the cloud.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Azure in a Day Training: Windows Azure Module 1: Windows Azure Overview Module 2: Development Environment / Portal – DEMO: Signing up for Windows Azure.
1 Neil Kidd MTC Architect - DPE NeilKidd Neil Kidd MTC Architect - DPE NeilKidd.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
Windows Azure Boot CampWindowsAzureBootCamp.com. Windows Azure Boot CampWindowsAzureBootCamp.com.
Making a Difference with Azure Storage Solutions Dudu Sinai.
Azure.
MIX 09 11/30/2017 5:54 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Platform as a Service (PaaS)
Cloud Computing for Science
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Build /26/2018 6:17 AM Building Resilient, Scalable Services with Microsoft Azure Service Fabric Érsek © 2015 Microsoft Corporation.
Platform as a Service (PaaS)
Business Continuity & Disaster Recovery
Design considerations for storing data in the Cloud
SharePoint Solutions Architect, Protiviti
Network Performance and Quality of Service
Cloud Data platform (Cloud Application Development & Deployment)
Windows Azure Migrating SQL Server Workloads
Design and Implement Cloud Data Platform Solutions
Using Azure Tables In this module- Learn how to store data in tables
Business Continuity & Disaster Recovery
Power Apps & Flow for Microsoft Dynamics SL
CNIT131 Internet Basics & Beginning HTML
03 | Data Storage Bruno Terkaly | Technical Evangelist
Windows Azure 講師: 李智樺, Ruddy Lee
Outline Virtualization Cloud Computing Microsoft Azure Platform
Windows Azure Cloud.
Ch 4. The Evolution of Analytic Scalability
microsoft cloud platform: enterprise-class architecture
Cloud computing mechanisms
Saranya Sriram Developer Evangelist | Microsoft
Technical Capabilities
MS AZURE By Sauras Pandey.
5 Azure Services Every .NET Developer Needs to Know
Building global and highly-available services using Windows Azure
Microsoft Virtual Academy
Presentation transcript:

Austin Donnelly | July 2010

Automated observations of the world BIG DATA

http://www.psl.wisc.edu/projects/large/cms Endcap for CMS (Compact Muon Solenoid), Large Hadron Collider 2million hair-like wires for capture, 310,000 channels of fast detectors 1GB/s in proton mode, 2GB/s heavy-ion mode: 15 petabytes/year Data flow graphic: http://cerncourier.com/cws/article/cern/31529

Machine-generated data BIG SIMULATIONS

New VM Polo wind tunnel airflow simulation

Simulations Pool fire simulation, 2040 nodes on Sandia National Lab’s Red Storm supercomputer (from SC05) Pool fire simulation, 2040 nodes on Sandia National Lab’s Red Storm supercomputer, from SC05 http://www.sandia.gov/NNSA/ASC/pubs/sc05/prog_posters/Poster-Visualization-halfsize.pdf

The unwitting cyborg Human MACHINES

Ahn and Dabbish 2004 Extension by MSFT: http://research.microsoft.com/apps/pubs/default.aspx?id=70638

1770 by Wolfgang von Kempelen 1820: unmasked by Londoner Robert Willis 1854: destroyed by fire

Cloud Computing Resources What for? Statistical analysis Simulation Mechanical Turk / ESP Game Where from? Departmental cluster Project based Windows Azure

Windows Azure

Windows Azure Key features: Scalable compute Scalable storage Pay-as-you-go: CPU, disk, network Higher-level API: PaaS

Application Development Cloud models Software as a Service Platform as a Service Infrastructure as a Service “SaaS” “PaaS” “IaaS” consume it build on it migrate to it Email Application Development Caching CRM Networking Collaborative Decision Support Security File Web ERP Technical Streaming System Mgmt

Your Applications Service Bus Workflow Database Analytics Access Control … Reporting Data Sync Compute Storage Manage …

MANAGE

Declarative Services Web Role Worker Role Web Role Worker Role LB Storage

Fabric Controller Node can be a VM or a physical machine Control VM VM VM WS08 Hypervisor Control Agent Service Roles Out-of-band communication – hardware control WS08 Load-balancers In-band communication – software control Node can be a VM or a physical machine Switches Highly-available Fabric Controller

Hardware specs Hardware: 64-bit Windows Server 2008 Choose from four different VM sizes: S: 1x 1.6GHz, medium IO, 1.75GB / 250GB M: 2x 1.6GHz, high IO, 3.5GB / 500 GB L: 4x 1.6GHz, high IO, 7GB / 1000 GB XL: 8x 1.6GHz, high IO, 14GB / 2000 GB

Blobs, Queues, Tables Storage

Blobs Example: Account – sally Container – music http://<Account>.blob.core.windows.net/<Container>/<BlobName> Example: Account – sally Container – music BlobName – rock/rush/xanadu.mp3 URL: http://sally.blob.core.windows.net/music/rock/rush/xanadu.mp3 Account Container Blob IMG001.JPG pictures IMG002.JPG sally movies MOV1.AVI

Blobs Block Blob vs. Page Blob Snapshots Copy xDrive Geo-replication: Dublin, Amsterdam, Chicago, Texas, Singapore, Hong Kong CDN: 18 global locations

Azure Queues GetMessage (Timeout) RemoveMessage PutMessage Worker Role HTTP/1.1 200 OK Transfer-Encoding: chunked Content-Type: application/xml Date: Tue, 09 Dec 2008 21:04:30 GMT Server: Nephos Queue Service Version 1.0 Microsoft-HTTPAPI/2.0 <?xml version="1.0" encoding="utf-8"?> <QueueMessagesList> <QueueMessage> <MessageId>5974b586-0df3-4e2d-ad0c-18e3892bfca2</MessageId> <InsertionTime>Mon, 22 Sep 2008 23:29:20 GMT</InsertionTime> <ExpirationTime>Mon, 29 Sep 2008 23:29:20 GMT</ExpirationTime> <PopReceipt>YzQ4Yzg1MDIGM0MDFiZDAwYzEw</PopReceipt> <TimeNextVisible>Tue, 23 Sep 2008 05:29:20GMT</TimeNextVisible> <MessageText>PHRlc3Q+dG...dGVzdD4=</MessageText> </QueueMessage> </QueueMessagesList> PutMessage Worker Role Queue Msg 1 Msg 2 Msg 2 Msg 1 Web Role POST http://myaccount.queue.core.windows.net/myqueue/messages DELETE http://myaccount.queue.core.windows.net/myqueue/messages/messageid?popreceipt=YzQ4Yzg1MDIGM0MDFiZDAwYzEw Msg 3 Msg 4 Worker Role Worker Role Msg 2

Tables Simple entity store Entity is a set of properties PartitionKey, RowKey, Timestamp are required (PartitionKey, RowKey) defines the key PartitionKey controls the scaling Designed for billions of rows PartitionKey controls locality RowKey provides uniqueness

Partitions Server A Server A Server B Action Action Animation Comedy PartitionKey (Genre) RowKey (Title) Timestamp ReleaseDate Action Fast & Furious … 2009 The Bourne Ultimatum 2007 Animation Open Season 2 The Ant Bully 2006 Comedy Office Space 1999 SciFi X-Men Origins: Wolverine War Defiance 2008 PartitionKey (Genre) RowKey (Title) Timestamp ReleaseDate Action Fast & Furious … 2009 The Bourne Ultimatum 2007 Animation Open Season 2 The Ant Bully 2006 Server B Table = Movies [Comedy- Western) Server A [Action - Comedy) Server A Table = Movies PartitionKey (Genre) RowKey (Title) Timestamp ReleaseDate Comedy Office Space … 1999 SciFi X-Men Origins: Wolverine 2009 War Defiance 2008

No Referential Integrity Tables What tables don’t do What tables can do Not relational  Cheap  No Referential Integrity  Very Scalable  No Joins  Flexible  Limited Queries  Durable  No Group by  No Aggregations  No Transactions 

Scalability targets 100TB storage per account (can ask for more) Blobs: 200GB max block-blob size 1TB max page-blob size Tables: max 255 properties, totalling 1MB Queues: 8KB messages, 1 week max age

TACTICS

HPC jobs Use worker roles Maybe web-role as front-end Good for parameter sweeps Increase the invisibility time (max 2hrs) Maybe web-role as front-end

Interpreters Python, Perl etc. IronPython Remember to upload runtime dlls Think about security!

Data management Blobs for large input files: Dump outputs to a blob upload may take a while, hopefully one-off http://blogs.msdn.com/b/windowsazurestorage/archive/2010/04/17/windows-azure-storage-explorers.aspx Dump outputs to a blob Reduce output to graphable size

Azure MODIS

Azure MODIS implementation

Data ANALYSIS

Data curation Where did your data come from? How was it processed? Do you have the original, master data? Can you regenerate derived data? Keep the data Keep the code Use a revision control system

Accuracy vs. Precision Accurate Not accurate X Precise X X X X Not precise X X X X X X X

Common mistakes in eval 1/2 No goals Or biased goals (them vs. us) Unsystematic approach Don’t just measure stuff at random Analysis without understanding the problem Up to 40% of effort might be in defining problems Incorrect metrics Right metric is not always the convenient one Wrong workload Wrong technique Measurement, simulation, emulation, analytics? Missed parameter or factor Bad experimental design Eg factors which interact not being varied sensibly together Wrong level of detail From Jain pg 17

Common mistakes in eval 2/2 No analysis Measurement is not the endgame Bad analysis No sensitivity analysis Ignoring errors Outliers: let the wrong ones in Assume no changes in the future Ignore variability: mean is good enough Too complex model Bad presentation of results Ignore social aspects Omit assumptions and limitations

Steps for a good eval State goals, define boundaries Select metrics List system and workload parameters Select factors and their values Select evaluation technique Select workload Design and run experiments Analyse and interpret the data Present results. Iterate if needed.

Books

http://www.azure.com/ THANKS!