© 2009 Infosys Technologies Limited Designing scalable applications for cloud Raghavan Subramanian, Associate Vice President, Head of Cloud-computing CoE,

Slides:



Advertisements
Similar presentations
Hello i am so and so, title/role and a little background on myself (i.e. former microsoft employee or anything interesting) set context for what going.
Advertisements

Performance Testing - Kanwalpreet Singh.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Building a Distributed Full-Text Index for the Web S. Melnik, S. Raghavan, B.Yang, H. Garcia-Molina.
Spark: Cluster Computing with Working Sets
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
Managing Data Resources
490dp Synchronous vs. Asynchronous Invocation Robert Grimm.
Overview Of Microsoft New Technology ENTER. Processing....
COMS E Cloud Computing and Data Center Networking Sambit Sahu
Windows Azure for scalable compute and storage SQL Azure for relational storage for the cloud AppFabric infrastructure to connect the cloud.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Platform as a Service (PaaS)
Ch 4. The Evolution of Analytic Scalability
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Computer System Architectures Computer System Software
Cloud Computing By Mihir Chitnis.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Software Architecture
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
Computer Measurement Group, India Optimal Design Principles for better Performance of Next generation Systems Balachandar Gurusamy,
McGraw-Hill/Irwin © The McGraw-Hill Companies, All Rights Reserved BUSINESS PLUG-IN B17 Organizational Architecture Trends.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Introduction to Hadoop and HDFS
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Papers on Storage Systems 1) Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud, SC ) Making Cloud Intermediate Data Fault-Tolerant,
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Server Virtualization
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
Server to Server Communication Redis as an enabler Orion Free
Operating System Principles And Multitasking
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
Azure in a Day Training: Windows Azure Module 1: Windows Azure Overview Module 2: Development Environment / Portal – DEMO: Signing up for Windows Azure.
Unit - 4 Introduction to the Other Databases.  Introduction :-  Today single CPU based architecture is not capable enough for the modern database.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-2.
Background Computer System Architectures Computer System Software.
(re)-Architecting cloud applications on the windows Azure platform CLAEYS Kurt Technology Solution Professional Microsoft EMEA.
Cloud Computing: Pay-per-Use for On-Demand Scalability Developing Cloud Computing Applications with Open Source Technologies Shlomo Swidler.
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Migration of Real Product into Windows Azure Lessons Learned.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Managing Data Resources File Organization and databases for business information systems.
Platform as a Service (PaaS)
Platform as a Service (PaaS)
Software Design and Architecture
Exploring Azure Event Grid
Azure Event Grid with Custom Events
Akshay Tomar Prateek Singh Lohchubh
Fault Tolerance Distributed Web-based Systems
Ch 4. The Evolution of Analytic Scalability
Design pattern for cloud Application
CSE8380 Parallel and Distributed Processing Presentation
Saranya Sriram Developer Evangelist | Microsoft
Cloud Computing Architecture
Presentation transcript:

© 2009 Infosys Technologies Limited Designing scalable applications for cloud Raghavan Subramanian, Associate Vice President, Head of Cloud-computing CoE, Infosys

© 2009 Infosys Technologies Limited Agenda Overview of scalability Scale-up and Scale-out Cloud design considerations Key principles of scale-out design A few techniques for transforming a scale-up design into a scale-out design Scale out Apps on Windows Azure

© 2009 Infosys Technologies Limited Overview of Scalability Scalability Is a property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or its ability to be enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added.

© 2009 Infosys Technologies Limited Scale up/Vertical scalability 1.Scale up/Vertical scalability is adding more memory and CPUs to a single box. 2.One can realize its benefits in scenarios like Large shared memory space, Many dependent threads, Tightly-coupled internal interconnect Small scale apps

© 2009 Infosys Technologies Limited Scale Out/Horizontal scalability 1.Scale Out/Horizontal scalability is adding more boxes of similar memory and CPU 2.One can realize its benefits in scenarios like Work which can be broken up into smaller tasks Activity which can run independently and as an Atomic unit Process large volumes of data

© 2009 Infosys Technologies Limited Law of diminishing returns In economics, diminishing returns (also called diminishing marginal returns) refers to how the marginal production of a factor of production starts to progressively decrease as the factor is increased.economicsmarginalfactor of production According to this relationship, in a production system with fixed and variable inputs (say factory size and labor), each additional unit of the variable input (i.e., man- hours) yields smaller and smaller increases in outputs, also reducing each worker's mean productivity.labor Consequently, producing one more unit of output will cost increasingly more (owing to the major amount of variable inputs being used, to little effect). This concept is also known as the law of diminishing marginal returns or the law of increasing relative cost.

© 2009 Infosys Technologies Limited Concept of Linear Scalability Linear scalability, relative to load or demand, means that with fixed resources, performance decreases at a constant rate relative to load or demand increases. Linear scalability, relative to server resources, means that with a constant load or demand, performance improves at a constant rate relative to changes in resources.

© 2009 Infosys Technologies Limited Scale up Vs Scale Out: which one scores more Scale upScale Out Administration effortEasy to administer one machine. More complex Hardware FailureSingle failure could take large chunk of data. Single failure would not impact all the data. Infinite scalability !!There is a limit to single machine power. Linear and infinite scale. FitmentFor small scale scenarios/applications. For large scale application or multiple application with conflicting resource requirements. Hardware CostBuying bigger machine is expensive. Comes out to be cheaper. $(1 * 64 way processor) >>> $(64* 1 way processors) Software : Software license costSeat based licensing are cheaper. However CPU based licenses would tend to be expensive. Licensing costs can tend to be higher. Scale-out encourages usage of free software.

© 2009 Infosys Technologies Limited Cloud encourages the usage of scale out design by solving some of the scale out issues.. On-demand provisioning Infrastructure management outsourced to the cloud vendor Elastic scale – Go up or down instantaneously Higher level of abstractions with fault-tolerance and resilience built-in Pay-per-Use- Optimize capital expenditure (Hardware and Software) However if apps don’t scale within the data center they certainly won’t scale when just moved to the cloud

© 2009 Infosys Technologies Limited Key Principles of Scale out design Asynchronous, event-driven design Parallelization Divide and Conquer MapReduce /Master-Worker Idempotent operations De-normalized, partitioned data (sharding) Shared nothing architecture Go Stateless Fault-tolerance by redundancy and replication (Design for failure)

© 2009 Infosys Technologies Limited Typical Scale-out design pattern Parallel Event Driven Fault-Tolerance Task A Task B Task C Task D X 5X Publish Task A Controller Task B Task C Task D Status Watch Merge Publish Data Task A* Task B* Task C* Task D* Dead-letter Q Status Watch Merge Publish Compensate X 3X Idempotency Sharding

© 2009 Infosys Technologies Limited Techniques to transform a scale-up design to a scale-out design Analyze the existing code flow - Prepare diagrams – flowcharts, components diagrams Identify logical units of work and extract them into separate components which can work in an independent fashion – Think Contract First design Each thread can be considered as a separate task Extract common functionality to a shared service Make component interaction message or document driven Consolidate Public variable to messages/entities Implement the DTO patterns Re-arrange the code flow - Activity and State diagrams Parallelize Asynchronize Choose NoSQL over RDBMS De-normalize data to reduce firing multiple join-like queries Partition data to form smaller data sets which can Shard Implement message queuing to build loosely coupled components Question yourself, “What if this fails?”. Put your failover mechanisms in place Dead letter queues Retry Handlers Alerts/Notifications

© 2009 Infosys Technologies Limited Scale-out Apps on Windows Azure Windows Azure Web Role Worker Role 1 Web Role LB Web Role Worker Role 3 Web Role Worker Role 2 LB Storage Queues Tables Blobs Jobs Best Practices Break jobs into smaller chunks so as to avoid inefficient workload distribution Spawn multiple threads per role to speed up job execution Rationalize tasks within fewer roles to have more redundancy at same cost For handling large datasets, store dataset file in Blobs with reference information contained in messages pushed to Queues For application operating in phases, store transient messages in Queues and statuses of job/messages in Tables Use affinity groups – keep compute roles and storage closer For debugging, instrument code to trace code execution paths across the application Windows Azure Components Compute – Web and Worker Role Storage – Blobs, Table and Queues

© 2009 Infosys Technologies Limited Thank You Authored by: 1. Bhausaheb Jadhav 2. Raghavan Subramanian 3. Sidharth Subhash Ghag 4. Sonal Arora