Research Directions in Internet-scale Computing Manweek 3rd International Week on Management of Networks and Services San Jose, CA Randy H. Katz

Slides:



Advertisements
Similar presentations
Sabyasachi Ghosh Mark Redekopp Murali Annavaram Ming-Hsieh Department of EE USC KnightShift: Enhancing Energy Efficiency by.
Advertisements

Distributed Data Processing
Welcome to Middleware Joseph Amrithraj
The Datacenter Needs an Operating System Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica.
Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
Cloud Computing to Satisfy Peak Capacity Needs Case Study.
Introduction to DBA.
Cloud Computing Data Centers Dr. Sanjay P. Ahuja, Ph.D FIS Distinguished Professor of Computer Science School of Computing, UNF.
Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached Bohua Kou Jing gao.
L-26 Cluster Computer (borrowed from Randy Katz, UCB)
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet-scale Computing: The Berkeley RADLab Perspective Wayne State University Detroit, MI Randy H. Katz 25 September 2007.
Datacenter Power State-of-the-Art Randy H. Katz University of California, Berkeley LoCal 0 th Retreat “Energy permits things to exist; information, to.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Virtualization in Data Centers Prashant Shenoy
COS 461: Computer Networks
1© Copyright 2015 EMC Corporation. All rights reserved. SDN INTELLIGENT NETWORKING IMPLICATIONS FOR END-TO-END INTERNETWORKING Simone Mangiante Senior.
Justin Meza Qiang Wu Sanjeev Kumar Onur Mutlu Revisiting Memory Errors in Large-Scale Production Data Centers Analysis and Modeling of New Trends from.
CS : Creating the Grid OS—A Computer Science Approach to Energy Problems David E. Culler, Randy H. Katz University of California, Berkeley August.
Randy H. Katz 10 February 2009 Power Management in Computing Systems EE290N-3 Contemporary Energy Issues Randy H. Katz
Router Architectures An overview of router architectures.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Router Architectures An overview of router architectures.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Introduction To Windows Azure Cloud
Copyright 2009 Fujitsu America, Inc. 0 Fujitsu PRIMERGY Servers “Next Generation HPC and Cloud Architecture” PRIMERGY CX1000 Tom Donnelly April
Protocols and the TCP/IP Suite
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
IT Infrastructure Chap 1: Definition
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Challenges towards Elastic Power Management in Internet Data Center.
LAN Switching and Wireless – Chapter 1 Vilina Hutter, Instructor
Copyright © 2002 Intel Corporation. Intel Labs Towards Balanced Computing Weaving Peer-to-Peer Technologies into the Fabric of Computing over the Net Presented.
CS 4396 Computer Networks Lab Router Architectures.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Windows Azure Virtual Machines Anton Boyko. A Continuous Offering From Private to Public Cloud.
VMware vSphere Configuration and Management v6
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Web Technologies Lecture 13 Introduction to cloud computing.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Extreme Scale Infrastructure
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Connected Infrastructure
CIS 700-5: The Design and Implementation of Cloud Networks
Organizations Are Embracing New Opportunities
Overview: Cloud Datacenters II
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Overview: Cloud Datacenters
Green cloud computing 2 Cs 595 Lecture 15.
Large Distributed Systems
Connected Infrastructure
Physical Architecture Layer Design
GGF15 – Grids and Network Virtualization
The University of Adelaide, School of Computer Science
Computer software.
HC Hyper-V Module GUI Portal VPS Templates Web Console
CLUSTER COMPUTING.
Internet and Web Simple client-server model
Technical Capabilities
Internet Protocols IP: Internet Protocol
Presentation transcript:

Research Directions in Internet-scale Computing Manweek 3rd International Week on Management of Networks and Services San Jose, CA Randy H. Katz 29 October 2007

2 Growth of the Internet Continues … billion in 2Q % of world population 225% growth

3 Close to 1 billion cell phones will be produced in 2007 Mobile Device Innovation Accelerates …

4 These are Actually Network- Connected Computers!

Announcements by Microsoft and Google Microsoft and Google race to build next-gen DCs –Microsoft announces a $550 million DC in TX –Google confirm plans for a $600 million site in NC –Google two more DCs in SC; may cost another $950 million -- about 150,000 computers each Internet DCs are a new computing platform Power availability drives deployment decisions

6 Internet Datacenters as Essential Net Infrastructure

7

8 Datacenter is the Computer Google program == Web search, Gmail,… Google computer == Warehouse-sized facilities and workloads likely more common Luiz Barroso’s talk at RAD Lab 12/11/06 Sun Project Blackbox 10/17/06 Compose datacenter from 20 ft. containers! –Power/cooling for 200 KW –External taps for electricity, network, cold water –250 Servers, 7 TB DRAM, or 1.5 PB disk in 2006 –20% energy savings –1/10th? cost of a building

9 “Typical” Datacenter Network Building Block

10 Computers + Net + Storage + Power + Cooling

11 Datacenter Power Issues Transformer Main Supply ATS Switch Board UPS STS PDU STS PDU Panel Generator … 1000 kW 200 kW 50 kW Rack Circuit 2.5 kW X. Fan, W-D Weber, L. Barroso, “Power Provisioning for a Warehouse-sized Computer,” ISCA’07, San Diego, (June 2007). Typical structure 1MW Tier-2 datacenter Reliable Power –Mains + Generator –Dual UPS Units of Aggregation –Rack (10-80 nodes) –PDU (20-60 racks) –Facility/Datacenter

12 Nameplate vs. Actual Peak X. Fan, W-D Weber, L. Barroso, “Power Provisioning for a Warehouse-sized Computer,” ISCA’07, San Diego, (June 2007). Component CPU Memory Disk PCI Slots Mother Board Fan System Total Peak Power 40 W 9 W 12 W 25 W 10 W Count Total 80 W 36 W 12 W 50 W 25 W 10 W 213 W Nameplate peak 145 WMeasured Peak (Power-intensive workload) In Google’s world, for given DC power budget, deploy (and use) as many machines as possible

13 Typical Datacenter Power Larger the machine aggregate, less likely they are simultaneously operating near peak power X. Fan, W-D Weber, L. Barroso, “Power Provisioning for a Warehouse-sized Computer,” ISCA’07, San Diego, (June 2007).

14 FYI--Network Element Power 96 x 1 Gbit port Cisco datacenter switch consumes around 15 kW -- equivalent to 100x a typical dual processor Google 145 W High port density drives network element design, but such high power density makes it difficult to tightly pack them with servers Is an alternative distributed processing/communications topology possible?

15 Energy Expense Dominates

16 Climate Savers Initiative Improving the efficiency of power delivery to computers as well as usage of power by computers –Transmission: 9% of energy is lost before it even gets to the datacenter –Distribution: 5-20% efficiency improvements possible using high voltage DC rather than low voltage AC –Chill air to mid 50s vs. low 70s to deal with the unpredictability of hot spots

17 DC Energy Conservation DCs limited by power –For each dollar spent on servers, add $0.48 (2005)/$0.71 (2010) for power/cooling –$26B spent to power and cool servers in 2005 expected to grow to $45B in 2010 Intelligent allocation of resources to applications –Load balance power demands across DC racks, PDUs, Clusters –Distinguish between user-driven apps that are processor intensive (search) or data intensive (mail) vs. backend batch-oriented (analytics) –Save power when peak resources are not needed by shutting down processors, storage, network elements

18 Power/Cooling Issues

19 Thermal Image of Typical Cluster Rack Rack Switch M. K. Patterson, A. Pratt, P. Kumar, “From UPS to Silicon: an end-to-end evaluation of datacenter efficiency”, Intel Corporation

20 DC Networking and Power Within DC racks, network equipment often the “hottest” components in the hot spot Network opportunities for power reduction –Transition to higher speed interconnects (10 Gbs) at DC scales and densities –High function/high power assists embedded in network element (e.g., TCAMs)

21 DC Networking and Power Selectively sleep ports/portions of net elements Enhanced power-awareness in the network stack –Power-aware routing and support for system virtualization Support for datacenter “slice” power down and restart –Application and power-aware media access/control Dynamic selection of full/half duplex Directional asymmetry to save power, e.g., 10Gb/s send, 100Mb/s receive –Power-awareness in applications and protocols Hard state (proxying), soft state (caching), protocol/data “streamlining” for power as well as b/w reduction Power implications for topology design –Tradeoffs in redundancy/high-availability vs. power consumption –VLANs support for power-aware system virtualization

22 Bringing Resources On-/Off-line Save power by taking DC “slices” off-line –Resource footprint of Internet applications hard to model –Dynamic environment, complex cost functions require measurement-driven decisions –Must maintain Service Level Agreements, no negative impacts on hardware reliability –Pervasive use of virtualization (VMs, VLANs, VStor) makes feasible rapid shutdown/migration/restart Recent results suggest that conserving energy may actually improve reliability –MTTF: stress of on/off cycle vs. benefits of off-hours

23 “System” Statistical Machine Learning S 2 ML Strengths –Handle SW churn: Train vs. write the logic –Beyond queuing models: Learns how to handle/make policy between steady states –Beyond control theory: Coping with complex cost functions –Discovery: Finding trends, needles in data haystack –Exploit cheap processing advances: fast enough to run online S 2 ML as an integral component of DC OS

24 Datacenter Monitoring To build models, S 2 ML needs data to analyze -- the more the better! Huge technical challenge: trace 10K++ nodes within and between DCs –From applications across application tiers to enabling services –Across network layers and domains

25 RIOT: RadLab Integrated Observation via Tracing Framework Trace connectivity of distributed components –Capture causal connections between requests/responses Cross-layer –Include network and middleware services such as IP and LDAP Cross-domain –Multiple datacenters, composed services, overlays, mash-ups –Control to individual administrative domains “Network path” sensor –Put individual requests/responses, at different network layers, in the context of an end-to-end request

26 X-Trace: Path-based Tracing Simple and universal framework –Building on previous path-based tools –Ultimately, every protocol and network element should support tracing Goal: end-to-end path traces with today’s technology –Across the whole network stack –Integrates different applications –Respects Administrative Domains’ policies Rodrigo Fonseca, George Porter

27 Many servers, four worldwide sites A user gets a stale page: What went wrong? Four levels of caches, network partition, misconfiguration, … Example: Wikipedia DNS Round-Robin 33 Web Caches 4 Load Balancers 105 HTTP + App Servers 14 Database Servers Rodrigo Fonseca, George Porter

28 Task Specific system activity in the datapath –E.g., sending a message, fetching a file Composed of many operations (or events) –Different abstraction levels –Multiple layers, components, domains IP Router IP Router IP TCP 1 Start TCP 1 End IP Router IP TCP 2 Start TCP 2 End HTTP Proxy HTTP Server HTTP Client Task graphs can be named, stored, and analyzed Rodrigo Fonseca, George Porter

29 Basic Mechanism All operations: same TaskId Each operation: unique OpId Propagate on edges: [TaskId, OpId] Nodes report all incoming edges to a collection infrastructure IP Router IP Router IP TCP 1 Start TCP 1 End IP Router IP TCP 2 Start TCP 2 End HTTP Proxy HTTP Server HTTP Client F H B A G M N CDE IJKL [id, G][id, A] X-Trace Report OpID: id,G Edge: from A, Edge: from F, X-Trace Report OpID: id,G Edge: from A, Edge: from F, Rodrigo Fonseca, George Porter

30 Some Details Metadata –Horizontal edges: encoded in protocol messages (HTTP headers, TCP options, etc.) –Vertical edges: widened API calls (setsockopt, etc.) Each device –Extracts the metadata from incoming edges –Creates an new operation ID –Copies the updated metadata into outgoing edges –Issues a report with the new operation ID Set of key-value pairs with information about operation No Layering Violation –Propagation among same and adjacent layers –Reports contain information about specific layer TCP 1 Start TCP 1 End HTTP Proxy HTTP Client Rodrigo Fonseca, George Porter

31 Example: DNS + HTTP Different applications Different protocols Different Administrative domains (A) through (F) represent 32-bit random operation IDs Client (A) Resolver (B) Root DNS (C) (.) Auth DNS (D) (.xtrace) Auth DNS (E) (.berkeley.xtrace) Auth DNS (F) (.cs.berkeley.xtrace) Apache (G) Rodrigo Fonseca, George Porter

32 Example: DNS + HTTP Resulting X-Trace Task Graph Rodrigo Fonseca, George Porter

33 Map-Reduce Processing Form of datacenter parallel processing, popularized by Google –Mappers do the work on data slices, reducers process the results –Handle nodes that fail or “lag” others -- be smart about redoing their work Dynamics not very well understood –Heterogeneous machines –Effect of processor or network loads Embed X-trace into open source Hadoop Andy Konwinski, Matei Zaharia

34 Hadoop X-traces Long set-up sequence Multiway fork Andy Konwinski, Matei Zaharia

35 Hadoop X-traces Word count on 600 Mbyte file: 10 chunks, 60 Mbytes each Multiway fork Andy Konwinski, Matei Zaharia Multiway join -- with laggards and restarts

36 Summary and Conclusions Internet Datacenters –It’s the backend to billions of network capable devices –Plenty of processing, storage, and bandwidth –Challenge: energy efficiency DC Network Power Efficiency is a Management Problem! –Much known about processors, little about networks –Faster/denser network fabrics stressing power limits Enhancing Energy Efficiency and Reliability –Consider the whole stack from client to web application –Power- and network-aware resource management –SLAs to trade performance for power: shut down resources –Predict workload patterns to bring resources on-line to satisfy SLAs, particularly user-driven/latency-sensitive applications –Path tracing + SML: reveal correlated behavior of network and application services

37 Thank You!

38 Internet Datacenter

39 Rapidly declining system cost What is the Limiting in Modern Networks? Cheap processing, cheap storage, and plentiful network b/w … Oh, but network b/w is limited … Or is it? “Spectral Efficiency”: More bits/m 3 Rapidly increasing transistor density

40 Not So Simple … Speed-Distance-Cost Tradeoffs Rapid Growth: Machine-to- Machine Devices (mostly sensors)