Networking Recap Storage Intro

Slides:



Advertisements
Similar presentations
SDN Controller Challenges
Advertisements

Data Center Fabrics. Forwarding Today Layer 3 approach: – Assign IP addresses to hosts hierarchically based on their directly connected switch. – Use.
Improving Datacenter Performance and Robustness with Multipath TCP Costin Raiciu, Sebastien Barre, Christopher Pluntke, Adam Greenhalgh, Damon Wischik,
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
CS4315A. Berrached:CMS:UHD1 Operating System Structures Chapter 3.
A Scalable, Commodity Data Center Network Architecture.
1 The Google File System Reporter: You-Wei Zhang.
Computer System Architectures Computer System Software
Local Area Networks (LAN) are small networks, with a short distance for the cables to run, typically a room, a floor, or a building. - LANs are limited.
Vic Liu Liang Xia Zu Qiang Speaker: Vic Liu China Mobile Network as a Service Architecture draft-liu-nvo3-naas-arch-01.
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
Configuring File Services. Using the Distributed File System Larger enterprises typically use more file servers Used to improve network performce Reduce.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
SDN challenges Deployment challenges
CIS 700-5: The Design and Implementation of Cloud Networks
Lecture 2: Cloud Computing
Introduction to Operating Systems
University of Maryland College Park
Lecture 2: Leaf-Spine and PortLand Networks
Operating Systems : Overview
Hydra: Leveraging Functional Slicing for Efficient Distributed SDN Controllers Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay Rao.
Flash Storage 101 Revolutionizing Databases
Revisiting Ethernet: Plug-and-play made scalable and efficient
Improving Datacenter Performance and Robustness with Multipath TCP
NOX: Towards an Operating System for Networks
Objectives Differentiate between the different editions of Windows Server 2003 Explain Windows Server 2003 network models and server roles Identify concepts.
The Client/Server Database Environment
Improving Datacenter Performance and Robustness with Multipath TCP
CHAPTER 3 Architectures for Distributed Systems
Introduction to Networks
Introduction to Networks
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Software Architecture in Practice
Storage Virtualization
Main Memory Management
Introduction to Operating Systems
EECS 582 Final Review Mosharaf Chowdhury EECS 582 – F16.
The Tail At Scale Dean and Barroso, CACM 2013, Pages 74-80
Chapter 17: Database System Architectures
Chapter 2: System Structures
Directory Structure A collection of nodes containing information about all files Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the.
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Research Opportunities in IP Wide Area Storage
Cloud computing mechanisms
Architectures of distributed systems Fundamental Models
Distributed File Systems
Distributed File Systems
Architectures of distributed systems Fundamental Models
Secondary Storage Management Brian Bershad
Operating Systems : Overview
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Internet and Web Simple client-server model
Specialized Cloud Architectures
Chapter 2: Operating-System Structures
Distributed File Systems
Introduction to Operating Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Operating Systems : Overview
Architectures of distributed systems
Secondary Storage Management Hank Levy
THE GOOGLE FILE SYSTEM.
Architectures of distributed systems Fundamental Models
Database System Architectures
Distributed File Systems
Distributed File Systems
In-network computation
Presentation transcript:

Networking Recap Storage Intro CSE-291 (Cloud Computing), Fall 2016 Gregory Kesden

Networking Recap Storage Intro

Long Haul/Global Networking Speed of light is limiting; Latency has a lower bound (.) Throughput is a cost, but not otherwise a limitation Running more fiber optic is technologically reasonable, within reason Demand will drive growth

Vs. Networking Within A Data Center Speed of light not limiting, short distance Serialization can be limiting, as bits clocked into and off of media Parallelization helps, but quickly limited in practice Bandwidth (Throughput) needs to be managed, both by distributing load and relieving bottlenecks Task drives access pattern, drives hot spots

Traditional 3-Tier Network Convenient for distribution All links same width More links at lower layers = More bandwidth lower layers Works if strong locality at leaf (access layer) Challenged if leaf-to-leaf up-and-down patterns Challenged if request distributed and wait reply (inflow) Maximizing work within time budget means replies at same time collide

Fat Tree Topology Multi-Rooted Tree = All paths same width Only useful if load distributed Doesn’t work if all pick same spanning tree Can distribute via Layer-1,-2, or -3 techniques

Utilizing Multiple Paths Layer-2 Forwarding tables; Consider PortLand Hard to be adaptive Layer-3 ECMP; Can do via routing Suffer collision of paths under load Slow to adapt, slow implementation Handles failure slowly Layer-4 Schedule flows Not all flows are equal Can change Big vs small can’t be balance DCTCP; Multiplex flows Positive Feedback to balance, not based upon interpreting drops. Adaptive, balanceable

PortLand Uses FatTree Topology Provides way of using multiple paths Separates Identifier (IP address) from Locator (PMAC) Facilitates migration (Benefit over other solutions) Fabric Manager enables failure management, mapping, etc Link Discovery enables Auto-Configuration

Clos Networks Obvious benefit to flatter, all-to-all Connectivity Presently enables 2-Layer topologies in many cases Fabric has limits. Only possible to a point. Scale may more commonly outgrow fabric in near future leading to migration b ack to 3-Tier Manage scale limitations by using locally within pods, among cores, etc. Provides local density where needed, without blowing up interconnectivity scale globally

Software Defined Networks (SDNs) Separate Control Plane (Management) from Data Plane (Forwarding) Enable central management of entire control plane Essentially makes a programmable network Enables management applications with appropriate abstraction for use case (enterprise, public cloud, etc) and management goal (traffic management, isolation, etc) Dramatically reduces management overhead, correctness, and security This is a “killer app”, by itself, at scale Critically important for virtualization Enable on-demand management, and real-time adaptation to failure. Replace multiple types of devices with single, generic, programmable devices

Networking Recap Storage Intro

Types of Storage General Purpose File System Special Purpose File System NoSQL Database (Key-Value Store, Column-oriented, Etc) Relational Databases, i.e. SQL In Memory (Critical)

General Purpose File System Emulate traditional file systems at larger scale and/or distributing near users E.g., Manage named user data with hierarchical access and protections, Andrew File System (AFS) is classical global example, many more like NFS approximate at DC or campus scale Friendly for end users, but how useful? Commonness of global use case Can be very distant with reasonable latency. Disk = 10mS; Network = 10mS/1,000mi (very approximate) High cost Synchronization, concurrency, etc

Special Purpose File System Relax constraints to limit complexity Accessing programmatically? Use index, drop directory hierarchy. Upload-based creation? Version and don’t support edits. Limits concurrency problem to version number Logging? Append-only writes. Limits concurrency problem to allocation. Single application? Access model limited to roles, let app manage permissions Stale okay? Easier replication. Etc. Relieving concurrency problems Improves caching, replication Decreases latency

NoSQL Databases Break up limitations associated with table (relation) CAP: Trade consistency for accessibility and partition tolerance. Speed at scale Not likely ACID Can do some SQL-like things, but some are hard, e.g. generalized join Many kinds, e.g. key-value, column-oriented, etc

In-Memory When reacting to human users in real time, time budget is limited Trip from client to service, trip from front-end to back-ends, processing, trip from back-ends to front- end, processing, trip back to client, etc. There isn’t time to constantly hit disk in many ways Would need a huge amount of disk time to be able to sustain throughout at scale What is reasonable response time for human user? 0.5 second (500 mS)? 75mS round trip to east coast The rest is data center latency, queuing for services, and actual work Best solution is to have the answer waiting Or, at least the components of the answer Sadly, sometimes block for slowest component