Ewen Cheslack-Postava

Slides:



Advertisements
Similar presentations
High throughput chain replication for read-mostly workloads
Advertisements

Multi-Data-Center Hadoop in a Snap Dr. Konstantin Boudnik Vice President, Open Source Development.
Lync /19/2017 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Running Your Database in the Cloud Eran Levin VP R&D - Xeround.
Database Design Table design Index design Query design Transaction design Capacity Size limits Partitioning (shard) Latency Redundancy Replica overhead.
© 2011 Cisco All rights reserved.Cisco Confidential 1 APP server Client library Memory (Managed Cache) Memory (Managed Cache) Queue to disk Disk NIC Replication.
Case study Active –Active DC
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Mick Badran Using Microsoft Service Fabric to build your next Solution with zero downtime – Lvl 300 CLD32 5.
Apache Kafka A distributed publish-subscribe messaging system
© 2014 MapR Technologies 1 Ted Dunning. © 2014 MapR Technologies 2 Me, Us Ted Dunning, MapR Chief Application Architect, Apache Member –Committer PMC.
Architecting Enterprise Workloads on AWS Mike Pfeiffer.
Migrate SQL Server Apps to SQL Azure Cloud DB
Calgary Oracle User Group
Pilot Kafka Service Manuel Martín Márquez. Pilot Kafka Service Manuel Martín Márquez.
Jun Rao co-founder at Confluent, Inc
Turgay Sahtiyan Istanbul, Turkey
Sponsors.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Data Loss and Data Duplication in Kafka
PROTECT | OPTIMIZE | TRANSFORM
Kafka & Samza Weize Sun.
DocFusion 365 Intelligent Template Designer and Document Generation Engine on Azure Enables Your Team to Increase Productivity MICROSOFT AZURE APP BUILDER.
ALWAYSON AVAILABILITY GROUPS
AlwaysOn Mirroring, Clustering
2016 Citrix presentation.
Trial.iO Makes it Easy to Provision Software Trials, Demos and Training Environments in the Azure Cloud in One Click, Without Any IT Involvement MICROSOFT.
Primal and Microsoft Azure Deliver Personalized Content, Intelligence, and Analytics That Match Your Content to the Interests of Your Audience MICROSOFT.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
VIDIZMO Deployment Options
Senior Solutions Architect, MongoDB Inc.
Nimble Streamer Helps Media Content Providers Create Streaming Networks Cost-Effectively and Easily by Utilizing Azure’s Worldwide Scalability MICROSOFT.
Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc.
Central Florida Business Intelligence User Group
AWS. Introduction AWS launched in 2006 from the internal infrastructure that Amazon.com built to handle its online retail operations. AWS was one of the.
Required 9s and data protection: introduction to sql server 2012 alwayson, new high availability solution Santosh Balasubramanian Senior Program Manager.
OpenNebula Offers an Enterprise-Ready, Fully Open Management Solution for Private and Public Clouds – Try It Easily with an Azure Marketplace Sandbox MICROSOFT.
Running on the Powerful Microsoft Azure Platform,
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
Built on the Powerful Microsoft Azure Platform, iSwarm Helps Businesses Analyze Social Media Conversations, then Connect with Individuals MICROSOFT AZURE.
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
湖南大学-信息科学与工程学院-计算机与科学系
Arrested by the CAP Handling Data in Distributed Systems
Utilizing the Capabilities of Microsoft Azure, Skipper Offers a Results-Based Platform That Helps Digital Advertisers with the Marketing of Their Mobile.
Planning High Availability and Disaster Recovery
MasterDoc Organizes, Shares Electronic Patient Records for General Practitioners and Their Staff Members, Thanks to the Microsoft Azure Cloud MICROSOFT.
Data Security for Microsoft Azure
Secure Electronic Procurement of Transcripts, HRD Attestations, and Certificates of Origin, Made Easy with Myeasydocs and Power of Microsoft Azure MICROSOFT.
CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.
Introducing Qwory, a Business-to-Business Search Engine That’s Powered by Microsoft Azure and Detects Vital Contact Information for Businesses MICROSOFT.
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
Specialized Cloud Mechanisms
Glynk on Microsoft Azure: A Social Networking Platform Connecting Like-Minded People Nearby for Recommendations, Activities, and Meetups MICROSOFT AZURE.
Evolution of messaging systems and event driven architecture
Clouds & Containers: Case Studies for Big Data
One-Stop Shop Manages All Technical Vendor Data and Documentation and is Globally Deployed Using Microsoft Azure to Support Asset Owners/Operators MICROSOFT.
Appcelerator Arrow: Build APIs in Minutes. Connect to Any Data Source
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
Distributed Availability Groups
Building global and highly-available services using Windows Azure
7.3 Example Use Cases Spirent Automation Platform Technologies.
Redefinition of Business Continuity Strategies using Cloud Native Enterprise Architectures Frank Stienhans, CTO August 2016.
Caching 50.5* + Apache Kafka
Setting up PostgreSQL for Production in AWS
Cosmic DBA Cosmos DB for SQL Server Admins and Developers
DMaaP Edge Deployments ONAP Dublin
Designing Database Solutions for SQL Server
Azure Cosmos DB – FY20 Top Use Cases
Presentation transcript:

Ewen Cheslack-Postava When One Data Center Is Not Enough: Building Large-scale Stream Infrastructures Across Multiple Data Centers with Apache Kafka Ewen Cheslack-Postava

Outline Kafka overview Common multi data center patterns Future stuff

What’s Apache Kafka Distributed, high throughput pub/sub system New theme. Picture/logo

Kafka usage

Common use case Large scale real time data integration

Other use cases Scaling databases Messaging Stream processing …

Why multiple data centers (DC) Disaster recovery Geo-localization Saving cross-DC bandwidth Security

What’s unique with Kafka multi DC Consumers run continuously and have state (offsets) Challenge: recovering the state during DC failover

Pattern #1: stretched cluster Typically done on AWS in a single region Deploy Zookeeper and broker across 3 availability zones Rely on intra-cluster replication to replica data across DCs Kafka producers consumers DC 1 DC 3 DC 2

On DC failure Producer/consumer fail over to new DCs DC 3 DC 1 DC 2 Existing data preserved by intra-cluster replication Consumer resumes from last committed offsets and will see same data Kafka producers consumers DC 1 DC 3 DC 2

When DC comes back Intra cluster replication auto re-replicates all missing data When re-replication completes, switch producer/consumer back Kafka producers consumers DC 1 DC 3 DC 2

Be careful with replica assignment Don’t want all replicas in same AZ Rack-aware support in 0.10.0 Configure brokers in same AZ with same broker.rack Manual assignment pre 0.10.0

Stretched cluster NOT recommended across regions Asymmetric network partitioning Longer network latency => longer produce/consume time Cross region bandwidth: no read affinity in Kafka region 1 Kafka ZK region 2 region 3

Pattern #2: active/passive Producers in active DC Consumers in either active or passive DC Kafka producers consumers DC 1 Replication DC 2

Cross Datacenter Replication Consumer & Producer: read from a source cluster and write to a target cluster Per-key ordering preserved Asynchronous: target always slightly behind Offsets not preserved Source and target may not have same # partitions Retries for failed writes Options: Confluent Multi-Datacenter Replication MirrorMaker

On active DC failure Fail over producers/consumers to passive cluster Challenge: which offset to resume consumption Offsets not identical across clusters Kafka producers consumers DC 1 Replication DC 2

Solutions for switching consumers Resume from smallest offset Duplicates Resume from largest offset May miss some messages (likely acceptable for real time consumers) Set offset based on timestamp Current API hard to use and not precise Better and more precise API being worked on (KIP-33) Preserve offsets during replication Harder to do No timeline yet

When DC comes back Need to reverse replication Kafka DC 1 DC 2 Same challenge: determining the offsets Kafka producers consumers DC 1 Replication DC 2

Limitations Reconfiguration of replication after failover Resources in passive DC under utilized

Pattern #3: active/active Local  aggregate replication to avoid cycles Producers/consumers in both DCs Producers only write to local clusters Kafka local Kafka aggregate producers consumers Replication DC 1 DC 2

On DC failure Same challenge on moving consumers on aggregate cluster Offsets in the 2 aggregate cluster not identical Kafka local Kafka aggregate producers consumers Replication DC 1 DC 2

When DC comes back No need to reconfigure replication Kafka local Kafka aggregate producers consumers Replication DC 1 DC 2

An alternative Challenge: reconfigure replication on failover, similar to active/passive Kafka local Kafka aggregate producers consumers Replication DC 1 DC 2

Another alternative: avoid aggregate clusters Prefix topic names with DC tag Configure replication to replicate remote topics only Consumers need to subscribe to topics with both DC tags Kafka producers consumers DC 1 Replication DC 2

Beyond 2 DCs More DCs  better resource utilization With 2 DCs, each DC needs to provision 100% traffic With 3 DCs, each DC only needs to provision 50% traffic Setting up replication with many DCs can be daunting Only set up aggregate clusters in 2-3

Comparison Pros Cons Stretched Better utilization of resources Easy failover for consumers Still need cross region story Active/passive Needed for global ordering Harder failover for consumers Reconfiguration during failover Resource under-utilization Active/active Extra aggregate clusters

Multi-DC beyond Kafka Kafka often used together with other data stores Need to make sure multi-DC strategy is consistent

Example application Consumer reads from Kafka and computes 1-min count Counts need to be stored in DB and available in every DC

Independent database per DC Run same consumer concurrently in both DCs No consumer failover needed Kafka local Kafka aggregate producers consumer Replication DC 1 DC 2 DB

Stretched database across DCs Only run one consumer per DC at any given point of time Kafka local Kafka aggregate producers consumer Replication DC 1 DC 2 DB on failover

Future work KIP-33: timestamp index Allow consumers to seek based on timestamp Integration with Kafka Connect for data ingestion Offset preservation

Ewen Cheslack-Postava | ewen@confluent.io | @ewencp THANK YOU! Ewen Cheslack-Postava | ewen@confluent.io | @ewencp Learn more about Kafka at Strata + Hadoop World NY Securing Apache Kafka - Jun Rao, River Pavilion @ 2:05pm Ask Me Anything: Apache Kafka – Jun Rao & Ewen Cheslack-Postava, 1E09 @ 4:35pm Visit Confluent’s Booth (#758) Kafka Training with Confluent University Kafka Developer and Operations Courses Visit www.confluent.io/training Want more Kafka? Download Confluent Platform Enterprise at http://www.confluent.io/product Apache Kafka 0.10 upgrade documentation at http://docs.confluent.io/3.0.1/upgrade.html Kafka Summit recordings now available at http://kafka-summit.org/schedule/