Thank you. Harel Ben Attia Senior Software Engineer River A data workflow management system.

Slides:



Advertisements
Similar presentations
MQ Series Cross Platform Dominant Messaging sw – 70% of market Messaging API same on all platforms Guaranteed one-time delivery Two-Phase Commit Wide EAI.
Advertisements

Implementing Tableau Server in an Enterprise Environment
Xcelsius Tips and Tricks Chris Greer EV Technologies.
New Release Announcements and Product Roadmap Chris DiPierro, Director of Software Development April 9-11, 2014
Creating the global research village The DANTE NOC Network Monitoring System Xavier Martins-Rivas, DANTE TNC 2010, Vilnius, 2 nd June 2010.
© 2008 OSIsoft, Inc. | Company Confidential Event Frames Initiative Update Chris Nelson Chris Coen Chris Nelson Chris Coen.
Week 6: Chapter 6 Agenda Automation of SQL Server tasks using: SQL Server Agent Scheduling Scripting Technologies.
CNT 4603: Managing/Maintaining Server 2008 – Part 3 Page 1 Dr. Mark Llewellyn © CNT 4603: System Administration Spring 2014 Managing And Maintaining Windows.
Parasol Architecture A mild case of scary asynchronous system stuff.
1 DB2 Access Recording Services Auditing DB2 on z/OS with “DBARS” A product developed by Software Product Research.
® IBM Software Group © 2006 IBM Corporation Rational Software France Object-Oriented Analysis and Design with UML2 and Rational Software Modeler 04. Other.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Report Distribution Report Distribution in PeopleTools 8.4 Doug Ostler & Eric Knapp 7264.
Lower costs and improve predictability Automation Enable service owners to focus on work that adds business value Reduce error-prone manual activities.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Workflow Management CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
Christopher Jeffers August 2012
Denise Luther Senior IT Consultant Practical Technology Enablement with Enterprise Integrator.
© 2012 WIPRO LTD | 1 Version 1.0a, 23 rd April 2012 TTCN-3 Users Conference Practical integration of TTCN-3 with Robot test automation framework.
Database Laboratory Regular Seminar TaeHoon Kim.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Microsoft SharePoint Server 2010 for the Microsoft ASP.NET Developer Yaroslav Pentsarskyy
DEPICT: DiscovEring Patterns and InteraCTions in databases A tool for testing data-intensive systems.
ARCH-4: The Presentation Layer in the OpenEdge® Reference Architecture Frank Beusenberg Senior Technical Consultant.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
Workflow Project Status Update Luciano Piccoli - Fermilab, IIT Nov
Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1 expectations of data model –simple robustness tests –Comparisons.
TEST-1 6. Testing & Refactoring. TEST-2 How we create classes? We think about what a class must do We focus on its implementation We write fields We write.
Web Timesheet Application
Server to Server Communication Redis as an enabler Orion Free
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Review of Condor,SGE,LSF,PBS
Overview of the Automated Build & Deployment Process Johnita Beasley Tuesday, April 29, 2008.
Advanced Tips And Tricks For Power Query
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
Learningcomputer.com SQL Server 2008 – Management Studio.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
IPS Infrastructure Technological Overview of Work Done.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Interactions & Automations
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
EGEE is a project funded by the European Union under contract IST Experiment Software Installation toolkit on LCG-2
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Interstage BPM v11.2 1Copyright © 2010 FUJITSU LIMITED INTERSTAGE BPM ARCHITECTURE BPMS.
Continuous Deployments using SSDT
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
PRESENTS TECHNOLOGY PARTNER INTEGRATION DAY MICROSOFT GTSC, Bengaluru September 10, 2016 Tulika Chaudharie / Harikharan Krishnaraju Escalation Engineer,
SQL Database Management
Everything you've ever wanted to know about using Control-M to integrate any application workload September 9, 2016 David Fernandez Senior Presales Consultant.
Project Advisor: Dr. Jerry Gao
Data Platform and Analytics Foundational Training
U.S. ATLAS Grid Production Experience
Web Technologies IT230 Dr Mohamed Habib.
The Client/Server Database Environment
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
Publishing ALICE data & CVMFS infrastructure monitoring
Exploring Azure Event Grid
Solving ETL Bottlenecks with SSIS Scale Out
Microsoft Build /8/2018 5:15 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Technical Capabilities
Overview of Workflows: Why Use Them?
Comparison IWS/Graph and IWS/WebAdmin for IWSz
Presentation transcript:

Thank you

Harel Ben Attia Senior Software Engineer River A data workflow management system

– Tens of Billions of Recommendations per month – Most major publishers in the World – Hundreds GBs of new data every day

Context Data Processing Workflows Multiple Types of Processing – Rollups, Grouping, Filtering, Algorithm Calculations Multiple Stages of Processing – Using the output of other processes as input

Problems Dependency “Management” – Hardcoded into code/scripts – Time-based using cron or another scheduler Logic is scattered around the system – Developers need to take care of monitoring, alerts, permissions etc. – Multiple Locations of Execution

River Data Processing Management Infrastructure Data Processing Management Infrastructure

River Execution Management – Full Execution History and Filtering – Monitoring and Actionable Alerting – Automatic Retries – Web UI Ease of Development – Declarative Data Processing Definitions – Decentralized Shared Data, separate development – JobLogs Data Driven Dependencies – Why? Ops / NOC Developers

A B C A B C J J A B C J J t Option 1 Option 2 Other Approaches

A B C J t Option 2 Other Approaches

D Fails D sends Developer of D still works here Where is the code? Other Approaches

2am is a great hour for troubleshooting! D = Data from C is missing… C = The data of C is all there! Other Approaches

A B C D… J X:37 seems like a good time… C never finished after X:30 anyway t Job J has been working for more than a week before the incident Other Approaches

Need to rerun processes B, C and D Without running A again? Without colliding with ongoing executions? Without running A again? Without colliding with ongoing executions? Which hours failed? How to run all of them for the specific hours? Other Approaches

A J “A will never take more than 15 minutes, so X:20 is more than enough” t A WILL eventually take longer X:00 Other Approaches

River Execution Management – Full Execution History + Filtering and Searching – Monitoring and Actionable Alerting – Automatic Retries – Web UI – JobLogs Ease of Development – Declarative Data Processing Definitions – Decentralized Shared Data, separate development Data Driven Dependencies – Why? Robustness Reliability Parallelism

River What?When?Where?How?

Execution Layer – the “What” Importing from MySQL to Hive Hive Queries JDBC Queries Transfer data from Hive into MySQL and to Cassandra Running External Commands: MapReduce, Java, bash, Legacy code, etc. Every data processing task is called a Job A Job can contain multiple Steps Jobs use Parameters

Scheduling Layer – the “When” Events that describe Data Availability Each job registers to an event, which will trigger its execution Each job emits an event at job completion Events that are time dependent

The “How” and the “Where” Integration to other systems Connecting to Hive/Hadoop/Cassandra Connecting to JDBC Databases Retries, throttling, timeouts Both handled by the infrastructure Logical names to all data sources Centralized Management, notifications and dashboards Monitoring and Alerts Location of Execution Actual location is hidden from the developer/ops “readOnlyDataWarehouse” ”productionCassandra”

River UI Restart Job Fail Job and Dependents Download JobLog

Monitoring Dashboard

Steps Steps only contain what needs to be done sourceDB = “productionDatabase” sourceTable = “myRawData” targetCluster = “onlineHadoopCluster” targetHiveTable = “rawDataTable” Filter = “date=#handledDate#” sourceDB = “productionDatabase” sourceTable = “myRawData” targetCluster = “onlineHadoopCluster” targetHiveTable = “rawDataTable” Filter = “date=#handledDate#” Copy Data From JDBC to Hive

A bit more about triggers Triggers have parameters as well Date= ,hour=15Date= ,hour=19 Parameters Propagate through jobs and to other triggers

Developer’s Point-of-View Automatic Retries Parameters Pass-through

Trigger Manager Trigger Manager External Systems Trigger Queue Execution Queue Hive/Hadoop Interface OS Interface OS Interface Cassandra Inerface Cassandra Inerface JDBC Interface JDBC Interface Spring Batch DB Execution Manager Spring Batch River Topology

Dependencies for detailed example

Trigger Manager Trigger Manager External Systems Trigger Queue Execution Queue Hive/Hadoop Interface OS Interface OS Interface Cassandra Inerface Cassandra Inerface JDBC Interface JDBC Interface Spring Batch DB Execution Manager Spring Batch River Topology T1 Date= hour=03 Job1,Job2 Job2 Job3 Job1 T2 Job3 T3 T1 Job3 Success Example Job1,Job2 Date= hour=03 (from Job1) (from Job2) T3 Date= hour=03

Trigger Manager Trigger Manager External Systems Trigger Queue Execution Queue Hive/Hadoop Interface OS Interface OS Interface Cassandra Inerface Cassandra Inerface JDBC Interface JDBC Interface Spring Batch DB Execution Manager Spring Batch River Topology Job2 T3 Job3 Failure Example Job2 Date= hour=03 T3 Date= hour=03 UI

Notable Features Parameter Enrichment – Example: #beginningOfMonth Precondition Expressions – Example: isLastDayOfMonth(#handleDate) Data Comparison Capabilities – Data Validations – Supports Tolerance Absolute and Percentage margins Command Line and Java Clients

River at 6 River Instances Running 5 Teams ~4100 Jobs running every day ~50 Different Job Types Job Failures due to environment issues have almost no overhead Automatic restarts of jobs when data arrives late

Future Plans Multiple Dependencies Offline Job Testing Capabilities Improved DSL for Job Definitions Support for Master/Worker River machines Job Priorities Analysis Tools Outbrain is working on Open Sourcing River Illustration by Chris Whetzel

Questions

Thank on Twitter Harel Ben Attia