Project Goals Collect and permanently store the data flowing around ONAP system into several Big Data storages, each in different category. Also serve.

Slides:



Advertisements
Similar presentations
BigData Tools Seyyed mohammad Razavi. Outline  Introduction  Hbase  Cassandra  Spark  Acumulo  Blur  MongoDB  Hive  Giraph  Pig.
Advertisements

Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Evaluation of NoSQL databases for DIRAC monitoring and beyond
Introduction to Backend James Kahng. Install Node.js.
An Information Architecture for Hadoop Mark Samson – Systems Engineer, Cloudera.
eGovernance Under guidance of Dr. P.V. Kamesam IBM Research Lab New Delhi Ashish Gupta 3 rd Year B.Tech, Computer Science and Engg. IIT Delhi.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
` tuplejump The data engineering platform. A startup with a vision to simplify data engineering and empower the next generation of data powered miracles!
Persistence Store Project Proposal.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
Realtime insight in your application usage with NodeJs, ElasticSearch and Kibana Onno de Haan.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Role Activity Sub-role Functional Components Control Data Software.
Streaming Analytics with Spark 1 Magnoni Luca IT-CM-MM 09/02/16EBI - CERN meeting.
SAP BI – The Solution at a Glance : SAP Business Intelligence is an enterprise-class, complete, open and integrated solution.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Our experience with NoSQL and MapReduce technologies Fabio Souto.
A presentation on ElasticSearch
ONAP E2E Flow `.
ONAP Management Requirements
Big Data & Test Automation
Master Service Orchestrator (MSO)
Pilot Kafka Service Manuel Martín Márquez. Pilot Kafka Service Manuel Martín Márquez.
Euro17 LSO Hackathon Open LSO Analytics
Organizational IT Stack
Defining ONAP APIs With BSS/OSS
5/9/2018 7:28 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS.
Backdooring enemies with a Proxy …..
Defining Data Warehouse Concepts and Terminology
WinCC-OA Log Analysis SCADA Application Service - Reporting
Special Topics in CCIT: Databases
Alla Goldner (outcomes from brainstorming meetings) Sept, 2017
Zhangxi Lin, The Rawls College,
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
PL2759 Autodesk® PLM 360 Connect Integration with Autodesk PLM 360
Experience in CMS with Analytic Tools for Data Flow and HLT Monitoring
MEF LSO Legato SDK 24 October 2017 Andy Mayer, Ph.D. Tara Cummings.
DI4R, 30th September 2016, Krakow
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Dineesha Suraweera.
Configuration Store in ONAP using Distributed KV Store (As part of making ONAP carrier grade) Consul.
ONAP Amsterdam Architecture
VoLTE remaining requirements Auto & manual Scaling
ONAP Amsterdam Architecture
Defining Data Warehouse Concepts and Terminology
September 11, Ian R Brooks Ph.D.
Monitoring of the infrastructure from the VO perspective
Documenting ONAP components (functional)
Future Data Architectures Big Data Workshop – April 2018
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
CS6604 Digital Libraries IDEAL Webpages Presented by
Johannes Peter MediaMarktSaturn Retail Group
Casablanca Platform Enhancements to Support 5G Use Case Summary of Planned Enhancement Areas 5G Use Case Team June 14, 2018.
C.U.SHAH COLLEGE OF ENG. & TECH.
Overview of big data tools
End to End Monitoring Solution using Open Source Technology where webMethods 9.10 is used as ESB IBM Confidential.
DCAE Data Files Collector
Big DATA.
Reportnet 3.0 Database Feasibility Study – Approach
GNFC Architecture and Interfaces
ONAP Optimization Framework (OOF) POC for Physical CellID (PCI) Optimization July 30, 2018.
Title: Robust ONAP Platform Controller for LCM in a Distributed Edge Environment (In Progress) Source: ONAP Architecture Task Force on Edge Automation.
DOE Review of the LCLS Project 7-9 February 2006
Developing ONAP API Documentation
Presentation transcript:

ONAP DataLake Guobiao Mo (China Mobile) Xin Miao (Huawei) Zhaoxing Meng (ZTE) October 29, 2018

Project Goals Collect and permanently store the data flowing around ONAP system into several Big Data storages, each in different category. Also serve as a common data storage for all ONAP components, with easy access. Provide APIs and ways for ONAP components and external systems (e.g. BSS/OSS) to consume the data. Provide sophisticated and ready-to-use data analytics tools built on the data.

Architecture OSS/BSS ONAP Components DMaaP/Kafka Other Sources JSON/XML/YAML DataLake Dispatcher DL Admin (UI) OLAP Store (Druid) Document Store (Couchbase) Other Stores Superset Query/UI Spark

Data Sources DataLake monitors all or selected DMaaP topics and real-time reads the data via Kafka API, and persists it. Other ONAP components can use DataLake as a storage to save application specific data, through DMaaP or DataLake REST APIs. Other data sources will be supported if needed.

Document Store POC is on MongoDB, which supports flexible database schemas and powerful ad hoc queries. Due to MongoDB license issue, we plan to replace it with Couchbase, which is a distributed document-oriented database, and supports Spark. DataLake real-time pulls the data and insert it into the store, one table for each topic, with the same table name as the topic name. Data types JSON, XML, and YAML are auto converted into native store schema. DL provides REST API for data query, while applications can access the data through the store’s native API as well. Couchbase supports Spark, which reads the document into DataFrame, for easy processing. This is suitable for complicate analytics model.

OLAP Store Apache Druid is a popular large scale OLAP data store. Superset is a UI tool for interactive analytics. Both are active in GitHub. DL extracts the dimensions and metrics from JSON files, and pre-configure Druid settings for each topics. DL pre-builds Superset interactive dashboards.

Other Storages Based on future requirements, other storages may be supported. Some Candidates: Search engine, ElasticSearch or Apache Solr. OpenTSDB, a distributed, scalable Time Series Database.

Summary Storage Target Document Store (Couchbase) Document Storage and Retrieval Document Store + Spark Customized Computation OLAP (Druid and Superset) Interactive Analytics Others Based on future requirements

In Relation to Other Components DCAE DCAE focuses on being a part of automated closed control loop on VNFs, storing collected data for archiving has not been covered by DCAE scope. (see ONAP wiki forum) Envision that some DCAE analytics applications may use the data in DataLake. PNDA PNDA is an infrastructure that bundles a wide variety of big data technologies for data processing. Applications are to be developed on the technologies provided by PNDA. The goal of DataLake is to store DMaaP and other data, and build ready-to-use applications around the data, making use of suitable technologies, whether they are provided by PNDA. Currently Couchbase, Druid and Superset are not included in PNDA.

Thank You Thank You DataLake proposal on ONAP wiki. Contact: Guobiao Mo (guobiaomo@chinamobile.com)