NIST BIG DATA WG Reference Architecture Subgroup Intermediate Report Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented Intelligence)

Slides:



Advertisements
Similar presentations
Conducting your own Data Life Cycle Audit
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
What's a Proxy Printer Provider? PWG WIMS-CIM Working Group Rick Landau Dell, CTO Office 2008/08/08 v0.2.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Software change management
Information Systems Today: Managing in the Digital World
Discovering Computers Fundamentals, 2012 Edition
1 Web-Enabled Decision Support Systems Access Introduction: Touring Access Prof. Name Position (123) University Name.
1 Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this proposal or quotation. An Introduction to Data.
Database System Concepts and Architecture
Executional Architecture
Chapter 13 The Data Warehouse
Database Management3-1 L3 Database Management Santa R. Susarapu Ph.D. Student Virginia Commonwealth University.
NIST Big Data Public Working Group Security and Privacy Subgroup Presentation September 30, 2013 Arnab Roy, Fujitsu Akhil Manchanda, GE Nancy Landreville,
NIST Big Data Public Working Group Big Data PWG Overview Presentation September 30, 2013 Wo Chang, NIST Robert Marcus, ET-Strategies Chaitanya Baru, UC.
Reference Architecture Subgroup NIST Big Data Public Working Group Reference Architecture Subgroup September 30, 2013 Co-chairs: Orit LevinMicrosoft James.
Microsoft Office 4/16/2017 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Chapter 3 Database Management
Cloud Usability Framework
NIST Big Data Public Working Group Reference Architecture Subgroup September 30, 2013 Co-chairs: Orit LevinMicrosoft James KetnerAT&T Don KrapohlAugmented.
NIST BIG DATA WG Reference Architecture Subgroup Meeting Agenda Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented Intelligence)
8/15/2013NIST Big Data WG / Ref Arch Subgroup1 NIST Big Data Program Alignment: Roadmap & Reference Architecture Version 1.3 Roadmap Subgroup NIST Big.
Software Engineering for Cloud Computing Rao, Feng 04/27/2011.
Operational Data Tools Chapter Eight. Copyright © Houghton Mifflin Company. All rights reserved.8–28–2 Chapter Eight Learning Objectives To learn database.
Chapter 12 Designing Distributed and Internet Systems
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Understanding Data Warehousing
K E Y : SW Service Use Big Data Information Flow SW Tools and Algorithms Transfer Application Provider Visualization Access Analytics Curation Collection.
Organizational Memory: Issues in Design & Implementation Sree Nilakanta May 1, 2000.
Moving the RFID Value Chain Value Proposition Cost and Complexity What is it? (passive RFID) Where is it? (active RFID) How is it? (Sensors) Adapt to it.
NIST BIG DATA WG Reference Architecture Subgroup Draft Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented Intelligence) August.
Emerging Technologies Work Group Master Data Management (MDM) in the Public Sector Don Hoag Manager.
TECHNOLOGY DEMONSTRATION BUSINESS INTELLIGENCE -DATA WAREHOUSE -OLAP -DATA MINING / KNOWLEDGE MANAGEMENT ANALYTICS & MODELLING DIVISION NATIONAL INFORMATICS.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Session ID: Session Classification: Dr. Michael Willett OASIS and WillettWorks DSP-R35A General Interest OASIS Privacy Management Reference Model (PMRM)
NIST Big Data Public Working Group Security and Privacy Subgroup Presentation September 30, 2013 Arnab Roy, Fujitsu Akhil Manchanda, GE Nancy Landreville,
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
NIST BIG DATA WG Reference Architecture Subgroup Agenda for the Subgroup Call Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
NIST BIG DATA WG Reference Architecture Subgroup Intermediate Report Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented Intelligence)
K E Y : SW Service Use Big Data Information Flow SW Tools and Algorithms Transfer Transformation Provider Visualization Access Analytics Curation Collection.
8/20/2013NIST Big Data WG / Roadmap Subgroup1 Architecture Storage Architecture Processing Architecture Resource Managers Architecture Infrastructure Architecture.
NIST BIG DATA WG Reference Architecture Subgroup Draft Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented Intelligence) August.
Advanced Database Concepts
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Web Technologies Lecture 13 Introduction to cloud computing.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
K E Y : DATA SW Service Use Big Data Information Flow SW Tools and Algorithms Transfer Hardware (Storage, Networking, etc.) Big Data Framework Scalable.
Data Warehousing Data Mining Privacy. Reading FarkasCSCE Spring
Big Data RA Topics 1 Industries Data Characteristics “V”s Curation Processing Changes E, T, L Scalable Infrastructure Management Security Data Sources.
WIDESCREEN PRESENTATION Tips and tools for creating and presenting wide format slides.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
Intro to MIS – MGS351 Databases and Data Warehouses
BIG DATA IN ENGINEERING APPLICATIONS
Track and measure Social Media and Darknet through
Data Warehouse.
Databases and Data Warehouses Chapter 3
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Chapter 1 Database Systems
Chapter 1 Database Systems
Big DATA.
CRM DMP – a marriage of two acronyms
Presentation transcript:

NIST BIG DATA WG Reference Architecture Subgroup Intermediate Report Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented Intelligence) July 24th, 2013

Reference Architecture Objectives Addresses a broad range of stakeholders (e.g., data owners, industries, academia, policy makers) Wide scope: Encompasses the whole data life cycle or in the ecosystem Can be applied to different use cases (including various verticals) Represents different system architectures (e.g., an enterprise data warehouse, distributed cloud-based system using multiple service providers) Focus Potentially with initial focus on the Big Data analytics and tools Assists in identifying security and privacy issues Agnostic to any specific technologies 2 7/24/2013NIST Big Data WG / Ref Arch Sub-group

RA Diagram Independent Submissions Different styles and perspectives, but easy to map between them Data centric (Wo Chang) Data Flow centric (Orit Levin, Bob Marcus) Technology Layers / Stack diagram (Gary Mazzaferro) The vocabulary used in these submissions and on the mailing list has been compiled and submitted as M /24/2013NIST Big Data WG / Ref Arch Sub-group

Abstract Reference Architecture by Wo Chang / NIST 7/24/2013NIST Big Data WG / Ref Arch Sub-group4

Independent RA Proposals: Big Data Sources, Usage, Transformation, and Infrastructure 7/24/2013NIST Big Data WG / Ref Arch Sub-group5 Data Flow Diagram by Bob Marcus Technology Stack / Layers Diagram by G. Mazzaferro Data Flow Ecosystem Diagram by Orit Levin

Data Sources and Usage 7/24/2013NIST Big Data WG / Ref Arch Sub-group6 Data Flow Diagram by Bob Marcus Technology Stack / Layers Diagram by G. Mazzaferro Data Flow Ecosystem Diagram by Orit Levin

Infrastructure: Storage, Security, and Management 7/24/2013NIST Big Data WG / Ref Arch Sub-group7 Data Flow Diagram by Bob Marcus Technology Stack / Layers Diagram by G. Mazzaferro Data Flow Ecosystem Diagram by Orit Levin

Data Transformation: Processing, Analytics, and Visualization 7/24/2013NIST Big Data WG / Ref Arch Sub-group8 Data Flow Diagram by Bob Marcus Technology Stack / Layers Diagram by G. Mazzaferro Data Flow Ecosystem Diagram by Orit Levin

Draft Agreement / Rough Consensus Transformation includes Processing functions Analytic functions Visualization functions Data Infrastructure includes Data stores In-memory DBs Analytic DBs 7/24/2013NIST Big Data WG / Ref Arch Sub-group9 Sources Transformation Usage Data Infrastructure Security Management Cloud Computing Network

Next Steps and AIs Deliverable I: Write the White Paper draft showing one or more (e.g., Data Flow and Stack approaches) using the same or similar terminology AI: Chairs will start the draft of the document incorporating the submissions to the Ref Arch subgroup AI: Close cooperation between “Ref Arch” and “Def&Tax” sub-groups to produce the Output: taxonomy for the RA diagrams with definitions for major entities/blocks; Input: M Deliverable II: A draft of a single RA requires more discussion and inputs based on the work of all sub-groups AI: Chairs will start the draft of the document incorporating the findings of the Ref Arch subgroup AI: Review the latest contributions to the Ref Arch and incorporate their findings (See from Yuri Demchenko / University of Amsterdam) AI: Close cooperation with the “Use Cases” and “Security” sub-groups to identify the areas of focus for “zooming” into their architecture 10 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Backup Slides 11 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Submitted RAs 127/24/2013NIST Big Data WG / Ref Arch Sub-group

Data Centric by Wo Chang / NIST 7/24/2013NIST Big Data WG / Ref Arch Sub-group13

Data Flow Diagram by Bob Marcus 14 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Individual Data Transfer Big Data Transfer Selected Data Storage and Retrieval Big Data Storage and Retrieval Aggregation Data Objects Data Sources Data Usage Government (incl. health & financial institutions) Industries / Businesses Network Operators / Telecom Academia Data Mining Matching Collection Data Transformation Data Infrastructure Storage & Retrieval Management Security Conditioning Anonymized Pseudo- anonymized PII VOLUME VARIETY VELOCITY Aggregation 15 Data Flow Ecosystem Diagram by Orit Levin 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Technology Layers / Stack diagram by Gary Mazzaferro M i c r o s o f t 16 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Mapping to Technologies and Use Cases Prepared by the authors of the original RAs 7/24/2013NIST Big Data WG / Ref Arch Sub-group17

18 7/24/2013NIST Big Data WG / Ref Arch Sub-group

19 An Example of Cloud Computing Usage in Big Data Ecosystem Individual Data Transfer Big Data Transfer Selected Data Storage and Retrieval Big Data Storage and Retrieval Aggregation Data Objects Data Sources Data Usage Government (incl. health & financial institutions) Industries / Businesses Network Operators / Telecom Academia Data Mining Collection Data Transformation Data Infrastructure VOLUME VARIETY VELOCITY Data Warehouse Cloud Provider / Service Layer Cloud Provider / Service Layer SaaS PaaS IaaS Matching 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Online Data Aggregator Data Subject / Person Online Sources Public Records (commons, government, etc.) Offline Sources Internal RecordsOther devices (Smart Grid, surveillance, scientific, etc.) End User devices incl. OS (mobile phones, etc.) Applications (search, publishers, etc.) Match/Bridge Service Networks Government, health, financial institutions, academia Industries / Businesses Network Operators Collection DataManagementPlatforms(DMPs) UI: Do Not Track (DNT) HTTP: DNT Analytic Cookie DMP Cookie DPI Match Cookie Appl. with customers (communications, social network, etc. Match Container Tag or Pixel request Offline Data Aggregator Web Browsers Data Mining Person Attribution Users SSP DSP AdNet AdXAgency Publisher Advertiser Advertising Industry Ecosystem DMP Container Tag or Pixel request Control Aggregated 1 st Party 2 nd Party De-identified PII 3 rd Party Contextual Data Collection Behavioral Data Creation Big Data Transfer Individual Data Transfer 20 Use Case: Advertising 7/24/2013NIST Big Data WG / Ref Arch Sub-group

Individual Data Transfer Big Data Transfer Selected Data Storage and Retrieval Big Data Storage and Retrieval Online Analytical Processing (OLAP) Data Usage Department Data Mart Regional Data Mart Subject Data MartApplication Data Mart Data Mining / Knowledge Discovery in Databases (KDD) Extraction, Transformation, and Loading (ETL) Data Transformation Data Infrastructure Central Data Warehouse Management Security Archives FilesOnline Transaction Processing (OLTP) Systems MS Office Documents Functional Data Mart Operational Data Store Staging Area Data Sources Manual Managed Report Environment (MRE) Data Objects 21 Use Case: Enterprise Data Warehouse 7/24/2013NIST Big Data WG / Ref Arch Sub-group

7/24/2013NIST Big Data WG / Ref Arch Sub-group22