Download presentation
Presentation is loading. Please wait.
Published byGriselda McGee Modified over 9 years ago
1
BIG DATA SECURITY & PRIVACY NIST Public Working Group Version 2 Possible Directions Version 1.2 1
2
NIST S&P Version 2: The Big Two 1. “New” Big Data Security and Privacy Design Patterns 2. Big Data Security Fabric Version 1.2 2
3
3
4
Reality Check in Apache Ecosystem Secure, Multi-Tenant Deployment Much like the early days of Hadoop, Apache Storm originally evolved in an environment where security was not a high-priority concern. Rather, it was assumed that Storm would be deployed to environments suitably cordoned off from security threats. While a large number of users were comfortable setting up their own security measures for Storm, this proved a hindrance to broader adoption among larger enterprises where security policies prohibited deployment without specific safeguards. Yahoo! hosts one of the largest Storm deployments in the world, and the engineering team recognized the need for security early on, so it implemented many of the features necessary to secure its own Apache Storm deployment. Yahoo!, Hortonworks, Symantec, and the broader Apache Storm community have been working on integrating those security innovations into the main Apache code base. That work is nearing completion, and is slated to be included in an upcoming Apache Storm release. Some of the highlights of that release include: Kerberos Authentication with Automatic Credential Push and Renewal Multi-Tenant Scheduling Secure integration with other Hadoop Projects (such as ZooKeeper, HDFS, HBase, etc.) User isolation (Storm topologies run as the user who submitted them) In the future, you can expect to see further integration between Apache Storm and security- focused projects like Apache Argus (formerly XA Secure). http://bit.ly/1Dlf2UPhttp://bit.ly/1Dlf2UP Version 1.2 4
5
Implications | Directions NIST Big Data PWG documentation should show awareness of trends & current efforts (good & bad) NIST Big Data PWG should be a step or two ahead Incorporate or link to work in grid, VLDB, distributed computing May need to separate “Expository” from “Technical” documents (a la Oasis TCs) What elements s/b fabric? What elements s/b design patterns? Version 1.2 5
6
Security & Privacy (& Management) Management Security & Privacy 6 Big Data Application Provider Visualizatio n Access Analytics Curation Collection System Orchestrator DATA Data Consumer Data Provider Horizontally Scalable (VM clusters) Vertically Scalable Horizontally Scalable Vertically Scalable Horizontally Scalable Vertically Scalable Big Data Framework Provider Processing Frameworks (analytic tools, etc.) Platforms (databases, etc.) Infrastructures Physical and Virtual Resources (networking, computing, etc.) DAT A SW Version 1.2 6
7
What is a security fabric? Fabric computing has an accepted definition. We must clarify & amplify from that starting point: Fabric computing or unified computing involves the creation of a computing fabric consisting of interconnected nodes that look like a 'weave' or a 'fabric' when viewed collectively from a distance. [1] [1] Usually this refers to a consolidated high-performance computing system consisting of loosely coupled storage, networking and parallel processing functions linked by high bandwidth interconnects (such as 10 Gigabit Ethernet and InfiniBand) [2] but the term has also been used to describe platforms like the Azure Services Platform and grid computing in general (where the common theme is interconnected nodes that appear as a single logical unit). [3]high-performance computingloosely coupledstoragenetworkingparallel processinghigh bandwidth10 Gigabit EthernetInfiniBand [2]Azure Services Platformgrid computing [3] The fundamental components of fabrics are "nodes" (processor(s), memory, and/or peripherals) and "links" (functional connection between nodes). [2] While the term "fabric" has also been used in association with storage area networks and switched fabric networking, the introduction of compute resources provides a complete "unified" computing system. Other terms used to describe such fabrics include "unified fabric", [4] "data center fabric" and "unified data center fabric". [5] [2]storage area networksswitched fabricnetworkingcompute [4]data center [5] According to Ian Foster, director of the Computation Institute at the Argonne National Laboratory and University of Chicago, "grid computing 'fabrics' are now poised to become the underpinning for next-generation enterprise IT architectures and be used by a much greater part of many organizations." [3] [3] Version 1.2 7
8
Big Data S&P Fabric Possible starting points Orchestrator as workflow manager for policy propagation Collection: Event triggers for collection of PII Curation: Provenance; human-mediated processes; automated curation tools Visualization: Risks around images of people in context; e.g., Google Street View, facial recognition Analytics: Controls over de-anonymization analytics apps with demonstrable commercial or forensic value Organization-specific issues: Tied to framework providers – internal roles, platform-specific features Version 1.2 8
9
“Fabric” Not Original, Which is Good Version 1.2 9
10
Use Case: Image Processing Version 1.2 10
11
Big Data: Risks & Solutions http://bit.ly/1CaUTyZ Version 1.2 11
12
Audit to Provenance Policy-preserving pipelines, processes Can the systems orchestrator do this? Curator: Use cases from Internet of Things Threats to data providers Version 1.2 12
13
Existing Models Guidance Leverage existing models that integrate roles, organizations and technologies. Don’t reinvent – show what’s new or different. Organized around the V’s or RA components Example: ITIL Security Management Example: Oasis Privacy-by-Design for software engineers http://bit.ly/1Cb7gLA http://bit.ly/1Cb7gLA PbD-SE offers a privacy extension/complement to OMG’s Unified Modeling Language (UML) and serves as a complement to OASIS’ eXtensible Access Control Mark-up Language (XACML) and Privacy Management Reference Model (PMRM). Oasis XACML core XML schema for representing authorization and entitlement policies Oasis Privacy Management Reference Model Version 1.2 13
14
Coordination Version 1.2 14
15
Concepts in Play Possible S&P Design Patterns for PII PII should be called out in Big Data models Borrow from DoDAF where S&P is life-and-death Kantara Initiative User-Managed Access Privacy by Design (next slide) IoT standards – end point protection Operations on encrypted content Variety-influenced policy management New Big Data roles for curation, risk management, governance Version 1.2 15
16
Definitions: Privacy and Security Possible Blur Lines Information Assurance Provenance Risk Management De-anonymizing analytics Time-dependent information value Version 1.2 16
17
Privacy by Design Good red zone markings, but not fabric or Big Data design patterns Version 1.2 17
18
MARK UNDERWOOD Mark.underwood@kryptonbrothers.com http://bigdatawg.nist.gov/home.php Version 1.2 18
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.