Securing Native Big Data Deployments Steven C. Markey, MSIS, PMP, CISSP, CIPP/US, CISM, CISA, STS-EV, CCSK, Cloud + Principal, nControl, LLC Adjunct Professor
Presentation Overview – Why Should You Care? – Big Data Overview – Securing Native Big Data Deployments Securing Big Data
Why Should You Care – Organizational Cost Reduction Requirements Justify Investments Improve Efficiencies (Productivity, Time to Market) – Digital Information – 60%~ Annual Growth Rate (AGR) – Data Storage – 15-20% AGR Capital Expense (CapEx) – Categorization, Classification & Retention Magnify Compliance, Legal & Privacy Regulations – Prevalent & Interconnected Business Ecosystems Supply Chains Business Process Outsourcers (BPO) Information Technology Outsourcers (ITO) Vendor’s Vendors Source: IDC Securing Big Data
Source: Flickr
Securing Big Data Big Data Overview – Aggregated Data from the Following Sources Traditional Sensory Social
Securing Big Data Traditional Data – Database Management Systems Relational Database Management Systems (RDBMS) Object-Oriented Database Management Systems (OODBMS) Non-Relational, Distributed DB Management Systems (NRDBMS) Mobile Databases (SQLite, Oracle Lite) – Online Transaction Processing (OLTP) Real-Time Data Warehousing – Online Analytical Processing (OLAP) Operational Data Stores (ODS) Enterprise Data Warehouses (EDW)
Securing Big Data Traditional Data – OLAP Business Intelligence (BI) – Data Mining – Reporting – OLAP (Continued) » Relational OLAP (ROLAP) » Multi-Dimensional OLAP (MOLAP) » Hybrid OLAP (HOLAP) OLTP ODS EDW (Data Marts) BI (Data Mining) OLTP ODS EDW (Data Marts) BI (Reporting) OLTP ODS EDW (Data Marts) BI (OLAP)
Securing Big Data Source Data – Log Files Event Logs / Operating System (OS) - Level Appliance / Peripherals Analyzers / Sniffers – Multimedia Image Logs Video Logs – Web Content Management (WCM) Web Logs Search Engine Optimization (SEO) – Web Metadata
OpenStack User Interface (Horizon) Object Store (Swift) Image Store (Glance) Compute (Nova) Block Storage (Cinder) Network Services (Neutron) Key Service (Barbican) Trusted Compute Pools (Extended with Geo Tagging) Trusted Compute Pools (Extended with Geo Tagging) OVF Meta-Data Import Intel® DPDK vSwitch Enhanced Platform Awareness Erasure Code Expose Enhancements Filter Scheduler Monitoring/Metering (Ceilometer) Object Storage Policy Key Encryption & Management Advanced Services in VMs Intelligent Workload Scheduling Metrics 10 VPN-as-a-Service (with Intel® QuickAssist Technology) Web Messaging (Zaqar) Messaging (Oslo)
Securing Big Data Big Data Overview – Aggregators Mostly NRDBMS Implemtations – Not only – Structured Query Language (NoSQL) NRDBMS Examples – Column Family Stores: BigTable (Google), Cassandra & HBase (Apache) – Key-Values Stores: App Engine DataStore (Google) & DynamoDB – Document Databases: CouchDB, MongoDB – Graph Databases: Neo4J
Securing Big Data Big Data Overview – Serial Processing Hadoop – Hadoop Distributed File System (HDFS) – Hive – DW – Pig – Querying Language Riak – Parallel Processing HadoopDB – Analytics Google MapReduce Apache MapReduce Splunk (for Security Information / Event Management [SIEM])
Securing Big Data
Source: Cloudera
Source: Wikispaces
Source: Google
Source: Cloudera
Source: Flickr
Securing Big Data Securing Cloud-Based NRDBMS Solutions – General Focus on Application / Middleware-Level Security – Single Sign-on (SSO) – SQL Injections Are Still Possible – Leverage Application IAM for NRDBMS User Rights Mgmt (URM) – Leverage Application & System Logging for Accounting Segregation of Duties – Read / Write Namespaces – Read-Only Namespaces – Specific Cryptography & Obfuscation – Homomorphic Encryption – Stateless Tokenization
SSO Standards & Categories: – Network: LDAP, Kerberos, RADIUS, RDBMS – e.g., OpenLDAP, AD, Tivoli Access Manager – Federated: SAML, OpenID, OAuth, WS-Federated, XACML – e.g., Keycloak, PingFederate, ADFS, RSA Federated SSO: Good, Bad & Ugly
Source: Microsoft
SSO: Good, Bad & Ugly Source: OASIS
SSO: Good, Bad & Ugly Source: OASIS
Source: Apache
Securing Big Data
+ =
Presentation Take-Aways – Big Data is Here to Stay – It Has to be Secure – Segregation of Data – Access Controls – Separation / Segregation of Duties – Federated Identities – Logging – Crypto v2.0 – Homomorphic Encryption – Stateless Tokenization Securing Big Data
Questions? Contact – – Twitter: markes1 – LI: