Using Big Data clustering for improved security intelligence

Slides:



Advertisements
Similar presentations
Chapter 1 Business Driven Technology
Advertisements

4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
© 2013 IBM Corporation October 4, 2013 IT Analytics and Big Data IBM Solutions Paul Smith (Smitty) Service Management Architect.
Unit 7: Store and Retrieve it Database Management Systems (DBMS)
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Chapter 12: Enhancing Decision Making Dr. Andrew P. Ciganek, Ph.D.
© 2012 Datameer, Inc. All rights reserved. Page 1 © 2012 Datameer, Inc. All rights reserved. Hadoop in Financial Services Adam Gugliciello, Solutions Engineer.
Big Data Bijan Barikbin Denisa Teme Matthew Joseph.
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
Ali Alhamdan, PhD National Information Center Ministry of Interior
Innovation Work Circle: Big Data Presented By: Innovation Work Circle Group.
Chapter 1 An Introduction to Information Systems
Threat Prevention and Detection (within Critical Infrastructures) under EU Data Protection Legislation– Purpose Specification and Limitation. Laurens Naudts.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
© 2012 IBM Corporation IBM Security Systems 1 © 2012 IBM Corporation Cloud Security: Who do you trust? Martin Borrett Director of the IBM Institute for.
MAR Capability Overview Deck Protean Analytics.
Sicherheitsaspekte beim Betrieb von IT-Systemen Christian Leichtfried, BDE Smart Energy IBM Austria December 2011.
Cognitive & Organizational Challenges of Big Data in Cyber Defence. YALAVARTHI ANUSHA 1.
1.Why it is important to study and understand information systems. 2.Distinguish data from information. 3.Name the components of an information system.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
Information Systems Chapter 1 An Introduction to Information Systems.
 Exists to serve the community’s interests by providing social conditions in which people maintain health  Describes epidemics and the spread of disease,
Video Surveillance Market to Global Analysis and Forecasts by Components and End-user Industry No of Pages: 150 Publishing Date: Jan 2017 Single.
Video Surveillance Market to Global Analysis and Forecasts by Components and End-user Industry No of Pages: 150 Publishing Date: Feb 2017 Single.
Introduction to Machine Learning, its potential usage in network area,
What we mean by Big Data and Advanced Analytics
Workshop 4: Developing a one page business case
Survey on Different Data Mining Techniques for E- Crimes
Big Data Enterprise Patterns
Decision Support Systems
Business Intelligence
Viewing Data-Driven Success Through a Capability Lens
AT&T Premises-Based Firewall Enhanced SBS Solution
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Microsoft Operations Management Suite Insight and Analytics
Big Data.
Overview of MDM Site Hub
DATA MINING APPLICATION IN CRIME ANALYSIS AND CLASSIFICATION
Emerging Trends in Information Technology
Trends in my profession, Information Technology
Through the Eyes of Data
Algorithms for Big Data Delivery over the Internet of Things
LEGAL & ETHICAL ISSUES InsurTech & Health Insurance Providers
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
MDIC- Case for Quality Forum
A Must to Know - Testing IoT
Chapter 1 Database Systems
Stop Data Wrangling, Start Transforming Data to Intelligence
Ed oms team OMS: Log Analytics Ed oms team.
Big Data.
Big Data Overview.
Big Data Young Lee BUS 550.
T H I N C Transformation Technology New Capabilities
Business Intelligence
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024 Low Power Wide Area Network.
1 Advanced Cyber Security Forensics Training for Law Enforcement Building Advanced Forensics & Digital Evidence Human Resource in the Law Enforcement sector.
Matteo Merialdo RHEA Group Innovative aspects in cyber range solutions.
CHPTER 6 The Marketing Plan
Big Data: Four Vs Salhuldin Alqarghuli.
Data Warehousing Data Mining Privacy
Big Data Analysis in Digital Marketing
Big DATA.
Creativity and the Business Idea
ERP and Related Technologies
UNIT 6 RECENT TRENDS.
Peter E, Ayemholan1, Garba, Suleiman2 and Osaigbovo Timothy3
Presentation transcript:

Using Big Data clustering for improved security intelligence By Abubakar Sheriff Federal University of Technology Minna 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Introduction : Big Data Big data is extremely large data sets consisting of structured, semi-structured and unstructured data that may be computationally analysed. These datasets are usually characterised by their volume, velocity, variety, veracity and value. This analysis can be used to reveal patterns, trends, and associations, especially relating to human behavior and interactions 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Introduction: Clustering Clustering is a method of aggregating large volumes of data into groups based on specified characteristics. This makes the analysis, understanding, interpretation and assimilation of large data easier and faster 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Introduction: Clustering (2) Clustering data can be used to provide valuable information in several fields including: Security: Human Behavior analysis. Medicine: DNA and Genotype patterns. Business: Market Research & Analysis Computing: Evolutionary Algorithms Geography: Climatology and Seasonal weather predictions 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Introduction: Security Intelligence Data intelligence is the continuous real-time collection, normalization and analysis of data generated by users, applications and infrastructure. This typically includes log management, security event correlation and network activity monitoring. Security intelligence is actionable information that can be used to prevent or resolve criminal activities or intent 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Introduction: Security Intelligence (2) It has become an essential tool for crime management due to the increasing abilities of perpetrators to circumvent traditional security systems. Furthermore, the continuing dissolution of traditional defensive perimeters coupled with enhanced attackers’ abilities requires organizations to adopt an intelligence-driven security model that is more risk-aware, contextual, and agile. 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Overview: Big Data & Security Intelligence Intelligence-driven security relies on big data analytics. Big data sources encompass both the breadth of sources and the information depth needed for intelligent monitoring This intelligence can then be used to assess risks accurately and to defend against illicit activity and advanced cyber threats. 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering, Analytics & Security Intelligence The ability of clustering to highlight information in vast amounts of data make it invaluable to deriving security intelligence from data. The following data sources can be used in conjunction with clustering by security operatives to enhance intelligence Gathering; Existing Crime Data Mobile Phone Call logs Mobile phone Geo-positions (Location) Financial Transaction Data Financial Market Data 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering, Analytics & Security Intelligence (2) Using clustering on big data can help achieve the following ; Consolidation of otherwise isolated and incompatible silo stores of data Easy detection of anomalies from large Datasets Establishing correlation and causality between otherwise independent and isolated data Real time analysis of data to aid crime prevention and crime resolution Flexibility in adapting to the constantly changing data and data environment 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering, Analytics & Security Intelligence: Example The image below shows the geographical position of calls registered on a mast by users on a particular cell tower. Longitude Latitude Altitude 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering, Analytics & Security Intelligence: Example An immediate observation from the image is a small group of calls made outside the normal perimeter of other calls. If this location is not marked as a residential or accommodated area, then it can be flagged immediately for review 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering, Analytics & Security Intelligence: Example (2) This example shows stock market trade prices and transactions from the FTSE 100 in 2015. Number of Customers Share price Volume of Trades 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering, Analytics & Security Intelligence: Example (2) From the graph, it can be seen that there are a few isolated transactions that do not conform with the general norm of other transactions. This can be flagged immediately for further investigation into the nature of these transactions 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering Research Methodology K-means is used as the clustering technique for the big data sources. There are several initialization methods in k-means which operate differently and use different number of clusters (initial “k” value). Common k-means initialization methods include K- means ++, Elbow Method, Gap Statistic Method, Silhouette Method, Calinski-Harabasz Criterion, Hartigan’s Method 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering Research Methodology (cont’d) These different k-values provide different levels of accuracy based on the data being analysed. The performance on the initialisation methods depend on various factors. This research aims to explore the relationship between initialisation methods and characteristics of the data 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Clustering Research Methodology (cont’d) The 5 main characteristics of big data are : Volume, Velocity, Variety, Veracity and Value. The following table summarises metrics established based on these characteristics 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Experiment : Different Number of Clusters (Mobile Phone geopositons) 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Experiment : Different Number of Clusters (Financial Data) 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Experiment : Metrics Big Data Feature Metric Unit Volume The Actual Size of the Data. GB (Gigabytes) Data Volume Changes per second GB (Gigabytes)/s Percentage of Data change % Variety Volume of Structured Data Volume of Semi-Structured Data Volume of Unstructured Data Percentage of Structured Data Relative Volume of Structured to Semi-Structured Data Ratio Relative Volume of Structured to Unstructured Data Velocity Rate of Change of Data Volume Per Second Rate of Change of Structured Data Volume Rate of Change of Semi-Structured Data Volume Rate of Change of Unstructured Data Volume Veracity Rate of Error Number of errors Per Second Volume of Error Percentage Error Real Time Accuracy Value Amount of Big Data Volume Used Percentage of Volume of Data Used 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Challenges in Implementation Cluster Visualisation: Improving the human-computer Interaction required with visual interfaces can facilitate how big data is consumed by security outfits. User Privacy: Limitations in existing regulations that prevent security agencies from accessing information protected under privacy laws. Public/Private Partnerships: Several current big data sources are generated by private companies, limiting the level of access by security agencies. Data Veracity: Because big data is derived from several data sources, it can be difficult to ascertain the trustworthiness of the data. 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Solutions to Facilitate Implementation Review of current legislation on data capture, storage and usage by both public and private organisations. Provide an enabling collaborative environment with incentives to enhance exchange of data. Provide the required infrastructure to support high speed and high volume data transfer across large areas. Promote the education of scientist on the use of data and provide software tools to enhance the culture of data sharing in the scientific environment, thereby improving research quality and overall data quality 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria

Thank You… 3rd Big Data Analytics and Innovation Conference, 22-25 November, NDC, Abuja, Nigeria