Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Big Data clustering for improved security intelligence

Similar presentations


Presentation on theme: "Using Big Data clustering for improved security intelligence"— Presentation transcript:

1 Using Big Data clustering for improved security intelligence
By Abubakar Sheriff Federal University of Technology Minna 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

2 Introduction : Big Data
Big data is extremely large data sets consisting of structured, semi-structured and unstructured data that may be computationally analysed. These datasets are usually characterised by their volume, velocity, variety, veracity and value. This analysis can be used to reveal patterns, trends, and associations, especially relating to human behavior and interactions 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

3 Introduction: Clustering
Clustering is a method of aggregating large volumes of data into groups based on specified characteristics. This makes the analysis, understanding, interpretation and assimilation of large data easier and faster 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

4 Introduction: Clustering (2)
Clustering data can be used to provide valuable information in several fields including: Security: Human Behavior analysis. Medicine: DNA and Genotype patterns. Business: Market Research & Analysis Computing: Evolutionary Algorithms Geography: Climatology and Seasonal weather predictions 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

5 Introduction: Security Intelligence
Data intelligence is the continuous real-time collection, normalization and analysis of data generated by users, applications and infrastructure. This typically includes log management, security event correlation and network activity monitoring. Security intelligence is actionable information that can be used to prevent or resolve criminal activities or intent 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

6 Introduction: Security Intelligence (2)
It has become an essential tool for crime management due to the increasing abilities of perpetrators to circumvent traditional security systems. Furthermore, the continuing dissolution of traditional defensive perimeters coupled with enhanced attackers’ abilities requires organizations to adopt an intelligence-driven security model that is more risk-aware, contextual, and agile. 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

7 Overview: Big Data & Security Intelligence
Intelligence-driven security relies on big data analytics. Big data sources encompass both the breadth of sources and the information depth needed for intelligent monitoring This intelligence can then be used to assess risks accurately and to defend against illicit activity and advanced cyber threats. 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

8 Clustering, Analytics & Security Intelligence
The ability of clustering to highlight information in vast amounts of data make it invaluable to deriving security intelligence from data. The following data sources can be used in conjunction with clustering by security operatives to enhance intelligence Gathering; Existing Crime Data Mobile Phone Call logs Mobile phone Geo-positions (Location) Financial Transaction Data Financial Market Data 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

9 Clustering, Analytics & Security Intelligence (2)
Using clustering on big data can help achieve the following ; Consolidation of otherwise isolated and incompatible silo stores of data Easy detection of anomalies from large Datasets Establishing correlation and causality between otherwise independent and isolated data Real time analysis of data to aid crime prevention and crime resolution Flexibility in adapting to the constantly changing data and data environment 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

10 Clustering, Analytics & Security Intelligence: Example
The image below shows the geographical position of calls registered on a mast by users on a particular cell tower. Longitude Latitude Altitude 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

11 Clustering, Analytics & Security Intelligence: Example
An immediate observation from the image is a small group of calls made outside the normal perimeter of other calls. If this location is not marked as a residential or accommodated area, then it can be flagged immediately for review 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

12 Clustering, Analytics & Security Intelligence: Example (2)
This example shows stock market trade prices and transactions from the FTSE 100 in 2015. Number of Customers Share price Volume of Trades 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

13 Clustering, Analytics & Security Intelligence: Example (2)
From the graph, it can be seen that there are a few isolated transactions that do not conform with the general norm of other transactions. This can be flagged immediately for further investigation into the nature of these transactions 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

14 Clustering Research Methodology
K-means is used as the clustering technique for the big data sources. There are several initialization methods in k-means which operate differently and use different number of clusters (initial “k” value). Common k-means initialization methods include K- means ++, Elbow Method, Gap Statistic Method, Silhouette Method, Calinski-Harabasz Criterion, Hartigan’s Method 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

15 Clustering Research Methodology (cont’d)
These different k-values provide different levels of accuracy based on the data being analysed. The performance on the initialisation methods depend on various factors. This research aims to explore the relationship between initialisation methods and characteristics of the data 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

16 Clustering Research Methodology (cont’d)
The 5 main characteristics of big data are : Volume, Velocity, Variety, Veracity and Value. The following table summarises metrics established based on these characteristics 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

17 Experiment : Different Number of Clusters (Mobile Phone geopositons)
3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

18 Experiment : Different Number of Clusters (Financial Data)
3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

19 Experiment : Metrics Big Data Feature Metric Unit Volume The Actual Size of the Data. GB (Gigabytes) Data Volume Changes per second GB (Gigabytes)/s Percentage of Data change % Variety Volume of Structured Data Volume of Semi-Structured Data Volume of Unstructured Data Percentage of Structured Data Relative Volume of Structured to Semi-Structured Data Ratio Relative Volume of Structured to Unstructured Data Velocity Rate of Change of Data Volume Per Second Rate of Change of Structured Data Volume Rate of Change of Semi-Structured Data Volume Rate of Change of Unstructured Data Volume Veracity Rate of Error Number of errors Per Second Volume of Error Percentage Error Real Time Accuracy Value Amount of Big Data Volume Used Percentage of Volume of Data Used 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

20 Challenges in Implementation
Cluster Visualisation: Improving the human-computer Interaction required with visual interfaces can facilitate how big data is consumed by security outfits. User Privacy: Limitations in existing regulations that prevent security agencies from accessing information protected under privacy laws. Public/Private Partnerships: Several current big data sources are generated by private companies, limiting the level of access by security agencies. Data Veracity: Because big data is derived from several data sources, it can be difficult to ascertain the trustworthiness of the data. 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

21 Solutions to Facilitate Implementation
Review of current legislation on data capture, storage and usage by both public and private organisations. Provide an enabling collaborative environment with incentives to enhance exchange of data. Provide the required infrastructure to support high speed and high volume data transfer across large areas. Promote the education of scientist on the use of data and provide software tools to enhance the culture of data sharing in the scientific environment, thereby improving research quality and overall data quality 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria

22 Thank You… 3rd Big Data Analytics and Innovation Conference, November, NDC, Abuja, Nigeria


Download ppt "Using Big Data clustering for improved security intelligence"

Similar presentations


Ads by Google