Detecting Insider Information Theft Using Features from File Access Logs Every action, on your phone, on your computer, online, has some risk associated.

Slides:



Advertisements
Similar presentations
ProAssist ® complex assistance services management system Global Assistance & INGENIUM Praha.
Advertisements

Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
© 2013 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual.
Polymorphic blending attacks Prahlad Fogla et al USENIX 2006 Presented By Himanshu Pagey.
5/1/2006Sireesha/IDS1 Intrusion Detection Systems (A preliminary study) Sireesha Dasaraju CS526 - Advanced Internet Systems UCCS.
Department Of Computer Engineering
Intrusion and Anomaly Detection in Network Traffic Streams: Checking and Machine Learning Approaches ONR MURI area: High Confidence Real-Time Misuse and.
Modeling and Detecting Anomalous Topic Access Siddharth Gupta 1, Casey Hanson 2, Carl A Gunter 3, Mario Frank 4, David Liebovitz 4, Bradley Malin 6 1,2,3,4.
Towards A User-Centric Identity-Usage Monitoring System - ICIMP Daisuke Mashima and Mustaque Ahamad College of Computing Georgia Institute of Technology.
User Profiling for Intrusion Detection in Windows NT Tom Goldring R23.
Intrusion Detection for Grid and Cloud Computing Author Kleber Vieira, Alexandre Schulter, Carlos Becker Westphall, and Carla Merkle Westphall Federal.
Improving Intrusion Detection System Taminee Shinasharkey CS689 11/2/00.
Using Identity Credential Usage Logs to Detect Anomalous Service Accesses Daisuke Mashima Dr. Mustaque Ahamad College of Computing Georgia Institute of.
Carnegie Mellon Selected Topics in Automated Diversity Stephanie Forrest University of New Mexico Mike Reiter Dawn Song Carnegie Mellon University.
Operating system Security By Murtaza K. Madraswala.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
Network Anomaly Detection Using Autonomous System Flow Aggregates Thienne Johnson 1,2 and Loukas Lazos 1 1 Department of Electrical and Computer Engineering.
INSIDER THREATS BY: DENZEL GAY COSC 356. ROAD MAP What makes the insider threat important Types of Threats Logic bombs Ways to prevent.
SOFTWARE TESTING TRAINING TOOLS SUPPORT FOR SOFTWARE TESTING Chapter 6 immaculateres 1.
Client-Side Malware Protection for your site
Application Communities
Chapter 9 Intruders.
Deployment Planning Services
    Customer Profile: If you have tech savvy customers, having your site secured for mobile users is recommended. Business Needs: With the growing number.
Ilija Jovičić Sophos Consultant.
Firmware threat Dhaval Chauhan MIS 534.
Data Mining: Concepts and Techniques
Ch.22 INTRUSION DETECTION
Critical Security Controls
Intercept X Early Access Program Root Cause Analysis
Parallel Autonomous Cyber Systems Monitoring and Protection
Real-time protection for web sites and web apps against ATTACKS
Packet Leashes: Defense Against Wormhole Attacks
Outline Introduction Characteristics of intrusion detection systems
Operating system Security
ADVANCED PERSISTENT THREATS (APTs) - Simulation
Evaluating a Real-time Anomaly-based IDS
CIS 333 RANK Education for Service-- cis333rank.com.
Intercept X for Server Early Access Program Sophos Tester
Flavio Toffalini, Ivan Homoliak, Athul Harilal,
Virus Attack Final Presentation
COMPREHENSIVE APPROACH TO INFORMATION SECURITY IN ADVANCED COMPANIES
Roland Kwitt & Tobias Strohmeier
The Privacy Cycle A Five-Step Process to Improve Your Privacy Culture
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Intercept X Early Access Program Root Cause Analysis
Software Assurance Maturity Model
A survey of network anomaly detection techniques
Improving DevOps and QA efficiency using machine learning and NLP methods Omer Sagi May 2018.
Intrusion Prevention Systems
12/6/2018 Honeypot ICT Infrastructure Sashan
Friday, December 07, 2018 Honeypot ICT Infrastructure Sashan Kantonsspital Graubunden ICT Department.
Application-level logs: visualization and anomaly detection
Chapter 9 Intruders.
Privacy-Preserving Dynamic Learning of Tor Network Traffic
A 5-minute overview of ADAudit Plus
Security.
Hardware Counter Driven On-the-Fly Request Signatures
Intrusion Detection system
ONLINE SECURE DATA SERVICE
Wenyu Ren, Timothy Yardley, Klara Nahrstedt
Helping you make your code better
Xbar Chart By Farrokh Alemi Ph.D
Information Protection
Operating System Concepts
Protect data in core business applications
Information Protection
Real-Time RAT-based APT Detection
Who Are Similar to Einstein: A Multi-Type Object Similarity Measure for Entity Recommendation Zheng Liang.
Presentation transcript:

Detecting Insider Information Theft Using Features from File Access Logs Every action, on your phone, on your computer, online, has some risk associated with it. Christopher Gates, Ninghui Li, Zenglin Xu Purdue University Suresh N. Chari, Ian Molloy, Youngja Park IBM TJ Watson Research

Detecting Malicious File Access Intellectual Property Theft is an important security problem Insider Threat Legitimate access In-depth knowledge of resources Knowledge of deployed security mechanisms Stolen Credentials Can utilize other persons legitimate access

Current Prevention Techniques Limit exposure via access control Users need access Productivity is often seen as more important Encrypt data at rest Does not stop legitimate access Use high level statistics for detection Does not capture more fine grained detail Does not give specific guidance for violation

Goal Exploit knowledge about resources to detect deviation from access history Can also be viewed as estimating/controlling risks of aggregated accesses by one user Two kinds of malicious insiders Impetuous Patient

Our Approach Generate a score for a set of accesses given a history Score between two files Related to all history All files in current period Normalize

Similarity between Files Files are not accessed randomly within a hierarchy There are reasons to access specific areas Job function Project Related content Similarity can also have many facets Distance Access similarity File type/content source-code/file-system/web

Distance Score Functions Name Formula Binary Full Distance Lowest Common Ancestor (LCA) Log LCA Access Similarity Binary – exact match. Good if there is high overlap for files in previous time periods (like source code) Full Distance – when similarity up both sides of the branch matters LCA – when distance to an accessed branch is useful LogLCA – to penalize things closer to the root differently then deeply nested in the hierarchy Access – Can capture similarity for other reasons, so source code and documents in otherwise unrelated areas of the hierarchy can still be similar based on the user overlap

Aggregation Function 3 aggregation functions : Relates f to all files in the history min : The lowest ave : Average all k-nearest : Compares to k lowest

Data CMVC Source Code Management System Log data: [user, timestamp, action, resource] For evaluation we used 1 year of log data ~512k unique files ~133k unique directories ~2k users 1 period to bootstrap, 10 to train, 1 to test. Configuration Management Verion Control

Self Similarity Check a users current access against their history Simple Easy to understand Detects deviations from past behavior

Adversary This can catch an impetuous attacker. Patient adversary can seed file accesses in previous time periods to affect similarity of distance based scores

Similarity Between Users Gives a relation of expected behavior across all profiles. Malicious user can only affect their own history. user1 u1Score u2Score … uNScore user2 userN

Features to Find Anomalies Description Unique File Count Main technique currently used in practice New Unique File Count Binary Method, new unique in window Average Similarity Score LogLCA Self Score values, [0,1] Sum Similarity Score LogLCA Sum Score values Mean Distance - Find a single point in to summarize previous periods over similarity between user features. - Use cosine similarity to find distance between the current point and the expected point. Mean Distance * New Unique Since the goal is to detect theft of files, and mean distance doesn’t have a feature to represent the number of files accessed, we combine the mean distance by the number of new unique files.

Exposure to Data

Values for Self Scores

Self Identification Performance

Generating Malicious Behavior No ground truth data for malicious behavior Generate simulated attacks by injecting directories Represents targeted attacks on specific data Three size ranges for the injection 500-1000 : 10 unique attacks 1000-2500 : 12 unique attacks 5000+ : 2 unique attacks Inject in two ways Impetous Attacker : Inject X accesses in current period Patient Attacker : Seed the current users history with files from the injection, then inject

Impetuous Attacker

Impetuous Attacker

Patient Attacker Injecting

Patient Attacker Injecting

Discussion: How to Present to Users Similarity scores may help communicating events Better detection of truly anomalous activity Go beyond simple file counts Create a ranking of most anomalous users Better understanding of what is causing the score Ranking the files that a user is accessing Allows for an incident response team to more quickly understand why a user is received a high score

Summary Explored using file similarity features to identify malicious insiders Evaluated with real access logs and synthetic attacks