Real-Time RAT-based APT Detection

Real-Time RAT-based APT Detection

High Value Asset Acquisition
Our Focus Initial Compromise Gaining Foothold Lateral Movement High Value Asset Acquisition Malware (e.g. RAT) Phishing Exploit vulnerability Victim Network scan Attacker Code Repo Malware propagation CONFIDENTIAL Malicious Web Exploit browser Database Behavior-based Malware Detection Behavior based Malware detection Provenance based Analytics Design a detection mechanism that targets at the key step (gaining foothold) in the APT life-cycle

APT Malware Remote Access Trojan (RAT)
Based on the study of 300+ APT whitepapers, RAT is a core component in an APT attack, and >90% are Windows based. Allows an adversary to remotely control a system A complex set of potentially harmful functions (PHFs) E.g., keylogger, screengrab, remote desktop, remote shell, audiograb A Windows RAT typically embodies10~40 PHFs.

Analysis of Engagement 1 Data

Issues with FAROS Kafka Topics
Kafka A and Kafka B not usable Due to the unstable FAROS tool, TA5.1 suggests not consuming either Kafka A or Kafka B produced by FAROS Stretch Goal Topic became available very late Data errors found and FAROS re-produced the topic on 10/7 Even with those issues, finally we finished our ingestion of the topic, submitted the initial report to TA5.1, and received positive feedback.

How to Figure Out the Attack Graph
Data Reduction 71M records in Stretch topic; 30 mins processing time 529 processes in total; 22 processes (4%) identified involved in malware activities Three processes were reported by our RAT detector Profile.exe (2) matched with the remoteshell signature Prodat.exe matched with the screengrab signature Perform backtracking Based on the artifacts (network ip/port connected, files created) and the pid-ppid relationship, we identify all relevant processes. 23KB/s = 1.38MB/min

Attack Graph for FAROS Stretch Goal Dataset

Breakdown of the Attack (1)
The attack begins with triggering an executable "C:\Users\User\Downloads\profile.exe" at Sep :12:06 GMT.

At 18:13:33, the malware "profile.exe" invoked "cmd.exe“, which in turn invoked another malware "C:\Users\User\Downloads\prodat.exe" at 18:13:58. However, the current data traces do not allow us to determine how the malware gained foothold. This malware mainly did screengrab, and saved the results in "C:\Users\User\Downloads\proout.png". And then this file was read and sent out by profile.exe to :19985. and can only speculate that it could be downloaded by the compromised Firefox browser

At 18:16:58, the malware "profile.exe" invoked cmd.exe again to run hostname.exe, whoami.exe, and netstat.exe to collect sensitive information. The results were written to a log file "C:\Windows\Temp\1283.log "

Breakdown of the Attack (3) – Cont’d
At 18:19:44, the malware "profile.exe" invoked cmd.exe" again, which in turn executed the malware “proup.exe“ "proup.exe" then sent "1283.log" and initiated TCP connection to the attacker machine :1050 for data exfiltration. At 18:21:42, “burnout.bat“ was executed for the cleanup.

At 20:03:55, a Firefox process ("firefox.exe") was launched, which invoked another Firefox process subsequently. Then the latter Firefox was probably compromised, which downloaded another malicious executable with the same name "profile.exe" from IP address :20480 (site lariat.world.net) at 20:06:41, and also saved it as "C:\Users\User\Downloads\profile.exe"

Breakdown of the Attack (4) – Cont’d
The malware then started running at 20:09:02 and soon invoked cmd.exe The cmd.exe executed both "systeminfo.exe" and "tasklist.exe" to collect system information and currently running task list. The results were saved in the file named "rfeed.dat". Then "profile.exe" sent the data file out to :19985. Finally, at 20:17:33, "profile.exe" executed "burnout.bat“ to perform the cleanup work.

Fine-Grained, Evasion-Resilient and Real-time RAT Detection
Our Approach: Fine-Grained, Evasion-Resilient and Real-time RAT Detection

Our Work What is going on Why not provenance-based causality analysis
Implement a fine-grained, evasion-resilient and real-time detection system of RATs Specifically, we detect if malicious functionalities are present in the system call traces of a process. Why not provenance-based causality analysis FAROS does not provide usable provenance information for now. Data missing: provenance node, netflow object node, file object node. What is next: Design a system for both real-time APT malware detection and automatic causality analysis.

Overview Observation Core Idea
# of PHFs possibly embodied in a RAT is limited (10~40). Core system calls and their orders required to exactly define a PHF are limited, and thus it is possible to identify all of them. Core Idea Fine-grained, evasion-resilient and real-time RAT detection Determine if a program is a RAT by detecting its functionalities and examining its characteristics. Specifically, Create signatures for each PHF possibly embodied in a RAT Train a classifier based on the unique characteristics of RATs to discern between RATs and benign programs function-level, ground truth, supervised training, is a must for adversaries to implement a PHF A PHF could be exactly defined by a limited set of core system calls. Separation of signatures for PHFs increases the detection specificity by combining different types of signatures In each signature sequence, the core system calls are irreplaceable, and the order must be preserved to implement a p

Overview (Cont’d) Advantages
Generated signatures are finer-grained and semantics-aware. Identify what activity is going on while detecting a RAT Hard to evade unless attackers find new ways of implementing PHFs and have to do that for at least several major PHFs function-level, ground truth, supervised training, is a must for adversaries to implement a PHF Hard to evade detection simply by inserting irrelevant system calls or manipulating the sequence of system calls A PHF could be exactly defined by a limited set of core system calls. Separation of signatures for PHFs increases the detection specificity by combining different types of signatures In each signature sequence, the core system calls are irreplaceable, and the order must be preserved to implement a p

Our Approach Design Supervised learning
Training data with ground truth Our Approach Design PHF1 Trace 1 PHF1 Trace 2 … PHF1 Trace n Self-repeated gadgets identification and correlation analysis A PHF1 Signatures for each PHF, for determining the functionality B C RAT traces … … Module 1: Traces based signature generation system (offline) PHFmTrace 1 PHFmTrace 2 … PHFm Trace n PHFm U Gadgets identification and correlation analysis V W Feature generation & selection Classifier signatures for differentiating benign from malicious Benign traces function-level, ground truth, supervised training Generated signatures for both determining the functionality and discerning between malicious and benign Characteristic analysis Supervised learning Signature matching PHF1 Sig Score 1 Module 2: Real-time RAT detection system System call traces NtGdiCreateCompatibleDC NtGdiBitBlt NtCreateSection NtQueryInformationProcess NtCreateThread NtResumeThread PHF2 Sig Score 2 Malicious Score … … PHFn-1 Sig Score n-1 Classifier Sig Score n

PHF Signature Generation
NtUserGetKeyboardState NtUserMapVirtualKeyEx NtUserGetForegroundWindow ⁞ Observation 1: Most malicious activities such as keylogger and screengrab require frequent probes of input devices to collect coherent and meaningful user inputs. Such characteristic is reflected in the trace that there exist small gadgets self-repeated multiple times. Insight Those gadgets can be automatically extracted from the traces and then potentially used for defining the malicious activities. change “With some prior knowledge” to “Such sigs can be automatically extracted from the traces because they are the common part in the system call graphs” Strength: consider all categories of system calls, unlike previous approaches which consider only system call categories that seem intuitively informative, such as file systems and registry calls. Separation of PHF-based signatures allows us to increase the detection specificity by combining different types of signatures.

PHF Signature Generation – cont’d
Observation 2: Multiple RATs tend to implement a PHF in the same way at the system call level. And the ways to implement a PHF are quite limited. Insight: Leverage sequence alignment algorithms borrowed from bioinformatics to identify regions of similarity in system call sequences. Such similarity regions typically correspond to the execution of similar code. Build finite automata to model the similarity regions as our signatures Thus, with tens of RAT samples available, it is feasible to identify all the core system call sequences possibly used for this PHF. signatures regions of system calls ⁞ NtProtectVirtualMemory NtGdiCreateCompatibleDC NtGdiCreateCompatibleBitmap NtGdiBitBlt NtGdiDeleteObjectApp NtGdiExtGetObjectW ⁞ NtDelayExecution NtDelayExecution NtGdiCreateCompatibleDC NtGdiCreateDIBSection NtGdiStretchBlt NtGdiDeleteObjectApp NtGdiExtGetObjectW NtDelayExecution NtDelayExecution

Classifier Signature Generation
Selected features (also unique characteristics of RATs) Persistence Modifies auto-execute functionality by setting/creating a value in the registry Environment Awareness for Reconnaissance and Evasion Reads the active computer name, or the machine identifier “MachineGuid” Tries to evade analysis by sleeping many times and for a long time (>2min) Spyware/Information Retrieval Accesses potentially sensitive information from local browsers Queries sensitive IE security settings Anti-Detection and Being Stealthy Sets the process error mode to suppress error box Checks for the presence of an antivirus engine Find the key called “MachineGuid” this key is generated uniquely during the installation of Windows and it won’t change regardless of any hardware swap. It won’t change unless you do a fresh reinstall of Windows. (uniquely Identify A Windows Machine)

Classifier Signature Generation – cont’d
Selected features – cont’d System Destruction Opens file with deletion access rights probably for cleanup after attack Unusual Characteristics Spawns a lot of processes Creates/touches files in windows system directory and registry Running in Background No window, menu, or any visible components No human interactions Actions initiated remotely, rather than initiated locally All those features can be observed in system call traces (either system call name or argument).

Classifier Signature Generation– cont’d
Training set and selected features System call traces of RATs System call traces of popular benign applications (Winscap, Skype, notepad, …) Features Persistence Reconnaissance & Evasive Spyware/Information Retrieval Anti-Detection/Stealthiness System Destruction Unusual characteristics Background running Remotely initiated RAT traces (Poison Ivy, Pandora, Darkcomet, …) Classifiers for discerning between RATs & benign Benign traces (Winscp, Skype, notepad++, quicktime player, …)

Previous Malware Detection Methods Fail RAT Detection

Previous Malware Detection Methods
Main idea of the state-of-the-art work Identify security-sensitive syscalls (e.g., network connections-related) Use data dependency to connect more syscalls, and hence construct a path ending at one security-sensitive syscall Use such a path as detection signature Each node without any children must be a security-relevant system call. E.g., the graph represents the signature graph generated. And the red nodes denote security-relevant system calls. Then whenever a path like the blue one and the yellow one is matched, the system would report the unknown program as malware.

Main problem 1: false positive In the real world, RATs and benign programs share lots of similar behavior. It is not reasonable to judge a program just based on a similar behavior (i.e., a matched path) without awareness of the semantics corresponding to that path. Either the blue path or the yellow one could represent benign behavior!

Main problem 2: evadable by RATs NtConnectPort … NtRequestWaitReplyPort NtCreateSection NtQueryInformationProcess NtCreateThread NtResumeThread Trace 1: NtUserGetDC NtGdiGetDeviceCaps ⁞ Trace 2: NtGdiCreateCompatibleDC NtGdiBitBlt Extract data dependency between system calls. Build a signature graph for each malware sample based on dependency. The system calls marked in red will be ignored since they neither are security-sensitive syscalls nor have data dependency with security-sensitive syscalls.

Main problem 2: evadable by RATs (cont’d) RATs often stay inactive for a long time before sending out the data already collected. That is, the data collection actions are not necessarily followed by security-relevant system calls corresponding to abnormal network connections. In this case, the data collection behavior will not be identified by the signatures generated based on the security-related syscalls. Thus, the previous approaches could be evaded. Actually, the ignored syscalls could exactly be generated by the data collection behavior E.g., the ignored syscalls actually represent part of the screengrab behavior (the right graph) Malicious behavior of RATs doesn’t rely on data dependency so signature lost the information about other important system calls.

Fine-Grained, Evasion-Resilient and Real-time RAT Detection
Conclusion We proposed a fine-grained, evasion-resilient and real-time RAT detection approach. Our approach has been evaluated to work well in the engagement 1. Fine-Grained, Evasion-Resilient and Real-time RAT Detection

Real-Time RAT-based APT Detection

Similar presentations

Presentation on theme: "Real-Time RAT-based APT Detection"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Real-Time RAT-based APT Detection

Similar presentations

Presentation on theme: "Real-Time RAT-based APT Detection"— Presentation transcript:

Similar presentations

About project

Feedback