Exploiting Temporal Persistence to Detect Covert Botnet Channels Authors: Frederic Giroire, Jaideep Chandrashekar, Nina Taft… RAID 2009 Reporter: Jing Chiu /11/29 1 Data Mining & Machine Learning Lab
Outlines Introduction Methodology Dataset Description Evaluation Conclusions 2015/11/29 2 Data Mining & Machine Learning Lab
Introduction How do bots get rid of existing defenses? ▫Polymorphic engines ▫packing engines ▫AV vendor reports 3000 distinct samples daily Anomaly detection methods for botnet ▫Use traffic feature distributions for analysis ▫Detect bots activated for generating attacks ▫Latency exist from infection to activation Covert channel between bots and C&C server ▫Last for an extended period ▫Lightweight and spaced out over irregular time period 2015/11/29 3 Data Mining & Machine Learning Lab
Methodology Assumptions ▫Communication between Zombie and C&C server is not limited to a few connections ▫Zombie is not programmed to use a completely new C&C server at each new attempt Persistence and destination atoms ▫Destination atoms for building white lists ▫Persistence for lightweight repetition 2015/11/29 4 Data Mining & Machine Learning Lab
Methodology (cont.) Why use white lists? ▫Regularly communicate hosts is a stable, small set Examples: Work related, news and entertainment websites Mail servers, update servers, patch servers, RSS feeds Advantages: Search fast Easy to management ▫These hosts require infrequent updating 2015/11/29 5 Data Mining & Machine Learning Lab
Methodology (cont.) Destination atoms ▫(dstService, dstPort, proto) ▫Different domains: second level domain name Yahoo.com, cisco.com ▫The same domains: third level domain name Mail.intel.com, print.intel.com ▫Multiple ports is allowed (ftp.service.com, 21:>1024, tcp) ▫ When address cannot be mapped to names, use IP address as service name ▫ExamplesExamples2015/11/29 6 Data Mining & Machine Learning Lab
Methodology (cont.) Persistence metric ▫d: destination atom W = [s 1, s 2,…, s n ] ▫W: observation window s i : measurement window ▫Timescale: (W,s) ▫For each timescale(W j,s j ): 1≤j≤k 2015/11/29 7 Data Mining & Machine Learning Lab
Methodology (cont.) C&C Detection Implementation ▫Use long bitmap to track connections at each timescale ▫Procedure Update bitmap, count persistence If updated persistence crosses the threshold p *, raise alarm After enough samples, the persistence is below the threshold, free bitmap up Bitmap example 2015/11/29 8 Data Mining & Machine Learning Lab
Dataset Description End host traffic traces ▫Collected at 350 enterprise user’s hosts ▫Over 5 week ▫Use 157 of the 350 traces, common 4 week period Botnet traffic traces ▫Collected 55 known botnet binaries ▫Executed inside a Windows XP SP2 VM and run for as long as a week ▫Experience A lot of binaries simply crashed the VM C&C deactivated Only 27 binaries yielded traffic 12 of the 27 binaries yielded traffic that lasted more than a day List of sampled Botnet binaries List of sampled Botnet binaries2015/11/29 9 Data Mining & Machine Learning Lab
Evaluation System Properties CDF of p(d) across all the atoms seen in training data Distribution of per host whitelist sizes (p * = 0.6) 2015/11/29 10 Data Mining & Machine Learning Lab
Evaluation C&C Detection Other results RoC curveFalse positives across usres(p * = 0.6) 2015/11/29 11 Data Mining & Machine Learning Lab
Evaluation Improvement in detection rate after filtering 2015/11/29 12 Data Mining & Machine Learning Lab
Conclusions Introduce “persistence” as a temporal measure of regularity in connection to “destination atoms” Persistence could help detect malware without ▫protocol semantics ▫payloads Proposed a method for detecting C&C server and has no false negative in experiment Both centralized and p2p infrastructure could be uncovered by this method Low overhead and low user annoyance factor 2015/11/29 13 Data Mining & Machine Learning Lab
Destination atoms 2015/11/29 14 Data Mining & Machine Learning Lab
Bitmap Example 2015/11/29 15 Data Mining & Machine Learning Lab
List of Botnet binaries 2015/11/29 16 Data Mining & Machine Learning Lab
C&C detection result 2015/11/29 17 Data Mining & Machine Learning Lab