Download presentation
Presentation is loading. Please wait.
Published byEileen Pitts Modified over 8 years ago
1
2014 Unsupervised Malware Classification: How Bad Software Can Find its own Kind Shannon Steinfadt, Ph.D., Juston Moore, Micah Yates Los Alamos National Laboratory October 9, 2014 Released under LA-UR-14-27786 #GHC14 2014
2
What is Malware? Malicious software = Malware −Intentional or Unintentional −Includes: Computer viruses Worms Trojan horses Ransomware Spyware Adware Scareware Poorly written software
3
2014 Growth of Malware Growth of malware now exceeds legitimate software releases −According to FireEye: “Enterprises experience a malware event up to once every three minutes” (2013) Mobile applications – new digital frontier of hacking
4
2014 (Some) Impacts of Malware Financial impacts −Target −Home Depot −Banks Intellectual Property −Advanced science knowledge −Patented processes and concepts Business Impacts −Competitors with insider information −Loss of customers (Neiman Marcus)
5
2014 Finding Malware Time consuming Highly skilled labor required Difficult task −Easy to overlook −Hard to discern what it is doing Obfuscate true C2 channels Change behavior in VMs Reverse Engineering (RE)
6
2014 Classifying Malware To find (detect) malware, it’s easier if you already know something about existing malware −Code reuse is common across different malicious software files Would like to have an automatic detection capability but… −What happens after you find malware?
7
2014 Disassembly Example executable file suspect as malicious (.exe or.dll) How to “look under the hood?” −Disassembler (IDA, Olly Debug, …) −Get x86 assembly instructions and some organization by function calls call sub_4300A1 call sub_437604 test eax, eax jnz short loc_43124F push offset aSFatalError call sub_406B06 pop ecx mov esi, eax
8
2014 IDA Disassembler View https://www.hex-rays.com/products/ida/pix/pc3.gif
9
2014 APT-1 February 2013 – Security company Mandiant released APT1 report −“APT1: Exposing One of China's Cyber Espionage Units” −APT = Advanced Persistent Threat −Data used here is well studied and publically available at http://intelreport.mandiant.com/ −Great for data validation for new tool sets
10
2014 Detecting Malware in Real Time Create signatures −Common to malware / unique otherwise −Output format, something that can be fed directly into other systems for detection Snort Rules Bro Rules YARA Rules* −Example YARA Rule from APT-1: rule lightbolt : apt { strings: $a = "bits.exe a all.jpg.\\ALL -hp%s" $b = "The %s store has been opened" $c = "Machine%d" $d = "Service%d" $e = "7z;ace;arj;bz2;cab;gz;jpeg;jpg;lha;lzh;mp3;rar;taz;tgz;z;zip" condition: filesize < 300KB and (5 of ($a,$b,$c,$d,$e)) }
11
2014 Detecting Malware in Real Time Create signatures −Common to malware / unique otherwise −Output format, something that can be fed directly into other systems for detection Snort Rules Bro Rules YARA Rules* −Example YARA Rule from APT-1: rule tarsip_eclipse : apt { strings: $a = "Eclipse" $b = "PIGG" $c = "WAKPDT" $d = "show.asp?" $e = "flink?" condition: filesize < 300KB and (5 of ($a,$b,$c,$d,$e)) }
12
2014 RED/UCE Tool Reverse Engineering Deduction / Universal Classifying Engine −Suite of tools Two main tools currently, more in the near future −Visual clustering of software samples and sample exploration −Signature generation for detection of certain families of software (YARA Rules)
13
2014 RED/UCE Clustering Map Input: Clean OS files + Small set of malware samples −RED/UCE can assist to make inferences about the authorship of unknown samples
14
2014 Malware Binaries Clean Binaries IDA Pro Clean Sample 1 wildcarded functions Malware Sample 1 wildcarded functions Clean Sample 2 wildcarded functions … Malware Sample 2 wildcarded functions … Strings Extractor Clean Sample 1 strings Malware Sample 1 strings Clean Sample 2 strings … Malware Sample 2 strings … Counter Count per sample of: Strings Words Functions Basic Blocks Select samples of interest Select samples to compare against Information Gain Sorter Display Feature Lists Strings Words Functions Basic Blocks Select features Generate YARA rule Select thresholds Select a samples for exploration Select meta-groups to compare with Select Features based on clean samples and meta-groups YARA generatorNew Sample Visualizer Identify K clusters to the sample of interest Embed samples in a 2D space using Multidimensional Scaling Display samples with edges connecting them. Each edge shall be annotated with the features in common between sampels.
15
2014 Count per sample of: Strings Words Functions Basic Blocks Select samples of interest Select samples to compare against Information Gain Sorter Display Feature Lists Strings Words Functions Basic Blocks Select features Generate YARA rule Select thresholds Select a samples for exploration Select meta-groups to compare with Select Features based on clean samples and meta-groups YARA generator New Sample Visualizer Identify K clusters to the sample of interest Embed samples in a 2D space using Multidimensional Scaling Display samples with edges connecting them. Each edge shall be annotated with the features in common between sampels.
16
2014 RED/UCE Clustering Map
17
2014 RED/UCE Clustering Map rule tarsip_eclipse : apt { strings: $a = "Eclipse” $b = "PIGG” $c = "WAKPDT" $d = "show.asp?" $e = "flink?” condition: filesize<300KB and (5 of ($a,$b,$c,$d,$e)) }
18
2014 RED/UCE Clustering Map rule tarsip_eclipse : apt { strings: $a = "Eclipse” $b = "PIGG” $c = "WAKPDT" $d = "show.asp?" $e = "flink?” condition: filesize<300KB and (5 of ($a,$b,$c,$d,$e)) }
19
2014 RED/UCE Focused Clustering Focused feature selection set – reduced sample input −Green dot – sample of interest −Red dot – labeled malware sample(s) −Blue dot – labeled good sample(s)
20
2014 RED/UCE Signature Creation Chosen features (drop-and-drag) Visual output for chosen features
21
2014 RED/UCE
22
2014 RED/UCE
23
2014 RED/UCE Signature Creation
24
2014 RED/UCE Signature Creation Common YARA style rule found across all 9 samples −String " c7 ?? ?? ?? ?? ?? 83 c0 04 41 75 ?? ” −A common basic block across 100% of these samples (Score: 100) {c7 ?? ?? ?? ?? ?? 83 c0 04 41 75 ??}
25
2014 Signature Use Output from RED/UCE can be deployed rapidly in operational environments −Open-source YARA format for signatures currently YARA signatures are commonly used by the security community Tool can be extended to produce other IDS (intrusion detection system) formats Bro, Snort / Sourcefire, …
26
2014 Conclusions RED/UCE tool assists analysts to: −Secure their networks from malicious software and users −Quickly find correlations across decompiled software code −Create effective signatures for new, emerging threats The signature development process is guided by information theoretic principles −Unlike other machine learning systems, RED/UCE provides the analyst with the ability to control the selection of signatures based on operational awareness criteria (knowing what’s important to your institutions) The tool and signatures can be widely deployed and promote collaboration between other sites −Does not require any vendor-specific hardware or network configuration in order to be useful
27
2014 Looking for an Internship? Visit the Los Alamos National Laboratory Table here at GHC’14 Contact Shannon@lanl.gov if you are interestedShannon@lanl.gov http://jobs.lanl.gov for other opportunities http://jobs.lanl.gov Released under LA-UR-14-27786
28
2014 Intel x86 Assembler Instruction Set Opcodes c7 MOV 83 SUB c0 #2 04 ADD 41 INC eCX 75 JNZ ?? Register value (Score: 100) {c7 ?? ?? ?? ?? ?? 83 c0 04 41 75 ??} http://sparksandflames.com/files/x86InstructionChart.html
29
2014 Got Feedback? Rate and Review the session using the GHC Mobile App To download visit www.gracehopper.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.