Greg Hoglund Martin Pillion Malware Threat Analysis with HBGary Responder™ Greg Hoglund Martin Pillion
Day 1, Part 1
Introduction
Introductions Trainers Participants: introduce yourselves to the class Greg Hoglund Martin Pillion Participants: introduce yourselves to the class Name Experience in IR & Reverse Engineering Why are you here? What would you like to learn in this class? NOTE: All trademarks referenced in this presentation are the property of their respective owners.
Class Structure Lecture for each section Demonstration Hands-on Lab Exercises Quiz This class is focused on Incident Response and Malware Analysis with the HBGary Responder™ Professional product
Key The start of a new training section Demo Movie that illustrates the concept Class exercise A helpful analysis hint
What You’ll Learn You will focus on reverse engineering malware programs with Responder™ You will use intelligent search terms to quickly find starting points for analysis You will learn the factors-based methodology for threat assessment
Class Materials Class DVD Responder PRO w/ Digital DNA™ Has all movies Has slide deck Has all malware samples Responder PRO w/ Digital DNA™ Full version HASP key w/ 6 months license
Detect, Diagnose, Respond Methodology
Detect Diagnose Respond Digital DNA™ Score Network Awareness Security Event Management Detect IP and DNS Protocol Patterns Registry Keys File Paths Search Patterns with other solutions NIDS signatures Firewall / Blackholes Network Scans Registry Key Scans Tag machines for reimaging Malware removal
Detect Detection Triggered by some activity / observation Immediate actions Preservation of live memory Quarantine the suspect machine(s) Image hard drive Pull the plug to contain any possible damage
Diagnose Analyze preserved memory image Quickly identify whether a compromise indeed occurred; if so, Quickly identify Risks Suspicious activity Intruder access method(s) Capabilities and behaviors of the malware Perform disk forensics
Respond Create signatures for IDS/IPS/HIDs Create black lists for routers to block URLs IP addresses file names, etc.
Architecture User View Digital DNA™ API to access code and data flows RE of all Code API to access Memory Objects OS Reconstruction Physical RAM Acquisition
Projects Physical Memory Projects Static Import Used with memory snapshots Full Windows™ OS analysis, all modules Static Import Used for stand alone binaries Used for livebins
Livebins This is the in-memory image of a running executable With Responder™ you can extract any module from a memory snapshot These are saved to disk in livebin format
Symbol Considerations Physical memory images offer more symbol information than livebins Livebins sometimes have “data_CALL_PTR” whereas with a physical memory snapshot these are usually labeled with actual symbol names (i.e., “CreateProcess”) In other words, you can look up the symbol on Google™ to learn more about what is going on
What to do with unknown EXE’s Many on-disk exe’s (not livebins) have obfuscation or packing that will interfere with analysis. In this case, load them in a VM with Flypaper! Once loaded, with Flypaper, snapshot the VM Import the .vmem as a physical memory project Alas, much better results! AND, you get DDNA scores for the binaries!
What if I don’t have time? If you don’t have VMWare installed, or don’t have time to run a VMWare session w/ the binary – there is still hope You can zip up to 50 malware programs and submit these to the HBGary Portal HBGary will automatically process all the binaries in our massive ESX server farm, and you get the DDNA results, and can also download the livebins for further analysis!
What if I have more than 50? You can upload thousands of samples at once The samples will be automatically broken into groups of fifty per job Each job will complete as a single unit of work
The HBGary Portal Click here to upload a zip Click here to see DDNA results
Search Patterns You can have one or more wordlist files The wordlist file has a special format String: “this is my string” (in quotes) Byte sequence: [00 11 22 33] (in brackets) Strings are searched case insensitive, ASCII & Unicode
Automatic Analysis When you are in a hurry, use “Extract and Analyze all suspicious binaries” Any high score DDNA will be offered for further analysis Any rule hit from baserules.txt will be offered for further analysis
Responder Project Import Instructor Demo Responder Project Import Movie: responder_import_project.avi
NOTES ON DEMO The instructor demo can be recorded with Camtasia. The movie should have popup text, a feature offered in Camtasia. You can pause the demo movie at any point, and narrate the movie to the class as the popup text indicates. The instructor demo should illustrate the concepts required for the following exercise, but without giving the quiz answers away.
Exercise Incident Response Infected Machine Focus Type Steps: Once VMware is configured, and Flypaper is running a. Drag the following files to the VM window - InstDrv.exe - _root_.exe - _root_.sys b. Run _root_.exe c. Look for that process in Task Manager (look at its PID) d. Run InstDrv - load _root_.sys - start _root_.sys e. See that the process has been removed in Task Manager f. Run netstat (what flags?) and see that the port is listening to a “non-existent” PID g. Telnet to that port and verify that it is indeed listening Incident Response Infected Machine Focus Type VMware Session (StudentExercise1.vmem) Description Search for indications of compromise on a VMware image Time 30 minutes
Exercise Step by Step Answer Questions 1-3 Create a new Responder project Import “Student Exercise1.vmem” Select “parishilton.exe” for analysis Browse the following from “Project” TAB “All Open Network Sockets” Answer Questions 1-3
Exercise Questions Question 1: What suspicious TCP/UDP port is listening on the machine? Question 2: What is the name of the process bound to this suspicious port? Question 3: What is the process ID?
INSTRUCTOR ANSWERS The answer is TCP port 12345 and 31337 The process names are: 31337 is parishilton.exe we already know that is suspicious 12345 is Solitaire.EXE, not something you expect to have a socket open 3. There are two Solitaire processes, PID 128 and 1584, and parishilton.exe, PID 1152
Parishilton Networking Exercise Recap Parishilton Networking Parishilton_network_recap.avi
NOTES ON EXERCISE RECAP The RECAP is similar to the demo, except it shows the specific exercise the students just did, and shows how to arrive at the answer to each of the quiz questions.
Introduction to Malware Threat Factors
Goals of Assessment Rapidly determine Identify traits of the malware Is it malicious? Does it warrant deeper investigation? Identify traits of the malware There are hundreds of traits Can be broadly grouped into six behavioral categories (“factors” )
Development Factors In what country was the malware created? Was it professionally developed? Are there multiple versions? Is there a platform involved? Is the a toolkit involved? Are there multiple parts developed by different groups or developers?
Communication Factors Where does it connect to on the Internet? Drop point IP addresses or DNS names Does it allow incoming connections? Does it use encryption? Does it use stego?
Command and Control Factors How is the malware controlled by its master? Do commands come from a cutout site? What commands does it support? Sniffing, logging, file system? Attack? Poison Pill - Self-destruct? Self-uninstall / silent modes?
Installation and Deployment Factors Does it use the registry? Does it drop any files? Does it attempt to infect other machines on the network? Does it sleep and awaken later?
Information Security Factors Identifies the risks associated with the binary What does it steal? Does it sniff keystrokes? Can it destroy data? Can it alter or inject data? Does it download additional tools?
Defensive Factors Does it have self-defense? Does it use stealth? Does it bypass parts of the operating system? Does it bypass virus scanners?
Computer Network Attack Connecting to drive shares Creating large numbers of sockets (DDOS) Port scanning
This was a lecture only section, no demo or exercise. NOTE This was a lecture only section, no demo or exercise.
Responder Overview GUI Overview
Creating a Project Two basic types Add details about the machine Physical Memory Snapshot Live memory analysis (all running processes) Static PE Import Binary import and analysis (static binaries from disk) Add details about the machine WHY you are analyzing this machine
Importing a Snapshot File → Import → Physical Memory Snapshot Select Snapshot File Add Details About the Snapshot Why is it of interest? Select Post-Import Options Extract and Analyze all Suspicious Binaries Generate the Malware Analysis report Same steps when importing a static binary File → Import → Import Executable Binary
The Scanning Process Import Memory Snapshot Validate the Page Table layout and size Identify PAE/Non PAE Identify OS and Service pack Reconstruct Object Manager Rebuild EPROCESS Blocks Rebuild the VAD Tree
UI: Project Panel Shows all harvested objects Processes, Modules, Drivers Strings, Symbols Macroscopic view of object data Allows drill-down on most objects Context-sensitive right-click menu Status icons
Responder Object Schema Project Memory Image Hardware IDT Operating System SSDT Processes Drivers Open Files Network Socket Information Open Registry Analyzed Binary Strings Analyzed Symbols
Project type Top level folders Leaf-node folders – double click these to see details view of the folder Expandable folders – single click these to expand contents of the folder Table – double click this to see contents of table.
Drill-down via the Project Pane The Project Pane allows you to get more details on almost any item in the report Double-clicking on items in the Project Pane typically shows detailed information about that item Provides automatic filtering Example: The SSDT can be explored in detail
Strings and Symbols Strings provide clues to origin and intention Usually in the language of the developer Typically use descriptive variable names Symbols provide clues to behavior Export calls must be explicitly named
UI: Working Canvas Panel Visually renders relationships and flow Control flow Data references Behavioral representation of the binary Relationships are displayed graphically No need to pore over disassembly Patterns become quickly evident
UI: Working Canvas Panel Function head Data reference Function end
UI: Report Panel Provides a repository for observations and step-wise refinement of the analysis You can edit the description fields in the Report Panel Descriptions are inserted into the final report You can choose which report items will be included in the final report
UI: Report Panel
Report: Descriptive Text Descriptive text can be added to the report item via In-place typing The “Edit Bookmark” feature Using “Edit Bookmark” lets you edit the bookmark name in addition to the description.
UI: Detail Panels Provide detailed information about the selected category in the Project Panel Data can be searched Data can be exported to a variety of formats PDF - XLS - CSV HTML - Image - Text RTF Panel contents can be “locked” Additional columns are available (per panel)
UI: Detail Panels Functions SSDT Strings IDT Symbols Processes Samples Modules Files Drivers Registry Network
Using Detail Panels The detail panels can be sorted by any column A quick way to find the hooked SSDT entries is to sort by target module
Leaf-node folders: double-click these to see the detail panel of the folder
Right click on header to get column chooser
Search the Detail Panel, which filters the panel’s contents to only those entries that match the search criteria
1. Lock the window after filtering 2. Double click the “All Open Registry Keys” folder again. Since the default window (the Registry Panel) is locked, a new (unfiltered) Registry View window is created. You can lock as many as you choose
Context-Sensitive Actions Every panel has a right-click context menu Menu choices based on selected object(s) Most common options Send to report: creates entry in the Report Pane for the selected item Google™ Text Search: uses Google™ search engine to find Internet references to the selected item Google™ Code Search: uses Google™ search engine to find source code that uses the selected item (typically a string or symbol)
Student should be able to drive now Notes Student should be able to drive now
Exercise TODO – verify this works Communications Factors Focus Type Interactive Analysis Description Use graphing techniques to quickly isolate the Communications Factors associated with both a memory image and a captured piece of malware. Time 25 minutes
Exercise Create a new project (Physical Memory Snapshot) Import memory image \Student Exercise 7\Student Exercise 7 CyberEspionage case.vmem Answer the set of questions in the handout (“Exercise 7 Questions”)
Review: Exercise Ipod.raw.exe Identify Communication Factors Identify if there are any hard coded IP’s Identify any domain names Identify 3 network protocols used Identify any e-mail addresses Identify Installation and Deployment Factors Registry locations to survive a reboot Dropped Files Taskdir.exe Identify Communication Factors Identify if there are any hard coded IPs Identify Installation and Deployment Factors Identify 5 network related strings & API calls
Review: Exercise Seigxbsg.exe: Identify Communication Factors Identify if there are any hard coded IP’s Identify any domain names Identify Installation and Deployment Factors Registry locations to survive a reboot Dropped Files Identify any Defensive Factors Taskdir.dll:
Day 1, Part 2
Directories, Files, and Downloads Basic Installation and Deployment Factors
Why I&D matters Knowing how it installs can help you: Design a cleanup script Scan for other infected machines Detect variants of the same malware
Objects of Interest Files, file extensions, paths .INI files Find/Next type loops, wildcards (“*.*”) Command line parameters Registry keys TEMP directory Checking for mutexes Sometimes used so they don’t infect twice
Directory and File Creation Start with these strings & symbols: CreateDirectory ExpandEnvironmentStrings “%ProgramFiles%” “%SystemRoot%” File extensions “.exe” “.dll” “.sys”
Starting points for Directory and File Creation CreateDirectory GetSystemDirectory CreateFile DeleteFile CopyFile OpenFile ExpandEnvironmentStrings %PROGRAM FILES% %SYSTEMROOT% C:\ .EXE .DLL .SYS .INI .INF .BAT *.* \\ (double backslash) MoveFile \\TEMP WINDOWS SYSTEM32 cmd /c del del %s GetTempPath
Directory Creation CreateDirectory, CreateDirectoryEx mkdir, wmkdir, _tmkdir _creat system(“mkdir … system(“md …
Environment Variables ExpandEnvironmentString GetEnvironmentVariable getenv, putenv
Common environment variables %ALLUSERSPROFILE% C:\Documents and Settings\All Users %APPDATA% C:\Documents and Settings\{username}\Application Data %COMPUTERNAME% {computername} %COMSPEC% C:\Windows\System32\cmd.exe %HOMEDRIVE% C: %HOMEPATH% \Documents and Settings\{username} %PATH% C:\Windows\System32\;C:\Windows\;C:\Windows\System32\Wbem %PATHEXT% .COM; .EXE; .BAT; .CMD; .VBS; .VBE; .JS ; .WSF; .WSH %PROGRAMFILES% Directory containing program files, usually C:\Program Files %PROMPT% Code for current command prompt format. Code is usually $P$G %SYSTEMDRIVE% The drive containing the Windows XP root directory, usually C: %SYSTEMROOT% The Windows XP root directory, usually C:\Windows %TEMP% and %TMP% C:\DOCUME~1\{username}\LOCALS~1\Temp %USERNAME% {username} %USERPROFILE% C:\Documents and Settings\{username} %WINDIR% C:\Windows %DATE% Current date in the format determined by the Date command %TIME% Current time in the format determined by the Time command %CD% Current directory with its full path %ERRORLEVEL% Number defining exit status of a previous command or program %RANDOM% Random number between 0 and 32767
Special Folder Locations SHGetSpecialFolderLocation CSIDL_ADMINTOOLS (FOLDERID_AdminTools) The file system directory that is used to store administrative tools for an individual user. CSIDL_ALTSTARTUP (FOLDERID_Startup) The file system directory that corresponds to the user's nonlocalized Startup program group. CSIDL_APPDATA (FOLDERID_RoamingAppData) C:\Documents and Settings\username\Application Data. CSIDL_BITBUCKET (FOLDERID_RecycleBinFolder) The virtual folder that contains the objects in the user's Recycle Bin. CSIDL_CDBURN_AREA (FOLDERID_CDBurning) C:\Documents and Settings\username\Local Settings\Application Data\Microsoft\CD Burning. Use MSDN to lookup all the possible values, there is a long list…
Shell32 API SHELL32.DLL SHGetPathFromIDList SHBrowseForFolder SHGetSpecialFolderLocation
Internet Downloads The WININET.DLL API winsock API InternetOpenFile InternetReadFile InternetOpenURL InternetConnect socket WSASocket connect WSAConnect … Addresses, URL, and web requests http:// www .com HTTP/1.0 Content-Type
Malware Boot Registry Keys Registry API Try searching… RegCreateKey RegOpenKey “CurrentControlSet” “CurrentVersion” “SOFTWARE” (all caps) Common registry keys to survive reboot HKLM\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\Run HKCU\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\Run HKCU\Software\Microsoft\Windows\CurrentVersion\RunServicesOnce HKLM\Software\Microsoft\Windows\CurrentVersion\RunServicesOnce HKCU\Software\Microsoft\Windows\CurrentVersion\RunServices HKLM\Software\Microsoft\Windows\CurrentVersion\RunServices HKLM\Software\Microsoft\Windows\CurrentVersion\RunOnceEx HKLM\SYSTEM\CurrentControlSet\Services\{Service Name}
Basic Installation Factors DEMO Basic Installation Factors Movie: huntpeck_install_factors.avi
Exercise TODO – Review this works Installation Factors Focus Livebin Paris hilton exe Description Use graphing techniques to quickly isolate installation behavior Time 15 minutes
Exercise Create a new project (Physical Memory Snapshot) Import memory image \Vmem\Student Exercise1.vmem Search the strings in memory of ParisHilton Go to search window, enter “command” Find “exefile\shell\open\command” Drop string onto canvas, grow up Answer the set of questions
Screen Shot
Exercise What is the purpose of exefile\shell\open\command? What is the filename being used? What file is it replacing? What file looks like it is being deleted? What file looks like it is being copied? BONUS –What does that accomplish? What file is this being copied from?
Instructor Questions and Answers What is the purpose of exefile\shell\open\command? - To launch an exe every time another exe is launched What is the filename being used? - wmmiexe.exe What file is it replacing? -wmiexe.exe What file looks like it is being deleted? wmiexe.exe What file looks like it is being copied? explorer.exe BONUS –What does that achieve? If explorer.exe is put to What file is this being copied from? Zzz.tmp
Exercise Installation Factors Focus Livebin Inhold_toolbar Description Use graphing techniques to quickly isolate installation behavior of inhold.toolbar Time 25 minutes
Exercise, Step by Step Start Responder and create a new project (Static Import) titled “inhold.1” Import the inhold.1.mapped.livebin Show symbols and filter for “CreateDirectory” Graph region around CreateDirectory Answer Questions 1-2 Look for the local path that is being used to store files Answer Questions 3-4 Discover how the files are being downloaded Answer Questions 5-6 Organize and flatten your graph Produce a concise RTF report with this information Continue on next slide…
Exercise Questions Questions What paths and URL’s stand out? What registry key is being created What environment string is being queried? What directory is being created locally? What API call is used to download files from ‘Net onto the computer? What are the remote and local names of the files, respectively
Instructor Answers What paths and URL’s stand out? Two URLs for www.inhold.co.uk InholdBar22.exe Uninstall22.exe /InholdBar/ What registry key is being created SOFTWARE /InholdBar What environment string is being queried? %ProgramFiles% What directory is being created locally? Program Files/InholdBar What API call is used to download files from ‘Net onto the computer? URLDownloadToFile What are the remote and local names of the files, respectively Remote InholdBar22 stores locally as InholdBar.exe Remote uninstall22.exe stores locally as uninstall.exe
Basic Installation Factors Exercise Recap Basic Installation Factors Movie: 5_min_install_factors.avi
Notes Student has analyzed a fairly decent malware sample, found network URL’s that can be used for firewall and IDS, and knows where its installed on the machine, and has a good chance of being able to uninstall it
Reconstructing string operations Format Strings Reconstructing string operations
Format strings Format strings are important to understand because you will encounter them so often.
Basic format strings %s – a string (also %S) %c – a character %d – a number (also %i) %x – a number in hex (also %02x, %08X, etc) %f – a number in float %l – a number (long) (also %ul) %% - a percent character (there are many variants of the above)
Functions that use format strings printf sprintf fprintf snprintf swprintf wsprintf vsprintf … variants of the above … %s %c %f %i %d %l %lu … etc …
Path Creation
GetSystemDirectory This is used to place files in the windows system32 directory If within a subroutine, grow up to see what kind of arguments are passed in Target: CreateRemoteThread livebin
Search registry for this Construction of a string under the Services key means this file could be registered as a service. Target: CreateRemoteThread livebin
Creates a directory in the Program Files directory. Target: inhold toolbar
Call setup 4 3 2 1 The args to a format string are almost always pushed directly before the format string in disassembly (reverse order) 1 2 3 4
%d argument to format string DEMO %d argument to format string Movie: res_url.avi
Batch Files %1, %2, %3 etc Used to indicate arguments passed to a batch file Look for .bat extension in strings
Using format strings as hints It is very likely that esi+4 stores a pointer to the nick used in IRC channel
IP Address Format String DEMO IP Address Format String Movie: format_string_ip_address.avi
Exercise Installation and Deployment Factors Focus VMEM VIRUS.VMEM Description Use graphing techniques to quickly isolate the Installation and Deployment Factors associated with both a memory image and a packed piece of malware. Time 25 minutes
Exercise Step by Step Create Physical Memory Snapshot Project called “virus.exe” Import memory image Analyze Virus.exe 9129837.exe Hid_evr2.sys Answer questions 1-5
Exercise Questions Questions Identify the registry locations used to survive reboot Is there anything peculiar about the EXE’s name that is registered to survive reboot? How does the malware choose the numerical filename? What directory does the numeric EXE get placed in What files, processes, drivers are dropped from Virus.exe?
Instructor Questions and Answers Identify the registry locations used to survive reboot "Software\Microsoft\Windows\CurrentVersion\Run“ Is there anything peculiar about the EXE’s name that is registered to survive reboot? "\%lu.exe“ – it’s always a number.EXE How does the malware choose the numerical filename? Uses GetTickCount What directory does the numeric EXE get placed in The windows directory What files, processes, drivers are dropped from Virus.exe? 9129837.exe Trust.exe
Exercise Recap Movie: virus.exe.filename.avi
Droppers & Multi-stage execution Installation and Deployment Factors
What is a Dropper? Malware is delivered in steps Dropper is initial downloaded package Can be a trojan or embedded exploit The dropper carries the malware in a payload Once “dropped”, the dropper decompresses and executes a secondary payload
Steps in Malware Deployment Dropper Embedded Resource Decompressed to Disk Embedded Resource Embedded Resource is Launched as an EXE
Multi-Stage Execution Child Process Spawning CreateProcess ShellExecute Creating Files WriteFile CopyFile
Launching External Processes Typical APIs used to launch processes RunDLL32.exe Shell32.dll Control_RunDLL Can be used to run a shell script Command line shows the path to the file on disk
Starting points for Process Creation CreateProcess Rundll32.exe cmd.exe cmd /c command.com /c %s ShellExec ShellExecute ShellExecuteA WinExec Shell32.DLL exec execve system
Cleanup using BAT files @echo off :%s del %%1 if exist %%1 goto %s rem %s"
CreateProcess w/ cmd.exe DEMO CreateProcess w/ cmd.exe Movie: shell_createprocess.avi
Exercise Installation and Deployment Factors Focus LIVEBIN inhold toolbar Description Recover ShellExecute arguments used within inhold toolbar Time 25 minutes
Exercise Step by Step Open your inhold toolbar livebin project Search symbols for “exec” Graph the region around ShellExecuteA Use the Code View to reconstruct the arguments being supplied to ShellExecuteA Answer questions 1-5
Exercise: Questions What does the lpOperation argument do for ShellExecuteA ? What action is being performed by inhold toolbar? What does inhold toolbar supply as the lpFile parameter? What environment variable is used to construct the file path? What is the path to the executable being launched?
Instructor Questions and Answers What does the lpOperation argument do for ShellExecuteA ? Defines the action to be performed What action is being performed by inhold toolbar? “open” What does inhold toolbar supply as the lpFile parameter? A file path to an executable What environment variable is used to construct the file path? %ProgramFiles% What is the path to the executable being launced? Program Files \ InHoldBar \ InHoldBar.exe
RECAP Shell Execution Movie: shell_exec.avi
Multi-Stage Execution Resource Extraction OpenResource Use of temporary directory GetTempDirectory Consult execution history in Flypaper
Starting points for Resource Extraction FindResource SizeOfResource Possible embedded kernel drivers PsCreateSystemThread \\DosDevices .sys drivers IoCreateSymbolicLink IoDeleteSymbolicLink IoCreateDevice IoDeleteDevice KeInitialize SpinLock ObReferenceObjectByHandle
When using flypaper, all of the child processes will be available in the memory snapshot. Examine them all and compare them to one another. MEP Target: MEP
Notice how the copy of the process has more loaded modules Target: MEP
Compare Copies Compare the secondary execution against the first one Different number of loaded modules The one with more modules implies that it has progressed farther in the execution lifetime Compare strings and symbols of both copies There may be additional unpacking in the secondary copy
MEP Target: MEP
RunDLL32
RunDLL32 Executes a subroutine exported from a DLL RUNDLL32.EXE <dllname>,<entrypoint> <optional arguments> Example: RUNDLL32.EXE SHELL32.DLL,Control_RunDLL desk.cpl,,0 void CALLBACK EntryPoint( HWND hwnd, HINSTANCE hinst, LPSTR lpszCmdLine, int nCmdShow);
This was a lecture only section, no demo or exercise NOTE This was a lecture only section, no demo or exercise
Bundled Kernel Drivers Installation and Deployment Factors
DevIoControl Method of communication between usermode and kernelmode See hide_evr.sys
IoCompleteRequest Called when a driver has finished processing an IRP Thus, can be used as a guidepost to find IO callback functions (IoCompletion routines) Open, Close, Read, Write, IOControl, etc. May be used as IofCompleteRequest Little ‘f’ means fastcall interface
ExAllocatePoolWithTag When pool tags are used, the allocations made by the driver can be tracked See hide_evr.sys
Pool Tags Used in kernel mode memory allocation Track buffers passed thru DeviceIoControl
Pool Tag Buffers Demo finding pool tag ‘Wdm ‘ in hide_evr.sys
Day 1, Part 3
DLL and Thread Injection Installation and Deployment Factors
Process Enumeration For any process to write to another process, it needs a handle to the target To get a handle to a process, the process must be located from the list of all processes Process enumeration primarily occurs using ToolHelp API NtQuerySystemInformation( ) Undocumented API
DLL & Thread Injection Look for ToolHelp library usage This will be used to enumerate running processes when finding one to inject into Look for CreateRemoteThread This is used to inject the code that will load the injected DLL Look for EnableDebug or AdjustPriv The process will need certain access rights to be able to perform the injection
DLL Injection Injected DLLs stand out clearly if they use Non-standard paths Unusual or odd-sounding names
Strings ending in .dll All modules
Remote Threads A remote thread is created in a second process, not part of the first process A remote thread is typically used to inject a DLL into another process, but not always A remote thread can also operate without a DLL injection This is a more advanced technique
Clear indication of process injection MEP Clear indication of process injection
Page Protections In order to inject against another process, memory protections will need to be unlocked This is done via the VirtualQuery API Use “Google™ Text Search” to get the API arguments
Searching Google™ on each of the MEP Searching Google™ on each of the symbols clearly indicates they all can access a remote process.
Starting points for DLL and Thread Injection CreateRemoteThread OpenProcess VirtualAlloc WriteProcessMemory WaitForSingleObject explorer.exe SeDebugPrivilege CreateToolhelp32Snapshot CreateEvent Process32First Process32Next Module32First Module32Next
DEMO Create Remote Thread Movie: create_remote_thread_RR.avi
Exercise Installation and Deployment Factors Focus LIVEBIN CreateRemoteThread Description What is being injected into the remote process? Time 25 minutes
Exercise Step by Step Open livebin CreateRemoteThread Search symbols for “remote” Graph the region around CreateRemoteThread Answer questions 1-5
Exercise: Questions What is WriteProcessMemory being used for What is being written to the remote process? What start address is being used for the remote thread? How does the malware determine what start address to use? What does the remote thread do?
Instructor Questions and Answers What is WriteProcessMemory being used for? -- it writes a string to the remote process What is being written to the remote process? -- the path “C:\WINDOWS\TEMP\~222e8c.~~~” What start address is being used for the remote thread? -- The address of “LoadLibrary” How does the malware determine what start address to use? -- uses GetProcAddress on kernel32 What does the remote thread do? -- it loads “C:\WINDOWS\TEMP\~222e8c.~~~” as a DLL
Recap Create Remote Thread Movie: create_remote_thread_RR2.avi
Installation and Deployment Factors Registry Keys Installation and Deployment Factors
Registry Key Creation Start with these: Search symbols for “reg” RegOpenKey RegCreateKey “CurrentControlSet” “CurrentVersion” “SOFTWARE” (all caps)
Starting points for Registry Modification RegCreateKey RegOpenKey .REG regedit RegCloseKey CreateService DeleteService OpenSCManager ServiceMain ServiceDll StartService CurrentControlSet SOFTWARE \\ (double backslash) CurrentVersion
Malware Boot Registry Keys Massive numbers of registry keys (too numerous to list here) See “Autoruns” See “MAP” Plug-in
The Run Keys HKLM\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\Run HKCU\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\Run HKLM\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run HKCU\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run HKCU\Software\Microsoft\Windows\CurrentVersion\RunServicesOnce HKLM\Software\Microsoft\Windows\CurrentVersion\RunServicesOnce HKCU\Software\Microsoft\Windows\CurrentVersion\RunServices HKLM\Software\Microsoft\Windows\CurrentVersion\RunServices HKLM\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce HKCU\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce HKLM\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce\Setup HKCU\Software\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce\Setup HKLM\Software\Microsoft\Windows\CurrentVersion\RunOnceEx HKCU\Software\Microsoft\Windows NT\CurrentVersion\Windows\Load
User Init HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\Userinit The above key takes a comma delimited list of programs to execute. Malware may install additional programs, or even replace the existing userinit.exe with a trojan version. The normal location for userinit.exe will be something like C:\WINDOWS\system32\userinit.exe (the windows install directory will be system specific). Any strange looking path or additional executables should be examined in detail.
Creating Services Creating a service via API calls simply creates registry keys under the hood CreateService One alternative way to load a device driver is the SystemLoadAndCallImage method ZwSetSystemInformation api call
HKLM\SYSTEM\CurrentControlSet\Services\{Service Name} For any given service, there may be a value called ImagePath that will indicate the path to the file that implements the service. If the file in question ends in “.sys” there is a good chance that it’s a kernel mode driver. To be sure you can check the Type value: Type Description 1 Kernel mode driver 2 File system driver 4 Adapter Arguments 8 File system service 16 Win32 program that runs as it's own process 32 Win32 program that shares a process w/ other services (think services.exe) The Start value can tell you when the service is started: 0 Boot, very early during startup 1 System, after Boot, but still while booting Windows 2 Automatic, after System, but still while booting Windows 3 Manual, doesn't run unless the user or another program starts it 4 Disabled
DEMO Registry Keys Movie: disable_logoff_startmenu.avi
Exercise Installation Factors Focus vmem Beizhu_2 Description Use graphing techniques to quickly isolate temporary path used with runonce key Time 25 minutes
Exercise, Step by Step Start Responder and create/open project for “beizhu_2.vmem” Import beizhu_2.vmem (if creating new project) Find and extract malware.exe Search for “\” (backslash) in strings view, drop registry keys to canvas Look for the value being used with the RunOnce key Try to determine what the value used with the RunOnce key is used for Hint: look for a temporary path Answer Questions 1-6
Exercise Questions Questions What value is being used under the RunOnce key? What command is being written to the cleanup value? What path is being passed to the DelNodeRunDLL32 API call? What path is being prepended in front of advpack.dll? Any ideas on what this value is being used for? Bonus: How is the temp path generated?
What value is being used under the RunOnce key? wextract_cleanup0 Instructor Answers What value is being used under the RunOnce key? wextract_cleanup0 What command is being written to the cleanup value? rundll32.exe %sadvpack.dll,DelNodeRunDLL32 "%s“ What path is being passed to the DelNodeRunDLL32 API call? C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\IXP000.TMP\ What path is being prepended in front of advpack.dll? The system directory, obtained from GetSystemDirectory Any ideas on what this value is being used for? Deleting files on restart How is the temp path generated? Appended with a number, using "IXP%03d.TMP"
Registry RunOnce Value Exercise Recap Registry RunOnce Value Movie: BEIHZU_RunOnce.avi
Installation and Deployment Factors Shell Extensions Installation and Deployment Factors
The Shell Malware can install one or more DLL’s on the system that are tied into your shell, menu’s, mouseclicks, actions, browsing – almost anything you can imagine
Starting points for detecting Shell Injection ShellEx Classes\Folder Classes\CSLID InProcServer32 shell\open shell\open\command exefile batfile comfile ddeexec
Shell Column Handlers HKCR\Software\Classes\Folder\Shellex\ColumnHandlers\{GUID} To find the actual program that is handling the column extension, you need to locate the GUID elsewhere in the registry. It will likely be located under the Classes folder: HKLM\SOFTWARE\Classes\CLSID\{GUID} If you check the InProcServer32 subkey, you will find a path to the DLL that is implementing the column extension.
Shell Open commands Malware can register itself as the handler for certain file extension types, controlled from the HKEY_CLASSES_ROOT (HKCR) folder and the HKLM\Software\Classes folder. There are many file extension types under these folders, but the following are well known hook points for malware: HKCR\batfile\shell\open\command HKCR\comfile\shell\open\command HKCR\exefile\shell\open\command HKLM\Software\Classes\batfile\shell\open\command HKLM\Software\Classes\comfile\shell\open\command HKLM\Software\Classes\exefile\shell\open\command The default value for the command key should be "%1" %*, but malware can modify the entry to contain something like “malware.exe” “%1” %*, causing the malware program to execute if someone double-click launches one of the infected file extensions.
Command and DDE Exec HKCU\Software\Classes\<some registered extension/path>\shell\<path>\ddeexec\ These keys have the same problems as described above for Shell Open Commands. In addition, you may notice a neighboring command key, which can also be altered or infected by malware. HKCU\Software\Classes\<some registered extension/path>\shell\<path>\command\
Installation and Deployment Factors Browser Extensions Installation and Deployment Factors
The Browser Malware can install one or more DLL’s that are tied into your browser, browsing events, keylogging, user interface, and more
Starting points for detecting Browser Injection \Internet Explorer \Extensions \Explorer Bars \Script \Exec \Browser Helper Objects InprocServer32 URLSearchHook Implemented Categories\ {00021494-0000-0000-C000-000000000046} {00021493-0000-0000-C000-000000000046} {4D5C8C2A-D075-11d0-B416-00C04FB90376} InitPropertyBag\Url
Browser Extensions This includes adding menu’s and shortcuts, additional toolbars, explorer bars, and browser helper objects. A simple way to add an additional menu item or button is to register a GUID under the following key: HKCU\Software\Microsoft\Internet Explorer\Extensions\{GUID} HKLM\Software\Microsoft\Internet Explorer\Extensions\{GUID} You may also find subkeys that path to an executable or script: HKLM\Software\Microsoft\Internet Explorer\Extensions\{GUID}\Script HKLM\Software\Microsoft\Internet Explorer\Extensions\{GUID}\Exec The values stored under these keys may lead you to a program that implements whatever command is indicated by the menu item or button.
Browser Helper Objects The term BHO is used loosely in the security community, and you may run across the term BHO being used to describe any malware that extends the browser – even if the malware is not specifically a BHO. Just remember that a BHO is just one of many ways malware can inject into the browser. A BHO is registered by adding a GUID to the following registry key: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer \Browser Helper Objects\{GUID} Similar to all the other GUID based extensions, the GUID can be looked up elsewhere in the registry to determine which DLL handles it. It will likely be found in a location like: HKCR\CLSID\{GUID} And, as usual, the InprocServer32 key will reveal which DLL is used to implement the BHO.
Day 2, Part 1
Browser Hijacking and Bank Info Stealers Information Security Factor
Browser Extensions UrlSearchHooks Extensions Browser Helper Objects
Filter and Grab Most BHO’s of this type have special search patterns that are very specific to a known banking site Once this pattern is “triggered”, the BHO injects, intercepts, or copies data
Injecting HTML Injection into an already displayed page objectReference.insertAdjacentText(position, textstring) objectReference.insertAdjacentHTML(position, textstring) position: where to put the injected text “beforeBegin”: immediately before the element, outside the element's enclosing tags (not affected by any formatting the element generates). “afterBegin”: just after the opening tag of the element “beforeEnd”: just before the closing tag of the element “afterEnd”: immediately after the closing tag of the element (not affected by any formatting the element generates). text: the text to insert
HTML Injection BHO has injected this HTML into the live banking page This screenshot provided by threatexpert.com
Scramble Pads Graphical representation of numbers or letters that the user clicks on instead of typing keys Intent is to bypass keyloggers As the data is processed by the browser, the BHO can intercept the data, irrespective of keyboard or mouse
Virtual Keyboards Even though the virtual keyboard bypasses the actual keyboard, the cleartext username and password details are stored all over in the browser RAM BHO’s can scan for the cleartext results
Transaction Authentication Number (TAN) 2-factor authentication, basically a one-time use password Used to authenticate a bank transaction Small lists of TAN’s are given to banking client for one-time use “TAN Grabber” BHO’s will steal correct TAN when its submitted to bank and substitute it with a fake, the user does not realize the transaction failed The valid TAN is then used to make a fraudulent transaction
Keylogging,Passwords, and Data Theft Information Security Factor
Information Security Factors DEMO Information Security Factors
Information Security Factors Goal: Rapidly identify if the malware can steal information Passwords Keystrokes Login credentials Intellectual property/secrets
What is Being Stolen? File scans Keystrokes Usernames and passwords Screen shots / screen scraping
Recently Used Passwords Common Targets GUIDs for Saved Passwords Outlook Internet Explorer Pstore.dll
File Searching Used for a variety of reasons Locate intellectual property Locate password files Search out DLL’s and executables Find files to infect Cleanup temporary files after installation
File Searching Typical API Calls Strings FindFirst( ) FindNext( ) Wildcards File Extensions
Target: soysauce
Keystroke Logging Windows Message Hooking Polling Interrupt Hooking IRP hooking 8042 Keyboard Controller Port 60, 64
DEMO Keylogging Movie: keylogging_windowshook.avi
Exercise Keylogging Focus LIVEBIN keylogger.1.mapped.livebin Description How is the malware sniffing keystrokes? Time 25 minutes
Exercise Step by Step Open/create project for livebin keylogger.1.mapped.livebin Search symbols for “key” Graph the region around GetAsyncKeyState and relabel the data_CALL_PTR used with it Find the code that actually calls GetAsyncKeyState Try to graph the key sniffing loop Try to find the keytable that is used by the sniffing loop Answer questions 1-4
Exercise: Questions How many places call GetAsyncKeyState? Are there any filenames that might be a keylog file? How many places call GetAsyncKeyState? How many keys can be checked at a time? Where is the table of keycodes (what address)?
Instructor Questions and Answers Are there any filenames that might be a keylog file? “keys.txt” How many places call GetAsyncKeyState? just one, loc_0040A692 How many keys can be checked at a time? Just one, each key has to be tested one at a time Where is the table of keycodes (what address)? 0x42F1EC
Password File Searching DEMO Password File Searching Movie: rdp_password_sniffer.avi
NOTE This following exercise should reinforce their previous work with format strings, while introducing FindFirstFile
Exercise File Searching Focus LIVEBIN FindFile.1.mapped.livebin Description What DLL naming scheme is being used by the malware? Time 25 minutes
Exercise Step by Step Open/create project for livebin findfile.1.mapped.livebin Search symbols for “find” Graph the region around FindFirstFile and FindNextFile Answer questions 1-4
Exercise: Questions What is the filename wildcard being used? What is the malware searching for on the filesystem? One or more DLL’s What is the filename wildcard being used? “sh” followed by a number, then “*.dll” What directory is the malware searching in? The system directory Can you recover a specific DLL name? sh9025.dll
Instructor Questions and Answers What is the malware searching for on the filesystem? - One or more DLL’s What is the filename wildcard being used? - “sh” followed by a number, then “*.dll” What directory is the malware searching in? -The system directory Can you recover a specific DLL name? - sh9025.dll
Recap File Searching Movie: findfile_dll.avi
Screenscrapers and Audio Bugs Information Security Factor
Screenshots BitBlt, screenshot methods
INSERT EXAMPLE OF SCREEN SCRAPER TODO
Communications Factor Data Exfiltration Communications Factor
Network Communications WS2_32.DLL WINET_XXX functions Wsock32.dll TCP/UDP Hard-coded IP addresses and web addresses
WSAStartup Initiates use of the Winsock DLL by a process Must be the first Windows Sockets function called by an application or DLL Use it to determine the order of operations
WSAGetLastError Returns the error status for the last Windows Sockets operation that failed Communications Factor Can be used to detect error-handler code vs. continued execution code Development Factor Showing how well the developer checks error codes is an indication of their formal training and programming skill
Winsock TCP / UDP Functions TCP (socket-based) recv WSARecv UDP (datagram-based) recvfrom WSARecvFrom
Imported functions from winsock recvfrom is UDP
Protocols E-mail strings SMTP HTTP POST / GET requests
Hosts and Addresses inet_ntoa inet_addr converts IP to a string (dynamically) inet_addr converts readable IP to number (dynamically)
Detecting the Protocol By examining the arguments that are pushed on the stack prior to the “socket” call, you can detect which protocol is being used 1,3,2 versus 6,1,2
Starting points for COMS factors InternetOpen InternetOpenURL InternetReadFile InternetCloseHandle http:// ftp:// .asp
Exercise Communications Factors Focus Type Interactive Analysis Description Use graphing techniques to quickly isolate all the coms factors in the realmBOT malware Time 25 minutes
Start Responder and create a new project (Static Import) titled “realmbot.1” Import the realmbot.mapped.livebin Use graph techniques to find all the callers of socket Use cheat sheet to determine if sockets are UDP or TCP Answer Questions 1-2 Continue on next slide… Questions How many locations make sockets, and how many of each type are there? Is there anything inconsistent about how the malware author creates the UDP and TCP sockets?
Exercise continued… Use graph techniques to build a single layer for each code region that makes a socket Look for information that reveals what each code region is doing Answer Questions 1-3 Build a final graph with each COMS factor in it’s own layer Generate a final report in RTF format Questions What protocols are being used by the code regions? Do any of the code regions run as a server and listen for incoming connection? What other features do the code regions appear to offer (computer network attack)?
Instructor Answers How many locations make sockets, and how many of each type are there? 7 sockets 2 UDP 5 TCP Is there anything inconsistent about how the malware author creates the UDP and TCP sockets? The author uses IPPROTO_IP sometimes, and IPPROTO_UDP or TCP sometimes These provide the same functionality, but are two different styles Maybe cut and paste code What protocols are being used by the code regions? IDENT This is a TCP IRC This is TCP SMTP / Email TFTP This is UDP
Instructor Answers Do any of the code regions run as a server and listen for incoming connection? The IDENT code listens for incoming TCP connections Bonus: which port does it listen on TCP port 113 What other features do the code regions appear to offer (computer network attack)? UDP pings Bonus: why did the malware author use UDP to ping? ICMP may be blocked by Windows firewall TCP Port scanning Bonus: traditional connect scanning, which is slow Port Redirection Represents path into internal network behind NAT
Exercise Recap Callers of socket Callers_to_socket.avi
Exercise Communications Factors Focus Type Interactive Analysis Description Use graphing techniques to quickly isolate the Communications Factors associated with both a memory image and a captured piece of malware. Time 25 minutes
Exercise Create a new project (Physical Memory Snapshot) Import memory image \Student Exercise 7\Student Exercise 7 CyberEspionage case.vmem Answer the set of questions in the handout (“Exercise 7 Questions”)
Review: Exercise Ipod.raw.exe Identify Communication Factors Identify if there are any hard coded IP’s Identify any domain names Identify 3 network protocols used Identify any e-mail addresses Identify Installation and Deployment Factors Registry locations to survive a reboot Dropped Files Taskdir.exe Identify Communication Factors Identify if there are any hard coded IPs Identify Installation and Deployment Factors Identify 5 network related strings & API calls
Review: Exercise Seigxbsg.exe: Identify Communication Factors Identify if there are any hard coded IP’s Identify any domain names Identify Installation and Deployment Factors Registry locations to survive a reboot Dropped Files Identify any Defensive Factors Taskdir.dll: S_65: Identify if there are any hard coded IPs Dropped Files
Day 2, Part 2
Loops
Loops Most designs will have a loop in the code that goes around and around getting packets There are many variations of this, but they all follow this general design Identifying the Communications Factors requires you to discover (and graph) this loop
Loops Most communications control flow is looped Multiple outbound connections Periodic check-in DDOS attack, large volume of sockets Multiple inbound connections (server)
Loops Outer loop Inner loop Typically is large Calls down into specific behaviors Usually links many different kinds of behaviors together Example: the outer loop may grab packets from the network and feed them to the Command and Control code Inner loop Usually small and only calls into one specific behavior Example: code that reads a file looking for passwords Both kinds of loops help you understand the malware but in different ways
Program continues around the outer loop until shutdown. Program Start Outer Loop At this point in the code, the outer loop makes a branching decision and enters the inner loop. The program can go around the inner loop many times before coming back out. Once the program leaves the inner loop, it continues along the outer loop. Inner loops can branch into even deeper inner loops. Nested Inner Loops Inner Loop Inner Loop
Loops: Timers and Triggers Most loops have some sort of pause; otherwise, they would consume 100% CPU Timer-based loop: usually uses the Sleep() API to wait a specific number of milliseconds between each iteration Trigger-based loop: responds to some external stimulus via a callback or a blocking call
Loops: Blocking Call Blocking call: Function call that effectively pauses until some event occurs Can have a “timeout” parameter, and will continue execution after the timeout if the event doesn’t occur In these cases, the return value from the call is usually checked to determine if the timeout expired or the event occurred (to tell the difference) A very common blocking call is “recv”, the function that gets packets from the network The “recv” function will pause until a packet arrives
Software stops here and waits for a signal. Only after the signal has fired does the software enter the inner loop.
A bunch of functions are called all in a row A bunch of functions are called all in a row. These functions are the main body of the software’s capabilities. A very basic Sleep architecture. After Sleep returns, program goes back to top of loop and runs again. Sleep anywhere from a few seconds to minutes or even hours
Wait for network packet. Check to see if entire command has arrived. Continues only after complete command has arrived. Read packet from buffer. Process the command
d Sleep if connection could not be made Continue if web connection made Attempt outbound web connection Send HTTP request Check to see if entire webpage has been read. Read webpage from network. Continue after entire webpage has been read Parse webpage for secret commands
Loop
Callback Functions Callback: Function that is called by an external component Not typically represented as a loop in the code Can be a timer, or an event such as a packet arriving Callbacks are “registered”, usually at program startup The actual callback function is usually not in the same place as the “registration” step, making these more difficult to identify
DEMO Loops Movie: file_delete_loop.avi
Exercise Defensive Factors Focus Type Interactive Analysis Description Use graphing techniques to quickly isolate the cleanup routine used by molebox packer Time 25 minutes
Start Responder and create a new project (Static Import) titled “molebox.1” Import the molebox.1.mapped.livebin Show strings and filter for “FindFirstFile” Grow the graph up and down and find the data_CALL_PTR associated with “FindFirstFile” Use graph techniques to find other functions and relabel the call pointers Answer Question 1 Continue on next slide… Questions Which function is performing the equivalent of a GetProcAddress?
Use graph and layer techniques to grow up from the call pointers Delete File Sleep GetCurrentDirectory Others… Try to connect all of the graph regions together into one graph, organize and flatten it Try to relabel the blocks Identify success and failure conditions after an API call Identify any sleeps Identify any loops Answer Questions 2-6 Identify the location which stores the filename that is being searched/deleted Answer Questions 7-9 Continue on next slide…
Questions Does the program attempt to delete more than once? Can it delete multiple different files? How long does the program sleep between delete attempts If the first attempt at finding files with FindFirstFile finds nothing, does the program enter the loop at all? Which directory does the program search for files to delete? What is the address of the filename pattern that is being deleted? What is the file pattern that is being used to search for files to delete? Bonus: did you find any other filename strings that match the pattern?
Instructor Answers Does the program attempt to delete more than once? Yes, it loops attempting multiple delete attempts Can it delete multiple different files? It uses FindFirstFile which supports a wildcard, so yes it can How long does the program sleep between delete attempts 500 ms If the first attempt at finding files with FindFirstFile finds nothing, does the program enter the loop at all? No, it does not enter the loop in this case Which directgory does the program search for files to delete? The current working directory
Instructor Answers What is the address of the filename pattern that is being deleted? Data_0047B914 What is the file pattern that is being used to search for files to delete? MBX@%X@*.### Bonus: did you find any other filename strings that match the pattern? There are a few other strings like this, but without the * wildcard
Molebox Cleanup Routine Exercise Recap Molebox Cleanup Routine Molebox_vs_responder.1.avi
Computer Network Attack DDOS Attack Computer Network Attack
DDOS attack Outer Loop Target: Realmbot.mapped.livebin Creates 400 sockets Creates 400 connections Sleeps 100 ms Inner Loop 2 Inner Loop 1 Outer Loop Inner Loop 3 Closes the sockets Comes around for another pass Target: Realmbot.mapped.livebin
Quiz Open up realmbot livebin and graph from the ws2_32 symbols… What does the ioctl do? Hint: use google code search
Answer Makes it non-blocking, which means you don’t have to wait for connects, which makes all 400 sockets go out faster, in parallel
Command and Control Factors Command Parsers Command and Control Factors
Starting Points Obvious Command Strings String Comparison Routines, e.g. strcmp strchr Grow outward from networking loop
Soysauce static These status messages tell a lot about the malware’s capabilities. They are only a few hops from the sendto function.
Command Strings Commands are typically processed by a parser There will be a string comparison There is a branching condition if the command matches There is a central location where the complete incoming command is stored This command buffer may be interrogated several times
Note the compare operation
Cmp bytes at these locations
Clean the graph so you only have the comparisons. The graph is a long series of compares. soysauce This is the “current” command
soysauce From the top node, grow down ONE step. The new blocks are the handlers for each individual command.
Promote one of the commands to a layer of its own and drag-n-drop it to a new graph canvas soysauce
Build a Diagram of a Specific Command Handler soysauce Note, the command handler “goes back” to the command buffer numerous times. Could the command contain arguments that are being parsed?
This is a pointer to the data being written to the file ECX must be a filename This is a pointer to the data being written to the file soysauce
The file is binary, not text. It is being appended to, not created fresh from scratch. soysauce
The file pointer appears to point to a “scratchpad” buffer
Both the fwrite() call and the packet handler touch this scratchpad buffer
Buffer Sharing Follow buffers from the networking layer to find functions that implement command/control logic
Likely candidate. Used for the fwrite() data. Review for pushes, we don’t see any. Packet handler is not using arguments. Must be shared buffer. The packet handler function would have pushes above if the function took arguments. Instead, we see the function accesses a global buffer – in other words, if coded in ‘c’, it would be a global variable like char gCommandBuffer[255] or something, and would not be passed around as an argument to the function. It’s hard coded. Likely candidate. Used for the fwrite() data.
A: entrypoint block into a network IO loop In this slide we see an outer loop – the loop accesses a function to send and receive packets. A: entrypoint block into a network IO loop B: subroutine that sends/recv packets
Observations / Discoveries Loops waiting for commands Commands are received over UDP Hard-coded IP address for packets (send/recv) Commands are stored in their own memory location short words There is a buffer area where binary data can be placed, like a scratchpad
Branch Logic CMP EAX, EBX CMP [EAX], EBX CMP EAX, 0x1000 TEST EAX, EAX
Branch Logic JNE jump if not equal JE jump if equal JA jump if above JAE jump if above or equal JB jump if below JBE jump if below or equal JG jump if greater JGE jump if greater or equal JL jump if less JLE jump if less or equal
Branch Logic To implement a branch decision, the software will first perform a CMP and then follow it with a conditional jump. You can figure out the branch logic by putting the two together. <instruction address> : CMP EAX, 5 <instruction address> : JNE <target address> In the above, the program will branch to <target address> if EAX is not equal to 5.
TEST result of CALL The TEST instruction will often be used after a function call returns. Most compilers will put the return code for a function into the EAX register. So, you may see code like this: <instruction address> : CALL <target address> <instruction address> : TEST EAX, EAX <instruction address> : JE <target address>
Detecting size of data CMP DWORD PTR [<address>], 0 CMP WORD PTR [<address>], 0 CMP BYTE PTR [<address>], 0 Byte register: al, ah, bl, bh, cl, ch 16 bits: ax, bx, cx 32 bits: eax, ebx, ecx
Exercise Advanced Graphing Focus Type Interactive Analysis (MEP.EXE) Description Use graphing techniques to quickly isolate Malware Analysis factors. Identify behavioral interrelationship between two of the factors. Time 25 minutes
Exercise Create a new project (Static PE Import) Import MEP (Be careful, DON’T run it) Rough cut the strings Create layers for Communications Factors Command and Control Factors Build a graph that connects the Communications code with the Command and Control code Answer the questions in the back section of the handout (“Exercise 5 Questions”) BONUS: Graph the e-mail network traffic’s outer control loop
Review: Exercise What are some example strings used with the e-mail protocol? Identify the e-mail protocol(s) in use. Why is the channel between the COMs code and the Command and Control code important?
Bonus Answer: E-mail Parent Loop
Exercise Command and Control Factors Focus Type Interactive Analysis Description Use graphing techniques to quickly isolate the Command and Control Factors associated with a captured piece of malware. Time 25 minutes
Exercise Create a new project (Static PE Import) Import soysauce2.dll Name it “Soysauce 2” Import soysauce2.dll Answer the set of questions in the handout (“Exercise 10 Questions”)
Review: Exercise Is there a command buffer? If so, where in memory is it located? Identify at least three commands that soysauce2 recognizes What do these commands cause soysauce2 to do? Hint: you may have to translate WS2_32 ordinals What protocol is used to receive commands?
Communications Factors Crypto & Stego Communications Factors
Cryptography If the packet stream is encrypted, there will usually be a function or set of functions that decrypt the information In some cases, if the decryption is very simple, the decryption will be in-lined into the networking loop (no separate function) Look for XOR between bytes or between a byte and a hard-coded value
Stego Steganography is the art and science of writing hidden messages in such a way that no one apart from the sender and intended recipient even realizes there is a hidden message. By contrast, cryptography obscures the meaning of a message, but it does not conceal the fact that there is a message.
Random Number Generation Another important factor of cryptography is the generation of random numbers. Most software does not do this very well, including malware. Look for “srand“ “srand” combined with a time function clock GetTickCount Etc.
Stego_example
Arithmetic Stego_example
Comparison Against a Value Stego_example
Reading and Storing a Data-byte
Identifying READ and WRITE Operations
Linear Shift of Bytes
A B C D B C D E
Arithmetic against the byte that was read with getchar(). The and/not pair makes this seem like a mask. Data argument on stack, passed into function Finally, the last thing that gets done is the value derived from getchar() is COMBINED with a value that was an argument passed on the stack.
B C D E Data in register B C D Shift Right (SAR) B C D Data in register AND Mask C
Data from getchar() Data passed on stack AND mask NOT AND mask B C D W X Y Z C W W C OR’d together
The Steganography Routine The routine is reading bytes The bytes are read into a shift register This means up to four bytes at a time are stored Many SHIFT and MASK operations AND, NOT operations There is a point where the byte is combined with a second, totally different byte This is the OR operation at the end Steganography is the art and science of writing hidden messages in such a way that no one apart from the sender and intended recipient even realizes there is a hidden message. By contrast, cryptography obscures the meaning of a message, but it does not conceal the fact that there is a message.
Unraveling Steganography The routine is very complex to reverse engineer (a lot of arithmetic) We know Data read from somewhere is being COMBINED with a second set of data This combination uses heavy masking/shifting One set (probably the set being read with getchar) is being combined with a secret message. The final result is probably an encrypted byte, or a stego byte.
Reconstructing a crypto routine DEMO Reconstructing a crypto routine encrypted_web_address.avi
Exercise Encryption Focus Type Interactive Analysis Description Reverse engineer SearchIndex.EXE Time 25 minutes
Start Responder and create a new project (Physical memory) titled “searchindex.1” Import the searchindex.vmem Extract searchindex.exe Show strings and filter for “crypto” Answer Question 1 Graph region around “crypto key set” Answer Question 2 Try to label subroutines and follow data locations Answer Questions 3-4 Pay close attention to the command line parameters Answer Questions 5-6 Try to build the entire communications backbone on the graph Try to find the functions which are responsible for encryption Answer Questions 7-8
Questions Does the program take command line parameters? What function sets the crypto key? Is there a function for printing debug statements? Is there a default encryption key? What global flag controls whether crypto is to be used? Are there any other global flags for other command line parameters? Which function encrypts data and sends it over the network? Which function is responsible for decrypting incoming data?
These answers don’t match the example Instructor Answers Does the program take command line parameters? Yes, -nocrypto, -debug What function sets the crypto key? Is there a function for printing debug statements? Yes Is there a default encryption key? Yes, "PiNkYsWoRlD" What global flag controls whether crypto is to be used? data_004082f8 Are there any other global flags for other command line parameters? Yes, -debug for example Which function encrypts data and sends it over the network? sub_00402AB0 Which function is responsible for decrypting incoming data? sub_00401810
Exercise Recap Network packet crypto Crypto_around_flag.avi
Day 2, Part 3
Basic Computer Network Attack
Infecting Other Computers Windows networking code Attack vectors
WNet API Enumerating other computers on the LAN Infecting drive shares .INI files WNetOpenEnum Starts an enumeration of network resources WNetEnumResource Continues a network enumeration WNetAddConnection3 Makes a connection to a network resource WNetCancelConnection2 Terminates an existing network connection WNetGetConnection Retrieves the remote name of a network resource WNetGetUniversalName Maps a local path for a network resource WNetGetUser Retrieves user name used for network connection
Exercise CNA Factors Focus Type Interactive Analysis Description Reverse engineer the USB media infection strategy Time 25 minutes
Start Responder and create a new project (Physical memory) titled “hellbot.1” Import the hellbot.vmem Extract hellbot.exe, kernel32.dll, and ntdll.dll Show strings and filter for “.ini” and “.inf” Answer Question 1 Graph region around the filenames Look for the path or directory creation Look for file writing Answer Question 2 Use the code view to reverse the function code Answer Questions 3-5 Make a bookmark on this function
Questions What filenames stand out? What paths stand out? What register is used to store the lstrcat function pointer? What register is used to store the SetFileAttributes function pointer Where does the autorunme.exe file get copied from?
Instructor Answers What filenames stand out? Autorun.inf Desktop.ini What paths stand out? \ RECYCLER \ S-1-6-21-2434476501-1644491937-600003330-1213 What register is used to store the lstrcat function pointer? ESI What register is used to store the SetFileAttributes function pointer EBX Where does the autorunme.exe file get copied from? It’s a copy of the hellbot executable Bonus, the source path to the hellbot exe is obtained by using GetModulePath with a NULL module handle
Hellbot USB Key Infector Exercise Recap Hellbot USB Key Infector Hellbot_autorun_recap.avi
Development Factors (Who Wrote It ?)
Who Wrote It? Characteristics of the Developer Code Quality Compiler Compile Time / Time Zone Language Hints (Strings)
Programming Language Detecting Delphi (Pascal) Detecting VB Detecting Compiler
Code Style / Quality Error Handling Functions Structured Coding Exception Handlers Checking Return Values Functions Size Cohesion Coupling Structured Coding Switch{ } Statements / Multiple-IF constructs
Source Code Clues Embedded Paths Revision Control PDB file Images / Resources Revision Control Tags indicating CVS usage
PDB Files Paths with .pdb files indicate the path where the source code was Can give many hints about the real name of a driver, even if it was renamed later Sometimes has project name, or even the user name of the creator
Control Classes Clues as to the libraries used i.e., "msctls_statusbar32“ == MFC statusbar
Exercise Development Factors Focus Type Interactive Analysis Description Use graphing techniques to quickly isolate the Development Factors associated with a VMware image. Time 25 minutes
Start Responder and create a new project (Physical memory project) titled “password.1” Import the password1.vmem Extract b2new.exe and wmsdkna.exe Answer Questions 1-4
Questions What language was used to make these exe’s? Can you tell what version / compiler was used? Can you tell what the original project names are for these files? Are they using source code control? How can you tell?
What language was used to make these exe’s? Wmsdkns: Visual Basic Instructor Answers What language was used to make these exe’s? Wmsdkns: Visual Basic Has “.vdp” file extension in string, Visual Basic Project File Has “__vba” in many function names, Visual Basic for Applications Can you tell what version / compiler was used? Wmsdkns: Visual Basic 6.0 98 "C:\Program Files\Microsoft Visual Studio\VB98\VB6.OLB “C:\WINDOWS\system32\msvbvm60.dll\3” B2new: Visual C It includes functions from the msvcrt library B2new.exe: c or c++ The API calls, like sprintf, are used in c code Can you tell what the original project names are for these files? Wmsdkns: “runbll” (like rundll but trojan?) B2new: “loader_our” or “a_loader” Are they using source code control? How can you tell? Yes: "e:\projects\cvs_port\port\tools\loader_our\Bin\i386\a_loader.pdb"
Anti-debugging & Packing
Defensive Factors Overview Firewall bypassing Anti-Virus Killing Rootkit/Stealth techniques Anti-Debugging Packing and Encrypting
Checking for Debugger IsDebuggerPresent Exception tricks Register an exception handler, then see if any exceptions are intercepted If a debugger intercepts the exception, the malware detects it
Packing Results in limited symbol information Many cross-references will not be available Requires data_CALL_PTR resolution
Packed Sections SECTION .Kaos12 SECTION ·Åµ´²»î¿
String and Path obfuscation \dri~~@vers\e~~@tc\hos~~@ts
Packers & Function Loaders Most malware will load a whole set of functions by name using GetProcAddress The functions appear by name as text No auto-labeling of the actual function pointers You can label them by hand
Function Loaders Function loaders usually load a whole set of functions at once You will see a series of ascii names and data_call_ptr’s used together, this is almost always a loading step
The graph shows the loading of a large set of named function pointers. The code loads functions with GetProcAddress and LoadLibrary. The names of all the loaded DLLs and functions are clustered around this single node. MEP The graph shows the loading of a large set of named function pointers. This must be the function-loader part of the packer.
We can label each un-named data_call object with the proper name. By examining the code, we see that each unlabeled address is associated with one of the named functions. MEP We can label each un-named data_call object with the proper name.
GetProcAddress Loading Function Pointers Target: LanScanner
Resolving Call Pointers DEMO Resolving Call Pointers Movie: resolving_data_call_ptrs.avi
Exercise Resolving Call Pointers Focus Type Interactive Analysis Description Use graphing techniques to name data_call_PTR nodes with their proper function names Time 25 minutes
Start Responder and create a new project (Static Import) titled “data_calls.1” Import the datacalls.1.mapped.livebin Drop the string “socket” onto the graph Grow up and find GetProcAddress Grow down to find data_CALL_PTR node associated with “socket” Use code view to determine how ASCII name and data_CALL_PTR node are related Answer Questions 1-2 Questions Which register is the function address returned in How many blocks are between “socket” and the data_CALL_PTR node it’s used with?
Use graph techniques to isolate the callers of socket Promote the data_CALL_PTR to its own layer Hide the nodes that were using GetProcAddress Grow up Use graph techniques to label the functions and determine the callers socket recvfrom listen connect Answer Questions 1-4 Questions How many callers to socket? How many callers to recvfrom? How many callers to listen? How many callers to connect?
Instructor Questions & Answers Which register is the function address returned in EAX How many blocks are between “socket” and the data_CALL_PTR node it’s used with? Two How many callers to socket? 8 How many callers to recvfrom? 1 How many callers to listen? 2 How many callers to connect? 4
Hooking and Stealth
Stealth Driver-assisted hiding Use of thread or DLL injection so that new processes don’t need to be created Removal of the DLL from the list of loaded modules
Hooks Several common points where execution can be hooked Many more non-obvious points Impossible problem to cover with 100% certainty In other words, the bad guys always have a way you didn’t think of yet
This is a diagram of all the places a rootkit can use in a hiding scheme. This slide requires a robust set of descriptive slides to go with it in order to be official “curriculum” so have Greg just talk to it in the class for now.
IDT Hooking Use of CLI and STI instructions around point of hook Use of KeSetTargetProcessorDPC Queuing callbacks on each CPU core for multi-IDT table hooking KeQueryActiveProcessors
User-mode Hooking IAT hooks Hooking code must run in or alter the address space of the target process If you try to patch a shared DLL such as KERNEL32.DLL or NTDLL.DLL, you will get a private copy of the DLL. Three documented ways to gain execution in the target address space CreateRemoteThread Globally hooking Windows messages Using the Registry HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs
IAT HOOK Import Address Table FunctionName or Ordinal 0x11223344 Table Entry
Some DLL IAT HOOK CODE FunctionName or Ordinal 0x11223344 CODE Table Entry
Some DLL IAT HOOK CODE Some Rootkit BAD CODE CODE
Some DLL IAT HOOK CODE Some Rootkit BAD CODE CODE
Modifying BaseRules.txt
Kernel-mode Hooking The operating system is global memory Does not rely on process context Except when portions of a driver are pageable By altering a single piece of code or a single pointer to code, the rootkit subverts every process on the system
System Call KiSystemService Call Number 0x11223344 Table Entry System Service Descriptor Table USER MODE KERNEL MODE
System Call Kernel or module Call Number 0x11223344 Table Entry System Service Descriptor Table USER MODE KERNEL MODE
System Call Kernel or module Some rootkit System Service Descriptor Table USER MODE KERNEL MODE
System Call Kernel or module System Service Descriptor Table Some rootkit USER MODE KERNEL MODE
System Call Kernel or module System Service Descriptor Table Some rootkit USER MODE KERNEL MODE
Memory Protection XP and later version of the operating system Protect the System Call Table Protect the Interrupt Descriptor Table Protect code segments Writing to “read and execute only” memory causes BSoD
The CR0 Modify Memory Here… // UNProtect memory __asm { push eax mov eax, CR0 and eax, 0FFFEFFFFh mov CR0, eax pop eax } Modify Memory Here… // REProtect memory __asm { push eax mov eax, CR0 or eax, NOT 0FFFEFFFFh mov CR0, eax pop eax }
BaseRule CodeBytes:1.0:1:50 0F 20 C0 25 FF FF FE FF 0F 22 C0 58:ALL:These code bytes disable memory protections, this is highly suspicious // UNProtect memory __asm { push eax mov eax, CR0 and eax, 0FFFEFFFFh mov CR0, eax pop eax }
DEMO System Call Hooks SSDT_HOOK_RESOLUTION.avi
NOTES ON DEMO Import virus.vmem and follow the three SSDT hooks that hide_evr puts in place
Exercise Firewall Killer Focus Type Interactive Analysis Description Reverse engineer the firewall defense Time 25 minutes
Start Responder and create a new project (Static Import) titled “firewall.1” Import the firewall_killer.vmem Extract 2e45.exe Find out how the malware protects against firewalls and anti-virus Answer Question 1 Find the function that is called multiple times in a row with firewall or antivirus program names Reverse engineer that function Answer Questions 2-3
Questions Does the program attempt to stop more than one tool or program? Which registry key does the program make values under? What is the name of the program which is set to debug the firewalls?
Answers Does the program attempt to stop more than one tool or program? Yes, there is a long list of programs Which registry key does the program make values under? Image File Execution Options What is the name of the program which is set to debug the firewalls? ALG.EXE
Exercise Recap Firewall Killer Firewall_killer_1.avi