Practical Malware Analysis — Chapter 1 — Basic Static Analysis
This chapter begins with introducing the basic static techniques in order to grab all the necessary information regarding the malicious executable. We can achieve it by doing,
- Scanning the executable with number of antiviruses (VirusTotal).
- Using the hashes (MD5, SHA) to identify a particular malware.
- Using the program “Strings” to find out different functions, DLL libraries, headers and other useful information.
The author has used other programs for examples but i will be using the executable’s provided with the book.
NOTE: I am using FLARE-VM which is a Windows Based Reverse Engineering and Malware Analysis Platform. It is being maintained by FireEye. It is quite easy to create (only a single script installs all of the necessary programs for you). More information @Flare-VM. You can use any of the pre-build machines or can create your own! Be sure that you have a safe environment :))
Also the Labs are pre downloaded in the Flare-VM.
Let’s begin!
Antivirus Scanning — The first step
Out first step is to always scan the files with antiviruses in order to make sure that the files are infected and we do not waste our time analyzing the file which is not malicious. Now let’s try to upload the files Lab01–01.exe on VirusTotal and check the results.
The results for Lab01–01.exe can be found here.
We can see that this file has been detected as malicious by 40 antivirus programs.
Now let’s look onto more details!
We can see the Imports this executable is making. Header information, Basic Properties and Hashes along with the history of the file.
Leaving the rest of the details here as we will be coming back to this again!
Hashing — A fingerprint for Malware
A step further we will create a hash using some tools in order to uniquely identify the malware. Why this? Just because hashes uniquely identify each malware separately and we can share these hashes online and with other resources which will stop these malware’s based on Signatures. We can also check whether this file was already been identified or not.
We will be using MD5 and SHA hashes. Let’s create the hash using “md5deep” tool.
We can now clearly see that the hash we got is the same as the hash provided with in the VirusTotal result. Moving further we can create number of hashes with another tool called “HashCalc”, “HashMyFiles”, “HashDeep”.
Finding Strings
Once we are done with AV scanning and Hashing its time to move onto checking the strings in the executable. Strings are sequence of characters. Strings are commonly found in an executable when there is a message to be printed, connects to a URL, copies a file to a specific location etc.
This will give us the hint about what basically is going on behind the scenes. We can have the clue regarding the DLL, files, functions etc which are within the executable.
According to the book “Practical Malware Analysis” there is a note that Microsoft uses term Wide Character which means that after every character there is a null byte.
Both ASCII and Unicode store characters in sequence that end with NULL byte. For example, consider NEW word. Now,
N=> 0x56, E => 0x45, W = 0x67
This is the ASCII representation of the characters but when it comes to wide character there is a NULL byte after each character and 2 NULL bytes at the end to tell that the string has ended and are known as NULL Terminator.
N=> 0x56, 0x00 — E => 0x45, 0x00 — W => 0x67, 0x00 –- 0x00, 0x00
Follow @ ASCII codes here.
Now, lets ‘strings’ the executable.
Now we can clearly see come sentences, DLL files which are being loaded, PATH to some DLL & some functions like ‘strcimp’ etc. We can also narow down our serch using ‘grep’ and ‘uniq’ commands. Let’s grep all DLL this executable is calling!
So now we know that these are the DLL from which particular functionalities is being used in the executable. Also we can see some functions which are being used like ‘CreateFileA’, ‘CopyFileA’ etc this means that this executable is creating and copying some files etc.
Packed & Obfuscated Malware
Packed programs are subset of Obfuscated programs in which the author of the malware has made the execution of the program hidden in order to make it difficult to detect and analyze. Such programs make it difficult to be analyzed. Also such programs will not show many strings in fact less strings or no strings at all. So we will need Unpackers to unpack these files in order to statically analyze them.
NOTE: These types of code will often include at least the functions LoadLibrary and GetProcAddressm which are used to load and gain access to additional features — Practical Malware Analysis Book
When a packed program is run there is a wrapper program which runs in parallel and unpacks the code and then runs it.
Detecting Packers with PEiD
Let’s try to check whether the executable we are currently working on is Packed or not. We will be using PEid.
We can see that this file is not packed. But the result we found on VirusTotal tells something else and mentions the packer details. This is true that the file we have at the moment is not packed but there are others with the same code (let’s suppose) which were packed using packers in order to avoid being analyzed.
Portable Executable File Format
The format of the file can reveal a lot of information about the program’s functionality. The PE is used by Windows Exe’s, object code and DLLs. The PE format is a data structure that contains information which is necessary for the Windows OS Loader to manage the wrapped code.
Nearly all files loaded in Windows are in PE format
PE Files begin with the header that includes the information about the code, type of application, required library functions, space requirements etc.
Linked Libraries & Functions
The most important details that we can have is about the functions a particular executable loads or imports.
Imports are functions used by one program but actually are stored in different program. Such as code libraries. These codes are linked with the original executable during the linking process!
Static, Runtime & Dynamic Linking
Static Linking — The code is copied onto the original executable. The main executable grows in size. When analyzing it becomes difficult to differentiate between the statically linked code and the original executable code. The PE header has no clue about this!
Runtime Linking — This is commonly done in malwares. The code is linked on runtime when it is required mainly done in packed/obfuscated executable. The functions is only called when it is required.
Dynamic Linking — When the program is loaded the OS searches for the necessary libraries. When the program calls the linked library function it is executed within the library.
The PE header stores the information about every library that will be loaded and every function that will be used by the program.
Exploring Dynamically Linked Functions with Dependency Walker
This software is not pre-installed in Flare-VM but can be downloaded from http://www.dependencywalker.com/
This software only lists the dynamically linked libraries in an executable. Let’s load the program we are currently working on to find which libraries are being dynamically linked.
Now we can see that functions from two DLLs are being imported. We can see the list of imported functions in upper right pane. But there is one DLL missing which DependencyWalker was not able to mention. That was Level01–01.DLL. It is because it’s not a system DLL file but a separate one creates specially for the Lab01–01.exe let’s try string on it.
Now we can see that this DLL is importing functions like “CreateProcessA” which means a separate process will be created for a separate portion of the program to run, we should keep an eye on other programs being launched. Also we can see an IP address which means that this will surely be communicating over this IP address. Let’s open it in dependency walker.
Now we can clearly see the DLL and the functions being imported by this file.
Common DLLs
PE File Headers & Sections
We have already studies much about PE files and now let’s take a look onto it’s sections.
.text This section sontains the instructions that the CPU executes. This sections includes the code and is the only one which executes.
.rdata This section contains the import and export information. The information we get from DependencyWalker and PEView. (Read-Only data that is accessible within the program).
.idata Sometimes present and stores the import function information. If not present then information in .rdata.
.edata Export Function if not present then .rdata
.rsrc This sections includes the resources used by the executable that are not considered part of the executable such as icons, images etc. Strings can be stored in .rsrc or main program.
.reloc Contains information for the relocation of library files.
Examining PE Files with PEview
We have discussed quite a lot about the PE Header and what information it stores. Now it’s time to analyze the PE Header with PEview. PEview is pre-installed in FlareVM.
We can now clearly see the different sections .text, .rdata, .data, Image Headers, DOS Header etc. Each sections provides us information separately.
The IMAGE_FILE_HEADER is providing us with the basic information that when this executable was compiled.
The IMAGE_OPTIONAL_HEADER provides us with more information that whether this is a CLI or GUI executable.
The IMAGE_SECTION_HEADER are used to describe each of the sections of the PE File. We must note down the Virtual Size of the executable (The memory allocated during the load process) & Size of Raw Data (Size on the disk) of all of the three section.
Conclusion
With all this we are done with Chapter-1. I really enjoyed learning new stuff in detail and i hope that you have also enjoyed it. If so then stay tuned to this series. Keep practicing and keep learning.
If you find any errors/bugs/typo errors do let me know.
Thanks for reading!