In this post, we will investigate a Vidar Malware sample containing suspicious encrypted strings. We will use Ghidra cross references to analyse the strings and identify the location where they are used.
在这篇文章中,我们将调查包含可疑加密字符串的 Vidar 恶意软件样本。我们将使用 Ghidra 交叉引用来分析字符串并确定它们的使用位置。
Using this we will locate a string decryption function, and utilise a debugger to intercept input and output to obtain decrypted strings.
使用它,我们将找到一个字符串解密函数,并利用调试器来拦截输入和输出以获得解密的字符串。
We will then semi-automate the process, obtaining a full list of decoded strings that can be used to fix the previously obfuscated Ghidra database.
然后,我们将半自动化该过程,获取可用于修复先前混淆的 Ghidra 数据库的解码字符串的完整列表。
Summary 总结
During basic analysis of a Vidar file, we can see a large number of base64 strings. These strings are not able to be decoded using base64 alone as there is additional encryption. By using Ghidra String References we can where the base64 is used, and hence locate the function responsible for decoding.
在对 Vidar 文件进行基本分析时,我们可以看到大量的 base64 字符串。这些字符串无法单独使用 base64 进行解码,因为有额外的加密。通过使用 Ghidra 字符串引用,我们可以在哪里使用 base64,从而找到负责解码的函数。
With a decoding function found, it is trival to find the “start” and “end” of the decryption process. Using this knowledge we can load the file into a debugger and set breakpoints on the beginning and end of the decoding function. This enables us to view the input (encoded string) and output (decoded string) without needing to reverse engineer the decryption process.
找到解码功能后,很难找到解密过程的“开始”和“结束”。利用这些知识,我们可以将文件加载到调试器中,并在解码函数的开头和结尾设置断点。这使我们能够查看输入(编码字符串)和输出(解码字符串),而无需对解密过程进行逆向工程。
By further adding a simple log command into the debugger (x32dbg), we can tell x32dbg to print all values at the start and end of the decryption function. This is a means of automation that is simple to implement without coding knowledge.
通过在调试器(x32dbg)中进一步添加一个简单的日志命令,我们可以告诉x32dbg在解密函数的开始和结束时打印所有值。这是一种无需编码知识即可轻松实现的自动化手段。
Once the encrypted/decrypted contents have been obtained, we can use this to manually edit the original Ghidra file and gain a deeper understanding of the malware’s hidden functionality.
获得加密/解密的内容后,我们可以使用它来手动编辑原始 Ghidra 文件并更深入地了解恶意软件的隐藏功能。
Obtaining the File 获取文件
The file can be downloaded here from Malware Bazaar.
该文件可以从 Malware Bazaar 下载。
SHA256: 0823253d24e0958fa20c6e0c4b6b24028a3743c5c895c577421bdde22c585f9f
SHA256的: 0823253d24e0958fa20c6e0c4b6b24028a3743c5c895c577421bdde22c585f9f
Initial Analysis and Identifying Strings
初始分析和识别字符串
We can download the file from Malware Bazaar using the link above, we can then unzip the file using the password infected
.
我们可以使用上面的链接从 Malware Bazaar 下载文件,然后我们可以 解压缩文件 infected
密码 .
We like to create a copy of the origininal file with a shorter and more useful file name. In this case we have chosen
vidar.bin
.
我们喜欢使用更短、更有用的文件名创建原始文件的副本。在这种情况下,我们选择了vidar.bin
.
We can perform some basic initial analysis using Detect-it-easy. A typical workflow in detect-it-easy is to look for strings contained within the file.
我们可以使用 Detect-it-easy 执行一些基本的初始分析。detect-it-easy 中的典型工作流是查找文件中包含的字符串。
If we select the “strings” option, we can see a large number of base64-like strings.
如果我们选择“strings”选项,我们可以看到大量类似 base64 的字符串。
(You could also use PeStudio or any other tooling that can identify strings)
(您也可以使用 PeStudio 或任何其他可以识别字符串的工具)
The default minimum string length is 5, which results in a lot of junk strings. By increasing this to 10, we can more easily identify strings of interest.
默认的最小字符串长度为 5,这会导致大量垃圾字符串。通过将此值增加到 10,我们可以更轻松地识别感兴趣的字符串。
In the screenshot below we can see a group of base64-like strings. In many cases, encoded strings like these are used to obfuscate functionality and Command-and-Control (C2) servers.
在下面的屏幕截图中,我们可以看到一组类似 base64 的字符串。在许多情况下,像这样的编码字符串用于模糊处理功能和命令和控制 (C2) 服务器。
Hence, they are a useful indicator to hone in on with tooling like Ghidra.
因此,它们是使用 Ghidra 等工具进行磨练的有用指标。
Now we’ve identified some interesting strings within the file. We can go ahead and use Ghidra to analyse these further and attempt to establish some context as to how they are used.
现在,我们已经在文件中识别了一些有趣的字符串。我们可以继续使用 Ghidra 进一步分析这些,并尝试建立一些关于如何使用它们的背景。
Loading the File Into Ghidra
将文件加载到 Ghidra 中
To analyse these strings further, we can go ahead and load the file into Ghidra.
为了进一步分析这些字符串,我们可以继续将文件加载到 Ghidra 中。
This can be done by dragging the file into Ghidra, accepting all default options and allowing the Ghidra analysis to run for a few minutes.
这可以通过将文件拖到 Ghidra 中,接受所有默认选项并允许 Ghidra 分析运行几分钟来完成。
We can then continue our analysis by locating the same strings we found during initial analysis. In this case we can start with the first base64 string of tw+lvmZw5kffvene
然后,我们可以通过定位在初始分析中发现的相同字符串来继续分析。在这种情况下,我们可以从 的第一个 base64 字符串 tw+lvmZw5kffvene
开始
The below screenshots demonstrate how to perform a string search with Ghidra. Search -> For Strings
下面的屏幕截图演示了如何使用 Ghidra 执行字符串搜索。 Search -> For Strings
Ghidra will present a window like below, we can typically go ahead and accept the defaults.
Ghidra 将显示一个如下所示的窗口,我们通常可以继续接受默认值。
Make sure that
Selection Scope -> Search All
is selected. Sometimes Ghidra changes toSelection Scope -> Search Selection
if you have something highlighted.
确保已选中。Selection Scope -> Search All
有时,如果您突出显示了某些内容,则 Ghidra 会更改为Selection Scope -> Search Selection
。
Once we’ve accepted the default search options, we can filter on the beginning of our previous string tw+
to locate it.
一旦我们接受了默认搜索选项,我们就可以过滤上一个字符串 tw+
的开头来找到它。
This will reveal 3 strings starting with tw+
这将显示 3 个字符串,以 tw+
We can double click on any of the returned strings, which will take us to the location of the string within the file.
我们可以双击任何返回的字符串,这将带我们进入文件中字符串的位置。
Ghidra will automatically recognise if the location storing the string has been used elsewhere in the file. This is known as a cross reference (xref) and is an extremely useful concept to become familiar with.
Ghidra 将自动识别存储字符串的位置是否已在文件的其他位置使用。这称为交叉参照 (xref),是一个非常有用的概念。
In this view, we can also see that there is one Cross Reference (XREF) available. This indicates that Ghidra has found one location where the string is used.
在此视图中,我们还可以看到有一个可用的交叉参考 (XREF)。这表明 Ghidra 已找到使用该字符串的一个位置。
Double-clicking the xref value will show us where the string has been referenced.
双击外部参照值将显示字符串的引用位置。
After double clicking on the xref value, we can see the base64 string (as well as others) contained within function FUN_004016a6
.
双击外部参照值后,我们可以看到 function FUN_004016a6
中包含的 base64 字符串(以及其他字符串)。
We can also see each of these strings is passed to FUN_00401526
. Since every string is going to the same function, it is very likely the one responsible for decryption.
我们还可以看到这些字符串中的每一个都被传递给 FUN_00401526
。由于每个字符串都指向相同的函数,因此它很可能是负责解密的字符串。
Side note – These strings undergo additional obfuscation as well as base64. We won’t be able to decode them using base64 alone.
旁注 – 这些字符串以及 base64 都经过了额外的混淆。我们将无法仅使用 base64 对它们进行解码。
If we click on the FUN_00401526
function taking all the encoded strings, we can see that it’s rather long, confusing and contains a lot of junk code.
如果我们单击获取所有编码字符串的 FUN_00401526
函数,我们可以看到它相当长,令人困惑并且包含大量垃圾代码。
Luckily, we don’t need to analyse it in detail in order to decrypt the strings. Since we know the location of the function within the file, we can use a debugger to obtain the decrypted content for us.
幸运的是,我们不需要为了解密字符串而对其进行详细分析。由于我们知道函数在文件中的位置,因此可以使用调试器为我们获取解密的内容。
The name of the function is the location within the file. This is all we need to be able to locate it within a debugger.
函数的名称是文件中的位置。这就是我们能够在调试器中找到它所需要的全部内容。
Eg for functionFUN_00401526
, the location of the function will be00401526
.
例如,对于函数,函数FUN_00401526
的位置将是00401526
。
As a side note, if we look at the same function within the disassembly view on the left hand side, we can see that there are 542 xrefs available.
顺便说一句,如果我们在左侧的反汇编视图中查看相同的函数,我们可以看到有 542 个外部参照可用。
This means thatFUN_00401526
is used 542 times throughout the file, a number this high is another strong indicator that the function is used for decoding.
这意味着FUN_00401526
在整个文件中使用了 542 次,这么高的数字是另一个强有力的指标,表明该函数用于解码。
We now know the location of a function that is likely responsible for decrypting the strings. Although we could analyse it statically, this is difficult, time consuming and often unnecessary.
我们现在知道了可能负责解密字符串的函数的位置。虽然我们可以静态地分析它,但这很困难、耗时,而且往往是不必要的。
A better method is to load the file into a debugger and use breakpoints to monitor the location of the function. We can use this method to obtain input (encrypted string) and output (decrypted string) without needing to manually analyse the function. We just need to know where the function starts.
更好的方法是将文件加载到调试器中,并使用断点来监视函数的位置。我们可以使用此方法来获取输入(加密字符串)和输出(解密字符串),而无需手动分析函数。我们只需要知道函数从哪里开始。
Loading The File Into x32dbg
将文件加载到 x32dbg 中
Since we now have a function to monitor, we can go ahead and load the file into x32dbg for further analysis.
由于我们现在有一个要监视的功能,我们可以继续将文件加载到 x32dbg 中进行进一步分析。
We can start this by dragging the file into x32dbg, and allowing the file to reach it’s entry point using F9
or Continue
.
我们可以通过将文件拖到 x32dbg 中并允许文件使用 F9
或 Continue
到达其入口点来开始此操作。
Confirming and Synchronising Base Addresses in Ghidra
确认和同步 Ghidra 中的基址
Before continuing analysis in the debugger, we need to confirm the base address is the same as in Ghidra. This ensures that the function will be stored at the same location.
在调试器中继续分析之前,我们需要确认基址与 Ghidra 中的基址相同。这可确保函数将存储在同一位置。
The location within Ghidra and X32dbg will always be <base address> + xyz. But if <base address> differs, then we occasionally need to fix it.
Ghidra 和 X32dbg 中的位置将始终为 <基址> + xyz。但是,如果<基址>不同,那么我们偶尔需要修复它。
We can double check the base address by clicking on the Memory map
option within x32dbg. The base address will be the one on the same line as your file name.
我们可以通过单击 x32dbg 中的 Memory map
选项来仔细检查基址。基址将与文件名在同一行。
The base address in our case was 0x000f0000
(this address may differ for you)
在我们的例子中,基址是 0x000f0000
(这个地址可能因您而异)
We need to make sure that this base address is aligned with Ghidra.
我们需要确保这个基址与 Ghidra 保持一致。
The base address can be found in Display Memory Map -> View Base Address
.
基址可在 中找到 Display Memory Map -> View Base Address
。
In this case, Ghidra’s base address is 0x00400000
, we can manually change this to match the 0x000f0000
found in x32dbg.
在本例中,Ghidra 的基址是 0x00400000
,我们可以手动更改它以匹配 x32dbg 中的 0x000f0000
基址。
Fixing the base address is as simple as changing the value to 0x000f000
修复基址非常简单,只需将值 0x000f000
更改为
After selecting OK
, Ghidra will reload the file with the new base address.
选择 OK
后,Ghidra 将使用新的基址重新加载文件。
After reloading a base address, sometimes Ghidra will get lost. You may need to do another string search + xref (same process as before) to identify the string decryption function again.
重新加载基址后,有时 Ghidra 会丢失。您可能需要再执行一次字符串搜索 + 外部参照(与之前相同的过程)来再次识别字符串解密函数。
With the correct base address now loaded, the string decryption function will have a new name FUN_000f1526
to reflect it’s new location.
现在加载正确的基址后,字符串解密函数将具有一个新名称 FUN_000f1526
来反映其新位置。
We can now use this address of 000f1526
to create a breakpoint within x32dbg.
现在,我们可以使用此地址 000f1526
在 x32dbg 中创建断点。
Setting Breakpoints on the Decryption Function
在解密函数上设置断点
We now want to create a breakpoint at the corrected address of the decryption function.
现在,我们想要在解密函数的更正地址处创建一个断点。
Using the new address of 000f1526
, we can go back to x32dbg and create a breakpoint using bp 000f1526
使用 的新地址 000f1526
,我们可以返回到 x32dbg 并使用 bp 000f1526
With the breakpoint set, we can let the malware run until the function is triggered.
设置断点后,我们可以让恶意软件运行,直到触发函数。
When the breakpoint is hit, we can view the current encoded string within the stack window on the right-hand side of x32dbg.
当命中断点时,我们可以在 x32dbg 右侧的堆栈窗口中查看当前编码的字符串。
If we allow the function to complete using the Execute Until Return
option, we can jump to the end of the decryption function and see if any decrypted output is present.
如果我们允许使用该 Execute Until Return
选项完成函数,我们可以跳转到解密函数的末尾,看看是否存在任何解密的输出。
Execute Until Return tells the debugger to allow the current function to finish without continuing beyond the current function. This is an easy way to obtain function output without it getting lost somewhere during execution.
“执行直到”返回指示调试器允许当前函数完成,而不会超出当前函数。这是一种获取函数输出的简单方法,而不会在执行过程中丢失。
The “Execute Until Return” button looks like this.
“执行直到返回”按钮如下所示。
After the Execute Until Return
has completed, we can observe the first decoded string HAL9TH
within the register window.
Execute Until Return
完成后,我们可以在寄存器窗口中观察第一个解码的字符串 HAL9TH
。
The decoded string is contained within
EAX
, which is the most common location where function output will be stored.
解码后的字符串包含在 中EAX
,这是存储函数输出的最常见位置。
Now that the decoded string is visible, we should note the current location of EIP within the debugger. This will tell us the exact location where we can find a decrypted copy of the string.
现在解码的字符串是可见的,我们应该注意 EIP 在调试器中的当前位置。这将告诉我们可以找到字符串解密副本的确切位置。
In the screenshot below, we can see that this location is 0x000f16a3
. This is the end of the decryption function, and we should create another breakpoint here.
在下面的屏幕截图中,我们可以看到这个位置是 0x000f16a3
.这是解密函数的结尾,我们应该在这里创建另一个断点。
Creating a breakpoint here is functionally identical to using
Execute Until Return
every time we hit the function, but creating a second breakpoint is much easier.
在此处创建断点在功能上与每次点击函数时使用Execute Until Return
断点相同,但创建第二个断点要容易得多。
The new breakpoint can be created with bp 000f16a3
or by pressing F2
on the address highlighted in green.
可以使用或按 F2
以绿色突出显示的地址来创建 bp 000f16a3
新断点。
If we continue to execute using F9
or Continue
, we will hit the original string decryption function again.
如果我们继续执行 using F9
or Continue
,我们将再次命中原来的字符串解密函数。
This time there is a new encoded string present in the stack window lgWSvkdzsA==
.
这一次,堆栈窗口中 lgWSvkdzsA==
存在一个新的编码字符串。
Allowing the malware to run with F9
again, will trigger our second breakpoint, which contains the decoded value of JohnDoe
.
允许恶意软件 F9
再次运行时,将触发第二个断点,其中包含解码值 JohnDoe
。
As you obtain decrypted values, it can be useful to google them to determine their purpose within the context of malware.
当您获得解密的值时,在谷歌上搜索它们以确定它们在恶意软件上下文中的用途可能很有用。
According to CyberArk, The two values JohnDoe
and HAL9TH
are default values used by the Windows Defender Emulator. The malware likely uses these values later to determine if it’s being emulated inside of Windows Defender.
根据 CyberArk 的说法,这两个值 JohnDoe
是 HAL9TH
Windows Defender 模拟器使用的默认值。恶意软件稍后可能会使用这些值来确定它是否在 Windows Defender 中模拟。
Obtaining Additional Decoded Values
By allowing the malware to execute with F9
, we will continue to hit the existing breakpoints and observe decoded values.
通过允许恶意软件执行 F9
,我们将继续命中现有断点并观察解码值。
Here we can see that the malware has decrypted some windows API names (LoadLibraryA, VirtualAlloc) as well as strings related to Crypto Wallets (Ethereum, ElectronCash, Binance).
在这里,我们可以看到该恶意软件已经解密了一些 Windows API 名称(LoadLibraryA、VirtualAlloc)以及与加密钱包(以太坊、ElectronCash、Binance)相关的字符串。
We can use this knowledge to assume that the malware is dynamically loading APIs, and likely stealing the data of Crypto Wallets.
我们可以利用这些知识来假设恶意软件正在动态加载 API,并可能窃取加密钱包的数据。
If we recall before, there were 542 references to the string decryption function. This is a few too many to observe manually, so we can go ahead and perform som basic automation using a debugger.
如果我们之前还记得,有 542 次引用字符串解密函数。这太多了,无法手动观察,因此我们可以继续使用调试器执行基本的自动化。
Automating the Process With Conditional Breakpoints
使用条件断点自动执行流程
Now that we have existing breakpoints on the start and end of the decryption function, we can add a log condition to print the interesting values to the log window.
现在,我们在解密函数的开始和结束时已经有了断点,我们可以添加一个日志条件,将感兴趣的值打印到日志窗口。
We can add a log condition by modifying our existing breakpoints. We can do this within the breakpoint window, and then Right-Click -> Edit
on the two existing breakpoints.
我们可以通过修改现有的断点来添加日志条件。我们可以在断点窗口中执行此操作,然后在 Right-Click -> Edit
两个现有断点上执行此操作。
Printing Encoded Strings With x32dbg
Our first breakpoint is at the “start” of the encryption function, and we know from previous analysis that the encoded value will be inside the stack window.
Observing the stack window closer, we can see that the exact location is [esp+4]
We can now tell the breakpoint to log the string contained at [esp+4]
我们现在可以告诉断点记录包含在 [esp+4]
We can do this with the command Encoded: {s:[esp+4]}
. The “Encoded: ” part is not necessary but it makes the output easier to read.
我们可以用命令 Encoded: {s:[esp+4]}
来做到这一点。“Encoded:”部分不是必需的,但它使输出更易于阅读。
Since we don’t need to stop at every breakpoint (we just want to log the results), we can add another condition run;
in Command Text
.
由于我们不需要在每个断点处停止(我们只想记录结果),因此我们可以在 Command Text
中添加另一个条件 run;
。
This will tell x32dbg to resume execution after printing the output.
这将告诉 x32dbg 在打印输出后继续执行。
Printing Decoded Strings with x32dbg
使用 x32dbg 打印解码字符串
We can repeat the same process for the second breakpoint.
我们可以对第二个断点重复相同的过程。
This time instead of printing [esp+4]
, we want to print the decoded value contained in eax
这次不是打印,而是要打印 [esp+4]
eax
After editing the second breakpoint, we want it to look something like this.
编辑第二个断点后,我们希望它看起来像这样。
This should be identical to the previous breakpoint, with only [esp+4]
being replaced with eax
.
这应该与上一个断点相同,只是 [esp+4]
替换为 eax
。
We can also change Encoded:
to Decoded:
to make the final output easier to read.
我们还可以更改为 Encoded:
Decoded:
使最终输出更易于阅读。
With the new breakpoints saved, we can restart the malware or allow it to continue it’s current execution. This will print all encoded and decoded values to the log window.
(You can find the log window next to the breakpoints window)
After restarting the malware and leaving the breakpoints intact, we can see our initial encoded string and it’s decoded value of kernel32.dll
.
We can also see additional decoded values related to Ethereum keystores.
Obtaining Only Decrypted Values
仅获取解密值
By temporarily disabling the initial breakpoint (right click -> disable) , we can print only the decoded values. Here we can see some potential encryption keys, as well as SQL commands used to steal mozilla firefox cookies.
通过暂时禁用初始断点(右键单击 -> disable),我们只能打印解码后的值。在这里,我们可以看到一些潜在的加密密钥,以及用于窃取 mozilla firefox cookie 的 SQL 命令。
We can also observe that the malware attempts to steal credit card information from web browsers.
我们还可以观察到恶意软件试图从 Web 浏览器窃取信用卡信息。
Using Results to Edit Ghidra Output
使用结果编辑 Ghidra 输出
If we go back to Ghidra, we can revisit the initial function containing references to encrypted strings.
如果我们回到 Ghidra,我们可以重新访问包含对加密字符串引用的初始函数。
Since we now have both the encrypted and decrypted values, we can edit the Ghidra view to reflect the decoded content.
由于我们现在同时拥有加密值和解密值,因此我们可以编辑 Ghidra 视图以反映解码内容。
Here we can see decoded values within x32dbg, reflecting the same encoded values as the above screenshot.
We can also note that after each call to the decoding function, the result is stored inside of a global variable (indicated by a green DAT_00138e98
etc on the left hand side).
This usually means that the same variable will be referenced each time the decoded string is used. If we rename the variable once, it will be renamed in all other locations that reference it.
这通常意味着每次使用解码字符串时都会引用相同的变量。如果我们重命名变量一次,它将在引用它的所有其他位置重命名。
We will see this in action in a few more screenshots.
我们将在更多屏幕截图中看到这一点。
Using the output from x32dbg, we can begin renaming those global variables DAT_000*
etc to their decoded values.
使用 x32dbg 的输出,我们可以开始将这些全局变量 DAT_000*
等重命名为它们的解码值。
This will significantly improve the readability of the Ghidra code.
这将大大提高 Ghidra 代码的可读性。
This process can be done manually or by saving the x32dbg output and creating a Ghidra Script. The process of scripting this is in Ghidra is relatively complicated and will be covered in a later post.
此过程可以手动完成,也可以通过保存 x32dbg 输出并创建 Ghidra 脚本来完成。在 Ghidra 中编写脚本的过程相对复杂,将在后面的文章中介绍。
For now, we can edit the names manually (Right Click -> Rename Global Variable)
现在,我们可以手动编辑名称(右键单击 -> 重命名全局变量)
Below we can see the same code after some slight renaming. Making sure to reference the x32dbg output.
下面我们可以看到经过一些轻微重命名后相同的代码。确保引用 x32dbg 输出。
We like to prepend each variable with
str_
to indicate that it’s a string. This is optional but improves the readability of the code.
我们喜欢在每个变量前面加上str_
一个字符串,以表明它是一个字符串。这是可选的,但提高了代码的可读性。
With the DAT_*
locations modified to their decoded values, any location within Ghidra that contains the same DAT_
value will now have a suitable name, making it much easier to infer the purpose of the function.
To determine where a variable is used, we can again use cross references. Double clicking on any of the
DAT_*
values will show it’s location and any available cross references where it is used.
为了确定变量的使用位置,我们可以再次使用交叉引用。双击任何DAT_*
值将显示其位置以及使用它的任何可用交叉引用。
For example, here is the function containing “JohnDoe” before the DAT_*
values are renamed.
例如,下面是重命名 DAT_*
值之前包含“JohnDoe”的函数。
If we had encountered this function without first decrypting strings, it would be difficult to tell what the function is doing.
如果我们在没有先解密字符串的情况下遇到这个函数,就很难分辨出这个函数在做什么。
After marking up the DAT_*
values with more appropriate names, the function now looks like this.
使用更合适的名称标记 DAT_*
值后,该函数现在如下所示。
Since we googled these values and determined they are used for Defender Emulation checks, we can infer that this is (most likely) the purpose of the function.
由于我们在谷歌上搜索了这些值并确定它们用于 Defender 仿真检查,因此我们可以推断这(很可能)是该函数的目的。
Using that assumption, we can change the name to something more useful.
使用这个假设,我们可以将名称更改为更有用的名称。
Now, anywhere where that function is called will be much more understandable.
现在,任何调用该函数的地方都将更容易理解。
To see where a function is called, we can double click it and view the x-refs again to see where the function is used.
要查看函数的调用位置,我们可以双击它并再次查看 x-refs 以查看该函数的使用位置。
Here is one such reference, which doesn’t make much sense at initial glance.
这是一个这样的参考,乍一看没有多大意义。
After renaming the function to mw_checkDefenderEmulation
, it begins to make more sense.
将函数重命名为 mw_checkDefenderEmulation
后,它开始更有意义。
After renaming all remaining DAT_*
variables, it begins to make even more sense.
重命名所有剩余 DAT_*
变量后,它开始变得更有意义。
The malware is temporarily going to sleep and repeatedly checking for signs of Defender Emulation.
恶意软件暂时进入睡眠状态,并反复检查 Defender 仿真的迹象。
A similar concept can be seen with the decoded string for VirtualAlloc.
在 VirtualAlloc 的解码字符串中可以看到类似的概念。
Below is a function referencing VirtualAlloc, prior to renaming variables.
下面是在重命名变量之前引用 VirtualAlloc 的函数。
After renaming, we can see that it’s primary purpose is to create memory using VirtualAlloc.
重命名后,我们可以看到它的主要目的是使用 VirtualAlloc 创建内存。
(There are some other things going on, but the primary purpose is memory allocation, hence we can rename this function to
mw_AllocateWithVirtualAlloc
)
(还有其他一些事情正在发生,但主要目的是内存分配,因此我们可以将此函数重命名为mw_AllocateWithVirtualAlloc
)
This process can be repeated until all points of interest have been labelled with appropriate values.
可以重复此过程,直到所有兴趣点都用适当的值标记。
This is time-consuming if you wish to mark up an entire file, but it is effective and will reveal a significant portion of the files previously hidden functionality.
如果您希望标记整个文件,这很耗时,但它是有效的,并且会显示以前隐藏的大部分文件功能。
Once you’re comfortable with performing this process manually, you can eventually create a script to do the same thing for you.
一旦你习惯了手动执行此过程,你最终可以创建一个脚本来为你做同样的事情。
Creating a script will still require obtaining the decrypted strings through some means, but the process of renaming everything can be done well with a Ghidra script.
创建脚本仍然需要通过某种方式获取解密的字符串,但是使用 Ghidra 脚本可以很好地重命名所有内容的过程。
Conclusion 结论
We have now looked at how to identify basic obfuscated strings, decrypt them, and fix their values within Ghidra.
我们现在已经研究了如何在 Ghidra 中识别基本的混淆字符串、解密它们并修复它们的值。
Although this is a relatively simple example, the same overall process and workflows are repeatable across many many malware samples.
尽管这是一个相对简单的示例,但相同的整体过程和工作流在许多恶意软件示例中是可重复的。
As you become more confident, many of these steps can be automated further or scripted. The renaming process can be replaced with a Ghidra script, and the “debugger” process can be replaced with scripted Emulation (Unicorn, Dumpulator etc).
随着您变得更加自信,其中许多步骤可以进一步自动化或编写脚本。重命名过程可以替换为 Ghidra 脚本,“调试器”过程可以替换为脚本化仿真(Unicorn、Dumpulator 等)。
Regardless, this blog demonstrates some core skills that are important for building the baseline skills to begin exploring future automation.
无论如何,这篇博客展示了一些核心技能,这些技能对于构建开始探索未来自动化的基本技能非常重要。
原文始发于Matthew:Ghidra Basics – Identifying, Decoding and Fixing Encrypted Strings
转载请注明:Ghidra Basics – Identifying, Decoding and Fixing Encrypted Strings | CTF导航