一
环境搭建
1.1 codeql-cli & visual studio code & 插件codeql
1.2 clone vscode-codeql-starter
git clone --recursive https://github.com/github/vscode-codeql-starter
/etc/hosts
里加上github的ip,比如这样:140.82.113.3 github.com
查询github ip的网址:https://www.ipaddress.com/site/github.com
1.3 下载u-boot已有数据库
❯ unzip u-boot_u-boot_cpp-srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5-dist_odasa-2019-07-25-linux64.zip
❯ ll
total 1011776
drwxr-xr-x 5 lzx staff 160B 6 6 23:33 attach
drwxr-xr-x 38 lzx staff 1.2K 6 5 23:02 codeql
drwxr-xr-x@ 22 lzx staff 704B 5 24 18:28 codeql-cli
-rw-r--r--@ 1 lzx staff 420M 6 5 23:01 codeql-cli.zip
-rw-r--r--@ 1 lzx staff 2.1K 6 7 00:00 codeql.md
-rw-r--r--@ 1 lzx staff 74M 6 5 23:21 u-boot_u-boot_cpp-srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5-dist_odasa-2019-07-25-linux64.zip
drwxr-xr-x@ 9 lzx staff 288B 6 7 00:01 u-boot_u-boot_d0d07ba <-- 解压结果
drwxr-xr-x 21 lzx staff 672B 6 6 22:51 vscode-codeql-starter
1.4 导入vscode-codeql-starter
File
->Open workspace from file...
->选择vscode-codeql-starter.code-workspace
1.5 导入u-boot 数据库
From a folder
(因为我前面解压了)。参考:choosing-a-database https://codeql.github.com/docs/codeql-for-visual-studio-code/analyzing-your-projects/#choosing-a-database 这个数据库对应的u-boot版本是d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5 https://github.com/u-boot/u-boot/tree/d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5 如果要自己创建数据库,那就clone这个版本的代码,然后再用codeql-cli创建。 proxychains4 git clone https://github.com/u-boot/u-boot.git
cd u-boot
git reset --hard d0d07ba86afc8074d79e436b1ba4478fa0f0c1b
1.6 clone 课程仓库
git clone https://github.com/hluwa/codeql-uboot.git
1.7 把课程仓库添加到前面的starter workspace
File
->Add Folder to Workspace...
添加完之后长下面这样。The goal of this challenge is to find the 13 remote-code-execution vulnerabilitiesthat our security researchers foundin theU-Boot loader. The vulnerabilities can be triggered when U-Boot is configured to use the network for fetching the next stage boot resources. MITRE has issued the following CVEs for the 13 vulnerabilities:CVE-2019-14192,CVE-2019-14193,CVE-2019-14194,CVE-2019-14195,CVE-2019-14196,CVE-2019-14197,CVE-2019-14198,CVE-2019-14199,CVE-2019-14200,CVE-2019-14201,CVE-2019-14202,CVE-2019-14203, andCVE-2019-14204. Through these vulnerabilities an attacker in the same network (or controlling a malicious NFS server) could gain code execution at the U-Boot powered device. Thefirst twooccurrences of the vulnerability were plainmemcpy overflowswith an attacker-controlled size coming from the network packet without any validation. The memcpy
function copiesn
bytes from memory areasrc
to memory areadest
. This can be unsafe when the size being parsed is not appropriately validated, allowing an attacker to fully control the data and length being passed through.U-Boot contains hundreds of calls to memcpy
andlibc
functions that read from the network such asntohl
andntohs
. In this challenge, you will useCodeQLto find those calls. Of course many of those calls are safe, so throughout this challenge you will refine your query to reduce the number of false positives.Upon completion of the challenge, you will have a query that is able to find many of the vulnerabilities that allow for remote execution of arbitrary code on U-Boot powered devices.
二
Step 0: Finding the definition of memcpy, ntohl, ntohll, and ntohs
2.1 找到所有名为strlen的函数定义
3_function_definitions.ql。
import cpp
from Function f
where f.getName() = "strlen"
select f, "a function named strlen"
3_function_definitions.ql
,右键
->CodeQL: Run Queries in Selected Files。
import cpp
: 导入 c++ 规则库。From Function f
: 声明一个 Function 类的变量为 f。where f.getName() = "strlen"
:f.getName()
用于获取此变量的名称,也就是满足条件:和strlen相同的Function会被选出来。select f,"a function named strlen"
: select的作用是选择要显示的结果,用逗号分隔。嗯,和sql一样。2.2 找到所有名为memcpy的函数定义
import cpp
from Function f
where f.getName() = "memcpy"
select f, "a function named memcpy"
2.3 找到所有名为ntohl
、ntohll
和ntohs
的函数定义或宏定义
ntohl
,ntohll
, andntohs
can either be functions or macros (depending on the platform where the code is compiled).Ctrl-Space
after the from clause to get the list of objects you can query. Wait a second after typingmyObject.
to get the list of methods.-
hmm…query console? ctrl-space
?
-
借助正则表达式,一次查询三个宏的定义
import cpp
from Macro m
where m.getName().regexpMatch("ntoh(l|ll|s)")
select m, "ntohl, ntohll, and ntohs"
import cpp
from Macro m
// where m.getName().regexpMatch("ntoh(l|ll|s)")
// select m, "ntohl, ntohll, and ntohs"
// where <your_variable_name> in [“bar”, “baz”, “quux”]
where m.getName() in ["ntohs","ntohl","ntohll"]
select m, "ntohl, ntohll, and ntohs 22222"
ntoh 族函数通常用来进行网络字节序到主机字节序的转换。 参考: ◆https://bestwing.me/codeql.html ◆https://milkii0.github.io/2022/06/10/CodeQLU-BootChallenge%20(CC++)/
三
Step 1: Finding the calls to memcpy, ntohl, ntohll, and ntohs
3.1 找到所有memcpy的调用
memcpy
.import cpp
from FunctionCall fc
// FunctionCall.getTarget():返回值类型的是Function,功能是获取被这个函数调用fc所调用的函数
where fc.getTarget().getName() = "memcpy" // 如果fc调用的函数的名称是memcpy
select fc
3.2 找到所有ntohl
、ntohll
和ntohs
的调用
ntohl
,ntohll
, andntohs
.ntohl
,ntohll
, andntohs
are macro invocations, unlike memcpy which is a function call.-
MacroInvocation类似上面的FunctionCall
MacorInvocation.getMacro()
的功能是获取被这个Invocation访问的宏。import cpp
from MacroInvocation mi
where mi.getMacro().getName().regexpMatch("ntoh(l|ll|s)")
select mi
3.3 找到所有包含上面宏调用的表达式
mi.getExpr()
的结果应该是mi
的子集。果然,执行后,翻了翻结果,数量和内容和Q1.1是一样的。另外,这个top-level expression是什么? What is a top-level expression?
import cpp
from MacroInvocation mi
where mi.getMacro().getName().regexpMatch("ntoh(l|ll|s)")
select mi.getExpr()
四
Step 2: Data flow analysis
hasFlowPath
that will tell us when some data coming from asourceflows to asink. Use the boiler plate provided below to complete yourtaint trackingquery.hasFlowPath
的谓词,其作用是告诉我们来自source
的数据什么时候流向sink
。用下面提供的样板来完成污点跟踪查询。这个样板在下面的Q2.1(写这句废话是因为在Q2.0加了一些东西,可能你一下看不到)。ntohl
,ntohll
, andntohs
.import cpp
// 定义一个类:
// 1. 要有class关键字
// 2. 类名首字母必须大写
// 3. 类的supertypes需要由关键字 extends 或者 instanceof 来声明
// 4. 类的body要闭合
class MyMacroInvocation extends MacroInvocation{ // 这个类继承MacroInvocation
MacroInvocation mi;// 声明一个宏调用的变量
MyMacroInvocation(){ // characteristic predicate, 类似构造函数
// mi满足下面的条件,并且this等于mi
mi.getMacro().getName().regexpMatch("ntoh(l|ll|s)") and this = mi
}
}
from MyMacroInvocation mmi
select mmi.getExpr() // 获取满足上面条件的宏调用的表达式
// 解法3:
import cpp
class MyExpr extends Expr {
MacroInvocation mi;
MyExpr(){
mi.getMacro().getName().regexpMatch("ntoh(l|ll|s)") and this = mi.getExpr()
}
}
from MyExpr me
select me, "33333"
import cpp
class NetworkByteSwap extends Expr {
NetworkByteSwap() {
exists(MacroInvocation mi | mi.getMacro().getName().regexpMatch("ntoh.*") | mi.getExpr() = this)
}
}
from NetworkByteSwap n
select n, "Network byte swap"
ntohl
,ntohll
, orntohs
. The sink should be the size argument of an unsafe call to memcpy.notch*
的调用,sink则应该是不安全函数memcpy的size参数。ntoh 族函数通常用来进行网络字节序到主机字节序的转换。所以这里的意思应该是noth族函数会将外部传入的数据包中的某些数据转换一个数值,而这个数值可能最终会被传给memcpy作为size参数,使得拷贝的长度被攻击者控制,就可能会产生安全风险。
/**
* @kind path-problem
*/
import cpp
import semmle.code.cpp.dataflow.TaintTracking
import DataFlow::PathGraph
class YOUR_CLASS_HERE extends Expr {
// 2.0 Todo
}
class Config extends TaintTracking::Configuration {
Config() { this = "NetworkToMemFuncLength" }
override predicate isSource(DataFlow::Node source) {
// 2.1 Todo
}
override predicate isSink(DataFlow::Node sink) {
// 2.1Todo
}
from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "ntoh flows to memcpy"
// 1. 声明3个变量,Config类型的cfg,PathNode类型的source和sink
from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
// 2. 将source 和 sink 作为参数传入hasFlowPath。顾名思义,也就是判断是否有从source到sink的数据流动的path
where cfg.hasFlowPath(source, sink)
// 3. 打印sink 和source,不过为啥要打印两遍sink?
select sink, source, sink, "ntoh flows to memcpy"
/**
* Holds if data may flow from `source` to `sink` for this configuration.
*
* The corresponding paths are generated from the end-points and the graph
* included in the module `PathGraph`.
*/
相应的path是通过模块PathGraph里的end-points和graph来生成的
class Config extends TaintTracking::Configuration {
Config() { this = "NetworkToMemFuncLength" }
override predicate isSource(DataFlow::Node source) {
// 2.1 Todo
}
override predicate isSink(DataFlow::Node sink) {
// 2.1Todo
}
提炼一下官方文档configuration(https://codeql.github.com/codeql-standard-libraries/cpp/semmle/code/cpp/dataflow/internal/tainttracking1/TaintTrackingImpl.qll/type.TaintTrackingImpl$Configuration.html)对这个类的描述: 1.该类定义了对于一个analysis来说,sources、sinks等可配置的选项。也就是说这是一个包含谓词的类,这些谓词定义了数据如何在 source
和sink
之间流动。2.为了创建一个configuration,需要创建一个继承该类的子类。它的characteristic predicate(可看作构造函数)是一个独一无二的字符串,比如模版中写的 NetworkToMemFuncLength。
3.Configuration类的override member predicate(可看作成员函数)中,必须要覆写的有 isSource
和isSink
,其他都是可选的。4.为了查询是否有从 source
到sink
的流,写法如下:exists(MyAnalysisConfiguration cfg | cfg.hasFlow(source, sink))
5.多个Configuration可以共存,但是不支持在一个Configuration类的override predicate里去写怎么依赖另一个Configuration。hmm…这个还是得看例子才知道怎么写。
isSource
和isSink
这两个is开头的谓词(函数)应该是写:判断这个source/sink要满足什么条件,然后返回true/false。但是呢,搂一眼介绍predicate的官方文档(https://codeql.github.com/docs/ql-language-reference/predicates/),按照文档上面的例子来看,只需要写需要满足什么条件就ok:Predicates are used to describe the logical relations that make up a QL program. 谓词用来描述QL程序里的逻辑关系 Strictly speaking, a predicate evaluates to a set of tuples. 严格来讲,谓词的计算结果是一组元组
predicate isCountry(string country) {
country = "Germany"
or
country = "Belgium"
or
country = "France"
}
/**
* @kind path-problem
*/
import cpp
import semmle.code.cpp.dataflow.TaintTracking
import DataFlow::PathGraph
class MyExpr extends Expr {
// 2.0 Todo
MacroInvocation mi;
MyExpr(){
mi.getMacro().getName().regexpMatch("ntoh(l|ll|s)") and this = mi.getExpr()
}
}
class Config extends TaintTracking::Configuration {
Config() { this = "NetworkToMemFuncLength" }
override predicate isSource(DataFlow::Node source) {
// 2.1 Todo
source.asExpr() instanceof MyExpr
}
override predicate isSink(DataFlow::Node sink) {
// 2.1Todo
// 存在memcpy的函数调用,并且该调用的第二个参数是sink.asExpr()
exists(FunctionCall fc | fc.getTarget().hasName("memcpy") | sink.asExpr() = fc.getArgument(2))
}
}
from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "ntoh flows to memcpy"
五
Step 3: Find additional vulnerabilities
根据Q2.1查询结果进行代码审计
CVE-2019-14192、 CVE-2019-14193 和 CVE-2019-14199
net_process_received_packet
的末尾,此处很明显会发生integer underflow:nc_input_packet
,但是第149行检测了len的大小,这样我觉得即使传入一个很大的len,这里的memcpy也不会发生越界写。。然而我查了一下,U-Boot NFS RCE Vulnerabilities (CVE-2019-14192)(https://securitylab.github.com/research/uboot-rce-nfs-vulnerability/)说这里是会越界写的。但是他就一句话,没懂。CVE-2019-14197、CVE-2019-14200、 CVE-2019-14201、CVE-2019-14202、CVE-2019-14203 和 CVE-2019-14204
udp_packet_handler
,QL的查询结果里没有此处。udp_packet_handler
,只有两处赋值的地方:net_set_udp_handler
,可以看到分别给nc、dhcp、bootp和nfs都设置了handler。nfs_handler
。那就只审计它。可以看到,该函数仍然未校验len,直接在不同分支里将其分别传递给函数:static void nfs_handler(uchar *pkt, unsigned dest, struct in_addr sip,
unsigned src, unsigned len)
{
int rlen;
int reply;
debug("%sn", __func__);
if (dest != nfs_our_port)
return;
switch (nfs_state) {
case STATE_PRCLOOKUP_PROG_MOUNT_REQ:
if (rpc_lookup_reply(PROG_MOUNT, pkt, len) == -NFS_RPC_DROP)
break;
......
break;
case STATE_PRCLOOKUP_PROG_NFS_REQ:
if (rpc_lookup_reply(PROG_NFS, pkt, len) == -NFS_RPC_DROP)
break;
......
break;
case STATE_MOUNT_REQ:
reply = nfs_mount_reply(pkt, len);
......
break;
case STATE_UMOUNT_REQ:
reply = nfs_umountall_reply(pkt, len);
......
break;
case STATE_LOOKUP_REQ:
reply = nfs_lookup_reply(pkt, len);
......
break;
case STATE_READLINK_REQ:
reply = nfs_readlink_reply(pkt, len);
......
break;
case STATE_READ_REQ:
rlen = nfs_read_reply(pkt, len);
......
break;
}
}
rpc_lookup_reply
->未检验len,直接memcpy(&rpc_pkt.u.data[0], pkt, len);
-> 存在越界写漏洞nfs_mount_reply
->未检验len,直接memcpy(&rpc_pkt.u.data[0], pkt, len);
-> 存在越界写漏洞nfs_umountall_reply
->未检验len,直接memcpy(&rpc_pkt.u.data[0], pkt, len);
-> 存在越界写漏洞nfs_lookup_reply
->未检验len,直接memcpy(&rpc_pkt.u.data[0], pkt, len);
-> 存在越界写漏洞nfs_readlink_reply
->未检验len,直接memcpy((unsigned char *)&rpc_pkt, pkt, len);
-> 存在越界写漏洞nfs_read_reply
->差点就错过了-
memcpy(&rpc_pkt.u.data[0], pkt, sizeof(rpc_pkt.u.reply));
-
pkt的大小可能没有 sizeof(rpc_pkt.u.reply)
这么大,存在越界读。
struct rpc_t {
union {
uint8_t data[NFS_READ_SIZE + (6 + NFS_MAX_ATTRS) *
sizeof(uint32_t)];
struct {
uint32_t id;
uint32_t type;
uint32_t rpcvers;
uint32_t prog;
uint32_t vers;
uint32_t proc;
uint32_t data[1];
} call;
struct {
uint32_t id;
uint32_t type;
uint32_t rstatus;
uint32_t verifier;
uint32_t v2;
uint32_t astatus;
uint32_t data[NFS_READ_SIZE / sizeof(uint32_t) +
NFS_MAX_ATTRS];
} reply;
} u;
} __attribute__((packed));
CVE-2019-14195
CVE-2019-14196
nfs_lookup_reply()
,目标buffer:filefh的长度是64,传入的长度是负数的时候就会越界写。漏洞+1。CVE-2019-14194/CVE-2019-14198
nfs_read_reply
的if和else分支里,将可控的rlen传入函数store_block
中进行memcpy,可进行任意长度的越界写。96行的函数flash_write
同样存在问题。ntoh*
的特征?六
总结
遗留问题
-
CVE-2019-14192、 CVE-2019-14193 和 CVE-2019-14199是咋回事? -
Q3.0说能根据查询结果找到9个漏洞,是哪9个?不该是13减去6个handler的漏洞吗? -
Q3.1? -
对于通过设置handler的情况,怎么查询?
参考文献
看雪ID:ztree
https://bbs.kanxue.com/user-home-830671.htm
# 往期推荐
球分享
球点赞
球在看
原文始发于微信公众号(看雪学苑):CodeQL入门 – U-Boot Challenge