原文始发于Mattia Zignale:Attacking and securing Docker containers
Docker Architecture
If you are reading this article I suppose you know how Docker works under the hood, however let’s quickly recap the major concepts:
Docker Client
This is the part that is usually more used by users, the client is the CLI to the Docker daemon and it helps to manage containers, images and registries.
It also can be used to manage networks and volumes used by containers.
Some of the commands you can use in Docker client:
#RUN A NGINX CONTAINER WITH PORT 80 BINDING HOST-CONTAINER docker run -d -p 80:80 nginx#PULL UBUNTU 22.04 DOCKER IMAGE docker pull ubuntu:22.04#SHOWS DOCKER IMAGES IN YOUR DOCKER SYSTEM docker images#HELPER FOR DOCKER NETWORKS docker network#HELPER FOR DOCKER VOLUMES docker volume
Docker Daemon
The Docker daemon is actually the core of the docker environment, it runs containers, pull images and manage all the networks and volume.
You can use the Docker daemon through a client, or if you want to make things hard you can connect directly to the Docker socket. It listens for API requests.
Docker Registry
The Docker registry is where the Docker images are stored.
By default Docker checks for images in Docker Hub, but if configured the images will be pulled from the set registry. Moreover a registry could be public or private with authentication.
Threats to Docker environment
Docker socket
Docker socket /var/run/docker.sock is the UNIX socket that Docker is listening to, and its owned by root user. This is the primary entry point for the Docker API.
Root cause:
docker run -d -v /var/run/docker.sock:/var/run/docker.sock <image_name>:<image_tag>
An attacker can issue requests to Docker daemon (running as root on Docker host) in order to:
- start a privileged container and breakout of that;
- pull a public backdoored image, run it and get inside that and breakout.
Attack example:
- find / -name docker.sock 2>/dev/null
- docker images
- docker run -it -v /:/host <image_name>:<image_tag> chroot /host bash
Docker socket could be local as we saw above and could even be exposed to internet if you run your docker daemon with this flags:
-H tcp://0.0.0.0:XXX
In this case the Docker socket would be exposed to the port specified in XXX and can be accessible from internet.
Privileged container
By default, Docker containers are “unprivileged” and cannot, for example, run a Docker daemon inside a Docker container.
The —privileged flag gives all capabilities to the container. When the operator executes docker run — privileged, Docker will enable access to all devices on the host as well as set some configuration in AppArmor or SELinux to allow the container nearly all the same access to the host as processes running outside containers on the host.
Root cause:
docker run -d --privileged <image_name>:<image_tag>
An attacker can mount host file system and use chroot to access host.
Attack example:
- capsh --print
- fdisk -l
- mount /target/disk /mnt/
- chroot /mnt/ bash
Capabilities abuse
In general, you should check the capabilities of the container, if it has any of the following ones, you might be able to breakout from it or do bad stuff (the exploitation is quite long and different in each scenario):
CAP_SYS_ADMIN
CAP_SYS_PTRACE
CAP_SYS_MODULE
DAC_READ_SEARCH
DAC_OVERRIDE
CAP_SYS_RAWIO
CAP_SYSLOG
CAP_NET_RAW
CAP_NET_ADMIN
You can check currently container capabilities using:
capsh --print
Shared network
Root cause:
docker run -d --network host <image_name>:<image_tag>
An attacker with an initial foothold on the vulnerable container can pivot to other containers. If there are management containers, e.g. Portainer, the attacker can use them to instantiate privileged machine that can be used to breakout.
You can discover the interface for the Docker network executing:
ip addr
The host machine mostly creates an interface which acts as gateway for Docker network and, generally, the first IP address of the range is used for that. By default, the IP range for Docker network is 172.17.0.0/16 and the host machine will have the IP address 172.17.0.1. If the IP address of the container is 172.17.0.1, then it can be concluded that the container shares the host network namespace.
Abuse group membership
The members of the Docker group can run Docker operations. This option allows all non-root users to use Docker, assuming that Docker daemon is running with root. However, this membership can be abused to perform privilege escalation on the Docker host machine.
Check if the user can run Docker operations and verify it checking the /etc/group file:
- docker images
- cat /etc/group | grep docker
Containerd abuse
If the Containerd daemon is running on the host machine and the ctr utility is installed, then it can be abused to attack the host and obtain root privileges.
The ctr utility can be used by another user on the host machine if available to it.
The attack can make requests to Containerd daemon that is running with root privileges, run a container and mount host file system to it, then use chroot to get root access from inside the container.
runc abuse
If the runc container runtime is installed on machine and anyone can run the runc command, then it can be abused to attack the host and obtain root privileges.
The attacker can make requests to runc daemon that is running with root privileges, run a container and mount host file system to it, use chroot to get root access from inside the container.
Management tool abuse
If a management tool, e.g. Portainer, is running with weak password or it is vulnerable, that could lead to host machine takeover.
An attacker can log into it, launch a privileged container and — from inside of newly launched container — chroot to host file system.
vulns or psw
Insecure Docker registry
If a private Docker registry is not using authentication, or it is not strong enough, then it can be attacked.
An attacker can interact with Docker registry, interact with its machine and download images without using Docker.
Information gathering:
- curl http://<target_registry>:5000/v2/_catalog
- curl http://<target_registry>:5000/v2/<repository_name>/tags/list
- curl http://<target_registry>:5000/v2/<repository_name>/manifests/<tag>
- curl -s http://<target_registry>:5000/v2/<repository_name>/blobs/<blobSum_value> --output
<output_filename>.tar
Fake images (backdoor)
The attacker can pull the original image from the registry, understand the structure and create a new image with a backdoor. Then the attacker can overwrite the image in Docker registry and wait for someone/something to deploy it.
The attacker can pull a multi-purpose image from the registry, understand the structure and craft it to provide the attacker shell access after deployment. Then the attacker can overwrite the image in Docker registry and wait for someone/something to deploy it. At this point, techniques described here can be used to get root access on Docker host.
Auditing and security on Docker
Audit Docker socket and secure it
- Check permissions of UNIX socket: others should not have any access.
- Check for TCP socket and ensure that either the TCP socket is listening only on local interface or authentication is deployed on it.
Default Docker TCP socket is not protected, e.g. tcp://<host_ip>:2375, so we need to implement:
- Authentication.
- Encrypted channel.
How?
- Using SSH.
- Using TLS (TCP port 2376).
Audit Docker group
- Check the members (/etc/group): non members won’t be able to access Docker.
Audit Docker configuration
- Check for allowed insecure registries (/etc/docker/daemon.json).
Tools for Docker Security
Securing containers
Seccomp
Seccomp (Security Computing mode) is a feature of Linux kernel that acts as a syscall filter. It is not a sandbox, but often used with sandboxes to block syscalls.
Seccomp-eBPF (mode 2) is supported by Docker to restrict the syscalls from the containers effectively decreasing the surface area. Seccomp profile can be defined in the form of a JSON file. Docker’s default Seccomp profile is a whitelist which blocks 44 syscalls.
Check out this link to learn more.
AppArmor
AppArmor is a kernel enhancement to confine programs to a limited set of resources. Binds access control attributes to programs, rather than to users.
A deep-dive article on AppArmor will come soon!
User namespace remapping
This technique allows to map a container user to non-existent user ID of host.
E.g. root in the container, UID = 0, same user on host, UID = 27072; as the user doesn’t exist, no privileges!
Impacts:
- no privileged mode;
- no chrooting using bind mount.
Learn more: https://docs.docker.com/engine/security/userns-remap/
Rootless mode
It is possible to run Docker daemon into rootless mode. Unfortunately, there are some known limitations: https://docs.docker.com/engine/security/rootless/#known-limitations
Conclusions
- Docker containers provide a secure method of isolating applications from the underlying host system, if not misused.
- It is important to ensure that all images used in Docker are from trusted sources and are regularly updated.
- Securing the Docker daemon, container images, and containers is an important step in ensuring that applications running in Docker are secure.
- Implementing role-based access control and monitoring can help ensure that Docker is used securely.
- Do not forget security on host!