How to debug a docker container from a separate container

Time:2020-2-3

Containers are great for encapsulating software, but sometimes you can go too far by blindly altering the container image to make it as small as possible. We need to find a good balance between “clean” images and images that cannot be debugged.

The normal way to see people debugging a running container is to docker exec – it $container SH and install debugging tools in the container as needed. But what if your container doesn’t have / bin / sh? What if there is no package manager? You can use docker CP to copy the utility to the container, and then copy exec to the running container, but this is also cumbersome.

So a friend recently asked not how to debug from a container, but how to debug from another container. I’m not so smart, so I asked a lot of smart people online and got a good answer.

We create a compact container with only caddy.

Download / extract the caddy binary first

$: curl https://getcaddy.com | bash -s personal && mv /usr/local/bin/caddy .

Then create a dockerfile to copy the binaries to the temporary container.

FROM scratch
ADD caddy /

Build the container and run caddy

$: docker build -t caddy .
<output trimmed>

Now run this container

$: docker run -d --name caddy -p 2015:2015 caddy /caddy

Now caddy is running publishing port 2015 (currently provides 404 pages because there is no content, but it doesn’t matter). How do you debug containers? Caddy has no bugs, which is not what you need. But for hypothetical reasons.

Many people recommend using — link, but this only places the container on the same network. Not the same namespace, but connected to each other on the same virtual network.

$: docker run -it --rm --link caddy:caddy alpine sh
/ # ping caddy -c 1
PING caddy (172.30.238.2): 56 data bytes
64 bytes from 172.30.238.2: seq=0 ttl=64 time=0.075 ms
/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 sh
    8 root       0:00 ps aux

Others recommend using — volumes from, but this does not allow you to install the tool into an existing run container unless the run container is exporting the volume and the volume is already in $path.

Instead, we’ll build a separate container using all the tools we need (in this case, strace) and run it in the same PID and network namespace as the original container.

First create a debugging container using strace

FROM alpine
RUN apk update && apk add strace
CMD ["strace", "-p", "1"]

Building containers

$: docker build -t strace .
<output trimmed>

Now, run the strace container in the same PID and network namespace.

$: docker run -t --pid=container:caddy \
  --net=container:caddy \
  --cap-add sys_admin \
  --cap-add sys_ptrace \
  strace
strace: Process 1 attached
futex(0xd72e90, FUTEX_WAIT, 0, NULL

Attach strace to the caddy process and follow it as it executes.

Good, but we can also use the root file system of the remote container (not a lot). This time, we’ll use the alpine image and start a shell again in the same PID and network namespace.

$: docker run -it --pid=container:caddy \
  --net=container:caddy \
  --cap-add sys_admin \
  alpine sh

Now we can see that Caddy runs as follows:

/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 /caddy
   13 root       0:00 strace -p 1
   34 root       0:00 sh
   40 root       0:00 ps aux

The caddy container file system is located in / proc / 1 / root

/ # ls -l /proc/1/root/caddy 
-rwxr-xr-x    1 root     root      16099400 Jan 24 15:30 /proc/1/root/caddy

After attaching this container to the original container, we can do more debugging. You can still debug the network, but make sure you use localhost because the new sh process is running in the same network namespace

/ # apk update && apk add curl lsof
/ # curl localhost:2015
404 Not Found
/ # lsof -i TCP
COMMAND PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
caddy     1 root    4u  IPv6 330044347      0t0  TCP *:2015 (LISTEN)

All of your standard debugging tools should run in the second container without contaminating the original container. If you encounter an error, make sure to check the kernel permissions (note how strace needs — CAP add sys_ptrace but the SH container only needs sys_admin)

This is obviously useful for the go container or any other container where you just need to introduce some extra debugging tools without changing the container itself.