Debugging File Permissions in AWS ECS Tasks
When working with containerized applications in AWS ECS, you might encounter file permission issues, especially when implementing security best practices. Recently, I had to debug file permissions in an AWS ECS task with the following requirements:
- The process within the container must run as an unprivileged user
- The root file system must be mounted as read-only
- The process must be able to write to the /tmp directory
- The contents of the /tmp directory should be cleared on container start
- The container runs in AWS ECS with Fargate backend
Me and the team tried multiple approaches. Involving EFS and S3 as volumes meant we had to implement the clearing the /tmp directory on container start. Defining and mounting an ephemeral storage volume caused trouble because it was mounted as the root user so it cannot be used by the unprivileged user. Then I stumbled accross a solution that was posted in a subreddit1. They suggested to define a Docker volume in the Dockerfile and have ECS create an ephemeral storage mount automatically.
VOLUME /tmp
In this post, I’ll share my debugging approach used to validate this solution. The debugging approach is not limited to the use of AWS ECS. It can be of value for all container environments such as kubernetes or plain docker.
Debugging with an overriden Container Entrypoint
When troubleshooting container permission issues, it’s helpful to use simple CLI tools to check the current user and directory permissions. I modified my container’s entry point in the ECS task definition to run a series of diagnostic commands:
resource "aws_ecs_task_definition" "my-task" {
= "service"
family = "awsvpc"
network_mode = ["FARGATE"]
requires_compatibilities = "256"
cpu = "512"
memory
= aws_iam_role.ecs-task-execution-role.arn
execution_role_arn = aws_iam_role.ecs-task-execution-role.arn
task_role_arn
= jsonencode([
container_definitions
{= "my-service"
name = local.image_uri
image = 256
cpu = 512
memory = true
essential = true
readonlyRootFilesystem = "185"
user = [
portMappings
{= 9090
containerPort = 9090
hostPort
}
]= [
entryPoint "sh",
"-c"
]= [
command "set -x; whoami; ls -la /tmp; touch /tmp/itWorks; ls -la /tmp; cat /proc/mounts; df -h"
]
}
])
}
Let’s break down these commands and understand what each one does:
Command Chaining with Semicolons
In the shell, the semicolon `;` allows you to run multiple commands sequentially on a single line. Each command runs independently, regardless of whether the previous command succeeded or failed. This is different from using `&&` which only runs the next command if the previous command succeeded, or `||` which only runs the next command if the previous one failed. In this case failing means returning a non zero returncode.
set -x
The `set -x` command enables debug mode in the shell. When this mode is active, the shell prints each command before executing it, prefixed with a + sign. This is useful for debugging scripts as it shows what commands are being executed. Without this it can be difficult to see which command yielded which output.
whoami
The `whoami` command prints the current user’s username. In our case, it showed that the container was running as the “jboss” user, confirming that we were indeed running as an unprivileged user defined in the Dockerfile of the base image.
ls -la /tmp
The `ls -la` command lists all files in a directory showing permissions, ownership, size, and modification time. This helped me see the current state of the /tmp directory and its permissions.
touch /tmp/itWorks
The `touch` command creates an empty file. I used it to test if the current user had write permissions in the /tmp directory. If this command succeeds, it confirms that the current user can write to the specified location.
cat /proc/mounts
The `cat /proc/mounts` command displays all mounted filesystems in the container. This is useful for seeing how filesystems are mounted. In this case, it showed that the root filesystem was indeed mounted as read-only (ro), while /tmp had its own mount point.
df -h
The `df -h` command shows disk space usage. This helped confirm which filesystems were available and how much space was allocated to them.
Analyzing the Output
Let’s run the container and have a look at the output of these commands:
+ whoami
jboss
+ ls -la /tmp
total 16
drwxrwxrwt 2 root root 4096 Apr 24 11:55 .
drwxr-xr-x 1 root root 4096 Apr 24 11:55 ..
-rwx------ 1 root root 291 Mar 13 11:05 ks-script-fb_e6y2x
-rwx------ 1 root root 701 Mar 13 11:05 ks-script-s7hte0c5
+ touch /tmp/itWorks
+ ls -la /tmp
total 16
drwxrwxrwt 2 root root 4096 Apr 24 11:55 .
drwxr-xr-x 1 root root 4096 Apr 24 11:55 ..
-rw-r--r-- 1 jboss root 0 Apr 24 11:55 itWorks
-rwx------ 1 root root 291 Mar 13 11:05 ks-script-fb_e6y2x
-rwx------ 1 root root 701 Mar 13 11:05 ks-script-s7hte0c5
From this output, I could see:
- The container was running as the “jboss” user (unprivileged) otherwise it would have stated `root`
- The /tmp directory belongs to root bu it has the sticky bit set (t in drwxrwxrwt), which allows any user to create files but only the owner can delete them. This is common for the /tmp directory.
- The “jboss” user was able to successfully create a file in /tmp
The mount information was particularly revealing:
+ cat /proc/mounts
overlay /
overlay ro,relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/103/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/102/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/101/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/100/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/99/fs:/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/93/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/107/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/107/work 0 0
/dev/nvme1n1 /tmp ext4 rw,relatime 0 0
This showed that:
- The root filesystem (/) was mounted as read-only (ro)
- The /tmp directory was mounted as a separate filesystem with read-write permissions (rw)
The disk usage information confirmed this setup:
+ df -h
Filesystem Size Used Avail Use% Mounted on
overlay 30G 11G 18G 37% /
tmpfs 64M 0 64M 0% /dev
shm 464M 0 464M 0% /dev/shm
tmpfs 464M 0 464M 0% /sys/fs/cgroup
/dev/nvme1n1 30G 11G 18G 37% /tmp
tmpfs 464M 0 464M 0% /proc/acpi
So in the end the definition of a docker volume for /tmp did the trick. When you define a volume in your Dockerfile, ECS with Fargate automatically creates an ephemeral storage mount with the correct permissions for the user specified in your container. This eliminates the need for manual permission adjustments. The storage used is AWS’s ephemeral storage, meaning it is temporary and will be removed when the container stops. This is ideal for temporary data storage needs.
Additional Considerations
ECS on Fargate automatically handles permissions for volumes. It will create the montpoint folder owned by the user that is specified in your container (e.g., user 185 or jboss). You can define multiple volumes if your application needs to write to different directories. If you need persistent storage instead of ephemeral, you would need to use EFS or other persistent storage options with ECS. This solution will not work for you in that case. I have not found any resource documenting this, admittedly useful, implementation of ECS. I don’t know but this behaviour might change in the future. There is an open Github Issue with a discussion regarding this though2.
Conclusion
Debugging file permissions in containerized environments can be challenging, but using simple CLI tools can provide valuable insights into what’s happening inside your container. Hope that helps you out.