I'm studying containerization mechanisms. The man page for user namespaces states: "Each process is a member of exactly one user namespace."
I'm also trying to follow this article where the author states: "User namespaces allow for the UID of a process in User namespace 1 to be different to the UID for the same process in User namespace 2"
The above 2 statements seem contradictory. Can the same process be part of multiple user namespaces? What is the relationship between processes, UIDs and user namespaces? Some graphical explanation would be much appreciated.
2 Answers 2
The apparent contradiction comes from the omission, in the article, of the user namespace hierarchy. Quoting the manpage:
User namespaces can be nested; that is, each user namespace—except the initial ("root") namespace—has a parent user namespace, and can have zero or more child user namespaces. The parent user namespace is the user namespace of the process that creates the user namespace via a call to
unshare(2)
orclone(2)
with theCLONE_NEWUSER
flag.
In the article, process D is part of user namespace 2, which is nested inside user namespace 1. A single process belongs to a single user namespace, but that user namespace is nested inside its successive parents, all the way up to the root namespace.
Processes can be visible from multiple user namespaces; in particular, all processes are visible from the top-most user namespace (or if you prefer, from outside all user namespaces). The user ids attached to processes change values depending on the user namespace from which they’re queried, depending on the uid/gid maps used in each user namespace, and this is the point the article is trying to make.
You can see this for example by starting a rootless container running bash
; ps
inside the container will then show
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.3 0.0 12024 3196 pts/0 Ss 13:49 0:00 /bin/bash
but the same bash
process will show up as
skitt 23345 0.0 0.0 12024 3208 pts/0 Ss+ 15:49 0:00 /bin/bash
outside the container. /proc/.../uid_map
shows the uid map in use:
$ cat /proc/23345/uid_map
0 1000 1
1 624288 65536
This means that the "range" of uids from 0 to 0 inside the corresponding user namespace maps to 1000 in the user namespace I queried it from, and the range from 1 to 65536 maps to 624288–689823.
After much headaches I think I have some answers. Every process is indeed part of exactly one user namespace (as confirmed here), the image in the article portrays it accurately. Apart from init, processes are created by other processes. When a process creates a child process, it has the option to assign the child to a new user namespace, which is also called a "child namespace" of the parent process's user namespace.
So called UID (user ID) and GID (group ID) mappings make it possible for a process's UID and GID to appear different when examined from another namespace. Mappings are important because a process may for example modify files on the system that processes from other namespaces will want to access and they also want to see a meaningful UID or GID on those files from outside the namespace.
Mapping UIDs (same for GIDs) works by taking the UID of the process that is creating the new namespace,
matching that against /proc/<Process_ID>/uid_map
and assigning the mapped UID for the new process in the new namespace.
If the mapping is not specified when creating the new user namespace, then the processes UID
in the new namespace will take the value of /proc/sys/kernel/overflowuid
More details of how such mappings can be difined can be found in this LWN article.