Understanding the concepts of commands, processes and namespaces

Question 1

I am not a strong linux user, but I want to better understand the material in this post here which talks about linux namespaces

https://stackoverflow.com/questions/44666700/unshare-pid-bin-bash-fork-cannot-allocate-memory

I think my failure to comprehend might be related to insufficient understanding of "command", "process", and a maybe a few other things.

First, let me explain a simple experiment that I am using for my education. I opened two PUTTY terminal windows. On each window, I did a ssh root@[ip of machine]. Now that I have 2 SSH sessions to my linux machine, I begin my experiments.

In the first window, I did this:

root@localhost:~# unshare --pid /bin/bash
bash: fork: Cannot allocate memory

In the second window, I did this:

root@localhost:~# ps -aux | grep unshare
root 58188 0.0 0.0 6480 2284 pts/3 R+ 21:49 0:00 grep --color=auto unshare

Here are my questions:

The second window does not show any indication of unshare --pid /bin/bash. Is this because the /bin/bash command or the /bin/bash process had already terminated? This is why many linux users on the internet recommend using the --fork so that the /bin/bash runs in the newly created namespace?
The accepted answer stated this: "After bash start to run, bash will fork several new sub-processes to do somethings." I do not understanding the meaning of this sentence. So in the second terminal window, I ran this:

root@localhost:~# unshare -pf /bin/bash
root@localhost:~# ps -a
 PID TTY TIME CMD
 58278 pts/2 00:00:00 sudo
 58279 pts/2 00:00:00 su
 58280 pts/2 00:00:00 bash
 58291 pts/2 00:00:00 unshare
 58292 pts/2 00:00:00 bash
 58299 pts/2 00:00:00 ps

Is PID 58278 to PID58299 what the author meant by "bash will fork several new sub-processes to do somethings"?

Question 2

Adding -H option to ps might make things more clear.

Question 3

Background

I'll begin with some simplified background on how processes are created in Linux. I'll not cover all options or all the details, but instead I'll focus on the key ideas.

Generally, new processes are created using the fork() system call. On success, fork() will result in a new process that is running the same program as the original (effectively, it's a clone of the program that invoked fork() at the point at which it called fork()). The fork() function returns in both the "parent" process and in the "child" (the newly-created) process. Each process can examine the return value of fork() to determine if they're the "parent" or the "child," and they can use that to decide what to do next.

Often creating a new process means we want to run a different program, and up until now we only have a way to create copies of the same program. Fortunately, there's a separate system call, exec(), that replaces the currently running program with a new program.

Consider the case where you have a shell (I'll assume bash here), and you type ls to list the contents of the current directory:

(P1:bash) calls fork()
--- the kernel creates P2 that is a copy of P1
--- the kernel starts running P2
(P1:bash) fork() returns with the PID of P2, so it knows it's the parent
(P1:bash) Waits for P2 to finish (detailed elided)
(P2:bash) fork() returns with 0, so it knows is the child
(P2:bash) calls exec("ls")
--- the kernel replaces bash with ls in P2 and starts ls running
(P2:ls) starts running
...
(P2:ls) eventually terminates
(P1:bash) wakes up since P2 is finished, continues about its business

The Problem

You begin with:

# unshare --pid /bin/bash
bash: fork: Cannot allocate memory
bash-5.2#

Notice the error bash: fork: Cannot allocate memory – that's not a good sign.

In this case, the unshare program (1) creates a new PID namespace and (2) execs /bin/bash. Recall from the Background section that exec replaces the current running process (unshare) with a new program (/bin/bash) – it doesn't create a new process.

Up to now, no processes are running in the newly-created PID namespace. The namespace exists, but the process that created the namespace hasn't yet fork()-ed anything.

As bash starts running, it typically runs some set of programs. Here run is the fork/exec combination described in the Background section. The kernel places the first process that bash fork()s in the new PID namespace and that process becomes the init process for that namespace (the process in that namespace with pid = 1). The program that bash runs is likely short lived, so it runs, terminates, and the PID namespace is destroyed.

Next bash tries to run some other commmand. It wants to put those commands in the new PID namespace, but that PID namespace no longer exists. As a result, the fork() fails resulting in the error message that you see. You'll see it again if you try to run any other command:

bash-5.2# ls
bash: fork: Cannot allocate memory
bash-5.2#

The Solution

As you note in your question, the unshare program has another option that is useful in this scenario. From man unshare:

-f, --fork

Fork the specified program as a child process of unshare rather than running it directly. This is useful when creating a new PID namespace. Note that when unshare is waiting for the child process, then it ignores SIGINT and SIGTERM and does not forward any signals to the child. It is necessary to send signals to the child process.

You can replace your first command with:

# unshare --fork --pid /bin/bash
#

Notice that in this case there is no error.

This option causes unshare to change its behavior. Instead of immediately using exec() to replace itself with /bin/bash, it uses the fork()/exec() behavior described in the Background section above:

(P1:unshare) calls fork()
--- the kernel creates P2 that is a copy of P1
--- the kernel starts running P2
(P1:unshare) fork() returns with the PID of P2, so it knows it's the parent
(P1:unshare) Waits for P2 to finish (detailed elided)
(P2:unshare) fork() returns with 0, so it knows is the child.
--- P2 is running in the new PID namespace and has pid = 1
(P2:unshare) calls exec("/bin/bash")
--- the kernel replaces unshare with /bin/bash in P2 and starts ls running
(P2:bash) starts running

You can confirm that in this case /bin/bash is the init process (i.e., process with pid 1) by printing its process id:

# echo $$
1
#

Answers to your questions

The second window does not show any indication of unshare --pid /bin/bash. Is this because the /bin/bash command or the /bin/bash process had already terminated? This is why many Linux users on the internet recommend using the --fork so that the /bin/bash runs in the newly created namespace?

The second window does not show any indication of unshare because it is no longer running – it used exec() to replace itself with /bin/bash.

The --fork option changes the behavior of unshare so that it uses fork() to first create a new process — a process in the newly-created PID namespace — then that process uses exec() to replace itself with /bin/bash.

The accepted answer stated this: "After bash start to run, bash will fork several new sub-processes to do somethings." I do not understanding the meaning of this sentence. So in the second terminal window, I ran this:

The new sub-processes are likely short-lived, so they're not longer running by the time you run ps.

Question 4

I will paste some links that explains exec and fork that I had to read to help me appreciate your answer: askubuntu.com/questions/428458/why-do-shells-call-fork and stackoverflow.com/questions/4204915/… .

Andy Dalton Andy Dalton 14.7k1 gold badge28 silver badges50 bronze badges · Accepted Answer · 2024-12-08 02:41:51Z

Background

I'll begin with some simplified background on how processes are created in Linux. I'll not cover all options or all the details, but instead I'll focus on the key ideas.

Generally, new processes are created using the fork() system call. On success, fork() will result in a new process that is running the same program as the original (effectively, it's a clone of the program that invoked fork() at the point at which it called fork()). The fork() function returns in both the "parent" process and in the "child" (the newly-created) process. Each process can examine the return value of fork() to determine if they're the "parent" or the "child," and they can use that to decide what to do next.

Often creating a new process means we want to run a different program, and up until now we only have a way to create copies of the same program. Fortunately, there's a separate system call, exec(), that replaces the currently running program with a new program.

Consider the case where you have a shell (I'll assume bash here), and you type ls to list the contents of the current directory:

(P1:bash) calls fork()
--- the kernel creates P2 that is a copy of P1
--- the kernel starts running P2
(P1:bash) fork() returns with the PID of P2, so it knows it's the parent
(P1:bash) Waits for P2 to finish (detailed elided)
(P2:bash) fork() returns with 0, so it knows is the child
(P2:bash) calls exec("ls")
--- the kernel replaces bash with ls in P2 and starts ls running
(P2:ls) starts running
...
(P2:ls) eventually terminates
(P1:bash) wakes up since P2 is finished, continues about its business

The Problem

You begin with:

# unshare --pid /bin/bash
bash: fork: Cannot allocate memory
bash-5.2#

Notice the error bash: fork: Cannot allocate memory – that's not a good sign.

In this case, the unshare program (1) creates a new PID namespace and (2) execs /bin/bash. Recall from the Background section that exec replaces the current running process (unshare) with a new program (/bin/bash) – it doesn't create a new process.

Up to now, no processes are running in the newly-created PID namespace. The namespace exists, but the process that created the namespace hasn't yet fork()-ed anything.

As bash starts running, it typically runs some set of programs. Here run is the fork/exec combination described in the Background section. The kernel places the first process that bash fork()s in the new PID namespace and that process becomes the init process for that namespace (the process in that namespace with pid = 1). The program that bash runs is likely short lived, so it runs, terminates, and the PID namespace is destroyed.

Next bash tries to run some other commmand. It wants to put those commands in the new PID namespace, but that PID namespace no longer exists. As a result, the fork() fails resulting in the error message that you see. You'll see it again if you try to run any other command:

bash-5.2# ls
bash: fork: Cannot allocate memory
bash-5.2#

The Solution

As you note in your question, the unshare program has another option that is useful in this scenario. From man unshare:

-f, --fork

Fork the specified program as a child process of unshare rather than running it directly. This is useful when creating a new PID namespace. Note that when unshare is waiting for the child process, then it ignores SIGINT and SIGTERM and does not forward any signals to the child. It is necessary to send signals to the child process.

You can replace your first command with:

# unshare --fork --pid /bin/bash
#

Notice that in this case there is no error.

This option causes unshare to change its behavior. Instead of immediately using exec() to replace itself with /bin/bash, it uses the fork()/exec() behavior described in the Background section above:

(P1:unshare) calls fork()
--- the kernel creates P2 that is a copy of P1
--- the kernel starts running P2
(P1:unshare) fork() returns with the PID of P2, so it knows it's the parent
(P1:unshare) Waits for P2 to finish (detailed elided)
(P2:unshare) fork() returns with 0, so it knows is the child.
--- P2 is running in the new PID namespace and has pid = 1
(P2:unshare) calls exec("/bin/bash")
--- the kernel replaces unshare with /bin/bash in P2 and starts ls running
(P2:bash) starts running

You can confirm that in this case /bin/bash is the init process (i.e., process with pid 1) by printing its process id:

# echo $$
1
#

Answers to your questions

The second window does not show any indication of unshare --pid /bin/bash. Is this because the /bin/bash command or the /bin/bash process had already terminated? This is why many Linux users on the internet recommend using the --fork so that the /bin/bash runs in the newly created namespace?

The second window does not show any indication of unshare because it is no longer running – it used exec() to replace itself with /bin/bash.

The --fork option changes the behavior of unshare so that it uses fork() to first create a new process — a process in the newly-created PID namespace — then that process uses exec() to replace itself with /bin/bash.

The accepted answer stated this: "After bash start to run, bash will fork several new sub-processes to do somethings." I do not understanding the meaning of this sentence. So in the second terminal window, I ran this:

The new sub-processes are likely short-lived, so they're not longer running by the time you run ps.

I will paste some links that explains exec and fork that I had to read to help me appreciate your answer: askubuntu.com/questions/428458/why-do-shells-call-fork and stackoverflow.com/questions/4204915/… .

Stack Exchange Network

Understanding the concepts of commands, processes and namespaces

1 Answer 1

Background

The Problem

The Solution

Answers to your questions

You must log in to answer this question.

Hot Network Questions

Understanding the concepts of commands, processes and namespaces

1 Answer 1

Background

The Problem

The Solution

Answers to your questions

You must log in to answer this question.

Related

Hot Network Questions