Re: [RFC][v6][PATCH 9/9]: Document clone_with_pids() syscall
From: Randy Dunlap
Date: Thu Sep 10 2009 - 11:28:39 EST
On Wed, 9 Sep 2009 23:14:13 -0700 Sukadev Bhattiprolu wrote:
>
>
Subject: [RFC][v6][PATCH 9/9]: Document clone_with_pids() syscall
>
>
This gives a brief overview of the clone_with_pids() system call. We should
>
eventually describe more details either in clone(2) or in a new man page.
>
>
Signed-off-by: Sukadev Bhattiprolu <sukadev@xxxxxxxxxxxxxxxxxx>
>
---
>
Documentation/clone-with-pids | 58 ++++++++++++++++++++++++++++++++++++++++++
>
1 file changed, 58 insertions(+)
>
>
Index: linux-2.6/Documentation/clone-with-pids
>
===================================================================
>
--- /dev/null 1970年01月01日 00:00:00.000000000 +0000
>
+++ linux-2.6/Documentation/clone-with-pids 2009年09月09日 21:53:30.000000000 -0700
>
@@ -0,0 +1,58 @@
>
+
>
+struct pid_set {
>
+ unsigned int num_pids;
>
+ pid_t pids[];
>
+};
>
+
>
+clone_with_pids(int flags, void *child_stack_base, int *parent_tid_ptr,
>
+ int *child_tid_ptr, NULL, struct pid_set *pid_setp)
>
+
>
+ The clone_with_pids() system call is identical to clone(), except
>
+ that it allows the user to specify a pid for the child process
>
+ in each of the child processes' pid name spaces.
>
+
namespaces. {as below}
>
+ This system call is meant to be used when restarting an application
>
+ from an earlier checkpoint. When restarting the application, the
>
+ processes in the application must get the same pids they had at the
>
+ time of the checkpoint.
>
+
>
+ The 'pid_setp' parameter defines a set of pids to use, one for each
>
+ pid-namespace of the child process. The order pids in '->pids[]'
order of pids
>
+ corresponds to the nesting order of pid-namespaces, with ->pids[0]
>
+ corresponding to the init_pid_ns.
>
+
>
+ If a pid in the ->pids list is 0, the kernel will assign the next
>
+ available pid in the pid namespace, for the process.
>
+
>
+ If a pid in the ->pids[] list is non-zero, the kernel tries to assign
>
+ the specified pid in that namespace. If that pid is already in use
>
+ by another process, the system call fails with -EBUSY.
>
+
>
+ On success, the system call returns the pid of the child process in
>
+ the parent's active pid namespace.
>
+
>
+ On failure, clone_with_pids() returns -1 and sets 'errno' to one of
>
+ following values (the child process is not created).
>
+
>
+ EPERM Caller does not have the SYS_ADMIN privilege needed to excute
execute
>
+ this call.
>
+
>
+ EINVAL The number of pids specified in 'pid_set.num_pids' exceeds
>
+ the current nesting level of parent process
>
+
>
+ EBUSY A requested 'pid' is in use by another process in that name
>
+ space.
>
+
>
+Example:
>
+
>
+ struct pid_set pid_set { 3, {0, 99, 177} };
>
+ void *child_stack = malloc(STACKSIZE);
>
+
>
+ /* set up child_stack, like with clone() */
>
+ rc = clone_with_pids(clone_flags, child_stack, NULL, NULL, &pid_set);
>
+
>
+ if (rc < 0) {
>
+ perror("clone_with_pids()");
>
+ exit(1);
>
+ }
What happens when one of the pids is busy? Say the last one in the
example above [177]. Are the first 2 children already cloned
or are all pids checked for availability before cloning?
If the latter, is there a race there?
and what value is returned?
---
~Randy
LPC 2009, Sept. 23-25, Portland, Oregon
http://linuxplumbersconf.org/2009/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
http://vger.kernel.org/majordomo-info.html
Please read the FAQ at
http://www.tux.org/lkml/