6

TLDR

When spinning up multiple docker containers in which I run npm ci, I start getting pthread_create: Resource temporarily unavailable errors (less than 5 docker containers can run fine). I deduce there is some kind of thread limit somewhere, but I cannot find which one is blocking here.

configuration

  • a Jenkins instance spins up docker containers for each build (connection through ssh into this docker container).
  • in each container some build commands are run; I see the error often when using npm ci since this seems to create quite some threads; but I don't think the problem is related to npm itself.
  • all docker containers run on a single docker-host. It's specifications:

docker-host

  • Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz with 12 cores, 220 GB RAM
  • Centos 7
  • Docker version 18.06.1-ce, build e68fc7a
  • systemd version 219
  • kernel 3.10.0-957.5.1.el7.x86_64

errors

I can see the error under different forms:

  • jenkins failing to contact the docker container; errors like: java.lang.OutOfMemoryError: unable to create new native thread
  • git clone failing inside the container with ERROR: Error cloning remote repo 'origin' ... Caused by: java.lang.OutOfMemoryError: unable to create new native thread
  • npm ci failing inside the container with node[1296]: pthread_create: Resource temporarily unavailable

Things I have investigated or tried

I looked quite a lot a this question.

  • docker-host has systemd version 219 and is hence does not have the TasksMax attribute.
  • /proc/sys/kernel/threads-max = 1798308
  • kernel.pid_max = 49152
  • number of threads (ps -elfT | wc -l) is typically 700, but with multiple containers running I have seen it climb to 4500.
  • all builds run as some user with pid 1001 inside the docker container; however there is no user with pid 1001 on the docker-host so I don't know which limits apply to this user.
  • I have already increased multiple limits for all users in /etc/security/limits.conf (see below)
  • I created a dummy user with uid 1001 on docker-host and made sure it had also nproc limit set to unlimited. Logging onto that user ulimit -u = unlimited. This still didn't solve the problem

/etc/security/limits.conf :

* soft nproc unlimited
* soft stack 65536
* soft nofile 2097152

output of ulimit -a as root:

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 899154
max locked memory (kbytes, -l) 1048576
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 65536
cpu time (seconds, -t) unlimited
max user processes (-u) 899154
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

limits of my dockerd process (cat /proc/16087/limits where 16087 is pid of dockerd)

Limit Soft Limit Hard Limit Units 
Max cpu time unlimited unlimited seconds 
Max file size unlimited unlimited bytes 
Max data size unlimited unlimited bytes 
Max stack size unlimited unlimited bytes 
Max core file size unlimited unlimited bytes 
Max resident set unlimited unlimited bytes 
Max processes unlimited unlimited processes 
Max open files 65536 65536 files 
Max locked memory 65536 65536 bytes 
Max address space unlimited unlimited bytes 
Max file locks unlimited unlimited locks 
Max pending signals 899154 899154 signals 
Max msgqueue size 819200 819200 bytes 
Max nice priority 0 0 
Max realtime priority 0 0 
Max realtime timeout unlimited unlimited us
asked Apr 25, 2019 at 16:12
4
  • If your Docker host is CentOS 7.4+, then I believe you are affected by the default TasksMax attribute of systemd (I saw that it is enabled from this bug report). Add TasksMax=infinity to your Docker service file override and see if it helps or prints the warning that it is not available. Commented Apr 25, 2019 at 19:13
  • Hi @GracefulRestart: my system is not affected by that; since I have systemd = 219 which does not have the TasksMax parameter; as noted in my question. Commented Apr 25, 2019 at 19:14
  • If you are on CentOS 7.4 or higher, then you are affected by it as it was backported to systemd-219-42 as per this errata announcement. Commented Apr 25, 2019 at 19:29
  • I have centos 7.6, systemd 219-62 but yet when I run systemctl status docker it says Tasks: 135 and there is no maximum between brackets so I still think that this is not the reason. Also my limit seems to lie at 4096 threads and not at 512. Commented Apr 25, 2019 at 19:45

1 Answer 1

1

I have found a way to get access to more than 4096 threads.

My docker container is a centos7 image; which has by default a user limit set to 4096 processes; as defined in /etc/security/limits.d/20-nproc.conf :

# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
* soft nproc 4096
root soft nproc unlimited

when logging in to my docker container; I added into the ~/.bashrc the command ulimit -u unlimited such that this limit is removed for that user. Now I can break through this 4096 ceiling.

I am not thoroughly happy with this solution; since this means that I need to adapt all containers that would run on docker-host since they each have their own limit; and since I run all build commands as user 1001 it seems like when a container asks for how many threads he has running; he "sees" all threads of all containers together; not only those from his own instance.

I created an issue in docker-for-linux github for this: https://github.com/docker/for-linux/issues/654

answered Apr 26, 2019 at 7:35

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.