I get the problem in many past days. I have MongoDB cluster with 3 shards and 3 replicas.
1 2 3
A O S S
B S P S
C P O P
P - Primary state
S - Secoundary state
O - Other state
Alphabets are replica machine and numberic are shards.
I try to resync all data around 2TB in mongod
machine with other state (A1 and C2). But I got error while resync data in primary that mongod service going down because Too many open file
2019年03月16日T16:35:22.351+0000 E - [conn28204] cannot open /dev/urandom Too many open files in system
2019年03月16日T16:35:22.362+0000 I NETWORK [listener] Error accepting new connection on 0.0.0.0:27017: Too many open files in system
2019年03月16日T16:35:22.362+0000 I NETWORK [listener] Error accepting new connection on 0.0.0.0:27017: Too many open files in system
2019年03月16日T16:35:22.362+0000 I NETWORK [listener] Error accepting new connection on 0.0.0.0:27017: Too many open files in system
I already try to fixed ulimit as mongodb recommendation.
> ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 241518
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 64000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 241518
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
and set in /etc/security/limits.conf like this
* soft nofile 64000
* hard nofile 64000
root soft nofile 64000
root hard nofile 64000
but all of this is not fix my problem. I still get mongod service down because Too many open file
. I stuck for 3 days. Anyone have any ideas or solutions?
1 Answer 1
Since you have 2TB of data to be resynced, try the following approach.
Follow the below approach for each replica set as required.
Connect to the Primary and back up the database using
mongodump
, also take the latest documentoplog
collection.Shutdown the secondary where you want to resync the data.
Start this secondary as standalone server
Drop the
oplog
collection in thelocal
database and recreate it by inserting the latest entry from the primary node.restore the backup data taken from step 1 using
mongorestore
Once the restoration is completed, shut down and restart the server as replica set member.
-
Thanks for your reply, In step1, dump only
oplog
collection?Bongsakorn– Bongsakorn2019年03月17日 05:21:17 +00:00Commented Mar 17, 2019 at 5:21 -
No, you shouldn't dump
oplog
collection. You should dump databases other thanadmin, local and config
. foroplog
you just need to restore the latest document from primary node. This you can do it by storing the latest document in a javascript variable or copy the record to a text editor and insert it later in secondary node.Mani– Mani2019年03月17日 05:32:11 +00:00Commented Mar 17, 2019 at 5:32