1

I get the problem in many past days. I have MongoDB cluster with 3 shards and 3 replicas.

 1 2 3
A O S S
B S P S
C P O P
P - Primary state
S - Secoundary state
O - Other state

Alphabets are replica machine and numberic are shards.

I try to resync all data around 2TB in mongod machine with other state (A1 and C2). But I got error while resync data in primary that mongod service going down because Too many open file

2019年03月16日T16:35:22.351+0000 E - [conn28204] cannot open /dev/urandom Too many open files in system
2019年03月16日T16:35:22.362+0000 I NETWORK [listener] Error accepting new connection on 0.0.0.0:27017: Too many open files in system
2019年03月16日T16:35:22.362+0000 I NETWORK [listener] Error accepting new connection on 0.0.0.0:27017: Too many open files in system
2019年03月16日T16:35:22.362+0000 I NETWORK [listener] Error accepting new connection on 0.0.0.0:27017: Too many open files in system

I already try to fixed ulimit as mongodb recommendation.

> ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 241518
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 64000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 241518
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

and set in /etc/security/limits.conf like this

* soft nofile 64000
* hard nofile 64000
root soft nofile 64000
root hard nofile 64000

but all of this is not fix my problem. I still get mongod service down because Too many open file. I stuck for 3 days. Anyone have any ideas or solutions?

asked Mar 17, 2019 at 4:25

1 Answer 1

1

Since you have 2TB of data to be resynced, try the following approach.

Follow the below approach for each replica set as required.

  1. Connect to the Primary and back up the database using mongodump, also take the latest document oplog collection.

  2. Shutdown the secondary where you want to resync the data.

  3. Start this secondary as standalone server

  4. Drop the oplog collection in the local database and recreate it by inserting the latest entry from the primary node.

  5. restore the backup data taken from step 1 using mongorestore

  6. Once the restoration is completed, shut down and restart the server as replica set member.

answered Mar 17, 2019 at 5:09
2
  • Thanks for your reply, In step1, dump only oplog collection? Commented Mar 17, 2019 at 5:21
  • No, you shouldn't dump oplog collection. You should dump databases other than admin, local and config. for oplog you just need to restore the latest document from primary node. This you can do it by storing the latest document in a javascript variable or copy the record to a text editor and insert it later in secondary node. Commented Mar 17, 2019 at 5:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.