I faced issue on mysql cluster (5.7.28), i shuted down properly vm6 (data nodes) and the mysql replication was broken, i'm trying to make the link bettwen the datanode down and the replication broken but i still can't find the reason (below the relevant), is there someone who can help me to find the link
Slave: Got error 4009 'Cluster Failure' from NDB Error_code: 1296 [Warning] Slave: Can't lock file (errno: 157 - Could not connect to storage engine) Error_code: 1015 Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START".
1 Answer 1
In MySQL Cluster the MySQL server that handles the replication slave will write into the NDB data nodes. Since the NDB Data nodes are down it isn't possible to write into them and you get the above mentioned errors. Replication will start up again when the data nodes are up again and the slave SQL thread is restarted.
-
Hello Mikael, thanks for this feedback we have 4 datanodes (for redondancy) in the cluster and i only powered off single one, the other one were up and running, just trying to underdand why replication went to down in this casegojo– gojo2021年03月01日 14:33:49 +00:00Commented Mar 1, 2021 at 14:33
-
Hard to say for certain without more logs and so forth. But 4009 means that the MySQL Server has lost contact with ALL data nodes. This can happen even if some data nodes are still up since each MySQL Server has a separate connection to each data node that is using heartbeats, so a general overload could cause or that the MySQL Server VM is not communicating for a while, look into the cluster log might provide more details.Mikael Ronström– Mikael Ronström2021年03月02日 22:58:49 +00:00Commented Mar 2, 2021 at 22:58