I have a master (m) - slave (s1) setup using mysql 5.1.45
When I try to add a second slave (s2) the slave lags behind and never catches up on the sync.
Even after having synced the s2 with the whole system offline and there were (Seconds_Behind_Master = 0) after a few hours the s2 gets out of sync.
Strange is that s1 is always on sync.
any ideas?
SHOW SLAVE STATUS \G (on slave2)
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: xxx.xxx.xxx.xxx
Master_User: xxxx_xxxx5
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.013165
Read_Master_Log_Pos: 208002803
Relay_Log_File: xxxxxxxxxx-relay-bin.000100
Relay_Log_Pos: 1052731555
Relay_Master_Log_File: mysql-bin.013124
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: xxxxxxxxx
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1052731410
Relay_Log_Space: 44233859505
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 69594
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
iperf results between servers:
M -> s2
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.0 sec 502 MBytes 420 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.05 GBytes 902 Mbits/sec
M -> s1
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 637 MBytes 534 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-10.0 sec 925 MBytes 775 Mbits/sec
vmstat for s2
vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 268 126568 199100 22692944 0 0 100 836 8 81 1 0 96 3
vmstat 2 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 268 1150144 197128 21670808 0 0 100 835 9 81 1 0 96 3 0
0 0 268 1144464 197160 21674940 0 0 644 3096 1328 1602 0 0 97 2 0
0 2 268 1140680 197176 21679624 0 0 846 5362 1002 1567 0 0 98 2 0
0 1 268 1135332 197192 21685040 0 0 960 3348 850 1193 0 0 98 1 0
0 0 268 1130776 197204 21688752 0 0 576 2894 978 1232 0 0 98 2 0
0 0 268 1127060 197264 21693556 0 0 586 5202 1075 1505 0 0 97 3 0
0 0 268 1122184 197272 21698412 0 0 896 1160 614 727 0 0 98 1 0
0 0 268 1118532 197300 21702780 0 0 586 5070 1279 1708 0 0 93 6 0
0 0 268 1114000 197324 21705820 0 0 402 1522 947 942 0 0 95 4 0
0 0 268 1109708 197336 21710188 0 0 704 9150 1224 2109 0 0 97 2 0
top output on s2
top - 14:44:25 up 16:36, 1 user, load average: 1.62, 1.47, 1.42
Tasks: 140 total, 1 running, 139 sleeping, 0 stopped, 0 zombie
Cpu0 : 2.9%us, 1.1%sy, 0.0%ni, 73.8%id, 21.8%wa, 0.0%hi, 0.4%si, 0.0%st
Cpu1 : 0.8%us, 0.3%sy, 0.0%ni, 95.5%id, 3.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.6%us, 0.3%sy, 0.0%ni, 97.7%id, 1.4%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.5%us, 0.2%sy, 0.0%ni, 98.9%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 24744184k total, 24005508k used, 738676k free, 199136k buffers
Swap: 1050616k total, 268k used, 1050348k free, 22078920k cached
Any ideas?
Is there any chance that the Mysql version is the culprit of all this in conjuction with the nearly 5 fold increase in traffic to the master ?
If that is the case then why s1 syncs and not s2?
Any ideas if 5.6.x solves similar probs ?
-
Comments are not for extended discussion; this conversation has been moved to chat.Paul White– Paul White ♦2017年08月29日 09:54:51 +00:00Commented Aug 29, 2017 at 9:54
2 Answers 2
The answer to this is very straightforward. The two slaves must have the same server_id. I wrote about this 2 years ago (Screwed up replication by sharing server ids). In that post, I quoted Baron Schwartz's blog Pop quiz: how can one slave break another slave.
The quick-and-dirty solution ? Change the second slave's server_id. For example, if the master's server_id is 1000 and first slave's server_id is 1001, go to the second slave and run the following:
mysql> SET GLOBAL server_id = 1002;
This will fix it right then and there.
Then, go to the second slave and change the server_id in the my.cnf
[mysqld]
server_id = 1002
Give it a Try !!!
For this we might need to have a look into the system to fix this. But you may try this procedure.
First bring it back in sysnc with the master by taking full dump if possible, and once it is done and sec behind master is 0, then at that point run this command on master
show master status
make a note of the log position. Now let it go out of sync and second behind master go far once you are stisfied it is out of sync issue following command on the slave.
stop slave reset slave flush logs `set global innodb_file_per_table=1';
then
change master to master_user= 'slave1';start slave;
use above command with user and password as per need basis.....
This may sort it out. If not I would be happy to look into it if you wish..
Thanks.
Masood
-
What is the point of running the above statements? This is a 50 GB db and such tests cost actually a whole nights work as well as getting the system ofline for fast backups and full network bandwidth. Please explain th epoint behind these statements... thank yousonaht– sonaht2013年03月31日 21:14:06 +00:00Commented Mar 31, 2013 at 21:14