MySQL replication slave using percona and docker

Question 1

I am trying to run a MySQL replication slave in a docker container. We are running MySQL 5.7.24-27-log in production and it's from the percona repository (Ubuntu 18.04).

I have used xtrabackup to backup, prepare and ship a starting data set for replication, then I started the percona docker image (docker pull percona) like so:

$ docker run --name mysql-replication -v /replication/data:/var/lib/mysql -v /replication/docker.cnf:/etc/mysql/docker.cnf:ro -e MYSQL_ROOT_PASSWORD=xxxx -P -d percona

My docker.cnf simply notes the server-id (I copied it from the percona image).

[mysqld]
skip-host-cache
skip-name-resolve
bind-address = 0.0.0.0
server-id = 4

After then using CHANGE MASTER etc. I have the replication running just fine.

My intention (as per the volume mount -v /replication/data:/var/lib/mysql) is to keep all of the MySQL data on the host machine, and treat the replication docker container as ephemeral, i.e. no state held in the container. It should also be easy to start up another replication container should I need one by stopping the existing container, copying the data elsewhere, changing the server-id and running a new container.

To test this, after it was set up and running properly (I watched Seconds_Behind_Master drop down to 0), I figured I should be able to delete the container and recreate it, and the replication would still work fine. I therefore tried this:

$ docker stop mysql-replication
$ docker rm mysql-replication
$ docker run ... // same command as before

When I do this and connect to MySQL running in the container I find that Slave_IO_Running is No, and after starting it (START SLAVE;) I get the following (as seen in SHOW SLAVE STATUS;):

Last_Error: Could not execute Update_rows event on table databasename.tablename; Can't find record in 'tablename', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000681, end_log_pos 9952

(databasename and tablename are real database and table names)

At first I thought that I had probably mucked something up but I have tried this a number of times now to try and solve the problem. Using docker diff mysql-replication shows no changes to the running container that seem to be of significance:

$ docker diff mysql-replication 
C /run
C /run/mysqld
A /run/mysqld/mysqld.pid
C /var
C /var/log
A /var/log/mysql

Googling has suggested that I need to use RESET SLAVE; and START SLAVE; but this doesn't seem to resolve it - it's like the data (outside the container) is no longer in sync with the master and replication therefore cannot continue.

Can anyone pick holes in what I'm doing please?

Thanks so much.

Question 2

Are you seeing any error logs or other errors apart from that one you show? Also, check this blog post by one of Percona's support staff, it might help: lalitvc.wordpress.com/2018/03/01/… If you could just check those, I'll see if I can find out more for you.

Question 3

Hi there - no other errors as far as I can tell - my replication set is 12GB and 33,000 tables so it's a little unwieldy to test with. I'm going to shrink it down to a simpler situation and go from there. Thanks for the link, I'm pretty sure we don't have any MEMORY tables but that's easy to find out. The error has always happened on the same table which seems a possible issue as well. Thanks again :)

Question 4

No memory tables. I've created another replication slave, it has 1.5 days of transaction logs to apply so I'll check on it in the morning and give an update.

Question 5

So far have the following notes: 1. I'm did not specify a relay-log name in the docker.cnf file so it is defaulting to the host name - the host name is randomly created when the docker container is created (e.g. Relay_Log_File: ea09183fdcc6-relay-bin.000450). When I recreate the container I end up with a different relay-bin file due to new host name. 2. I did not have /var/log/mysql mounted outside of the container, therefore it was lost when the container was removed. 3. the container may be terminated early if it doesn't shut down within 10 seconds leading to data loss.

Question 6

The root cause of this issue was the absence of the relay-log option in the mysql.cnf file (or in this case, due to the docker volume mounts, the docker.cnf file). This lead to the creation and usage of files such as 89726507f176-relay-bin.000002 initially, where 89726507f176 is the host name of the machine (randomly assigned by the docker daemon when an image is created). When the container was stopped, removed and recreated, a new set of files was created and used (e.g. be0c801d95bc-relay-bin.000407) but this caused sync issues.

By explicitly specifying a value for relay-log in the docker.cnf file the container was able to be removed and recreated without problems.

As a side note, I suggested also that there was a problem with the /var/log/mysql directory not being mounted - this is not the case. If however you specify a value of log_bin = /var/log/mysql/mysql-bin.log for example, then this is a requirement. If you do not specify this path, it seems the binary logs are stored locally in /var/lib/mysql which is already mounted outside the container.

My final docker.cnf file is as follows:

[mysqld]
skip-host-cache
skip-name-resolve
bind-address = 0.0.0.0
binlog-ignore-db = mysql
replicate-ignore-db = mysql
log_bin = /var/log/mysql/mysql-bin.log
relay-log = replication-1
server_id = 1

Note: server_id = 2 on the replication slave.

Also note that without the relay-log option the command SHOW MASTER STATUS; returned no results on the master database container.

There is a possible outstanding issue yet which is that by default when you use docker stop it asks the container to terminate (by sending a SIGHUP to the docker entrypoint command) and if it doesn't terminate within 10 seconds it is forcefully stopped. I need to ensure taht this is given sufficient time to shut down as it could take a little while to sort itself out while under load, possibly resulting in data loss as a result.

GuruBob GuruBob 1013 bronze badges · Answer 1 · 2019-06-12 11:57:36Z

The root cause of this issue was the absence of the relay-log option in the mysql.cnf file (or in this case, due to the docker volume mounts, the docker.cnf file). This lead to the creation and usage of files such as 89726507f176-relay-bin.000002 initially, where 89726507f176 is the host name of the machine (randomly assigned by the docker daemon when an image is created). When the container was stopped, removed and recreated, a new set of files was created and used (e.g. be0c801d95bc-relay-bin.000407) but this caused sync issues.

By explicitly specifying a value for relay-log in the docker.cnf file the container was able to be removed and recreated without problems.

As a side note, I suggested also that there was a problem with the /var/log/mysql directory not being mounted - this is not the case. If however you specify a value of log_bin = /var/log/mysql/mysql-bin.log for example, then this is a requirement. If you do not specify this path, it seems the binary logs are stored locally in /var/lib/mysql which is already mounted outside the container.

My final docker.cnf file is as follows:

[mysqld]
skip-host-cache
skip-name-resolve
bind-address = 0.0.0.0
binlog-ignore-db = mysql
replicate-ignore-db = mysql
log_bin = /var/log/mysql/mysql-bin.log
relay-log = replication-1
server_id = 1

Note: server_id = 2 on the replication slave.

Also note that without the relay-log option the command SHOW MASTER STATUS; returned no results on the master database container.

There is a possible outstanding issue yet which is that by default when you use docker stop it asks the container to terminate (by sending a SIGHUP to the docker entrypoint command) and if it doesn't terminate within 10 seconds it is forcefully stopped. I need to ensure taht this is given sufficient time to shut down as it could take a little while to sort itself out while under load, possibly resulting in data loss as a result.

Stack Exchange Network

MySQL replication slave using percona and docker

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

MySQL replication slave using percona and docker

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions