How to use the cluster configuration file and database data to recover a failed cluster.
| Redis Enterprise Software |
|---|
When a Redis Enterprise Software cluster fails, you must use the cluster configuration file and database data to recover the cluster.
Cluster failure can be caused by:
To recover a cluster and re-create it as it was before the failure,
you must restore the cluster configuration ccs-redis.rdb to the cluster nodes.
To recover databases in the new cluster, you must restore the databases from persistence files such as backup files, append-only files (AOF), or RDB snapshots.
These files are stored in the persistent storage location.
The cluster recovery process includes:
/ccs/ccs-redis.rdb on the persistent storage for each node.(Optional) If you want to recover the cluster to the original cluster nodes, uninstall Redis Enterprise Software from the nodes.
Install Redis Enterprise Software on the new cluster nodes.
The new servers must have the same basic hardware and software configuration as the original servers, including:
Mount the persistent storage drives with the recovery files to the new nodes. These drives must contain the cluster configuration backup files and database persistence files.
If you use local persistent storage, place all of the recovery files on each of the cluster nodes.
To recover the original cluster configuration, run rladmin cluster recover on the first node in the new cluster:
rladmin cluster recover filename [ <persistent_path> | <ephemeral_path> ]<filename> node_uid <node_uid> rack_id <rack_id>
For example:
rladmin cluster recover filename /tmp/persist/ccs/ccs-redis.rdb node_uid 1 rack_id 5
When the recovery command succeeds, this node is configured as the node from the old cluster that has ID 1.
To join the remaining servers to the new cluster, run rladmin cluster join from each new node:
rladmin cluster join nodes <cluster_member_ip_address> username <username> password <password> replace_node <node_id>
For example:
rladmin cluster join nodes 10.142.0.4 username [email protected] password mysecret replace_node 2
Run rladmin status to verify the recovered nodes are now active and the databases are pending recovery:
rladmin status
After the cluster is recovered, you must recover the databases.