We want to setup patroni HA setup with 3 coordinator, 2 worker node 1 & 2 worker node 2.
I am getting system id mismatch when i start 2nd coordinator. First one is running fine, only facing issue while running the 2nd coordinator while joining the cluster.
Below is my patroni.yml file.
I tried deleting data directory and also removing key from etcd and start again. But it is the same issue.
scope: demo
name: ${INSTANCE_NAME}
namespace: /service/
citus:
group: ${CITUS_GROUP} # 0 for coordinator and 1, 2, 3, etc for workers
database: citus # must be the same on all nodes
restapi:
listen: 0.0.0.0:8008
connect_address: ${INVENTORY_HOSTNAME}:8008
etcd3:
hosts: ${PATRONI_ETCD_HOSTS}
bootstrap:
method: initdb
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
master_start_timeout: 300
synchronous_mode: false
synchronous_mode_strict: false
synchronous_node_count: 1
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
max_connections: 1000
superuser_reserved_connections: 5
password_encryption: md5
max_locks_per_transaction: 512
max_prepared_transactions: 0
huge_pages: try
shared_buffers: 512MB
effective_cache_size: 1024MB
work_mem: 512MB
maintenance_work_mem: 256MB
checkpoint_timeout: 15min
checkpoint_completion_target: 0.9
min_wal_size: 2GB
max_wal_size: 8GB
wal_buffers: 16MB
default_statistics_target: 1000
seq_page_cost: 1
random_page_cost: 1.1
effective_io_concurrency: 200
synchronous_commit: on
autovacuum: on
autovacuum_max_workers: 5
autovacuum_vacuum_scale_factor: 0.01
autovacuum_analyze_scale_factor: 0.01
autovacuum_vacuum_cost_limit: 500
autovacuum_vacuum_cost_delay: 2
autovacuum_naptime: 1s
max_files_per_process: 4096
archive_mode: on
archive_timeout: 1800s
archive_command: cd .
wal_level: logical
wal_keep_size: 2GB
max_wal_senders: 10
max_replication_slots: 10
hot_standby: on
wal_log_hints: on
wal_compression: on
shared_preload_libraries: pg_stat_statements,auto_explain
pg_stat_statements.max: 10000
pg_stat_statements.track: all
pg_stat_statements.track_utility: false
pg_stat_statements.save: true
auto_explain.log_min_duration: 10s
auto_explain.log_analyze: true
auto_explain.log_buffers: true
auto_explain.log_timing: false
auto_explain.log_triggers: true
auto_explain.log_verbose: true
auto_explain.log_nested_statements: true
auto_explain.sample_rate: 0.01
track_io_timing: on
log_lock_waits: on
log_temp_files: 0
track_activities: on
track_activity_query_size: 4096
track_counts: on
track_functions: all
log_checkpoints: on
logging_collector: on
log_truncate_on_rotation: on
log_rotation_age: 1d
log_rotation_size: 0
log_line_prefix: '%t [%p-%l] %r %q%u@%d '
log_filename: postgresql-%a.log
log_directory: /var/log/postgresql
hot_standby_feedback: on
max_standby_streaming_delay: 30s
wal_receiver_status_interval: 10s
idle_in_transaction_session_timeout: 10min
jit: off
max_worker_processes: 24
max_parallel_workers: 10
max_parallel_workers_per_gather: 2
max_parallel_maintenance_workers: 2
tcp_keepalives_count: 10
tcp_keepalives_idle: 300
tcp_keepalives_interval: 30
citus.enable_change_data_capture: 'on'
citus.max_client_connections: 300
slots:
demo_slot:
type: logical
database: demo
plugin: pgoutput
initdb:
- encoding: UTF8
- locale: en_US.UTF-8
- data-checksums
pg_hba:
- host replication ${PATRONI_REPLICATION_USERNAME} 127.0.0.1/32 md5
- host all all 0.0.0.0/0 md5
postgresql:
listen: 0.0.0.0:5432
connect_address: ${INVENTORY_HOSTNAME}:5432
use_unix_socket: true
data_dir: /data/postgresql
bin_dir: /usr/lib/postgresql/16/bin
config_dir: /etc/postgresql/16/main
pgpass: /tmp/pgpass
authentication:
replication:
username: ${PATRONI_REPLICATION_USERNAME}
password: ${PATRONI_REPLICATION_PASSWORD}
superuser:
username: ${PATRONI_SUPERUSER_USERNAME}
password: ${PATRONI_SUPERUSER_PASSWORD}
parameters:
unix_socket_directories: /var/run/postgresql
remove_data_directory_on_rewind_failure: false
remove_data_directory_on_diverged_timelines: false
create_replica_methods:
- basebackup
basebackup:
max-rate: '1000M'
checkpoint: fast
watchdog:
mode: automatic
device: /dev/watchdog
safety_margin: 5
tags:
nosync: false
noloadbalance: false
nofailover: false
clonefrom: false
On 2nd coordinator, i also tried remove method: initdb and its arguments as well.
-
Are you configuring slave node using initdb?If yes,that is not correct way you will get system id mismatch,you should take backup from master node.manjunath– manjunath2024年06月03日 11:20:21 +00:00Commented Jun 3, 2024 at 11:20
-
No, while setting up slave node, i remove method and initdb config from patroni.yml. Still i get the same error. @manjunathjack– jack2024年06月03日 11:36:01 +00:00Commented Jun 3, 2024 at 11:36
lang-sql