Skip to content

backport replicaset expel

Georgy Moshkin requested to merge gmoshkin/backport-expel-replicaset into 24.6

Summary

Backport of !1325 (merged) and !1303 (merged) to 24.6.

  • chore: yet another compilation warning introduced by someone else

  • test: fix test_migration_lock flakiness (use alter system)

  • test: fix tests

  • fix: set auto_offline_timeout to 30 seconds by default

  • test: fix test_join_with_duplicate_instance_id

  • fix: expel is blocked until buckets are rebalanced from last instance

  • fix: don't update current_config_version for offline replicasets

  • chore: fix compilation warning

  • test: cleanup TODOs in test_replication.py

  • test: make test_vshard_updates_on_master_change less flaky

  • fix: don't update current_master_name until it synchronizes

This patch changes when we update current_master_name during consistent master switchover. Now it is updated as soon as we determine that the new master has synchronized it's vclock with the old master.

As a result we can now block expel of a non-last replica in cases where other replicas are offline, which is crucial because instances can easily become offline due to temporary network issues.

  • fix: previous commit broke transferring of raft leadership

  • fix: expel is blocked my master switchover even if replicas are offline

  • fix: change definition of Instance::may_respond

Now we assume instance may not respond if

  • target state is Offline (non-graceful assumed)
  • current state is Expelled
  • test: add a failing test for bucket rebalancing before instance expel

  • test: don't panic libtestplug.so in case of unauthorized RPC

  • refactor: cleanup code in resolve_rpc_target

  • Update version tag for 24.6

  • release: 24.6.1

  • git: freeze submodules for 24.6

  • ci: feature freeze 24.6 branch

Merge request reports