replication: add is_orphan field to ballot
A successfully fetched remote instance ballot isn't updated during bootstrap procedure. This leads to a case when different instances choose different masters as their bootstrap leaders. Imagine such a situation. You start instance A without replication set up. Instance A successfully bootstraps. You also have instances B and C both with replication set up to {A, B, C} and replication_connect_quorum set to 3 You first start instance B. It doesn't proceed to choosing a leader until one of the events happens: either the replication_connect_timeout runs out, or instance C is up and starts listening on its port. B has established connection to A and fetched its ballot, with some vclock, say, {1: 1}. B retries connection to C every replication_timeout seconds. Then you start instance C. Instance C succeeds in connecting to A and B right away and bootstraps from instance A. Instance A registers C in its _cluster table. This registration is replicated to instance C. Meanwhile, instance C is trying to sync with quorum instances (which is 3), and stays in orphan mode. Now replication_timeout on instance B finally runs out. It retries a previously unsuccessful connection to C and succeeds. C sends its ballot to B with vclock = {1: 2, 2:0} (in our example), since it has already incremented it after _cluster registration. B sees that C has a greater vclock than A, and chooses to bootstrap from C instead of A. C is orphan and rejects B's attempt to join. B dies. To fix such ungentlemanlike behaviour of C, we should at least include loading status in ballot and prefer fully bootstrapped instances to the ones still syncing with other replicas. We also need to use a separate flag instead of ballot's already existent is_ro, since we still want to prefer loading instances over the ones explicitly configured to be read-only. Closes #4527 (cherry picked from commit dc1e4009)
Showing
- src/box/box.cc 8 additions, 0 deletionssrc/box/box.cc
- src/box/iproto_constants.h 1 addition, 0 deletionssrc/box/iproto_constants.h
- src/box/replication.cc 8 additions, 2 deletionssrc/box/replication.cc
- src/box/xrow.c 11 additions, 2 deletionssrc/box/xrow.c
- src/box/xrow.h 5 additions, 0 deletionssrc/box/xrow.h
- test/replication/bootstrap_leader.result 61 additions, 0 deletionstest/replication/bootstrap_leader.result
- test/replication/bootstrap_leader.test.lua 27 additions, 0 deletionstest/replication/bootstrap_leader.test.lua
- test/replication/replica_uuid_rw.lua 26 additions, 0 deletionstest/replication/replica_uuid_rw.lua
- test/replication/replica_uuid_rw1.lua 1 addition, 0 deletionstest/replication/replica_uuid_rw1.lua
- test/replication/replica_uuid_rw2.lua 1 addition, 0 deletionstest/replication/replica_uuid_rw2.lua
- test/replication/replica_uuid_rw3.lua 1 addition, 0 deletionstest/replication/replica_uuid_rw3.lua
Loading
Please register or sign in to comment