Skip to content
Snippets Groups Projects
Commit 17958322 authored by Serge Petrenko's avatar Serge Petrenko Committed by Kirill Yukhin
Browse files

replication: add is_orphan field to ballot

A successfully fetched remote instance ballot isn't updated during
bootstrap procedure. This leads to a case when different instances
choose different masters as their bootstrap leaders.

Imagine such a situation.
You start instance A without replication set up. Instance A successfully
bootstraps.
You also have instances B and C both with replication set up to {A, B,
C} and replication_connect_quorum set to 3
You first start instance B. It doesn't proceed to choosing a leader
until one of the events happens: either the replication_connect_timeout
runs out, or instance C is up and starts listening on its port.
B has established connection to A and fetched its ballot, with some
vclock, say, {1: 1}.
B retries connection to C every replication_timeout seconds.
Then you start instance C. Instance C succeeds in connecting to A and B
right away and bootstraps from instance A. Instance A registers C in its
_cluster table. This registration is replicated to instance C.
Meanwhile, instance C is trying to sync with quorum instances (which is
3), and stays in orphan mode.
Now replication_timeout on instance B finally runs out. It retries a
previously unsuccessful connection to C and succeeds. C sends its ballot
to B with vclock = {1: 2, 2:0} (in our example), since it has already
incremented it after _cluster registration.
B sees that C has a greater vclock than A, and chooses to bootstrap from
C instead of A. C is orphan and rejects B's attempt to join. B dies.

To fix such ungentlemanlike behaviour of C, we should at least include
loading status in ballot and prefer fully bootstrapped instances to the
ones still syncing with other replicas.
We also need to use a separate flag instead of ballot's already existent
is_ro, since we still want to prefer loading instances over the ones
explicitly configured to be read-only.

Closes #4527

(cherry picked from commit dc1e4009)
parent e321e679
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment