replication: stop syncing if quorum cannot be formed
If box.cfg() successfully connects to a number of replicas sufficient to form a quorum (>= box.cfg.replication_connect_quorum), it won't return until it syncs with all of them (lag <= box.cfg.replication_sync_lag). If one of the replicas forming a quorum disconnects permanently while sync is in progress, box.cfg() will hang forever. Such a behavior is rather unreasonable. After all, syncing a quorum is best-effort. It would be much more sensible to return from box.cfg() leaving the instance in the 'orphan' mode in this case. This patch does exactly that: now if we detect that not enough replicas are connected to form a quorum while we are syncing we stop syncing immediately.
Showing
- src/box/replication.cc 61 additions, 31 deletionssrc/box/replication.cc
- src/box/replication.h 20 additions, 10 deletionssrc/box/replication.h
- test/replication/errinj.result 4 additions, 8 deletionstest/replication/errinj.result
- test/replication/errinj.test.lua 4 additions, 5 deletionstest/replication/errinj.test.lua
- test/replication/replica_ack.lua 0 additions, 11 deletionstest/replication/replica_ack.lua
Loading
Please register or sign in to comment