replication: do not wait for all masters on recovery
If one cluster node is down permanently for some reason, no other node can restart - they will stall in box.cfg{} until all other nodes are up and running. This complicates a tarantool cluster deployment in real world scenarios. To address this issue, let's complete the configuration as soon as connections have been established connections to the number of hosts specified by the new configuration option, box.cfg.replication_quorum, assuming the rest will connect asynchronously. If the option is unset, it defaults to the number of entries in box.cfg.replication so this patch shouldn't affect the behavior of existing setups. Closes #2958
Showing
- src/box/box.cc 41 additions, 6 deletionssrc/box/box.cc
- src/box/box.h 1 addition, 0 deletionssrc/box/box.h
- src/box/lua/cfg.cc 12 additions, 0 deletionssrc/box/lua/cfg.cc
- src/box/lua/load_cfg.lua 4 additions, 0 deletionssrc/box/lua/load_cfg.lua
- src/box/replication.cc 105 additions, 10 deletionssrc/box/replication.cc
- src/box/replication.h 32 additions, 3 deletionssrc/box/replication.h
- src/cfg.c 9 additions, 0 deletionssrc/cfg.c
- src/cfg.h 3 additions, 0 deletionssrc/cfg.h
- test/replication/autobootstrap.lua 2 additions, 5 deletionstest/replication/autobootstrap.lua
- test/replication/autobootstrap_guest.lua 4 additions, 7 deletionstest/replication/autobootstrap_guest.lua
- test/replication/quorum.lua 21 additions, 0 deletionstest/replication/quorum.lua
- test/replication/quorum.result 152 additions, 0 deletionstest/replication/quorum.result
- test/replication/quorum.test.lua 69 additions, 0 deletionstest/replication/quorum.test.lua
Loading
Please register or sign in to comment