replication: fix extraneous split-brain alerting
Current split-brain detector implementation raises an error each time a CONFIRM or ROLLBACK entry is received from the previous synchronous transaction queue owner. It is assumed that the new queue owner must have witnessed all the previous CONFIRMS. Besides, according to Raft, ROLLBACK should never happen. Actually there is a case when a CONFIRM from an old term is legal: it's possible that during leader transition old leader writes a CONFIRM for the same transaction that is confirmed by the new leader's PROMOTE. If PROMOTE and CONFIRM lsns match there is nothing bad about such situation. Symmetrically, when an old leader issues a ROLLBACK with the lsn right after the new leader's PROMOTE lsn, it is not a split-brain. Allow such cases by tracking the last confirmed lsn for each synchronous transaction queue owner and silently nopifying CONFIRMs with an lsn less than the one recorded and ROLLBACKs with lsn greater than that. Closes #9138 NO_DOC=bugfix
Showing
- changelogs/unreleased/gh-9138-relax-split-brain-check.md 6 additions, 0 deletionschangelogs/unreleased/gh-9138-relax-split-brain-check.md
- src/box/applier.cc 36 additions, 8 deletionssrc/box/applier.cc
- src/box/txn_limbo.h 10 additions, 0 deletionssrc/box/txn_limbo.h
- test/replication-luatest/gh_5295_split_brain_test.lua 152 additions, 6 deletionstest/replication-luatest/gh_5295_split_brain_test.lua
Loading
Please register or sign in to comment