limbo: assign LSN to entries right after WAL write
LSN to a limbo entry used to be assigned either before a WAL write when the txn was from an applier, or one event loop iteration after a WAL write for instance's own transactions. The latter means that the WAl write callback only did fiber_wakeup() on a fiber which is supposed to assign the LSN later. That made possible a bug when a remote PROMOTE was received during a local txn WAL write. They were written to WAL in that order. But PROMOTE WAL write was handled *before* the txn's WAL write. That led to a crash, because PROMOTE processing wasn't ready to the limbo having entries without an LSN. The reason why it happened was that the finished WAL batches, even if are sent to WAL far from each other in time, still can be processed in TX thread all at once, without yields. They just keep stacking in an inter-thread queue until TX thread picks them up. If TX thread is slow enough, the WAL batches will form bigger "batches of batches" in this WAL->TX queue. When it happened for a txn + PROMOTE, the txn WAL write trigger only called fiber_wakeup() without LSN assign. The PROMOTE WAL write trigger was applied right away without yields, didn't meet its assumptions, and crashed. The patch makes an LSN be assigned to a limbo entry right in WAL write trigger. The cost of this is to store limbo entry as a member in struct txn. The patch only fixes the issue for PROMOTE covering the older transaction. The not covering case is still failing, subject of another commit. A side effect which allowed to make the patch a bit simpler - LSN is assigned to all limbo entries now, even to the non-synchro ones. The alternative solution was to create an on WAL write trigger for each synchro transaction and store the limbo entry in its data field. But that is more complicated. I decided it is time to add the entry to the txn. For non-synchro transactions it shouldn't add any cost, because the limbo entry is only touched for newly allocated txns (which eventually will stop being allocated and will only be reused from txn_cache); for synchro transactions on their normal path; for all transactions on failure path. Another alternative solution was to make limbo's latch a read-write latch. "Readers" would be new transactions until they finish WAL write, "writers" would be PROMOTE, DEMOTE, probably ROLLBACK. That way a PROMOTE wouldn't start until all limbo entries have LSNs. But that looks like an overkill. At least for this issue. Part of #6842 NO_DOC=Bugfix NO_CHANGELOG=To be added later
Showing
- src/box/box.cc 11 additions, 0 deletionssrc/box/box.cc
- src/box/txn.c 33 additions, 69 deletionssrc/box/txn.c
- src/box/txn.h 4 additions, 0 deletionssrc/box/txn.h
- src/box/txn_limbo.c 11 additions, 4 deletionssrc/box/txn_limbo.c
- src/lib/core/errinj.h 1 addition, 0 deletionssrc/lib/core/errinj.h
- test/box/errinj.result 1 addition, 0 deletionstest/box/errinj.result
- test/replication-luatest/gh_6842_qsync_applier_order_test.lua 112 additions, 0 deletions.../replication-luatest/gh_6842_qsync_applier_order_test.lua
Loading
Please register or sign in to comment