txn_limbo: introduce cascading rollback
Cascading rollback is a state when existing transactions are being rolled back right now, and newer transactions can't be committed as well. To preserve the 'reversed rollback order' rule. WAL writer can enter such state when something goes wrong with writing to disk. Limbo didn't have that feature until now. Consider an example why limbo should be able to turn on cascading rollback. Without cascading rollback it can happen that a transaction is seemingly rolled back, but after restart it is committed and visible. The scenario: * Master writes a sync transaction to WAL with LSN1; * It starts waiting for ACKs; * No ACKs for timeout - it starts writing to WAL the command ROLLBACK(LSN1). To rollback everything with LSN >= LSN1 but < LSN of the ROLLBACK record itself; * Another fiber starts a new transaction, while ROLLBACK is in progress; * Limbo is not empty, so the new transaction is added there. Then it also starts writing itself to WAL; * ROLLBACK finishes WAL write. It rolls back all the transactions in the limbo to conform with the 'reversed rollback order' rule. Including the latest transaction; * The latest transaction finished its WAL write with LSN2 and sees that it was rolled back by the limbo already. All seems to be fine, but actually what happened is that ROLLBACK(LSN1) is written to WAL *before* the latest transaction with LSN2. Now when restart happens, ROLLBACK(LSN1) is replayed first, and then the latest LSN2 transaction is replayed second - it will be committed successfully, and will be visible. On the summary: transaction canceled its rollback after instance restart. Expected behaviour is that while ROLLBACK is in progress, all newer transactions should not even try going to WAL. They should be rolled back immediately. The patch implements the cascading rollback for the limbo. Closes #5140
Showing
- src/box/txn_limbo.c 24 additions, 1 deletionsrc/box/txn_limbo.c
- src/box/txn_limbo.h 17 additions, 0 deletionssrc/box/txn_limbo.h
- src/box/wal.c 6 additions, 0 deletionssrc/box/wal.c
- src/lib/core/errinj.h 1 addition, 0 deletionssrc/lib/core/errinj.h
- test/box/errinj.result 1 addition, 0 deletionstest/box/errinj.result
- test/replication/gh-5140-qsync-casc-rollback.result 224 additions, 0 deletionstest/replication/gh-5140-qsync-casc-rollback.result
- test/replication/gh-5140-qsync-casc-rollback.test.lua 105 additions, 0 deletionstest/replication/gh-5140-qsync-casc-rollback.test.lua
- test/replication/suite.ini 1 addition, 1 deletiontest/replication/suite.ini
Loading
Please register or sign in to comment