-
Vladislav Shpilevoy authored
The limbo cleared TXN_WAIT_SYNC and TXN_WAIT_ACK flags for all removed entries - succeeded and failed. For succeeded it is fine. For failed it was not. The reason is that a transaction could be rolled back after a successful WAL write but before its waiting fiber wakes up. Then on wakeup the fiber wouldn't not see TXN_WAIT_SYNC flag and assert that the transaction signature >= 0. It wasn't true for txns rolled back due to synchro-reasons like a foreign PROMOTE not including this transaction. The patch makes so a failed transaction keeps its TXN_WAIT_SYNC flag so as its owner fiber on wakeup would reach txn_limbo_wait_complete(), notice the bad signature, and follow the rollback-path. TXN_WAIT_ACK is dropped, because the transaction owner otherwise would try to call txn_limbo_ack() for the transaction even if the limbo doesn't belong to the instance anymore. An alternative solution would be to check signature value for all transactions even when journal_entry->res is >= 0. But that would slow down the common path even for non-synchro transactions. Closes #6842 NO_DOC=Bugfix
Vladislav Shpilevoy authoredThe limbo cleared TXN_WAIT_SYNC and TXN_WAIT_ACK flags for all removed entries - succeeded and failed. For succeeded it is fine. For failed it was not. The reason is that a transaction could be rolled back after a successful WAL write but before its waiting fiber wakes up. Then on wakeup the fiber wouldn't not see TXN_WAIT_SYNC flag and assert that the transaction signature >= 0. It wasn't true for txns rolled back due to synchro-reasons like a foreign PROMOTE not including this transaction. The patch makes so a failed transaction keeps its TXN_WAIT_SYNC flag so as its owner fiber on wakeup would reach txn_limbo_wait_complete(), notice the bad signature, and follow the rollback-path. TXN_WAIT_ACK is dropped, because the transaction owner otherwise would try to call txn_limbo_ack() for the transaction even if the limbo doesn't belong to the instance anymore. An alternative solution would be to check signature value for all transactions even when journal_entry->res is >= 0. But that would slow down the common path even for non-synchro transactions. Closes #6842 NO_DOC=Bugfix