Commit f9299f24 authored 3 years ago by Vladislav Shpilevoy

limbo: don't clear txn sync flag in case of fail

The limbo cleared TXN_WAIT_SYNC and TXN_WAIT_ACK flags for all
removed entries - succeeded and failed. For succeeded it is fine.
For failed it was not.

The reason is that a transaction could be rolled back after a
successful WAL write but before its waiting fiber wakes up. Then
on wakeup the fiber wouldn't not see TXN_WAIT_SYNC flag and assert
that the transaction signature >= 0. It wasn't true for txns
rolled back due to synchro-reasons like a foreign PROMOTE not
including this transaction.

The patch makes so a failed transaction keeps its TXN_WAIT_SYNC
flag so as its owner fiber on wakeup would reach
txn_limbo_wait_complete(), notice the bad signature, and follow
the rollback-path.

TXN_WAIT_ACK is dropped, because the transaction owner otherwise
would try to call txn_limbo_ack() for the transaction even if the
limbo doesn't belong to the instance anymore.

An alternative solution would be to check signature value for all
transactions even when journal_entry->res is >= 0. But that would
slow down the common path even for non-synchro transactions.

Closes #6842

NO_DOC=Bugfix

parent 993f0f9a

No related branches found

No related tags found

Hide whitespace changes

Inline Side-by-side

Showing with 105 additions and 5 deletions

Please register or to comment