Skip to content
Snippets Groups Projects
Commit a9b99f0e authored by Vladislav Shpilevoy's avatar Vladislav Shpilevoy Committed by Kirill Yukhin
Browse files

recovery: handle local sync txns during recovery

Recovery uses txn_commit_async() so as not to block the recovery
process when a synchronous transaction is met. They are either
committed later when CONFIRM is read, or stay in the limbo after
recovery.

However txn_commit_async() assumed it is used for remote
transactions only, and had some assertions about that. One of them
crashed in case master restarted and had any synchronous
transaction in WAL.

The patch makes txn_commit_async() not assume anything about
transaction's origin.

Closes #5163
parent 604eb737
No related branches found
No related tags found
No related merge requests found
......@@ -756,8 +756,14 @@ txn_commit_async(struct txn *txn)
if (txn_has_flag(txn, TXN_WAIT_ACK)) {
int64_t lsn = req->rows[txn->n_applier_rows - 1]->lsn;
txn_limbo_assign_remote_lsn(&txn_limbo, limbo_entry,
lsn);
/*
* Can't tell whether it is local or not -
* async commit is used both by applier
* and during recovery. Use general LSN
* assignment to let the limbo rule this
* out.
*/
txn_limbo_assign_lsn(&txn_limbo, limbo_entry, lsn);
}
/*
......@@ -844,6 +850,11 @@ txn_commit(struct txn *txn)
if (is_sync) {
if (txn_has_flag(txn, TXN_WAIT_ACK)) {
int64_t lsn = req->rows[req->n_rows - 1]->lsn;
/*
* Use local LSN assignment. Because
* blocking commit is used by local
* transactions only.
*/
txn_limbo_assign_local_lsn(&txn_limbo, limbo_entry,
lsn);
/* Local WAL write is a first 'ACK'. */
......
......@@ -148,6 +148,16 @@ txn_limbo_assign_local_lsn(struct txn_limbo *limbo,
entry->ack_count = ack_count;
}
void
txn_limbo_assign_lsn(struct txn_limbo *limbo, struct txn_limbo_entry *entry,
int64_t lsn)
{
if (limbo->instance_id == instance_id)
txn_limbo_assign_local_lsn(limbo, entry, lsn);
else
txn_limbo_assign_remote_lsn(limbo, entry, lsn);
}
static int
txn_limbo_write_rollback(struct txn_limbo *limbo, int64_t lsn);
......
......@@ -188,6 +188,20 @@ void
txn_limbo_assign_local_lsn(struct txn_limbo *limbo,
struct txn_limbo_entry *entry, int64_t lsn);
/**
* Assign an LSN to a limbo entry. Works both with local and
* remote transactions. The function exists to be used in a
* context, where a transaction is not known whether it is local
* or not. For example, when a transaction is committed not bound
* to any fiber (txn_commit_async()), it can be created by applier
* (then it is remote) or by recovery (then it is local). Besides,
* recovery can commit remote transactions as well, when works on
* a replica - it will recover data received from master.
*/
void
txn_limbo_assign_lsn(struct txn_limbo *limbo, struct txn_limbo_entry *entry,
int64_t lsn);
/**
* Ack all transactions up to the given LSN on behalf of the
* replica with the specified ID.
......
-- test-run result file version 2
test_run = require('test_run').new()
| ---
| ...
engine = test_run:get_cfg('engine')
| ---
| ...
--
-- gh-5163: master during recovery treated local transactions as
-- remote and crashed.
--
_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
| ---
| ...
_ = box.space.sync:create_index('pk')
| ---
| ...
box.space.sync:replace{1}
| ---
| - [1]
| ...
test_run:cmd('restart server default')
|
box.space.sync:select{}
| ---
| - - [1]
| ...
box.space.sync:drop()
| ---
| ...
test_run = require('test_run').new()
engine = test_run:get_cfg('engine')
--
-- gh-5163: master during recovery treated local transactions as
-- remote and crashed.
--
_ = box.schema.space.create('sync', {is_sync=true, engine=engine})
_ = box.space.sync:create_index('pk')
box.space.sync:replace{1}
test_run:cmd('restart server default')
box.space.sync:select{}
box.space.sync:drop()
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment