Skip to content
Snippets Groups Projects
Commit b5f0dc4d authored by Vladislav Shpilevoy's avatar Vladislav Shpilevoy
Browse files

txn: stop TXN_SIGNATURE_ABORT override

When txn_commit/try_async() failed before going to WAL thread,
they installed TXN_SIGNATURE_ABORT signature meaning that the
caller and the rollback triggers must look at the global diag.

But they called txn_rollback() before doing return and calling the
triggers, which overrode the signature with TXN_SIGNATURE_ROLLBACK
leading to the original error loss.

The patch makes TXN_SIGNATURE_ROLLBACK installed only when a real
rollback happens (via box_txn_rollback()).

This makes the original commit errors like a conflict in the
transaction manager and OOM not lost.

Besides, ERRINJ_TXN_COMMIT_ASYNC does not need its own diag_log()
anymore. Because since this commit the applier logs the correct
error instead of ER_WAL_IO/ER_TXN_ROLLBACK.

Closes #6027
parent 26a8317f
No related branches found
No related tags found
No related merge requests found
## bugfix/replication
* When an error happened during appliance of a transaction received from a
remote instance via replication, it was always reported as "Failed to write
to disk" regardless of what really happened. Now the correct error is shown.
For example, "Out of memory", or "Transaction has been aborted by conflict",
and so on (gh-6027).
......@@ -801,11 +801,6 @@ txn_commit_try_async(struct txn *txn)
ERROR_INJECT(ERRINJ_TXN_COMMIT_ASYNC, {
diag_set(ClientError, ER_INJECTION,
"txn commit async injection");
/*
* Log it for the testing sake: we grep
* output to mark this event.
*/
diag_log();
goto rollback;
});
......@@ -983,11 +978,11 @@ void
txn_rollback(struct txn *txn)
{
assert(txn == in_txn());
assert(txn->signature != TXN_SIGNATURE_UNKNOWN);
txn->status = TXN_ABORTED;
trigger_clear(&txn->fiber_on_stop);
if (!txn_has_flag(txn, TXN_CAN_YIELD))
trigger_clear(&txn->fiber_on_yield);
txn->signature = TXN_SIGNATURE_ROLLBACK;
txn_complete_fail(txn);
fiber_set_txn(fiber(), NULL);
}
......@@ -1086,6 +1081,8 @@ box_txn_rollback(void)
diag_set(ClientError, ER_ROLLBACK_IN_SUB_STMT);
return -1;
}
assert(txn->signature == TXN_SIGNATURE_UNKNOWN);
txn->signature = TXN_SIGNATURE_ROLLBACK;
txn_rollback(txn); /* doesn't throw */
fiber_gc();
return 0;
......@@ -1221,7 +1218,10 @@ txn_on_stop(struct trigger *trigger, void *event)
{
(void) trigger;
(void) event;
txn_rollback(in_txn()); /* doesn't yield or fail */
struct txn *txn = in_txn();
assert(txn->signature == TXN_SIGNATURE_UNKNOWN);
txn->signature = TXN_SIGNATURE_ROLLBACK;
txn_rollback(txn);
fiber_gc();
return 0;
}
......
-- test-run result file version 2
test_run = require('test_run').new()
| ---
| ...
--
-- gh-6027: on attempt to a commit transaction its original error was lost.
--
box.schema.user.grant('guest', 'super')
| ---
| ...
s = box.schema.create_space('test')
| ---
| ...
_ = s:create_index('pk')
| ---
| ...
test_run:cmd('create server replica with rpl_master=default, '.. \
'script="replication/replica.lua"')
| ---
| - true
| ...
test_run:cmd('start server replica')
| ---
| - true
| ...
test_run:switch('replica')
| ---
| - true
| ...
box.error.injection.set('ERRINJ_TXN_COMMIT_ASYNC', true)
| ---
| - ok
| ...
test_run:switch('default')
| ---
| - true
| ...
_ = s:replace{1}
| ---
| ...
test_run:switch('replica')
| ---
| - true
| ...
test_run:wait_upstream(1, {status = 'stopped'})
| ---
| - true
| ...
-- Should be something about error injection.
box.info.replication[1].upstream.message
| ---
| - Error injection 'txn commit async injection'
| ...
test_run:switch('default')
| ---
| - true
| ...
test_run:cmd('stop server replica')
| ---
| - true
| ...
test_run:cmd('delete server replica')
| ---
| - true
| ...
box.error.injection.set('ERRINJ_TXN_COMMIT_ASYNC', false)
| ---
| - ok
| ...
s:drop()
| ---
| ...
box.schema.user.revoke('guest', 'super')
| ---
| ...
test_run = require('test_run').new()
--
-- gh-6027: on attempt to a commit transaction its original error was lost.
--
box.schema.user.grant('guest', 'super')
s = box.schema.create_space('test')
_ = s:create_index('pk')
test_run:cmd('create server replica with rpl_master=default, '.. \
'script="replication/replica.lua"')
test_run:cmd('start server replica')
test_run:switch('replica')
box.error.injection.set('ERRINJ_TXN_COMMIT_ASYNC', true)
test_run:switch('default')
_ = s:replace{1}
test_run:switch('replica')
test_run:wait_upstream(1, {status = 'stopped'})
-- Should be something about error injection.
box.info.replication[1].upstream.message
test_run:switch('default')
test_run:cmd('stop server replica')
test_run:cmd('delete server replica')
box.error.injection.set('ERRINJ_TXN_COMMIT_ASYNC', false)
s:drop()
box.schema.user.revoke('guest', 'super')
......@@ -45,6 +45,7 @@
"gh-5536-wal-limit.test.lua": {},
"gh-5566-final-join-synchro.test.lua": {},
"gh-5613-bootstrap-prefer-booted.test.lua": {},
"gh-6027-applier-error-show.test.lua": {},
"gh-6032-promote-wal-write.test.lua": {},
"gh-6057-qsync-confirm-async-no-wal.test.lua": {},
"gh-6094-rs-uuid-mismatch.test.lua": {},
......
......@@ -3,7 +3,7 @@ core = tarantool
script = master.lua
description = tarantool/box, replication
disabled = consistent.test.lua
release_disabled = catch.test.lua errinj.test.lua gc.test.lua gc_no_space.test.lua before_replace.test.lua qsync_advanced.test.lua qsync_errinj.test.lua quorum.test.lua recover_missing_xlog.test.lua sync.test.lua long_row_timeout.test.lua gh-4739-vclock-assert.test.lua gh-4730-applier-rollback.test.lua gh-5140-qsync-casc-rollback.test.lua gh-5144-qsync-dup-confirm.test.lua gh-5167-qsync-rollback-snap.test.lua gh-5506-election-on-off.test.lua gh-5536-wal-limit.test.lua hang_on_synchro_fail.test.lua anon_register_gap.test.lua gh-5213-qsync-applier-order.test.lua gh-5213-qsync-applier-order-3.test.lua gh-6032-promote-wal-write.test.lua gh-6057-qsync-confirm-async-no-wal.test.lua
release_disabled = catch.test.lua errinj.test.lua gc.test.lua gc_no_space.test.lua before_replace.test.lua qsync_advanced.test.lua qsync_errinj.test.lua quorum.test.lua recover_missing_xlog.test.lua sync.test.lua long_row_timeout.test.lua gh-4739-vclock-assert.test.lua gh-4730-applier-rollback.test.lua gh-5140-qsync-casc-rollback.test.lua gh-5144-qsync-dup-confirm.test.lua gh-5167-qsync-rollback-snap.test.lua gh-5506-election-on-off.test.lua gh-5536-wal-limit.test.lua hang_on_synchro_fail.test.lua anon_register_gap.test.lua gh-5213-qsync-applier-order.test.lua gh-5213-qsync-applier-order-3.test.lua gh-6027-applier-error-show.test.lua gh-6032-promote-wal-write.test.lua gh-6057-qsync-confirm-async-no-wal.test.lua
config = suite.cfg
lua_libs = lua/fast_replica.lua lua/rlimit.lua
use_unix_sockets = True
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment