Commits · f0feb9230ae6abe1f7829df5ecd732f8f9afc12a · core / tarantool

Oct 12, 2023

asan: suppress leak reports caused by JIT · f0feb923

Nikolay Shirokovskiy authored 1 year ago

With ASAN-friendly small allocators there are a lot test failures due
to leak reports which are gone if JIT is off.

Fortunately all the reports related to a few functions. Let's suppress
temporarily such reports.

Part of #7327

NO_TEST=internal
NO_CHANGELOG=internal
NO_DOC=internal

f0feb923

asan: adapt misc stats test for ASAN · 1436eb41

Nikolay Shirokovskiy authored 1 year ago

When SMALL_MALLOC_IMPL is defined and ASAN-friendly allocators are used
the arena allocator is not used at all as we not allocate memory
directly from there. And other ASAN-friendly allocators are not allocate
from it too. Thus box.slab.info().arena_size == 0. Same for usage
of runtime arena box.runtime.info().used.

Also usage with ASAN-friendly lsregion is a bit different as it does
not account for size of alignment padding. Thus we need to adapt
box.stat.vinyl().memory.level0 tests. Approach is to check for lower
and upper limit instead of checking for exact values.

Part of #7327

NO_DOC=test changes
NO_CHANGELOG=test changes

1436eb41

asan: prepare for ASAN-friendly ibuf · 4f542bb7

Nikolay Shirokovskiy authored 1 year ago

ASAN-friendly implementation poisons memory after allocation with
ibuf_alloc so we need to fix existing places in code where we access
memory after allocation.

Part of ibuf implementation is inline functions in headers. Thus ibuf
implementation in Lua reimplement this parts. We add poison to these
inline functions in ASAN-friedly implementation so we need add same poison
in Lua implementation.

Part of #7327

NO_CHANGELOG=internal
NO_DOC=internal

4f542bb7

salad: get rid of core memory dependency · d01609a4

Nikolay Shirokovskiy authored 1 year ago

We are going to include generated small_config.h into small allocator
headers (currently it is only included in small source files).
core/memory.h depends on small headers and salad/heap.h depends on
core/memory.h. As a result we need to provide a way for salad/heap.h
users to find small_config.h header.

Instead let's drop dependency from core/memory.h as we only use it for
typeof definition.

Part of #7327

NO_CHANGELOG=code cleanup
NO_DOC=code cleanup

d01609a4

fiber: disable fiber stack protection with ASAN temporarily · 2ee15793

Nikolay Shirokovskiy authored 1 year ago

If leak sanitizer reaches the memory protected from read with mprotect
it exhibits all sorts of odd behaviour. It can hang, can crash, can
return errors with no leak backtraces.

We use mprotect to create guard zones at the end of fiber stack so if
stack is overflowed we get a signal and crash. We take protection off
when fiber is destroyed. Unfortunately we do not destroy cords (and its
fibers) which cancelled through cord_cancel_and_join. This is going to
be addressed in patch for issue #8423 ("Get rid of pthread_cancel()").
Until that moment let's disable protection for ASAN builds.

Note that we did not hit this behaviour before because LSAN only scans
memory allocated using malloc and regular slab cache uses mmap to get
memory.

Part of #7327

NO_CHANGELOG=internal
NO_DOC=internal

2ee15793

fiber: make madvise(2) arguments page aligned with ASAN slab cache · 130c7807

Nikolay Shirokovskiy authored 1 year ago

Regularly fiber stack slab is page aligned. So upper stack border is
page aligned too when stack grows down. But with ASAN friendly slab
cache implementation this border is not page aligned. As a result
madvise call on stack may zero memory beyond stack slab which will cause
heap corruption. In debug build corruption is detected by assertion:

NO_WRAP
 >  Fatal glibc error: malloc.c:2593 (sysmalloc): assertion failed: (old_top
 >  == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >=
 >  MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize
 >  - 1)) == 0)
NO_WRAP

Interestingly enough the issue can not be investigated using ASAN. The
memory is zeroed by kernel code which is not instrumented so it is
invisible for sanitizer.

Looks like non-ASAN builds are not affected. Even if stack_size is
not page aligned the slab allocated for stack is page aligned. Thus
memory zeroing will be inside the slab and there will be no memory
corruption.

Also when stack grows up lower stack border in not aligned even with
regular small implementation. So madvise call will fail with EINVAL as
it is required that start address is page aligned. We ignore the error
though. Let's fix this issue too while we at it.

Let's introduce fiber_madvise_aligned to align madvise range with proper
direction before calling madvise(2). To justify its usage note that
besides fixing the issues described above, in case of stack growing down
fiber->stack is page aligned and in case of stack growing up
fiber->stack + fiber->stack_size is page aligned.

Part of #7327

NO_TEST=tested by ASAN (debug build)
NO_CHANGELOG=has effect only with newly introduced ASAN friendly slab cache
NO_DOC=has effect only with newly introduced ASAN friendly slab cache

130c7807

fiber: don't unpoison fiber stack · 0784f7b7

Nikolay Shirokovskiy authored 1 year ago

The unpoison was added in the initial commit 1.7.2-68-gafd229393 that
supported ASAN. It is not clear why do we need it as we don't poison
stack memory manually.

Part of #7327

NO_TEST=removing unfunctional code
NO_CHANGELOG=removing unfunctional code
NO_DOC=removing unfunctional code

0784f7b7

test: tune tests hitting quota for ASAN · d456a986

Nikolay Shirokovskiy authored 1 year ago

ASAN small object allocator implementation has a bit different pattern
on quota leasing on allocating memory. So we may need to allocate more
objects to hit the quota etc.

Part of #7327

NO_CHANGELOG=test tuning
NO_DOC=test tuning

d456a986

iproto: use obuf API to check whether buffer is destroyed · ea4d19ec

Nikolay Shirokovskiy authored 1 year ago

The reason check is different for ASAN and regular versions of obuf.

Part of #7327

NO_DOC=internal
NO_CHANGELOG=internal
NO_TEST=<will be tested by asan-debug CI>

ea4d19ec

Oct 11, 2023

test: update gh_8083, gh_8445 and gh_7434 tests · 23b61351

Oleg Chaplashkin authored 1 year ago

These tests fail after the commit [1] has been added to the Luatest:

- app-luatest/gh_8083_fatal_signal_handler_test.lua
- app-luatest/gh_8445_crash_during_crash_report_test.lua
- box-luatest/gh_7434_yield_in_on_shutdown_trigger_test.lua

The issue is due to lack of necessary directories:

    sh: 1: cd: can't cd to /tmp/t/001_app-luatest/server-XXX

Just update tests on the simple `fio` module instead `luatest.server`.

[1] tarantool/luatest@7d1358c

NO_CHANGELOG=internal
NO_DOC=internal

23b61351

test: bump test-run to new version · f4bc53e8

Oleg Chaplashkin authored 1 year ago

Bump test-run to new version with the following improvements:

- luatest: bump luatest to 0.5.7-48-g18859f6 [1]
- Adapt use luatest with new --no-clean option [2]
- luatest: bump luatest to 0.5.7-49-g9c7710e [3]

[1] tarantool/test-run@aa3b34d
[2] tarantool/test-run@8ebb3aa
[3] tarantool/test-run@82542d3

NO_DOC=test
NO_TEST=test
NO_CHANGELOG=test

f4bc53e8

box: move on_shutdown triggers to the trigger registry · b7489dab

Andrey Saranchin authored 1 year ago

The commit moves on_shutdown triggers to the trigger registry. The
triggers set by C API and internal triggers remain unchanged - only Lua
user triggers are affected.

Changelog entry of #8657 is populated with box.ctl triggers and is
slightly improved.

Closes #8657

NO_DOC=later

b7489dab

core: get rid of trigger_fiber_run function · 19c1387d

Andrey Saranchin authored 1 year ago

Function trigger_fiber_run is used only for on_shutdown triggers and
uses internal structure run_list. This structure is another list but all
the triggers are popped from run_list instead of iteration because this
approach is safe when triggers are deleted from the list that is being run.
Also, new triggers are not inserted to run_list.

Since we are running only on_shutdown triggers, which won't be used after
they are fired, we can move all the triggers to an internal trigger
list (so that no new triggers will appear) and pop them instead of
iteration. So let's remove function trigger_fiber_run and run on_shutdown
core triggers in a new special function. Later, this new function will run
triggers from on_shutdown event as well.

Part of #8657

NO_TEST=no behavior changes
NO_CHANGELOG=later
NO_DOC=later

19c1387d

box: move all box.ctl triggers except for on_shutdown · 267b0877

Andrey Saranchin authored 1 year ago

The patch moves all triggers from box.ctl to module trigger instead of
on_shutdown trigger - they are run in separate fibers, which makes it more
difficult to move it to the event subsystem, so it will be moved there
in a separate commit.

Also, box_raft_on_broadcast triggers are renamed to box_raft_on_election.
Despite they are fired on broadcast, the only place they are installed
along the whole tarantool organization is box.ctl.on_election.

NO_DOC=later
NO_CHANGELOG=later

Part of #8657

267b0877

unit: fix undefined behaviour in prbuf test · 4a868563

Nikolay Shirokovskiy authored 1 year ago

The test start to fail in CI on osx_debug (x86_64) workflow

```
[033]  	*** test_buffer_foreach_copy_number ***
[033] -ok 13 - prbuf(size=256, payload=16, iterations=16) has been validated
[033] -ok 14 - prbuf(size=256, payload=16, iterations=32) has been validated
[033] -ok 15 - prbuf(size=256, payload=16, iterations=64) has been validated
[033] +ok 13 - prbuf(size=256, payload=4294967312, iterations=16) has been validated
[033] +ok 14 - prbuf(size=256, payload=4294967312, iterations=32) has been validated
[033] +ok 15 - prbuf(size=256, payload=4294967312, iterations=64) has been validated
[033]  	*** test_buffer_foreach_copy_number: done ***
```

NO_CHANGELOG=test fix
NO_DOC=test fix

4a868563

Oct 10, 2023

ci: exclude app-tap/tarantoolctl.test.lua from packaging test · fb64a241

Yaroslav Lobankov authored 1 year ago

Tarantool packages don't provide `tarantoolctl` since series 3.
So there is nothing to test.

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci

fb64a241

tarantoolctl: fix luarocks warnings issue · d6ae403e

Pavel Balaev authored 1 year ago

This patch fixes issue:

$ tarantoolctl rocks --version 1>/dev/null
Warning: failed to load command module luarocks.cmd.help

NO_DOC=bugfix
NO_CHANGELOG=not released yet

d6ae403e

sql: assign collation to indexes in CREATE TABLE · 65608d87

Mergen Imeev authored 1 year ago

Before this patch, if an index was created due to a column's UNIQUE
constraint or a column's PRIMARY KEY constraint before adding a
collation, and if the column's fieldno was not equal to the index's
position in space->index, the collation would not be assigned to the
index.

Also, this patch fixes an assertion in debug build for the case when an
index with more that one field was created before a collation was added.

Closes #9229

NO_DOC=bugfix

65608d87

box: mitigate the tuple.perftest gcc regression · ed21247b

Magomed Kostoev authored 1 year ago

Because of inlining rules some parts of comparators aren't optimized
properly by the gcc compiler, this causes a regression introduced by
the sort order implementation.

This patch introduces inline hints for the compiler in order to
mitigate the regression.

perf/tuple.cc test results (RelWithDebInfo, time in nanoseconds):

                             Tiger Lake

gcc 11.4.0:

                             Base    After #8915      Patched
       tuple_tuple_compare   40.5    41.5 (+2.5%)     39.4 (-2.7%)
  tuple_tuple_compare_hint   43.0    33.5 (-22.1%)    35.9 (-16.5%)

clang 14.0.0:

                             Base    After #8915      Patched
       tuple_tuple_compare   25.7    25.1 (-2.3%)     25.7
  tuple_tuple_compare_hint   33.1    32.5 (-1.8%)     33.1

                                Zen 3

gcc 11.4.0:

                             Base    After #8915      Patched
       tuple_tuple_compare   18.9    22.85 (+20.9%)   19.4 (+2.6%)
  tuple_tuple_compare_hint   24.25   22.95 (-5.4%)    23.5 (-3.1%)

clang 14.0.0:

                             Base    After #8915      Patched
       tuple_tuple_compare   17.3    17.0 (-1.7%)     17.0 (-1.7%)
  tuple_tuple_compare_hint   20.3    20.1 (-1.0%)     20.1 (-1.0%)

Closes #9216

NO_DOC=no code modification
NO_TEST=no code modification
NO_CHANGELOG=no code modification

ed21247b

config: add security.secure_erasing option · 61fbb31a

Vladimir Davydov authored 1 year ago

The new option is backed by `box.cfg.secure_erasing`. It is available
only in Enterprise Edition builds.

Needed for tarantool/tarantool-ee#540

NO_DOC=will be added to Enterprise Edition
NO_CHANGELOG=will be added to Enterprise Edition

61fbb31a

xlog: allow to extend inprogress xlog file cleanup · ef8b002f

Vladimir Davydov authored 1 year ago

We call xdir_collect_inprogress() at startup to clean up the xlog
directory of files left from the previous run. Let's rename it to
xdir_remove_temporary_files() and make it delete all files for which
the new callback function xlog_file_is_temporary() returns true. By
default, the callback returns true only for .inprogress files but it
can be overridden to make xdir_remove_temporary_files() delete other
kinds of files. This is required for thorough file deletion.

Needed for tarantool/tarantool-ee#540

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

ef8b002f

xlog: introduce xlog_remove_file function for removing xlog files · d139f245

Vladimir Davydov authored 1 year ago

This commit introduces the xlog_remove_file() function that removes
a file by name and logs the error on failure. We use this function
everywhere we delete xlog files so that there's a single place where we
call unlink(). We also factor out the core functionality to a callback
function that can be overridden. This will help us implement thorough
file deletion.

Needed for tarantool/tarantool-ee#540

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

d139f245

vinyl: delete run files in single coio call · 8a0c586c

Vladimir Davydov authored 1 year ago

Currently, vy_run_remove_files calls coio several times under the hood -
once per each run file and data directory. Apart from being inefficient,
this also prevents us from adding some extra logic for thorough file
deletion. So let's perform all the operations in a single coio call.

Needed for tarantool/tarantool-ee#540

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

8a0c586c

Oct 09, 2023

sql: drop struct drop_constraint_def · 8e60908e

Mergen Imeev authored 1 year ago

The structure is no longer used, so it is dropped.

Follow-up #9112

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

8e60908e

sql: introduce variations of DROP CONSTRAINT · 263777dc

Mergen Imeev authored 1 year ago

This patch introduces variations of DROP CONSTRAINT with a declared
constraint type.

Closes #9112

@TarantoolBot document
Title: upgrade of DROP CONSTRAINT

Now, instead of just `ALTER TABLE table DROP CONSTRAINT constraint;` we
have 8 operator variants:
1) Statement to drop PRIMARY KEY, UNIQUE, tuple FOREIGN NEY or tuple
CHECK constraints:
```
ALTER TABLE tab_name DROP CONSTRAINT constr_name;
```

This statement cannot drop a constraint if `constr_name` matches
more than one constraint.

2) Statement to drop field FOREIGN NEY or field CHECK constraints:
```
ALTER TABLE tab_name DROP CONSTRAINT field_name.constr_name;
```

This statement cannot drop a constraint if `constr_name` matches
more than one constraint for the `field_name` field.

3) Statement to drop PRIMARY KEY constraint:
```
ALTER TABLE tab_name DROP CONSTRAINT constr_name PRIMARY KEY;
```

4) Statement to drop UNIQUE constraint:
```
ALTER TABLE tab_name DROP CONSTRAINT constr_name UNIQUE;
```

5) Statement to drop tuple FOREIGN KEY constraint:
```
ALTER TABLE tab_name DROP CONSTRAINT constr_name FOREIGN KEY;
```

6) Statement to drop tuple CHECK constraint:
```
ALTER TABLE tab_name DROP CONSTRAINT constr_name CHECK;
```

7) Statement to drop field FOREIGN KEY constraint:
```
ALTER TABLE tab_name DROP CONSTRAINT field_name.constr_name FOREIGN KEY;
```

8) Statement to drop field CHECK constraint:
```
ALTER TABLE tab_name DROP CONSTRAINT field_name.constr_name CHECK;
```

263777dc

sql: disallow DROP CONSTRAINT for ambiguous name · 25bd19fa

Mergen Imeev authored 1 year ago

This patch prohibits DROP CONSTRAINT if more than one constraint matches
a given name.

Part of #9112

NO_DOC=will be added later
NO_CHANGELOG=will be added later

25bd19fa

sql: syntax construction to drop field constraints · 71566f7d

Mergen Imeev authored 1 year ago

This patch introduces "ALTER TABLE table_name DROP CONSTRAINT
field_name.constraint_name" which can be used to drop field constraints.
Also, after this patch, field constraints cannot be dropped using
"ALTER TABLE table_name DROP CONSTRAINT constraint_name;".

Part of #9112

NO_DOC=will be added later
NO_CHANGELOG=will be added later

71566f7d

box: use xregion_alloc() in mp_vformat_on_region() · 001fee1b

Mergen Imeev authored 1 year ago

This patch replaces region_alloc() by xregion_alloc() in
mp_vformat_on_region().

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

001fee1b

ci: add debug_asan_clang workflow · 980ad3f4

Nikolay Shirokovskiy authored 1 year ago

Similarly to release_asan_clang but to test debug build. It is also run
only under `asan-ci` and `full-ci` labels.

Fiber stack size is 2 times bigger than in the release workflow for luajit
tests to pass. Note that this factor is a wild guess.

Part of #7327

NO_TEST=ci
NO_CHANGELOG=ci
NO_DOC=ci

980ad3f4

test: fix flaky gh-2717-no-quit-sigint · 6f48b8d7

Nikolay Shirokovskiy authored 1 year ago

This test is quite a flaky in debug ASAN build. Let's fix it before
turning debug ASAN on in CI.

The issue is due to heavy load popen.read may return nil with 'TimedOut:
timed out' error. Just read again as in the other cases of this test.

Part of #7327

NO_CHANGELOG=internal
NO_DOC=internal

6f48b8d7

asan: temporary suppress leak reports relatead to luajit · 37d0fdbf

Nikolay Shirokovskiy authored 1 year ago

This blocks us from turning debug ASAN CI currently. The ticket for the
leakage is #9213.

Part of #7327

NO_TEST=internal
NO_CHANGELOG=internal
NO_DOC=internal

37d0fdbf

fiber: turn off max slice check with ASAN · 232c28f3

Nikolay Shirokovskiy authored 1 year ago

Introducing ASAN-friendly small allocators slows down execution notably.
As a result several tests start to fail due to hitting max slice limit.
I guess we don't interested if fibers in ASAN build grabs control for
too long as we have release build run in CI anyway.

Some tests set max slice limit explicitly to some large value thus
overwriting default infinity value for ASAN. Unfortunately this large
value is not large enough for ASAN. Let's set some really large value.

Part of #7327

NO_CHANGELOG=internal
NO_DOC=internal

232c28f3

ci: add crud integration test run · 1a0fcc07

Oleg Jukovec authored 1 year ago

The patch returns integration test run with `crud`. The test run was
removed earlier [1] because the `crud` did not support tests
with Tarantool 3.0. But now it supports [2].

1. https://github.com/tarantool/tarantool/commit/7316d8165e80b3678b45fd1b42823a8f92b734f6
2. https://github.com/tarantool/crud/pull/381

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci

1a0fcc07

perf: add memtx benchmark · 2b7d9027

Georgiy Lebedev authored 2 years ago

This first version is quite basic and only benchmarks random `get`s of
existing keys and `select`s of all keys for a tree index (these benchmarks
are needed for #6964) — its main goal is to provide a foundation (i.e., all
the necessary initialization logic) for benchmarking memtx. Extending this
benchmark using the provided memtx singleton and fixture should be fairly
simple.

The results of running this benchmark compiled with clang-16 on my Intel
MacBook Pro (13-inch, 2020) laptop [1]:

NO_WRAP
georgiy.lebedev@georgiy-lebedev perf % ./memtx.perftest --benchmark_min_warmup_time=10 --benchmark_repetitions=10 --benchmark_report_aggregates_only=true --benchmark_display_aggregates_only=true
2023-10-02T12:59:36+03:00
Running ./memtx.perftest
Run on (8 X 2000 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB
  L1 Instruction 32 KiB
  L2 Unified 512 KiB (x4)
  L3 Unified 6144 KiB
Load Average: 5.67, 10.05, 7.89
mapping 4398046511104 bytes for memtx tuple arena...
Actual slab_alloc_factor calculated on the basis of desired slab_alloc_factor = 1.090508
fiber has not yielded for more than 0.500 seconds
--------------------------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------------
MemtxFixture/TreeGetRandomExistingKeys_mean          682 ns          667 ns           10 items_per_second=1.51504M/s
MemtxFixture/TreeGetRandomExistingKeys_median        704 ns          693 ns           10 items_per_second=1.44387M/s
MemtxFixture/TreeGetRandomExistingKeys_stddev       81.7 ns         72.7 ns           10 items_per_second=169.696k/s
MemtxFixture/TreeGetRandomExistingKeys_cv          11.97 %         10.90 %            10 items_per_second=11.20%
MemtxFixture/TreeGet1RandomExistingKey_mean          253 ns          241 ns           10 items_per_second=4.20104M/s
MemtxFixture/TreeGet1RandomExistingKey_median        233 ns          229 ns           10 items_per_second=4.36911M/s
MemtxFixture/TreeGet1RandomExistingKey_stddev       46.7 ns         29.7 ns           10 items_per_second=464.187k/s
MemtxFixture/TreeGet1RandomExistingKey_cv          18.43 %         12.34 %            10 items_per_second=11.05%
MemtxFixture/TreeSelectAll_mean                  4766728 ns      4705622 ns           10 items_per_second=27.978M/s
MemtxFixture/TreeSelectAll_median                4605936 ns      4580478 ns           10 items_per_second=28.6184M/s
MemtxFixture/TreeSelectAll_stddev                 447495 ns       349499 ns           10 items_per_second=1.85573M/s
MemtxFixture/TreeSelectAll_cv                       9.39 %          7.43 %            10 items_per_second=6.63%
NO_WRAP

[1]: https://support.apple.com/kb/SP819?locale=en_US

Needed for #6964

NO_CHANGELOG=benchmark
NO_DOC=benchmark
NO_TEST=benchmark

2b7d9027

box: fix tuple format and access subsystems initialization · 24bb3553

Georgiy Lebedev authored 2 years ago

The tuple format and access subsystems have static variables holding their
states which don't get reset during cleanup: initialize them explicitly in
`*_init` functions — that way we can re-initialize these subsystems
multiple times (e.g., when setting up and tearing down benchmarks). Opted
for initializing them in ``*_init` functions rather than resetting them in
`*_free` functions for logical consistency.

Needed for #6964

NO_CHANGELOG=cleanup fix
NO_DOC=cleanup fix
NO_TEST=cleanup fix

24bb3553

box: fix force recovery for transactions with local rows · 85df1c96

Serge Petrenko authored 1 year ago

Force recovery first tries to collect all rows of a transaction into a
single list, and only then applies those rows.

The problem was that it collected rows based on the row replica_id. For
local rows replica_id is set to 0, but actually such rows can be part
of a transaction coming from any instance.

Fix recovery of such rows

Follow-up #8746
Follow-up #7932

NO_DOC=bugfix
NO_CHANGELOG=the broken behaviour couldn't be seen due to bug #8746

85df1c96

box: get rid of dummy NOPs after transactions ending with local rows · f5e52b2c

Serge Petrenko authored 1 year ago

In order to preserve transaction boundaries over replication, Tarantool
writes a global NOP row after the last transaction row, if this row
happens to be local. This is done to make sure that the is_commit flag,
which is set only in the last transaction row, reaches the replica. This
wouldn't happen if the last row was local.

This workaround works fine for transactions completely authored by one
instance: when both global and local rows come from operations of a
single master.

However, it's possible to append local rows to a remote master's
transaction on a replica. For example, one can use on_replace triggers
to write to replica's local space on each new transaction coming from
master.

In this case essentially a global NOP entry is added at the end of a
remote master's transaction. This leads to several problems.

First of all, this bumps replica's LSN, which is counter-intuitive,
given that the replica might even be read-only. Besides, in a star
topology this leads to master being unable to connect to the replica
later on due to their vclocks becoming incompatible.

Secondly, even if replication channel between master and replica is
bidirectional, it creates a new row which should be replicated from
replica to master, but at the same time is the last row of the master's
transaction. Once master receives this row, it breaks its connection to
replica due to transaction boundary violation (the last row of the
transaction is received without its beginning).

Adding a NOP row became extraneous since the previous commit, which made
relay find transaction boundaries by itself.

Closes #8958

NO_DOC=bugfix

f5e52b2c

relay: send rows transactionally · f96782b5

Serge Petrenko authored 1 year ago

Some time ago we started writing transaction boundaries to WAL and
respecting them in the replication stream: replicas wait for a full
transaction receipt before applying it.

However, during all these changes relay remained transaction-agnostic:
it simply read single rows from WAL and sent them over to the receiver.

This lead to a handful of ugly crutches: for example, tsn is not always
equal to the lsn of the first global row of the transaction: if the
first row is local, tsn is deduced from the first global row of the
transaction.

Also a dummy NOP was appended to the end of a transaction ending by a
local row, so that is_commit flag wasn't lost by the replication.

Let's make relay read a full transaction, filter out all the unnecessary
rows, set the transaction boundaries accordingly and then send the
transaction at once.

Since in relay a single fiber sends data to the remote peer, there is no
chance for a heartbeat to get in between rows of a single transaction:
they're all sent at once. Hence the deletion of a corresponding guard
`relay->is_sending_tx`.

Prerequisite #8958

NO_DOC=internal change
NO_CHANGELOG=internal change
NO_TEST=covered by existing tests

f96782b5

wal: fix transaction boundaries for replicated transactions · 099cb2da

Serge Petrenko authored 1 year ago

Transaction boundaries were not updated correctly for transactions in
which local space writes were made from a replication trigger. Existing
transaction boundaries and row flags from the master were written as is
on the replica. Actually, the replica should recalculate transaction
boundaries and even WAIT_SYNC/WAIT_ACK flags.

Transaction boundaries should be recalculated when a replica appends a
local write at the end of the master's transaction, and
WAIT_SYNC/WAIT_ACK should be overwritten when nopifying synchronous
transactions coming from an old term.

The latter fix has uncovered the bug in skipping outdated synchronous
transactions: if one replica replaces a transaction from an old term
with NOPs and then passes that transaction to the other replica, the
other replica raises a split brain error. It believes the NOPs are an
async transaction form an old term. This worked before the fix, because
the rows were written with the original WAIT_ACK = true bit. Now this
is fixed properly: we allow fully NOP async tranasctions from the old
term.

Closes #8746

NO_DOC=bugfix
NO_CHANGELOG=covered by the next commit

099cb2da

Oct 06, 2023

misc: add missing changelogs for trigger registry · 2d5e9874

Andrey Saranchin authored 1 year ago

The commit adds missing changelogs for tarantool.trigger.on_change and
triggers that were moved to the trigger registry. The second changelog
is especially important because it describes a breaking change of space
triggers behavior.

Follow-up #8664
Part of #8657

NO_TEST=changelog
NO_DOC=later

2d5e9874