- Oct 26, 2023
-
-
Nikolay Shirokovskiy authored
Regularly fiber stack slab is page aligned. So upper stack border is page aligned too when stack grows down. But with ASAN friendly slab cache implementation this border is not page aligned. As a result madvise call on stack may zero memory beyond stack slab which will cause heap corruption. In debug build corruption is detected by assertion: NO_WRAP > Fatal glibc error: malloc.c:2593 (sysmalloc): assertion failed: (old_top > == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= > MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize > - 1)) == 0) NO_WRAP Interestingly enough the issue can not be investigated using ASAN. The memory is zeroed by kernel code which is not instrumented so it is invisible for sanitizer. Looks like non-ASAN builds are not affected. Even if stack_size is not page aligned the slab allocated for stack is page aligned. Thus memory zeroing will be inside the slab and there will be no memory corruption. Also when stack grows up lower stack border in not aligned even with regular small implementation. So madvise call will fail with EINVAL as it is required that start address is page aligned. We ignore the error though. Let's fix this issue too while we at it. Let's introduce fiber_madvise_aligned to align madvise range with proper direction before calling madvise(2). To justify its usage note that besides fixing the issues described above, in case of stack growing down fiber->stack is page aligned and in case of stack growing up fiber->stack + fiber->stack_size is page aligned. Part of #7327 NO_TEST=tested by ASAN (debug build) NO_CHANGELOG=has effect only with newly introduced ASAN friendly slab cache NO_DOC=has effect only with newly introduced ASAN friendly slab cache (cherry picked from commit 130c7807)
-
Nikolay Shirokovskiy authored
The unpoison was added in the initial commit 1.7.2-68-gafd229393 that supported ASAN. It is not clear why do we need it as we don't poison stack memory manually. Part of #7327 NO_TEST=removing unfunctional code NO_CHANGELOG=removing unfunctional code NO_DOC=removing unfunctional code (cherry picked from commit 0784f7b7)
-
Nikolay Shirokovskiy authored
ASAN small object allocator implementation has a bit different pattern on quota leasing on allocating memory. So we may need to allocate more objects to hit the quota etc. Part of #7327 NO_CHANGELOG=test tuning NO_DOC=test tuning (cherry picked from commit d456a986)
-
Mergen Imeev authored
This patch removes some deprecated code. This code had no user-visible effect, but caused problems when running the test with ASAN enabled. Closes #8761 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit d63a4bf2)
-
Nikolay Shirokovskiy authored
Regular region implementation supports allocations of size 0 with no extra efforts. It returns a non-NULL pointer in this case. However in case of ASAN friendly implementation it will require a special care for this case. Instead let's avaid allocations if size 0 for region. Also use xregion_ macros for allocations. Our current policy is to panic on OOM on runtime allocations. Part of tarantool/tarantool#7327 NO_TEST=internal NO_CHANGELOG=internal NO_DOC=internal (cherry picked from commit 8159347d)
-
Nikolay Shirokovskiy authored
Small library currently depends on Tarantool core through 'exception.h'. This is not the way to go. Let's drop this dependency and instead of moving _xc functions to Tarantool repo we can just stop using them. Our current policy is to panic on OOM in case of runtime allocation. Part of #7327 NO_DOC=<OOM behaviour is not documented> NO_CHANGELOG=<no OOM expectations> NO_TEST=<no test harness for checking OOM> (cherry picked from commit 3fccfc8f)
-
Nikolay Shirokovskiy authored
They are rather noisy. Also delete debug log on arena creation. These two make sense only with each other. Part of #7327 NO_TEST=internal NO_DOC=internal NO_CHANGELOG=internal (cherry picked from commit 0dc37356)
-
Nikolay Shirokovskiy authored
Panic if we fail to allocate internal temporary objects on region. We do not test allocation failures and this should normally happen also (see #3534). Part of #8658 NO_DOC=code cleanup NO_TEST=code cleanup NO_CHANGELOG=code cleanup (cherry picked from commit b1a03a49)
-
Mergen Imeev authored
This patch replaces region_*() functions with xregion_*() functions. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 1ba84fe3)
-
Mergen Imeev authored
This patch removes the 'size' argument from macros, as it was only used to set an error on failure, which is not possible for x* versions. In addition, both macros now cast the value to the specified type, as is done in the original macros. Closes #8522 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal (cherry picked from commit ae02f0cd)
-
Mergen Imeev authored
This patch fixes SQL memory leaks found by static analyzers and SQL fuzzer. Part of tarantool/security#120 NO_DOC=fix for memleak NO_TEST=fix for memleak NO_CHANGELOG=fix for memleak (cherry picked from commit cd173ce5)
-
Nikolay Shirokovskiy authored
Proposed ASAN implementation of region allocator does not support double reservation for the sake of simplicity. Every reservation is supposed to be followed by one or more allocations. This restriction does not work well with mpstream currently. The issue is mpstream_init/mpstream_reserve do reservation of size 0. For example In case of region slab of min order is reserved (a chunk of memory of page size currently). If the first data we want to write to mpstream is larger then the reservation done then we make reservation again. Let's get rid of this reservation at the beginning as it is suboptimal behaviour. Moreover let's get rid of mpstream_reset as mpstream_init is lightweight and we can create a new mpstream instead of reusing exiting. Also while we at it avoid allocation of 0 size in mpstream_flush as it is done in mpstream_reserve_slow (see 3.0.0-alpha3-19-g8159347d0 "misc: avoid allocations of size 0 for region" for details). NO_TEST=internal NO_CHANGELOG=internal NO_DOC=internal (cherry picked from commit 3b1de78d)
-
Nikolay Shirokovskiy authored
This way we will have access to build info in those modules. In particularly build.asan flag is going to be used in buffer.lua in scope of #7327. Part of #7327 NO_TEST=internal NO_DOC=internal NO_CHANGELOG=internal (cherry picked from commit f58cc96f)
-
Nikolay Shirokovskiy authored
We already use this info in one of the test and going to use it more. Part of #7327 @TarantoolBot document Title: new tarantool.build.asan flag It is `true` if `ENABLE_ASAN` build option is set and `false` otherwise. (cherry picked from commit 23012356)
-
Vladimir Davydov authored
The check_param and check_param_table Lua helpers are defined in box/lua/schema.lua but used across the whole code base. The problem is we can't use them in files that are loaded before box/lua/schema.lua, like box/lua/session.lua. Let's move them to a separate source file lua/utils.lua to overcome this limitation. Also, let's add some tests. NO_DOC=refactoring NO_CHANGELOG=refactoring (cherry picked from commit d8d267c5)
-
Nikolay Shirokovskiy authored
We hit #3807 in release/2.11 for release ASAN build with ASAN-friendly small allocators. Follow-up #7327 NO_CHANGELOG=internal NO_DOC=internal (cherry picked from commit 3fbd7fcb)
-
- Oct 24, 2023
-
-
Vladimir Davydov authored
Configuring log modules work differently with log.cfg and box.cfg: box.cfg{log_modules=...} overwrites the current config completely while log.cfg{modules=...} overwrites the currently config only for the specified modules. Let's fix this inconsistency by making log.cfg behave exactly as box.cfg. Closes #7962 NO_DOC=bug fix (cherry picked from commit c13e59a5)
-
- Oct 20, 2023
-
-
Vladimir Davydov authored
We install a signal handler that prints the stack trace on SIGSEGV, SIGBUS, SIGILL, SIGFPE. The signal handler uses the current stack. This works fine for most issues, but not for stack overflow, because the latter makes the current stack unusable, leading to a crash in the signal handler. Let's install an alternative signal stack in each thread so that we can print the stack trace on stack overflow. Note that we skip this for ASAN because it installs its own signal stack. (Installing a custom stack would result in a crash.) Closes #9222 NO_DOC=bug fix (cherry picked from commit cb8e903b)
-
- Oct 17, 2023
-
-
Nikolay Shirokovskiy authored
The motivation is to reduce time slip on Tarantool startup before running init scripts. Internal ev time is set in fiber_init/ev_default_loop and is not get updated until starting event loop. This causes timeouts slip up to 0.3 in debug ASAN build in init script (see #9261). Let's run event loop right at the beginning of the run_script_f before executing any script. This way besides updating internal ev time we make an explicit place of starting script event loop. Currently it is started lazily when config script yields. This will fix CI for PR https://github.com/tarantool/tarantool-ee/pull/572 for debug ASAN workflow. We can also remove start_loop condition. It does not make sense now. It was added in the commit 3a851430 ("Fix tarantool -e "os.exit()" hang") but since then we start to stop event loop after handling os.exit(). Also this fixes #9266. The issue is we don't have an event loop to run on shutdown triggers if -e command line expression add such a trigger and then call os.exit(). Follow-up #7327 Closes #9266 NO_DOC=bugfix (cherry picked from commit 1fcfb8c2)
-
Pavel Balaev authored
This patch fixes issue: $ tarantoolctl rocks --version 1>/dev/null Warning: failed to load command module luarocks.cmd.help NO_DOC=bugfix NO_CHANGELOG=not released yet (cherry picked from commit d6ae403e)
-
- Oct 16, 2023
-
-
Vladimir Davydov authored
Tarantool supports two console protocols: text and binary. The binary protocol is implemented with IPROTO EVAL request so the console module reuses the net.box module to establish and maintain a binary connection. Currently, instead of passing the original URI specified by the user to net.box.connect as is, the console module parses the URI and passes the host and port. As a result, extra information that may be specified in URI parameters is lost. This prevents the user from connecting to the binary console using the SSL transport because to use the SSL transport the user must specify transport=ssl URI parameter. Needed for tarantool/tarantool-ee#567 NO_DOC=no visible changes in CE NO_TEST=no visible changes in CE NO_CHANGELOG=no visible changes in CE (cherry picked from commit 33e72567)
-
- Oct 13, 2023
-
-
Ilya Verbin authored
During building an index in background, some transaction can perform a dml request that affects space size (e.g. a replace), but the size will remain the same, because bsize is moved from the old space to the new space in memtx_space_prepare_alter() prior to space_execute_dml(). Fix this issue by calling space_finish_alter() in alter_space_do(). In fact, this patch partially reverts commit 9ec3b1a4 ("alter: zap space_vtab::commit_alter"). NO_DOC=bugfix Closes #9247 (cherry picked from commit 54a42186)
-
- Oct 12, 2023
-
-
Oleg Chaplashkin authored
These tests fail after the commit [1] has been added to the Luatest: - app-luatest/gh_8083_fatal_signal_handler_test.lua - app-luatest/gh_8445_crash_during_crash_report_test.lua - box-luatest/gh_7434_yield_in_on_shutdown_trigger_test.lua The issue is due to lack of necessary directories: sh: 1: cd: can't cd to /tmp/t/001_app-luatest/server-XXX Just update tests on the simple `fio` module instead `luatest.server`. [1] tarantool/luatest@7d1358c NO_CHANGELOG=internal NO_DOC=internal (cherry picked from commit 23b61351)
-
Oleg Chaplashkin authored
Bump test-run to new version with the following improvements: - luatest: bump luatest to 0.5.7-48-g18859f6 [1] - Adapt use luatest with new --no-clean option [2] - luatest: bump luatest to 0.5.7-49-g9c7710e [3] [1] tarantool/test-run@aa3b34d [2] tarantool/test-run@8ebb3aa [3] tarantool/test-run@82542d3 NO_DOC=test NO_TEST=test NO_CHANGELOG=test (cherry picked from commit f4bc53e8)
-
- Oct 11, 2023
-
-
Nikolay Shirokovskiy authored
The test start to fail in CI on osx_debug (x86_64) workflow ``` [033] *** test_buffer_foreach_copy_number *** [033] -ok 13 - prbuf(size=256, payload=16, iterations=16) has been validated [033] -ok 14 - prbuf(size=256, payload=16, iterations=32) has been validated [033] -ok 15 - prbuf(size=256, payload=16, iterations=64) has been validated [033] +ok 13 - prbuf(size=256, payload=4294967312, iterations=16) has been validated [033] +ok 14 - prbuf(size=256, payload=4294967312, iterations=32) has been validated [033] +ok 15 - prbuf(size=256, payload=4294967312, iterations=64) has been validated [033] *** test_buffer_foreach_copy_number: done *** ``` NO_CHANGELOG=test fix NO_DOC=test fix (cherry picked from commit 4a868563)
-
- Oct 10, 2023
-
-
Mergen Imeev authored
Before this patch, if an index was created due to a column's UNIQUE constraint or a column's PRIMARY KEY constraint before adding a collation, and if the column's fieldno was not equal to the index's position in space->index, the collation would not be assigned to the index. Also, this patch fixes an assertion in debug build for the case when an index with more that one field was created before a collation was added. Closes #9229 NO_DOC=bugfix (cherry picked from commit 65608d87)
-
Nikolay Shirokovskiy authored
Similarly to release_asan_clang but to test debug build. It is also run only under `asan-ci` and `full-ci` labels. Fiber stack size is 2 times bigger than in the release workflow for luajit tests to pass. Note that this factor is a wild guess. Part of #7327 NO_TEST=ci NO_CHANGELOG=ci NO_DOC=ci (cherry picked from commit 980ad3f4)
-
Vladimir Davydov authored
Required to suppress the ASAN leak detector. Closes #9158 NO_DOC=ASAN NO_TEST=ASAN NO_CHANGELOG=ASAN (cherry picked from commit bf62170f)
-
Nikolay Shirokovskiy authored
This test is quite a flaky in debug ASAN build. Let's fix it before turning debug ASAN on in CI. The issue is due to heavy load popen.read may return nil with 'TimedOut: timed out' error. Just read again as in the other cases of this test. Part of #7327 NO_CHANGELOG=internal NO_DOC=internal (cherry picked from commit 6f48b8d7)
-
Nikolay Shirokovskiy authored
This blocks us from turning debug ASAN CI currently. The ticket for the leakage is #9213. Part of #7327 NO_TEST=internal NO_CHANGELOG=internal NO_DOC=internal (cherry picked from commit 37d0fdbf)
-
- Oct 09, 2023
-
-
Serge Petrenko authored
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up #8746 Follow-up #7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug #8746 (cherry picked from commit 85df1c96)
-
Serge Petrenko authored
In order to preserve transaction boundaries over replication, Tarantool writes a global NOP row after the last transaction row, if this row happens to be local. This is done to make sure that the is_commit flag, which is set only in the last transaction row, reaches the replica. This wouldn't happen if the last row was local. This workaround works fine for transactions completely authored by one instance: when both global and local rows come from operations of a single master. However, it's possible to append local rows to a remote master's transaction on a replica. For example, one can use on_replace triggers to write to replica's local space on each new transaction coming from master. In this case essentially a global NOP entry is added at the end of a remote master's transaction. This leads to several problems. First of all, this bumps replica's LSN, which is counter-intuitive, given that the replica might even be read-only. Besides, in a star topology this leads to master being unable to connect to the replica later on due to their vclocks becoming incompatible. Secondly, even if replication channel between master and replica is bidirectional, it creates a new row which should be replicated from replica to master, but at the same time is the last row of the master's transaction. Once master receives this row, it breaks its connection to replica due to transaction boundary violation (the last row of the transaction is received without its beginning). Adding a NOP row became extraneous since the previous commit, which made relay find transaction boundaries by itself. Closes #8958 NO_DOC=bugfix (cherry picked from commit f5e52b2c)
-
Serge Petrenko authored
Some time ago we started writing transaction boundaries to WAL and respecting them in the replication stream: replicas wait for a full transaction receipt before applying it. However, during all these changes relay remained transaction-agnostic: it simply read single rows from WAL and sent them over to the receiver. This lead to a handful of ugly crutches: for example, tsn is not always equal to the lsn of the first global row of the transaction: if the first row is local, tsn is deduced from the first global row of the transaction. Also a dummy NOP was appended to the end of a transaction ending by a local row, so that is_commit flag wasn't lost by the replication. Let's make relay read a full transaction, filter out all the unnecessary rows, set the transaction boundaries accordingly and then send the transaction at once. Since in relay a single fiber sends data to the remote peer, there is no chance for a heartbeat to get in between rows of a single transaction: they're all sent at once. Hence the deletion of a corresponding guard `relay->is_sending_tx`. Prerequisite #8958 NO_DOC=internal change NO_CHANGELOG=internal change NO_TEST=covered by existing tests (cherry picked from commit f96782b5)
-
Serge Petrenko authored
Transaction boundaries were not updated correctly for transactions in which local space writes were made from a replication trigger. Existing transaction boundaries and row flags from the master were written as is on the replica. Actually, the replica should recalculate transaction boundaries and even WAIT_SYNC/WAIT_ACK flags. Transaction boundaries should be recalculated when a replica appends a local write at the end of the master's transaction, and WAIT_SYNC/WAIT_ACK should be overwritten when nopifying synchronous transactions coming from an old term. The latter fix has uncovered the bug in skipping outdated synchronous transactions: if one replica replaces a transaction from an old term with NOPs and then passes that transaction to the other replica, the other replica raises a split brain error. It believes the NOPs are an async transaction form an old term. This worked before the fix, because the rows were written with the original WAIT_ACK = true bit. Now this is fixed properly: we allow fully NOP async tranasctions from the old term. Closes #8746 NO_DOC=bugfix NO_CHANGELOG=covered by the next commit (cherry picked from commit 099cb2da)
-
- Oct 05, 2023
-
-
Nikolay Shirokovskiy authored
If non-terminal symbol is referenced in C code then destructor for expression is not called. Thus we don't need to duplicate. Otherwise we got a memory leak. See https://www.sqlite.org/cgi/src/doc/trunk/doc/lemon.html#destructor Close #9159 NO_DOC=bugfix NO_TEST=tested by debug ASAN CI (to be turned on) (cherry picked from commit 36ef3fb4)
-
- Oct 03, 2023
-
-
Nikolay Shirokovskiy authored
It is convenient to have a label to run ASAN CI without running full CI. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci (cherry picked from commit c0025ffb)
-
Sergey Bronnikov authored
Performance tests added to perf directory are not automated and currently we run these tests manually from time to time. From other side source code that used rarely could lead to software rot [1]. The patch adds CMake target "test-perf" and GitHub workflow, that runs these tests in CI. Workflow is based on workflow release.yml, it builds performance tests and runs them. 1. https://en.wikipedia.org/wiki/Software_rot NO_CHANGELOG=testing NO_DOC=testing NO_TEST=testing (cherry picked from commit 5edcb712)
-
Sergey Bronnikov authored
Note that targets for running performance tests are generated only when CMAKE_BUILD_TYPE is equal to Release or RelWithDebug. Additionally, C++ performance tests require Google Benchmark library. Using non-debug build and having installed Google Benchmark library is rare case, so I suppose we don't need to introduce CMake option for performance testing. NO_CHANGELOG=testing NO_DOC=testing NO_TEST=testing infrastructure (cherry picked from commit a63d291b)
-
Sergey Bronnikov authored
The patch adds a targets for each C performance test in a directory perf/ and a separate target "test-c-perf" that runs all C performance tests at once. NO_CHANGELOG=testing NO_DOC=testing NO_TEST=test infrastructure (cherry picked from commit 68623381)
-
Sergey Bronnikov authored
The patch adds a targets for each Lua performance test in a directory perf/lua/ (1mops_write_perftest, box_select_perftest, uri_escape_unescape_perftest) and a separate target "test-lua-perf" that runs all Lua performance tests at once. NO_CHANGELOG=testing NO_DOC=testing NO_TEST=test infrastructure (cherry picked from commit 49d9a874)
-