- Sep 28, 2022
-
-
Georgiy Lebedev authored
`struct memtx_story` has a `space` field, which is basically used to identify that a tuple is unlinked from the history chain in `memtx_tx_index_invisible_count_slow` (though this can be determined by its presence in the index) and is used to get the space's index in `memtx_tx_story_link_top` (though it can be retrieved from the older story's link field): remove this redundant field. Needed for #7343 NO_CHANGELOG=<refactoring> NO_DOC=<refactoring> NO_TEST=<refactoring> (cherry picked from commit 55e64a8d)
-
Georgiy Lebedev authored
When a space is deleted, all transactions need to be aborted and all their stories need to be removed immediately out of order: currently we artificially rollback statements — instead call this statement removal to logically distinguish it from rollback. It differs in the sense that the whole space's tuple history is teared down instead — no more transaction managing is going to be done as opposed to rollback of an individual transaction. Needed for #7343 NO_CHANGELOG=refactoring NO_DOC=refactoring NO_TEST=refactoring (cherry picked from commit 88203d4f)
-
Georgiy Lebedev authored
Follow `memtx_tx_history_{add, prepare}_{insert, delete}` pattern: split code responsible for rollbacking addition and deletion of a story into separate functions. Needed for #7343 NO_CHANGELOG=refactoring NO_DOC=refactoring NO_TEST=refactorin (cherry picked from commit 9dd27681)
-
Georgiy Lebedev authored
When a statement gets rollbacked, we need to remove delete statements attached to the story it adds by relinking them and making them delete an older story in the history chain: refactor this loop out into a separate function. Needed for #7343 NO_CHANGELOG=refactoring NO_DOC=refactoring NO_TEST=refactoring (cherry picked from commit 1da727f6)
-
Georgiy Lebedev authored
If a statement becomes prepared, the story it adds must be 'sunk' to the level of prepared stories: refactor this loop into a separate function. Needed for #7343 NO_CHANGELOG=refactoring NO_DOC=refactoring NO_TEST=refactoring (cherry picked from commit b25d3729)
-
- Sep 26, 2022
-
-
Vladislav Shpilevoy authored
If an update operation tried to insert a new key into a map or an array which was created by a previous update operation, then the process would fail an assertion. That was because the first operation was stored as a bar update. The second operation tried to branch it assuming that the entire bar update's JSON path must exist, but it wasn't so for the newly created part of the path. The solution is to fallback to branching earlier than the entire bar path ends, if can see that the next part of the path can't be found. Closes #7705 NO_DOC=bugfix (cherry picked from commit 8425ebfc)
-
- Sep 23, 2022
-
-
Georgiy Lebedev authored
TREE (HASH) index implements `random` method: if the space is empty from the transaction's perspective, which means we have to return nothing, add gap tracking of whole range (full scan tracking), since this result is equivalent to `index:select{}`, otherwise repeatedly call `random` and clarify result, until we get a non-empty one. We do not care about performance here, since all operations in context of transaction management currently have O(number of dirty tuples) complexity. Closes #7670 NO_DOC=bugfix (cherry picked from commit 1b82beb2)
-
Vladimir Davydov authored
This commit moves the code that gets the index of a random light record from the memtx hash index implementation to a new light method. This gives us more freedom of refactoring the light internals without modifying the code using it. After this change, LIGHT(pos_valid) isn't needed anymore so it's inlined in LIGHT(random). Needed for #7192 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 76add786)
-
Georgiy Lebedev authored
Since `key_def_merge` sets the merged key definition's unique part count equal to the new part count, the extra assignment in case the index is not unique is redundant: remove it. NO_CHANGELOG=<refactoring> NO_DOC=<refactoring> NO_TEST=<refactoring> (cherry picked from commit 1d6c92e5)
-
Georgiy Lebedev authored
If TREE index `get` result is empty, the key part count is incorrectly compared to the tree's `cmp_def->part_count`, though it should be compared with `cmp_def->unique_part_count`. But we can actually assume that by the time we get to the index's `get` method the part count is equal to the unique part count (partial keys are rejected and `get` is not supported for non-unique indexes): change check to correct assertion. Closes #7685 NO_DOC=<bugfix> (cherry picked from commit bfcd8ca7)
-
- Sep 21, 2022
-
-
Boris Stepanenko authored
Replaced assertions, that no one started new elections/promoted while acquiring limbo, with checks that raft term and limbo term didn't change. In case they did - don't write DEMOTE/PROMOTE and just release limbo, because it's already owned/will soon be by someone else. Closes #7086 NO_DOC=Bugfix (cherry picked from commit 8ee0e434)
-
- Sep 16, 2022
-
-
Ilya Verbin authored
Currently, it is possible to create a constraint with a name that does not match the rules for identifiers. Fix this by validating them by identifier_check. Closes #7201 NO_DOC=bugfix NO_CHANGELOG=minor bug (cherry picked from commit 1d00b544)
-
- Sep 15, 2022
-
-
Yaroslav Lobankov authored
Bump test-run to new version with the following improvements: - Improve getting iproto port for tarantool < 2.4.1 [1] [1] https://github.com/tarantool/test-run/pull/349 NO_DOC=testing stuff NO_TEST=testing stuff NO_CHANGELOG=testing stuff (cherry picked from commit 4668db62)
-
Ilya Verbin authored
Introduce cmake option ENABLE_HARDENING, which is TRUE by default for non-debug regular and static builds, excluding AArch64 and FreeBSD. It passess compiler flags that harden Tarantool (including the bundled libraries) against memory corruption attacks. The following flags are passed: * -Wformat - Check calls to printf and scanf, etc., to make sure that the arguments supplied have types appropriate to the format string specified. * -Wformat-security -Werror=format-security - Warn about uses of format functions that represent possible security problems. And make the warning into an error. * -fstack-protector-strong - Emit extra code to check for buffer overflows, such as stack smashing attacks. * -fPIC -pie - Generate position-independent code (PIC). It allows to take advantage of the Address Space Layout Randomization (ASLR). * -z relro -z now - Resolve all dynamically linked functions at the beginning of the execution, and then make the GOT read-only. Also do not disable hardening for Debian and RPM-based Linux distros. Closes #5372 Closes #7536 NO_DOC=build NO_TEST=build (cherry picked from commit e6abe1c9)
-
Georgiy Lebedev authored
`directly_replaced` stories can potentially get garbage collected in `memtx_tx_handle_gap_write`, which is unexpected and leads to 'use after free': in order to fix this, limit garbage collection points only to external API calls. Wrap all possible garbage collection points with explicit warnings (see c9981a56). Closes #7449 NO_DOC=bugfix (cherry picked from commit 18e042f5)
-
- Sep 14, 2022
-
-
Alexander Turenko authored
All merge sources (including the merger itself) share the same `<merge source>:pairs()` implementation, which returns `gen, param, state` triplet. `gen` is `lbox_merge_source_gen()`, `param` is `nil`, `state` in the merge source. The `lbox_merge_source_gen()` returns `source, tuple`. The returned source is supposed to be the same object as a one passed to the function (`gen(param, state)`), so the function assumes the object as alive and don't increment source's refcounter at entering, don't decrease it at exitting. This logic is perfect, but there was a mistake in the implementation: the function returns a new cdata object (which holds the same pointer to the merge source structure) instead of the same cdata object. The new cdata object neither increases the source's refcounter at pushing to Lua, nor decreases it at collecting. At result, if we'll loss the original merge source object (and the first `state` that is returned from `:pairs()`), the source structure may be freed. The pointer in the new cdata object will be invalid so. A sketchy code that illustrates the problem: ```lua gen, param, state0 = source:pairs() assert(state0 == source) source = nil state1, tuple = gen(param, state0) state0 = nil -- assert(state1 == source) -- would fails collectgarbage() -- The cdata object that is referenced as `source` and as `state` -- is collected. The GC handler is called and dropped the merge -- source structure refcounter to zero. The structure is freed. -- The call below will crash. gen(param, state1) ``` In the fixed code `state1 == source`, so the GC handler is not called prematurely: we have the merge source object alive till the end of the iterator or till the stop of the traversal. Fixes #7657 NO_DOC=a crash is definitely not what we want to document (cherry picked from commit 3bc64229)
-
- Sep 13, 2022
-
-
Yaroslav Lobankov authored
- Remove unused imports - Remove unnecessary creation of 'replica' instance objects - Use `<instance>.iproto.uri` object attribute instead of calling `box.cfg.listen` via admin connection NO_DOC=testing stuff NO_TEST=testing stuff NO_CHANGELOG=testing stuff (cherry picked from commit d13b06bd)
-
Yaroslav Lobankov authored
Bump test-run to new version with the following improvements: - Report job summary on GitHub Actions [1] - Free port auto resolving for TarantoolServer and AppServer [2] Also, this patch includes the following changes: - removing `use_unix_sockets` option from all suite.ini config files due to permanent using Unix sockets for admin connection recently introduced in test-run - switching replication-py tests to Unix sockets for iproto connection - fixing replication-py/swap.test.py and swim/swim.test.lua tests [1] tarantool/test-run#341 [2] tarantool/test-run#348 NO_DOC=testing stuff NO_TEST=testing stuff NO_CHANGELOG=testing stuff (cherry picked from commit 4335b442)
-
Georgiy Lebedev authored
When conflicting transactions that made full scans in `memtx_tx_handle_gap_write`, we need to also track that the conflicted transaction has read the inserted tuple, just like we do in gap tracking for ordered indexes — otherwise another transaction can overwrite the inserted tuple in which case no gap tracking will be handled. Closes #7493 NO_DOC=bugfix (cherry picked from commit 7f52f445)
-
- Sep 12, 2022
-
-
Vladimir Davydov authored
strerror() is MT-Unsafe, because it uses a static buffer under the hood. We should use strerror_r() instead, which takes a user-provided buffer. The problem is there are two implementations of strerror_r(): XSI and GNU. The first one returns an error code and always writes the message to the beginning of the buffer while the second one returns a pointer to a location within the buffer where the message starts. Let's introduce a macro HAVE_STRERROR_R_GNU set if the GNU version is available and define tt_strerror() which writes the message to the static buffer, like tt_cstr() or tt_sprintf(). Note, we have to export tt_strerror(), because it is used by Lua via FFI. We also need to make it available in the module API header, because the say_syserror() macro uses strerror() directly. In order to avoid adding tt_strerror() to the module API, we introduce an internal helper function _say_strerror(), which calls tt_strerror(). NO_DOC=bug fix NO_TEST=code is covered by existing tests (cherry picked from commit 44f46dc8)
-
- Sep 09, 2022
-
-
Alexander Turenko authored
In brief: `vfork()` on Mac OS 12 and newer doesn't suspend the parent process, so we should wait for `setpgrp()` to use `killpg()`. See more detailed description of the problem in a comment of the `popen_wait_group_leadership()` function. The solution is to spin in a loop and check child's process group. It looks as the most simple and direct solution. Other possible solutions requires to estimate cons and pros of using extra file descriptor or assigning a signal number for the child -> parent communication. There are the following alternatives and variations: * Create a pipe and notify the parent from the child about the `setpgrp()` call. It costs extra file descriptor, so I decided to don't do that. However if we'll need some channel to deliver information from the child to the parent for another task, it'll worth to reimplement this function too. One possible place, where we may need such channel is delivery of child's errors to the parent. Now the child writes them directly to logger's fd and it requires some tricky code to keep and close the descriptor at right points. Also it doesn't allow to catch those errors in the parent, but we may need it for #4925. * Notify the parent about `setpgrp()` using a signal. It seems too greedly to assign a specific signal for such local problem. It is also unclear how to guarantee that it'll not break any user's code: a user can load a dynamic library, which uses some signals on its own. However we can consider using this approach here if we'll design some common interprocess notification system. * We can use the fiber cond or the `popen_wait_timeout()` function from PR #7648 to react to the child termination instantly. It would complicate the code and anyway wouldn't allow to react instantly on `setpgrp()` in the child. Also it assumes yielding during the wait (see below). * Wait until `setpgrp()` in `popen_send_signal()` instead of `popen_new()`. It would add yielding/waiting inside `popen_send_signal()` and likely will extend a set of its possible exit situations. It is undesirable: this function should have simple and predictable behavior. * Finally, we considered yielding in `popen_wait_group_leadership()` instead of sleeping the whole tx thread. `<popen handle>:new()` doesn't yield at the moment and a user's code may lean on this fact. Yielding would allow to achieve better throughtput (amount of parallel requests per second), but we don't take much care to performance on Mac OS. The primary goal for this platform is to offer the same behavior as on Linux to allow development of applications. I didn't replace `vfork()` with `fork()` on Mac OS, because `vfork()` works and I don't know consequences of calling `pthread_atfork()` handlers in a child created by popen. See the comment in `popen_new()` near to `vfork()` call: it warns about possible mutex double locks. This topic will be investigated further in #6674. Fixes #7658 NO_DOC=fixes incorrect behavior, no need to document the bug NO_TEST=already tested by app-tap/popen.test.lua (cherry picked from commit e2207fdc)
-
- Sep 07, 2022
-
-
Vladislav Shpilevoy authored
If a node persisted a foreign term + vote request at the same time, it increased split-brain probability. A node could vote for a candidate having smaller vclock than the local one. For example, via the following scenario: - Node1, node2, node3 are started; - Node1 becomes a leader; - The topology becomes node1 <-> node2 <-> node3 due to network issues; - Node1 sends a synchro txn to node2. The txn starts a WAL write; - Node3 bumps term and votes for self. Sends it all to node2; - Node2 votes for node3, because their vclocks are equal; - Node2 finishes all pending WAL writes, including the txn from node1. Now its vclock is > node3's one and the vote was wrong. - Node3 wins, writes PROMOTE, and it conflicts with node1 writing CONFIRM. This patch makes so a node can't persist a vote in a new term in the same WAL write as the term bump. Term bump is written first and alone. It serves as a WAL sync after which the node's vclock is not supposed to change except for the 0 (local) component. The vote requests are re-checked after term bump is persisted to see if they still can be applied. Part of #7253 NO_DOC=bugfix (cherry picked from commit c9155ac8)
-
Vladislav Shpilevoy authored
If the limbo was fenced during CONFIRM WAL write, then the confirmed txn was committed just fine, but its author-fiber kept hanging. This is because when it was woken up, it checked if the limbo is frozen and went to infinite waiting before actually checking if the txn is completed. The fiber would unfreeze if would be woken up explicitly as a workaround. The fix is simple - change the checks order. Part of #7253 NO_DOC=bugfix (cherry picked from commit ec628100)
-
Vladislav Shpilevoy authored
box.ctl.promote() bumps the term, makes the node a candidate, and waits for the term outcome. The waiting used to be until there is a leader elected or the node lost connection quorum or the term was bumped again. There was a bug that a node could hang in box.ctl.promote() even when became a voter. It could happen if the quorum was still there and a leader couldn't be elected in the current term at all. For instance, others could have `election_mode='off'`. The fix is to stop waiting for the term outcome if the node can't win anyway. NO_DOC=bugfix (cherry picked from commit ab08dad9)
-
Vladislav Shpilevoy authored
If box.ctl.promote() was called on more than one instance, then it could lead to infinite or extremely long elections bumping thousands of terms in just a few seconds. This was because box.ctl.promote() used to be a loop. The loop retried term bump + voted for self until the node won. Retry happened immediately as the node saw the term was bumped again and there was no leader elected or the connection quorum was lost. If 2 nodes would start box.ctl.promote() almost at the same time, they could bump each other's terms, not see any winner, bump them again, and so on. For example: - Node1 term=1, node2 term=2; - Promote is called on both; - Node1 term=2, node2 term=3. They receive the messages. Node2 ignores node1's old term. Node1 term is bumped and it votes for node2, but it didn't win, so box.ctl.promote() bumps its term to 4. - Node2 receives term 4 from node1. Its own box.ctl.promote() sees the term was bumped and no winner, so it bumps it to 5 and the process continues for a long time. It worked good enough in tests - the problem happened sometimes, terms could roll like 80k times in a few seconds, but the tests ended fine anyway. One of the next commits will make term bump + vote written in separate WAL records. That aggravates the problem drastically. Basically, this mutual term bump loop could end only if one node would receive vote for self from another node and send back the message 'I am a leader' before the other node's box.ctl.promote() notices the term was bumped externally. This will get much harder to achieve. The patch simply drops the loop. Let box.ctl.promote() fail if the term was bumped outside. There was an alternative to keep running it in a loop with a randomized election timeout like it works inside of raft. But the current solution is just simpler. NO_DOC=bugfix NO_TEST=election_split_vote_test.lua catches it already (cherry picked from commit dd89c57e)
-
- Sep 06, 2022
-
-
Sergey Vorontsov authored
Add the redos_7.3.yml workflow to build Tarantool packages (x86_64) for the RedOS 7.3 system. Packages are created by https://github.com/packpack/packpack. NO_DOC=ci NO_TEST=ci (cherry picked from commit a6b48f14)
-
- Sep 05, 2022
-
-
Ilya Grishnov authored
Supplemented the implementation of the `src/lib/uri` parser. Before this fix a call `uri.parse(uri.format(uri.parse(3301)))` returned an error of 'Incorrect URI'. Now this call return correct `service: '3301'`. As a result, the possibility of using host=localhost by default for `tarantoolctl connect` has been restored now. As well as for `console.connect`. Fixes #7479 NO_DOC=bugfix (cherry picked from commit 96d8dcec)
-
Alexander Turenko authored
This commit pursues several goals: * Eliminate unused parameter/variable warnings at building module_api.c in non-debug configuration. The problem was introduced in commit 5c1bc3da ("decimal: add the library into the module API"). * Eliminate a need to check newly added tests in two build configurations (Debug and RelWithDebInfo) and don't forget to add `(void)x;` statements in addition to a test condition check. * Fail the testing if conditions required by the app-tap/module_api.test.lua test are not met -- not only in the Debug build, but also in RelWithDebInfo. Fixes #7625 NO_DOC=a change in a test, purely development matter NO_CHANGELOG=see NO_DOC (cherry picked from commit aaf3bf91)
-
Ilya Verbin authored
Currently this script causes 100% CPU usage for 10 sec, because os.exit() infinitely yields to the scheduler until on_shutdown fiber completes and breaks the event loop. Fix this by a sleep. ``` box.ctl.set_on_shutdown_timeout(100) box.ctl.on_shutdown(function() require('fiber').sleep(10) end) os.exit() ``` Closes #6801 NO_DOC=bugfix NO_TEST=don't know how to catch this by a test Co-authored-by:
Georgy Moshkin <louielouie314@gmail.com> (cherry picked from commit 6d91e44b)
-
Ilya Verbin authored
When Tarantool is stopped by Ctrl+D or by reaching the end of the script, run_script_f() breaks the event loop, then tarantool_exit() is called from main(), however the fibers that execute on_shutdown triggers can not be longer scheduled, because the event loop is already stopped. Fix this by starting an auxiliary event loop for such cases. Closes #7434 NO_DOC=bugfix (cherry picked from commit cdd5674c)
-
- Sep 02, 2022
-
-
Vladimir Davydov authored
This reverts commit 0c3f9b37. If log_destroy and log_boot use the same fd (STDERR_FILENO), say() called after say_logger_free() will write to a closed fd. What's worse, the fd may be reused, in which case say() will write to a completely unrelated file or socket (maybe a data file!). This is what happened with flightrec - flightrec finalization info message was written to an xlog file. Let's move say_logger_free() back to where it belongs - after other subsystem has been finalized. Since 2.10.2 was released, this commit also adds a changelog. Reopens #4450 Needed for https://github.com/tarantool/tarantool-ee/issues/223 NO_DOC=bug fix NO_TEST=revert (cherry picked from commit 5cb688ed)
-
- Sep 01, 2022
-
-
Kirill Yukhin authored
Generate changelog for 2.10.2 release. Also, clean changelogs/unreleased folder. NO_DOC=no code changes NO_TEST=no code changes NO_CHANGELOG=no code changes
-
Pavel Semyonov authored
Fix wording, punctuation, and formatting. NO_CHANGELOG=changelog NO_DOC=changelog NO_TEST=changelog
-
- Aug 31, 2022
-
-
Nikolay Shirokovskiy authored
Non privileged user (thru public role) has write access to _truncate table in order to be able to perform truncates on it's tables. Normally it should be able to modify records only for the tables he has write access. Yet now due to bootstrap check it is not so. Closes tarantool/security#5 NO_DOC=bugfix (cherry picked from commit 941318e7)
-
Nikolay Shirokovskiy authored
Simple part is a part without any extra key besides 'field' and 'type'. Let's make a check in try_simplify_index_parts itself. NO_TEST=refactoring NO_DOC=refactoring NO_CHANGELOG=refactoring (cherry picked from commit bc0872fd)
-
Nikolay Shirokovskiy authored
If index parts are specified using old syntax like: parts = {1, 'number', 2, 'string'}, then (except if parts count is 1) index options set in space format are not taken into account. Solution is to continue after parsing 1.6.0 style parts so to use code that check format options. Closes #7614 NO_DOC=bugfix (cherry picked from commit 91ba0a59)
-
- Aug 30, 2022
-
-
Nikita Zheleztsov authored
Currently internal tarantool fibers can be cancelled from the user's app, which can lead to critical errors. Let's mark these fibers as a system ones in order to be sure that they won't be cancelled from the Lua world. Closes #7448 Closes #7473 NO_DOC=minor change (cherry picked from commit 3733ff25)
-
Nikita Zheleztsov authored
There are a number of internal system fibers which are not supposed to be cancelled. Let's introduce `FIBER_IS_SYSTEM` flag that will indicate, if the fiber can be explicitly killed. If this flag is set, killing functions will just ignore cancellation request. This commit introduce blocking system fiber cancelling only from the Lua public API, as it is more important to have it right. The prohibition to cancel fibers from C API will be introduced later. Related to #7448 Part of #7473 NO_DOC=internal NO_TEST=will be added in subsequent commit NO_CHANGELOG=internal (cherry picked from commit 3a18a9bf)
-
- Aug 26, 2022
-
-
Yaroslav Lobankov authored
The `ubuntu-18.04` environment is deprecated, so let's switch to `ubuntu-latest` where it is safe. For more details see [1]. [1] https://github.com/actions/virtual-environments/issues/6002 NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci (cherry picked from commit 4572a584)
-
- Aug 25, 2022
-
-
Aleksandr Lyapunov authored
In commit (35334ca1) qsort was fixed but unfortunately a small typo was introduced. Due to that typo the qsort made its job wrong. Fix the problem and add unit test for qsort. Unfortunately the test right from the issue runs extremely long, so it should go to long-tests. Closes #7605 NO_DOC=bugfix (cherry picked from commit e1d96170)
-