- Jul 04, 2022
-
-
Boris Stepanenko authored
Problem with W214 in src/box/lua/net_box.lua was fixed in previous commit. Can bump luacheck version now. NO_DOC=testing NO_TEST=testing NO_CHANGELOG=testing
-
Boris Stepanenko authored
Since 0.26.0 luacheck emits a warning on the `_box` variable. From luacheck v.0.26.0 release notes: "Function arguments that start with a single underscore get an "unused hint". Leaving them unused doesn't result in a warning. Using them, on the other hand, is a new warning (№ 214)." Renamed `_box` to `__box`, which isn't considered unused. Closes #7304. NO_DOC=testing NO_TEST=testing NO_CHANGELOG=testing
-
Vladimir Davydov authored
We must not throw exceptions from C code. Currently, there's the only C function that uses diag_raise() - it's space_cache_find_xc. We move it under ifdef(__cplusplus). Follow-up #4735 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Vladimir Davydov authored
current_session() is called from C code so it must not throw, but it may if it fails to allocate a session. Practically, this is hardly possible, because we don't limit the runtime arena, which is used for allocation of session objects. Still, this looks potentially dangerous. Gracefully handling an allocation failure in all places where current_session() may be called would be complicated. Since it's more of a theoretical issue, let's panic on a session allocation error, like we do if we fail to allocate other mission critical system objects. Closes #4735 NO_DOC=code health NO_TEST=code health NO_CHANGELOG=code health
-
Vladimir Davydov authored
The functions allocate and free a session so they should be called new/delete, not create/destroy accroding to our naming convention. While we are at it, also delete obsoleve comments to these functions: they don't invoke session triggers. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Vladimir Davydov authored
C++ features are not used in this file. Note, we need to move ifdef(__cplusplus) in user.h to make guest_user and admin_user variables accessible from C code. Also, we need to move initialization of session_vtab_registry to session_init(), because most C compilers don't allow to initialize a global variable with a value of another global variable. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
Our txn_limbo_is_replica_outdated check works correctly only when there is a stream of PROMOTE requests. Only the author of the latest PROMOTE is writable and may issue transactions. No matter synchronous or asynchronous. So txn_limbo_is_replica_outdated assumes that everyone but the node with the greatest PROMOTE/DEMOTE term is outdated. This isn't true for DEMOTE requests. There is only one server which issues the DEMOTE request, but once it's written, it's fine to accept asynchronous transactions from everyone. Now the check is too strict. Every time there is an asynchronous transaction from someone, who isn't the author of the latest PROMOTE or DEMOTE, replication is broken with ER_SPLIT_BRAIN. Let's relax it: when limbo owner is 0, it's fine to accept asynchronous transactions from everyone, no matter the term of their latest PROMOTE and DEMOTE. This means that now after a DEMOTE we will miss one case of true split-brain: when old leader continues writing data in an obsolete term, and the new leader first issues PROMOTE and then DEMOTE. This is a tradeoff for making async master-master work after DEMOTE. The completely correct fix would be to write the term the transaction was written in with each transaction and replace txn_limbo_is_replica_outdated with txn_limbo_is_request_outdated, so that we decide whether to filter the request or not judging by the term it was applied in, not by the term we seen in some past PROMOTE from the node. This fix seems too costy though, given that we only miss one case of split-brain at the moment when the user enables master-master replication (by writing a DEMOTE). And in master-master there is no such thing as a split-brain. Follow-up #5295 Closes #7286 NO_DOC=internal chcange
-
Serge Petrenko authored
Currently there's only one place where applier_synchro_filter_tx accesses limbo state under a latch: this place is txn_limbo_is_replica_outdated. Soon there will be more accesses to limbo parameters and all of them should be guarded as well. Let's simplify things a bit and guard the whole synchro_filter_tx with the limbo latch. While we are at it remove txn_limbo_is_replica_outdated as not needed anymore. Part-of #7286 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
Starting with commit deca9749 ("replication: unify replication filtering with and without elections") The filter works always, even when elections are turned off. Reflect that in the comments for applier_synchro_filter_tx and txn_limbo_is_replica_outdated. Follow-up #6133 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
- Jul 01, 2022
-
-
Yaroslav Lobankov authored
The 'small' lib test suite was not run for out-of-source builds since the wrong symlink was created for test binaries and test-run couldn't find them. Now it is fixed. When test-run loads tests, first, it searches the suite.ini file and if it exists test-run consider the dir as a test suite. So there was sense to create a permanent link for 'small' lib tests. Closes #4485 NO_DOC=testing stuff NO_TEST=testing stuff NO_CHANGELOG=testing stuff
-
Yaroslav Lobankov authored
Disable tests while building packages in the reusable_build.yml workflow to speed up the build process in integration testing for tarantool and modules/connectors. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Yaroslav Lobankov authored
Sometimes we need to disable testing while building deb/rpm packages to speed up the build process. Now it is possible via `MAKE_CHECK` env var. By default, testing is on, but if one defines `MAKE_CHECK=false`, tests will be off. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Yaroslav Lobankov authored
This patch adds the tzdata package as a dependency for DEB/RPM tarantool package since some tarantool datetime functionality needs this. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Yaroslav Lobankov authored
Now the test-run dependencies (pyyaml, gevent) have the corresponding deb packages installable via the 'apt' package manager and finally it's time to enable running tests in the package build process. Closes #1341 NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Vladimir Davydov authored
Vinyl doesn't support the hot standby mode. There's a ticket to implement it, see #2013. The behavior is undefined if running an instance in the hot standby mode in case the master has Vinyl spaces. It may result in a crash or even data corruption. Let's raise an explicit error in this case. Closes #6565 NO_DOC=bug fix
-
Vladimir Davydov authored
Since commit d2537d9d ("relay: cleanup error handling") recover_remaining_wals() doesn't log the error it throws - now callers of this function should catch and log the error. hot_standby_f() doesn't catch the error so the diagnostic message is lost if we fail to apply a row in the hot standby mode. Fix this. NO_DOC=bug fix NO_TEST=checked in next commit NO_CHANGELOG=minor bug in logging
-
Vladimir Davydov authored
If a nested tuple field is indexed, it can be accessed by [*] aka multikey or any token: s = box.schema.create_space('test') s:create_index('pk') s:create_index('sk', {parts = {{2, 'unsigned', path = '[1][1]'}}}) t = s:replace{1, {{1}}} t['[2][1][*]'] -- returns 1! If a nested field isn't indexed (remove creation of the secondary index in the example above), then access by [*] returns nil. Call graph: lbox_tuple_field_by_path: tuple_field_raw_by_full_path tuple_field_raw_by_path tuple_format_field_by_path json_tree_lookup_entry json_tree_lookup And json_tree_lookup matches the first node if the key is [*]. We shouldn't match anything to [*]. Closes #5226 NO_DOC=bug fix
-
- Jun 30, 2022
-
-
Boris Stepanenko authored
__gcov_flush was removed in gcc11. Since gcc11 __gcov_dump calls __gcov_lock at the start and __gcov_unlock before returning. Same is true for __gcov_reset. Because of that using __gcov_reset right after __gcov_dump since gcc11 is the same as using __gcov_flush before gcc11. Closes #7302 NO_CHANGELOG=internal NO_DOC=internal NO_TEST=internal
-
Boris Stepanenko authored
Covered most of box_promote and box_demote with tests: 1. Promote/demote unconfigured box 2. Promoting current leader with elections on and off 3. Demoting follower with elections on and off 4. Promoting current leader, but not limbo owner with elections on 5. Demoting current leader with elections on and off 6. Simultaneous promote/demote 7. Promoting voter 8. Interfering promote/demote while writing new term to wal 9. Interfering promote/demote while waiting for synchro queue to be emptied 10. Interfering promote while waiting for limbo to be acked (similar to replication/gh-5430-qsync-promote-crash.test.lua) Closes #6033 NO_DOC=testing stuff NO_CHANGELOG=testing stuff
-
Serge Petrenko authored
The test failed with the following output: TAP version 13 1..3 # Started on Tue Jun 28 13:36:03 2022 # Starting group: pre-vote not ok 1 pre-vote.test_no_direct_connection # .../election_pre_vote_test.lua:46: expected: a value evaluating to true, actual: false # stack traceback: # .../election_pre_vote_test.lua:65: in function 'retrying' # .../election_pre_vote_test.lua:64: in function 'pre-vote.test_no_direct_connection' # ... # [C]: in function 'xpcall' ok 2 pre-vote.test_no_quorum ok 3 pre-vote.test_promote_no_quorum # Ran 3 tests in 6.994 seconds, 2 succeeded, 1 failed This is the moment when one of the followers disconnects from the leader and expects its `box.info.election.leader_idle` to grow. It wasn't taken into account that this disconnect might lead to leader resign due to fencing, and then a new leader would emerge and `leader_idle` would still be small. IOW, the leader starts with fencing turned off, and only resumes fencing, once it has connected to a quorum of nodes (one replica in this test). If the replica that we just connected happens to be the one we disconnect in the test, the leader might fence, if it hasn't yet connected to the other replica, because it immediately loses a quorum of healthy connections right after gaining it for the first time. Fix this by waiting until everyone follows everyone before each test case. The test, of course, could be fixed by turning fencing off, but this might hide any possible future problems with fencing. Follow-up #6654 Follow-up #6661 NO_CHANGELOG=test fix NO_DOC=test fix
-
Vladimir Davydov authored
After scanning disk, the Vinyl read iterator checks if it should restore the iterator over the active memory tree, because new tuples could have been inserted into it while we yielded reading disk. We assume that existing tuples can't be deleted from the memory tree, but that's not always true - a tuple may actually be deleted by rollback after a failed WAL write. Let's reevaluate all scanned sources and reposition the read iterator to the next statement if this happens. Initially, the issue was fixed by commit 83462a5c ("vinyl: restart read iterator in case L0 is changed"), but it introduced a performance degradation and was reverted (see #5700). NO_DOC=bug fix NO_TEST=already there NO_CHANGELOG=already there
-
Vladimir Davydov authored
The Vinyl read iterator, which is used for serving range select requests, works as follows: 1. Scan in-memory sources. If found an exact match or a chain in the cache, return it. 2. If not found, scan disk sources. This operation may yield. 3. If any new data was inserted into the active memory tree, go to step 1, effectively restarting the iteration from the same key. Apparently, such an algorithm doesn't guarantee any progress of a read operation at all - when we yield reading disk on step 2 after a restart, even newer data may be inserted into the active memory tree, forcing us to restart again. In other words, in presence of an intensive write workload, read ops rate may drop down to literally 0. It hasn't always been like so. Before commit 83462a5c ("vinyl: restart read iterator in case L0 is changed"), we only restored the memory tree iterator after a yield, without restarting the whole procedure. This makes sense, because only memory tree may change after a yield so there's no point in rescanning other sources, including disk. By restarting iteration after a yield, the above-mentioned commit fixed bug #3395: initially we assumed that statements may never be deleted from a memory tree while actually they can be deleted by rollback after a failed WAL write. Let's revert this commit to fix the performance degradation. We will re-fix bug #3395 in the next commit. Closes #5700 NO_DOC=bug fix NO_TEST=should be checked by performance tests
-
Igor Munkin authored
* Avoid conflict between 64 bit lightuserdata and ITERN key. * Reorganize lightuserdata interning code. * test: fix path storage for non-concatable objects * ARM64: Fix assembly of HREFK. * FFI/ARM64: Fix pass-by-value struct calling conventions. * test: set DYLD_LIBRARY_PATH environment variable * x64/LJ_GC64: Fix fallback case of asm_fuseloadk64(). * FFI: Handle zero-fill of struct-of-NYI. * Fix interaction between profiler hooks and finalizers. * Flush and close output file after profiling run. * Fix debug.debug() for non-string errors. * Fix write barrier for lua_setupvalue() and debug.setupvalue(). * Fix FOLD rule for strength reduction of widening. * Fix bytecode dump unpatching. * Fix tonumber("-0") in dual-number mode. * Fix tonumber("-0"). * Give expected results for negative non-base-10 numbers in tonumber(). * Add missing LJ_MAX_JSLOTS check. * Add stricter check for print() vs. tostring() shortcut. Closes #6548 Fixes #4614 Fixes #4630 Fixes #5885 Fixes tarantool/tarantool-qa#234 Fixes tarantool/tarantool-qa#235 Follows up #2712 NO_DOC=LuaJIT submodule bump NO_TEST=LuaJIT submodule bump
-
Vladimir Davydov authored
Normally, there shouldn't be any upserts on disk if the space has secondary indexes, because we can't generate an upsert without a lookup in the primary index hence we convert upserts to replace+delete in this case. The deferred delete optimization only makes sense if the space has secondary indexes. So we ignore upserts while generating deferred deletes, see vy_write_iterator_deferred_delete. There's an exception to this rule: a secondary index could be created after some upserts were used on the space. In this case, because of the deferred delete optimization, we may never generate deletes for some tuples for the secondary index, as demonstrated in #3638. We could fix this issue by properly handle upserts in the write iterator while generating deferred delete, but this wouldn't be easy, because in case of a minor compaction there may be no replace/insert to apply the upsert to so we'd have to keep intermediate upserts even if there is a newer delete statement. Since this situation is rare (happens only once in a space life time), it doesn't look like we should complicate the write iterator to fix it. Another way to fix it is to force major compaction of the primary index after a secondary index is created. This looks doable, but it could slow down creation of secondary indexes. Let's instead simply disable the deferred delete optimization if the primary index has upsert statements. This way the optimization will be enabled sooner or later, when the primary index major compaction occurs. After all, it's just an optimization and it can be disabled for other reasons (e.g. if the space has on_replace triggers). Closes #3638 NO_DOC=bug fix
-
Ilya Verbin authored
* doc: add allocators hierarchy diagram * build: fix CMake warning * small: fix compilation on macOS 12 NO_DOC=small submodule bump NO_TEST=small submodule bump NO_CHANGELOG=small submodule bump
-
Ilya Verbin authored
Asserts are disabled in the Release build, that leads to: $ cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_FLAGS=-Werror $ make [ 43%] Building C object src/lib/bitset/CMakeFiles/bitset.dir/bitset.c.o src/lib/bitset/bitset.c:169:9: error: variable 'cardinality_check' set but not used [-Werror,-Wunused-but-set-variable] size_t cardinality_check = 0; ^ NO_DOC=build fix NO_TEST=build fix NO_CHANGELOG=build fix
-
- Jun 29, 2022
-
-
Nikolay Shirokovskiy authored
This will fix compilation on recent Archlinux. The issue is this distro installs libbpf in base configuration and nghttp2 1.46.0 tries to compile eBPF code if this library is present and failed if compiler is gcc. Closes #7292 NO_DOC=nghttp2 submodule bump NO_TEST=nghttp2 submodule bump
-
Ilya Verbin authored
Now fiber.top() does not use x86-specific instructions, so it can be enabled for ARM. Closes #4573 NO_TEST=<Tested in test/app/fiber.test.lua> NO_DOC=<x86 or ARM are not mentioned in the fiber.top doc>
-
Ilya Verbin authored
It doesn't make sense after switching from RDTSCP to clock_gettime(CLOCK_MONOTONIC). Part of #5869 @TarantoolBot document Title: fiber: get rid of cpu_misses in fiber.top() Since: 2.11 Remove any mentions of `cpu_misses` in `fiber.top()` description.
-
Ilya Verbin authored
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Closes #5869 [1] https://github.com/tarantool/tarantool/issues/2694#issuecomment-546381304 NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua>
-
Ilya Verbin authored
Use this wrapper to simplify the code. Part of #5869 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Ilya Verbin authored
clock_gettime() returns 0 for success, or -1 for failure. Add missed checks for the return value. Part of #5869 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Vladislav Shpilevoy authored
These 3 modules are low hanging fruits which right now can be freed at return from main() without any effort. There are still a lot of other modules whose freeing is not that easy. A few hard to untangle knots, and there are more: - Session, credentials, iproto, and fibers are tied together via the latter. Each fiber potentially has a session, its current credentials object. Each iproto connection has a session and a file descriptor which is stored in the session too. The possible solution would be to walk all the fibers and destroy them before proceeding to destroy everything else. - Tuples depend on memtx, and Lua depends on tuples. Because there are tuples allocated on memtx->arena. Hence destruction of memtx and its arena makes the tuples still stored in Lua invalid. It seems Lua should be destroyed first, not last. It would free all the refs which might be kept at objects in C in the other modules. - IProto connections leak when iproto is destroyed. They are not freed and their descriptors are not closed properly. That requires additional preparatory work to destroy them correctly on iproto module deconstruction. Given amount of work, it should be done as a big separate ticket. Follow up #7259 NO_CHANGELOG=Not a visible change NO_DOC=Not a visible change NO_TEST=Not a visible change
-
Vladislav Shpilevoy authored
main() used to skip most of modules destruction in tarantool_free(). That got ASAN complaining on clang-13 about a leak of a fiber on_stop trigger which was allocated in Lua. The patch makes fiber_free() called for the main cord. It destroys and frees all the fibers together with their on_stop triggers. Closes #7259 NO_CHANGELOG=Not a visible change NO_DOC=Not a visible change NO_TEST=Not a visible change
-
- Jun 28, 2022
-
-
Nikita Pettik authored
Before this patch struct tuple had two boolean bit fields: is_dirty and has_uploaded_refs. It is worth mentioning that sizeof(boolean) is implementation depended. However, in code it is assumed to be 1 byte (there's static assertion restricting the whole struct tuple size by 10 bytes). So strictly speaking it may lead to the compilation error on some non-conventional system. Secondly, bit fields anyway consume at least one size of type (i.e. there's no space benefits in using two uint8_t bit fields - they anyway occupy 1 byte in total). There are several known pitfalls concerning bit fields: - Bit field's memory layout is implementation dependent; - sizeof() can't be applied to such members; - Complier may raise unexpected side effects (https://lwn.net/Articles/478657/). Finally, in our code base as a rule we use explicit masks: txn flags, vy stmt flags, sql flags, fiber flags. So, let's replace bit fields in struct tuple with single member called `flags` and several enum values corresponding to masks (to be more precise - bit positions in tuple flags). NO_DOC=<Refactoring> NO_CHANGELOG=<Refactoring> NO_TEST=<Refactoring>
-
- Jun 27, 2022
-
-
Timur Safin authored
We did not retain correctly `hour` attribute if modified via `:set` method attributes `min`, `sec` or `nsec`. ``` tarantool> a = dt.parse '2022-05-05T00:00:00' tarantool> a:set{min = 0, sec = 0, nsec = 0} -- - 2022-05-05T12:00:00Z ... ``` Closes #7298 NO_DOC=bugfix
-
- Jun 24, 2022
-
-
Vladimir Davydov authored
If vy_point_lookup called by vy_sauash_process yields (doing disk read), a dump may be triggered bumping the L0 generation counter, in which case we would insert a statement to a sealed vy_mem, as explained in #5080. Let's check the generation counter and rotate the active vy_mem if necessary after vy_point_lookup to avoid that. Closes #5080 NO_DOC=bug fix NO_TEST=complicated, need stress/perf test to catch bugs like this
-
Vladimir Davydov authored
We often call vy_lsm_rotate_mem_if_required if its generation or schema version is older than the current one. Let's add a helper function for that. Needed for #5080 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Vladimir Davydov authored
The optimization is mostly useless, because it only works if there's no data on disk. As explained in #5080, it contains a potential bug: if L0 dump is triggered between 'prepare' and 'commit', it will insert a statement to a sealed vy_mem. Let's drop it. Part of #5080 NO_DOC=bug fix NO_CHANGELOG=later
-
Nikita Pettik authored
gh_6634_different_log_on_tuple_new_and_free_test.lua verifies that proper debug message gets into logs for tuple_new() and tuple_delete(): occasionally tuple_delete() printed wrong tuple address. However, still there are two debug logs: one in tuple_delete() and another one in memtx_tuple_delete(). So to avoid any possible confusions let's fix regular expression to find proper log so that now it definitely finds memtx_tuple_delete(). NO_CHANGELOG=<Test fix> NO_DOC=<Test fix>
-