- Jun 29, 2022
-
-
Ilya Verbin authored
clock_gettime() returns 0 for success, or -1 for failure. Add missed checks for the return value. Part of #5869 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Vladislav Shpilevoy authored
These 3 modules are low hanging fruits which right now can be freed at return from main() without any effort. There are still a lot of other modules whose freeing is not that easy. A few hard to untangle knots, and there are more: - Session, credentials, iproto, and fibers are tied together via the latter. Each fiber potentially has a session, its current credentials object. Each iproto connection has a session and a file descriptor which is stored in the session too. The possible solution would be to walk all the fibers and destroy them before proceeding to destroy everything else. - Tuples depend on memtx, and Lua depends on tuples. Because there are tuples allocated on memtx->arena. Hence destruction of memtx and its arena makes the tuples still stored in Lua invalid. It seems Lua should be destroyed first, not last. It would free all the refs which might be kept at objects in C in the other modules. - IProto connections leak when iproto is destroyed. They are not freed and their descriptors are not closed properly. That requires additional preparatory work to destroy them correctly on iproto module deconstruction. Given amount of work, it should be done as a big separate ticket. Follow up #7259 NO_CHANGELOG=Not a visible change NO_DOC=Not a visible change NO_TEST=Not a visible change
-
Vladislav Shpilevoy authored
main() used to skip most of modules destruction in tarantool_free(). That got ASAN complaining on clang-13 about a leak of a fiber on_stop trigger which was allocated in Lua. The patch makes fiber_free() called for the main cord. It destroys and frees all the fibers together with their on_stop triggers. Closes #7259 NO_CHANGELOG=Not a visible change NO_DOC=Not a visible change NO_TEST=Not a visible change
-
- Jun 28, 2022
-
-
Nikita Pettik authored
Before this patch struct tuple had two boolean bit fields: is_dirty and has_uploaded_refs. It is worth mentioning that sizeof(boolean) is implementation depended. However, in code it is assumed to be 1 byte (there's static assertion restricting the whole struct tuple size by 10 bytes). So strictly speaking it may lead to the compilation error on some non-conventional system. Secondly, bit fields anyway consume at least one size of type (i.e. there's no space benefits in using two uint8_t bit fields - they anyway occupy 1 byte in total). There are several known pitfalls concerning bit fields: - Bit field's memory layout is implementation dependent; - sizeof() can't be applied to such members; - Complier may raise unexpected side effects (https://lwn.net/Articles/478657/). Finally, in our code base as a rule we use explicit masks: txn flags, vy stmt flags, sql flags, fiber flags. So, let's replace bit fields in struct tuple with single member called `flags` and several enum values corresponding to masks (to be more precise - bit positions in tuple flags). NO_DOC=<Refactoring> NO_CHANGELOG=<Refactoring> NO_TEST=<Refactoring>
-
- Jun 27, 2022
-
-
Timur Safin authored
We did not retain correctly `hour` attribute if modified via `:set` method attributes `min`, `sec` or `nsec`. ``` tarantool> a = dt.parse '2022-05-05T00:00:00' tarantool> a:set{min = 0, sec = 0, nsec = 0} -- - 2022-05-05T12:00:00Z ... ``` Closes #7298 NO_DOC=bugfix
-
- Jun 24, 2022
-
-
Vladimir Davydov authored
If vy_point_lookup called by vy_sauash_process yields (doing disk read), a dump may be triggered bumping the L0 generation counter, in which case we would insert a statement to a sealed vy_mem, as explained in #5080. Let's check the generation counter and rotate the active vy_mem if necessary after vy_point_lookup to avoid that. Closes #5080 NO_DOC=bug fix NO_TEST=complicated, need stress/perf test to catch bugs like this
-
Vladimir Davydov authored
We often call vy_lsm_rotate_mem_if_required if its generation or schema version is older than the current one. Let's add a helper function for that. Needed for #5080 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Vladimir Davydov authored
The optimization is mostly useless, because it only works if there's no data on disk. As explained in #5080, it contains a potential bug: if L0 dump is triggered between 'prepare' and 'commit', it will insert a statement to a sealed vy_mem. Let's drop it. Part of #5080 NO_DOC=bug fix NO_CHANGELOG=later
-
Nikita Pettik authored
gh_6634_different_log_on_tuple_new_and_free_test.lua verifies that proper debug message gets into logs for tuple_new() and tuple_delete(): occasionally tuple_delete() printed wrong tuple address. However, still there are two debug logs: one in tuple_delete() and another one in memtx_tuple_delete(). So to avoid any possible confusions let's fix regular expression to find proper log so that now it definitely finds memtx_tuple_delete(). NO_CHANGELOG=<Test fix> NO_DOC=<Test fix>
-
Vladimir Davydov authored
Net.box triggers (on_connect, on_schema_reload) are executed by the net.box connection worker fiber so a request issued by a trigger callback can't be processed until the trigger returns execution to the net.box fiber. Currently, an attempt to issue a synchronous request from a net.box trigger leads to a silent hang of the connection, which is confusing. Let's instead raise an error until #7291 is implemented. We need to add the check to three places in the code: 1. luaT_netbox_wait_result for future:wait_result() 2. luaT_netbox_iterator_next for future:pairs() 3. conn._request for all synchronous requests. (We can't add the check to luaT_netbox_transport_perform_request, because conn._request may also call conn.wait_state, which would hang if called from on_connect or on_schema_reload trigger.) We also add an assertion to netbox_request_wait to ensure that we never wait for a request completion in the net.box worker fiber. Closes #5358 @TarantoolBot document Title: Synchronous requests are not allowed in net.box triggers An attempt to issue a synchronous request (e.g. `call`) from a net.box trigger (`on_connect`, `on_schema_reload`) now raises an error: "Synchronous requests are not allowed in net.box trigger" (Before https://github.com/tarantool/tarantool/issues/5358 was fixed, it silently hung.) Invoking an asynchronous request (see `is_async` option) is allowed, but the request will not be processed until the trigger returns and an attempt to wait for the request completion with `future:pairs()` or `future:wait_result()` will raise the same error.
-
Ilya Verbin authored
`wal_write` has been adapted to spurious wakeups, so this protection is no longer needed (see 4bf52367). Part of #7166 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Ilya Verbin authored
Currently it's possible to wakeup a fiber, which is waiting on `limbo->wait_cond`, using Tarantool C API, so the protection by `fiber_set_cancellable` doesn't make much sense. This patch removes it. Spurious wakeups are already handled correctly. Part of #7166 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Ilya Verbin authored
Part of #7166 NO_DOC=fix comment NO_TEST=fix comment NO_CHANGELOG=fix comment
-
- Jun 23, 2022
-
-
Georgiy Lebedev authored
Procedure name cache was removed during refactoring in 8ce76364 based off the interpretation that libunwind caches procedure names internally — turned out it only caches unwinding info, which causes a severe performance downgrade (#7207): return it to speed up frame resolving. Closes #7207 NO_DOC=performance improvement NO_TEST=performance improvement
-
Georgiy Lebedev authored
For the sake of maximizing backtrace collection performance, build the libunwind project submodule with "-O2" compiler flag. Also, build it with the "-g" compiler flag just in case to simplify debugging. Last but not least, pass the same archive-maintaining program used by the main CMake project to the libunwind project submodule to make the build homogeneous. Needed for #7207 NO_CHANGELOG=build NO_DOC=build NO_TEST=build
-
Vladimir Davydov authored
exclude_null is a special index option, which makes the index ignore tuples that contain null in any of the indexed fields. Currently, it doesn't work for json and multikey indexes, because: 1. index_filter_tuple ignores json path. 2. index_filter_tuple ignores multikey index. Issue no. 1 is easy to fix - we just need to use tuple_field_by_part instead of tuple_field when checking if a key field is null. Issue no. 2 is more complicated, because when we call index_filter_tuple we don't know the multikey index. We address this issue by pushing the index_filter_tuple call down to engine-specific index implementation. For Vinyl, we make vy_stmt_foreach_entry, which iterates over multikey tuple entries, skip entries that contain nulls. For memtx, we move the check to index-specific index_replace function implementation. Fortunately, only tree indexes support nullable fields so we just need to update the memtx tree implementation. Ideally, we should handle multikey indexes in memtx at the top level, because the implementation should essentially be the same for all kinds of indexes, but this refactoring is complicated and will be done later. For now, just fix the bug. Closes #5861 NO_DOC=bug fix
-
Vladimir Davydov authored
For some reason, some test cases create memtx spaces irrespective of the value of the engine parameter. NO_DOC=test NO_CHANGELOG=test
-
- Jun 22, 2022
-
-
Nikita Pettik authored
We may need the full (old) tuple to add some of its fields to WAL entry. NO_CHANGELOG=<Later for EE> NO_DOC=<Later for EE> NO_TEST=<Later for EE>
-
Nikita Pettik authored
NO_CHANGELOG=<No functional changes> NO_DOC=<Later for EE>
-
Nikita Pettik authored
These fields correspond to the tuple before DML request is executed (old); and after - result (new). For example let index stores tuple {1, 1}: replace{1, 2} -- old == {1, 1}, new == {1, 2} These fields rather make sense for update operation, which holds a key and an array of update operations (not the old tuple). `old_tuple`, `new_tuple` are going to be used as WAL extensions available in enterprise version. Alongside with it let's reserve 0x2c and 0x2d Iproto keys for these members. NO_DOC=<No functional changes> NO_TEST=<No functional changes> NO_CHANGELOG=<No functional changes>
-
Mergen Imeev authored
This marco is not set so the code depending on it is dropped. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Mergen Imeev authored
This macro is not set, so unused code is dropped. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Mergen Imeev authored
This macro does nothing, so it is dropped. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Mergen Imeev authored
These macros do nothing, so they are dropped. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Mergen Imeev authored
This macro does nothing, so it is dropped. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Mergen Imeev authored
This macro does nothing, so it is dropped. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Georgiy Lebedev authored
We assume that a transaction cannot conflict itself and that only a non-prepared transaction can be conflicted: track these assumptions with appropriate assertions. NO_CHANGELOG=internal assertions NO_DOC=internal assertions NO_TEST=internal assertions
-
Georgiy Lebedev authored
On insertion, when point holes are checked on insertion, we must only conflict transactions other than the one that read the hole. NO_CHANGELOG=internal bugfix NO_DOC=bugfix Closes #7234 Closes #7235
-
Georgiy Lebedev authored
When full scans are checked on writes, we must only conflict transactions other than the one that did the full scan. NO_CHANGELOG=internal bugfix NO_DOC=bugfix Closes #7221
-
- Jun 21, 2022
-
-
Igor Munkin authored
This patch introduces reusable workflow used by integration testing machinery run within tarantool/luajit repository. For the first attempt GitHub action has been used, but its fetch (or more precisely unpack) phase fails due to test/test-run.py symlink into test-run submodule (the action being used doesn't fetch it while packing tarantool repository). As the alternative for removing this symlink, it was decided to use reusable workflows despite its known limitations (e.g. inability to use the testing matrix) until the issue with symlink is resolved in any possible way. As an alternate way, a common action to be used in all submodules for integration testing can be added to tarantool/actions repository. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci Reviewed-by:
Yaroslav Lobankov <y.lobankov@tarantool.org> Reviewed-by:
Sergey Bronnikov <sergeyb@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
Vladimir Davydov authored
Commit 4d52199e ("box: fix transaction "read-view" and "conflicted" states") updated vy_tx_send_to_read_view so that now it aborts all RW transactions right away instead of sending them to read view and aborting them on commit. It also updated vy_tx_begin_statement to fail if a transaction sent to a read view tries to do DML. With all that, we assume that there cannot possibly be an RW transaction sent to read view so we have an assertion checking that in vy_tx_commit. However, this assertion may fail, because a DML statement may yield on disk read before it writes anything to the write set. If this is the first statement in a transaction, the transaction is technically read-only and we will send it to read-view instead of aborting it. Once it completes the disk read, it will apply the statement and hence become read-write, breaking our assumption in vy_tx_commit. Fix this by aborting RW transactions sent to read-view in vy_tx_set. Follow-up #7240 NO_DOC=bug fix NO_CHANGELOG=unreleased
-
- Jun 20, 2022
-
-
Sergey Bronnikov authored
NO_CHANGELOG=ci NO_DOC=ci NO_TEST=ci
-
Sergey Bronnikov authored
By default CMake generates Makefiles for building a project. However, it allows to generate Ninja files. Ninja [1] may build project a bit faster than Make, see [2]. Patch adds fixes for CMake files allowing to use Ninja for building Tarantool: 1. Fixed dependencies in ExternalProject_Add(), see explanation in [3] 2. Fixed ninja error due to presence of symbol '$' in cmake/rpm.cmake 3. Added propagation of CMAKE_GENERATOR in dependencies that uses CMake for building, see [4] How-to build wit Ninja: $ cmake -G Ninja -B build -S . $ ninja -C build/ 1. https://ninja-build.org/ 2. https://mesonbuild.com/Simple-comparison.html 3. https://stackoverflow.com/a/65803911/3665613 4. https://cmake.org/cmake/help/latest/module/ExternalProject.html NO_DOC=internal NO_CHANGELOG=internal NO_TEST=internal
-
- Jun 18, 2022
-
-
Igor Munkin authored
* ci: add job for build using Ninja on Linux/x86_64 * build: create file lists outside of CMake commands * build: use unique names for CMake targets * Revert "test: disable PUC-Rio tests for several -l options" * ci: make GitHub workflows more CMake-ish * test: adapt PUC-Rio tests for debug line hook * test: adapt PUC-Rio test for tail calls debug info * test: adapt PUC-Rio test with reversed function Closes #5693 Closes #5702 Closes #5782 Follows up #5747 NO_DOC=LuaJIT submodule bump NO_TEST=LuaJIT submodule bump NO_CHANGELOG=LuaJIT submodule bump
-
- Jun 17, 2022
-
-
Cyrill Gorcunov authored
When fiber has finished its work it ended up in two cases: 1) If no "joinable" attribute set then the fiber is simply recycled 2) Otherwise it continue hanging around waiting to be joined. Our API allows to call fiber_wakeup() for dead but joinable fibers (2) in release builds without any side effects, such fibers are simply ignored, in turn for debug builds this causes assertion to trigger. We can't change our API for backward compatibility sake but same time we must not preserve different behaviour between release and debug builds since this brings inconsistency. Thus lets get rid of assertion call and allow to call fiber_wakeup in debug build as well. Fixes #5843 NO_DOC=bug fix Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Serge Petrenko authored
Once the split-brain detection is in place, it's fine to nopify obsolete data even on a node with elections disabled. Let's not keep a bug around anymore. This behaviour change leads to changing "gh_6842_qsync_applier_order_test.lua" a bit. It actually relied on old and buggy behaviour: it assumed old transactions would not be nopified and would trigger replication error. This doesn't happen anymore, because nopify works correctly, and the transactions are not followed by a conflicting CONFIRM. The test for this commit is simply altering the gh_5295_split_brain_detection_test.lua to work with elections disabled. Closes #6133 Follow-up #5295 NO_DOC=internal change NO_CHANGELOG=internal change
-
Cyrill Gorcunov authored
When we receive synchro requests we can't just apply them blindly because in worst case they may come from split-brain configuration (where a cluster split into several clusters and each one has own leader elected, then clusters are trying to merge back into the original one). We need to do our best to detect such disunity and force these nodes to rejoin from the scratch for data consistency sake. Thus when we're processing requests we pass them to the packet filter first which validates their contents and refuse to apply if they violate consistency. Depending on request type each packet traverses an appropriate chain. filter_generic(): a common chain for any synchro packet. 1) request:replica_id = 0 allowed for PROMOTE request only. 2) request:replica_id should match limbo:owner_id, IOW the limbo migration should be noticed by all instances in the cluster. filter_confirm_rollback(): a chain for CONFIRM | ROLLBACK packets. 1) Zero lsn is disallowed for such requests. filter_promote_demote(): a chain for PROMOTE | DEMOTE packets. 1) The requests should come in with nonzero term, otherwise the packet is corrupted. 2) The request's term should not be less than maximal known one, iow it should not come in from nodes which didn't notice raft epoch changes and living in the past. filter_queue_boundaries(): a common finalization chain. 1) If LSN of the request matches current confirmed LSN the packet is obviously correct to process. 2) If LSN is less than confirmed LSN then the request is wrong, we have processed the requested LSN already. 3) If LSN is greater than confirmed LSN then a) If limbo is empty we can't do anything, since data is already processed and should issue an error; b) If there is some data in the limbo then requested LSN should be in range of limbo's [first; last] LSNs, thus the request will be able to commit and rollback limbo queue. Note the filtration is disabled during initial configuration where we apply requests from the only source of truth (either the remote master, or our own journal), so no split brain is possible. In order to make split-brain checks work, the applier nopify filter now passes synchro requests from obsolete term without nopifying them. Also, now ANY asynchronous request coming from an instance with obsolete term is treated as a split-brain. Think of it as of a syncrhonous request committed with a malformed quorum. Closes #5295 NO_DOC=it's literally below Co-authored-by:
Serge Petrenko <sergepetrenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> @TarantoolBot document Title: new error type: ER_SPLIT_BRAIN If for some reason the cluster had 2 leaders working independently (for example, user has mistakenly lovered the quorum below N / 2 + 1), then once such leaders and their followers try connecting to each other, they will receive the ER_SPLIT_BRAIN error, and the connection will be aborted. This is done to preserve data integrity. Once the user notices such an error he or she has to manually inspect the data on both the split halves, choose a way to restore the data, and rebootstrap one of the halves from the other.
-
Serge Petrenko authored
Change return types of txn_limbo_req_prepare, txn_limbo_process, txn_limbo_write_promote, txn_limbo_write_demote from void to int. This is a preparation for when these functions start returning errors. Part-of #5295 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
Make box_issue_promote and box_issue_demote return a return code. For now it's always 0, but soon they will return errors. Part-of #5295 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
limbo->confirmed_lsn was only filled on limbo owner in txn_limbo_write_confirm. Replicas and recovering limbo owner need to track it as well to correctly detect split-brains based on confirmed_lsn. So update confirmed_lsn in txn_limbo_read_confirm. Part-of #5295 NO_DOC=internal change NO_TEST=tested in future commits NO_CHANGELOG=internal change
-