- Dec 06, 2023
-
-
Sergey Ostanevich authored
Some changes are not new features, rather developer tools updates and improvements. There are also number of tweaks we can introduce to improve testing and tests backporting across branches, which also are not considered neither feature nor bug fix. NO_DOC=changelog NO_TEST=changelog NO_CHANGELOG=changelog
-
Nikita Zheleztsov authored
Sometimes the test fails with "Peer closed" error, logs says, that fatal error happened: cfg_get('read_only'). This is caused by the instance processing its own ballot. The problem is that we set box.cfg to function in before_all trigger, so it's impossible to access box.cfg table during the whole time of test execution. Let's instead set box.cfg to function at the start of every test and restore box.cfg at the end. This way we'll decrease the time, in which such fatal error can happen. Even though it's still possible to get it in theory, the problem is not reproduced anymore. The alternative solution of introducing errinj seems to be overhead here. Closes tarantool/tarantool-qa#329 NO_DOC=testfix NO_CHANGELOG=testfix
-
Nikita Zheleztsov authored
We need to apply instance/replicaset_name as soon as the instance becomes RW, so currently we try to do so at every box.status broadcast. Even though broadcast happens pretty often, it's not enough: The bug is reproduced in config-luatest/set_names_reload, which checks the following situation: 1. Cluster is recovered from the xlogs without names set. 2. User forgets to set UUID for one replica, starts the cluster. 3. Replica, UUID of which have not been set, fails to start. 4. User notices that and updates config, reloading it on the instances, which succeeded to start, starting failed one. 5. Master must apply name for a failed replica. The test worked all right in the majority of runs, because box.status broadcast happens often: e.g. it's broadcasted, when master's applier synced with replica. However, under heavy load on CPU, the test failed sometimes, when master fails to subscribe on replica and broadcast doesn't happen. Let's try to set names not only, when box.status is broadcasted, but immediately after reload, as at this time new names, which must be set, might appear. Let's also change test so that, it doesn't rely on broadcast anymore. Closes tarantool/tarantool-qa#328 NO_DOC=bugfix
-
Alexander Turenko authored
See the details in the documentation request below. Fixes #9431 @TarantoolBot document Title: config: failover mode and election mode consistency `replication.failover: election` enables RAFT based leader election mechanism on a replicaset. The instances can be configured in the following election modes: `off`, `candidate`, `voter`, `manual`. It is controlled by the `replication.election_mode` parameter. However, the election mode parameter has no sense and confusing for other failover modes (`off`, `manual`, `supervised`). So, it is forbidden to set the election modes other than `off` in failover modes != `election`. Summary: * `replication.failover: off` * `replication.election_mode: off`: OK * `replication.election_mode: candidate`: FAIL * `replication.election_mode: voter`: FAIL * `replication.election_mode: manual`: FAIL * `replication.failover: manual` * `replication.election_mode: off`: OK * `replication.election_mode: candidate`: FAIL * `replication.election_mode: voter`: FAIL * `replication.election_mode: manual`: FAIL * `replication.failover: election` * `replication.election_mode: off`: OK * `replication.election_mode: candidate`: OK * `replication.election_mode: voter`: OK * `replication.election_mode: manual`: OK * `replication.failover: supervised` * `replication.election_mode: off`: OK * `replication.election_mode: candidate`: FAIL * `replication.election_mode: voter`: FAIL * `replication.election_mode: manual`: FAIL
-
Alexander Turenko authored
A next commit adds one more check and it becomes too large snippet of the code to be part of the constructor function. NO_DOC=refactoring NO_CHANGELOG=see NO_DOC NO_TEST=see NO_DOC
-
Alexander Turenko authored
It allows to start a replicaset from the given declarative configuration, perform checks on all or particular instances, add a new instance into the replicaset (without stopping existing instances), update the config file and reload it on the instances. NO_DOC=testing helper NO_CHANGELOG=see NO_DOC NO_TEST=see NO_DOC
-
Alexander Turenko authored
The module now has only one function, which allows to reduce a boilerplace code needed to verify a scenario, when all instances of a replicaset should fail with the same error message. The module will be extended with replicaset management functions later. NO_DOC=testing helper NO_CHANGELOG=see NO_DOC NO_TEST=see NO_DOC
-
Alexander Turenko authored
It allows to construct a declarative configuration for a test case using less boilerplace code/options, especially when a replicaset is to be tested, not a single instance. See a description at top of the file for details. NO_DOC=testing helper NO_CHANGELOG=see NO_DOC NO_TEST=see NO_DOC
-
- Dec 05, 2023
-
-
Serge Petrenko authored
Relay sometimes decodes the PROMOTE packets to be sent in order to conditionally delay their dispatch. It was believed that own WALs can't be corrupted and hence there is no point in checking that decoding succeeds. Let's be more strict here and check the decoding result before proceeding. Closes #9265 NO_DOC=bugfix NO_TEST=hard to test NO_CHANGELOG=not user-visible
-
Mergen Imeev authored
Closes #9435 @TarantoolBot document Title: `login` and `password` fields in URI table Now `uri.parse()` can parse a table containing the `login` and `password` fields. Values from these fields take precedence over values obtained from the string URI. For example, login and password of `{uri = 'one:two:localhost:3301, login = 'alpha', password = 'omega'}` will be `alpha` and `omega` respectively. If the `login` field is set and the `password` field is not set, the password is set to `nil`. If the `password` field is set, the `login` field must be present.
-
Magomed Kostoev authored
Before this patch it could happen that deletion of a function from the function cache didn't delete it from the funcs_by_name map. The reason is that the check if the function exists in the map performs the comparison of the search result with the `end` backet ID of the wrong hash table. The situation in which this could happen is the following: 1. Insertion of a new function into the cache triggers resize of the funcs_by_value map, but the size of the funcs map remains the same. 2. Then user deletes a function. This removes the function from the funcs map. Then we check if the function exists in the funcs_by_value map. The function exists there, but it so happens that it's bucket ID equals to the funcs map bucket count, so the incorrect check if the function exists in the funcs_by_value map states that the function does not exist there, so it's not dropped from the map. 3. Now we have the following result: the function is referenced in the funcs_by_value map, but not in funcs map. This triggers the assertion failure on any attempt to insert a new function with the same name. Closes #9426 NO_DOC=bugfix
-
- Dec 04, 2023
-
-
Mergen Imeev authored
This patch adds support for automatic master discovery in vshard. There is no longer a need to reapply the vshard storage configuration every time an instance becomes a master or ceases to be a master. Automatic master discovery also solves the problem with the rebalancer. Previously, the rebalancer could not work correctly if the masters of some replicasets were unknown, and since the vshard config generated by the config module did not contain information about all the masters, the rebalancer was disabled. The rebalancer can now perform master discovery on its own, which is why it is enabled. Before this patch, the config module was based on the "test: use dofile() for configs instead of require" commit in vshard. At a minimum, commit "storage: fix assertion error in conn_manager" is now required. Part of #8862 NO_DOC=will be added along with documentation for the rebalancer role NO_CHANGELOG=will be added along with changelog for the rebalancer role
-
Maxim Kokryashkin authored
This module became unused as a result of LuaJIT bump made in the commit 88333d13 ("luajit: bump new version"), so it can be purged safely from the Tarantool sources. Part of #8700 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=added within the aforementioned commit
-
- Nov 30, 2023
-
-
Serge Petrenko authored
Current split-brain detector implementation raises an error each time a CONFIRM or ROLLBACK entry is received from the previous synchronous transaction queue owner. It is assumed that the new queue owner must have witnessed all the previous CONFIRMS. Besides, according to Raft, ROLLBACK should never happen. Actually there is a case when a CONFIRM from an old term is legal: it's possible that during leader transition old leader writes a CONFIRM for the same transaction that is confirmed by the new leader's PROMOTE. If PROMOTE and CONFIRM lsns match there is nothing bad about such situation. Symmetrically, when an old leader issues a ROLLBACK with the lsn right after the new leader's PROMOTE lsn, it is not a split-brain. Allow such cases by tracking the last confirmed lsn for each synchronous transaction queue owner and silently nopifying CONFIRMs with an lsn less than the one recorded and ROLLBACKs with lsn greater than that. Closes #9138 NO_DOC=bugfix
-
Serge Petrenko authored
Previously the replicas only persisted the confirmed lsn of the current synchronous transaction queue owner. As soon as the onwer changed, the info about which lsn was confirmed by the previous owner was lost. Actually, this info is needed to correctly filter synchro requests coming from the old term, so start tracking confirmed vclock instead of the confirmed lsn on replicas. In-scope of #9138 NO_TEST=covered by the next commit NO_CHANGELOG=internal change @TarantoolBot document Title: Document new IPROTO_RAFT_PROMOTE request field IPROTO_RAFT_PROMOTE and IPROTO_RAFT_DEMOTE requests receive a new key value pair: IPROTO_VCLOCK : MP_MAP The vclock holds a confirmed vclock of the node sending the request.
-
Serge Petrenko authored
Synchronous requests will receive a new field encoding a full vclock soon. Theoretically a vclock may take up to ~ 300-400 bytes (3 bytes for a map header + 32 components each taking up 1 byte for replica id and up to 9 bytes for lsn). So it makes no sense to increase SYNCHRO_BODY_LEN_MAX from 32 to 400-500. It would become almost the same as plain BODY_LEN_MAX. Simply reuse the latter everywhere. In-scope-of #9138 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
There was an error in xrow_decode_synchro: it compared the expected type of the value to the type of the key (MP_UINT) instead of the type of the actual value. This went unnoticed because all values in synchro requests were integers. This is going to change soon, when PROMOTE requests will start holding a vclock, so fix the wrong type check. In-scope-of #9138 NO_DOC=bugfix NO_CHANGELOG=not user-visible
-
Sergey Kaplun authored
Without checking the return value of lua_pcall()` in `lua_field_inspect_ucdata()`, the error message itself is returned as a serialized result. The result status of `lua_pcall()` is not ignored now. NO_DOC=bugfix Closes #9396
-
Nikolay Shirokovskiy authored
Netbox internally watches 'box.shutdown' for the sake of graceful shutdown. The event subscription is async with connection API. Additionally we check error count on server using different connection. As a result we may or may not account error for the netbox internal watch failure. Let's account the internal watch failure reliably. Also while we at it let's get rid of races for error count check. Close #9423 NO_CHANGELOG=internal NO_DOC=internal
-
Alexander Turenko authored
If `config.etcd` is present and non-empty, `config.etcd.prefix` is required. This validation check was not performed due to a mistake in a schema node wrapper that adds a validator that checks an attempt to use an Enterprise Edition option on Community Edition. Part of #8862 NO_DOC=bugfix
-
- Nov 29, 2023
-
-
Nikolay Shirokovskiy authored
Looks like this is typo introduced in the commit 0704ebb7 ("xlog: rework writer API"). Close #9428 NO_TEST=will be tested when fiber_cxx_invoke suppression will be removed NO_CHANGELOG=introduced in 3.0.0-alpha3 NO_DOC=bugfix
-
Serge Petrenko authored
Starting with commit f1c2127d ("replication: add META stage to JOIN") replication master appends a special section, called IPROTO_JOIN_META to the initial snapshot sent to the replica. This section contains the latest raft term and synchronous transaction queue owner and term. The section is only sent to nodes, which have a non-zero version_id. For some reason, version_id encoding for FETCH_SNAPSHOT (analog of JOIN for anonymous replicas) wasn't added in that commit, so anonymous replicas do not receive synchronous queue state. This leads to them raising ER_SPLIT_BRAIN errors later after join, when the first synchronous row arrives. In order to fix this, start encoding version_id in FETCH_SNAPSHOT requests. Closes #9401 @TarantoolBot document Title: new field in `IPROTO_FETCH_SNAPSHOT` request `IPROTO_FETCH_SNAPSHOT` request was bodyless (only contained a header) until now, but now it receives a body with a single field: `IPROTO_SERVER_VERSION` : MP_UINT -- an encoded representation of the server version of a replica issuing the request.
-
Yan Shtunder authored
Added a new is_sync parameter to `box.begin()`, `box.commit()`, and `box.atomic()`. To make the transaction synchronous, set the `is_sync` option to `true`. If any value other than `true/nil` is set, for example `is_sync = "some string"`, then an error will be thrown. Example: ```Lua -- Sync transactions box.atomic({is_sync = true}, function() ... end) box.begin({is_sync = true}) ... box.commit({is_sync = true}) box.begin({is_sync = true}) ... box.commit() box.begin() ... box.commit({is_sync = true}) -- Async transactions box.atomic(function() ... end) box.begin() ... box.commit() ``` Closes #8650 @TarantoolBot document Title: box.atomic({is_sync = true}) Added the new `is_sync` parameter to `box.atomic()`. To make the transaction synchronous, set the `is_sync` option to `true`. Setting `is_sync = false` is prohibited. If to set any value other than true for example `is_sync = "some string"`, then an error will be thrown.
-
Mergen Imeev authored
This patch adds dependencies support for roles. Part of #9078 @TarantoolBot document Title: dependencies for roles Roles can now have dependencies. This means that the verify() and apply() methods will be executed for these roles, taking into account the dependencies. Dependencies should be written in the "dependencies" field of the array type. Note, the roles will be loaded (not applied!) in the same order in which they were specified, i.e. not taking dependencies into account. Example: Dependencies of role A: B, C Dependencies of role B: D No other role has dependencies. Order in which roles were given: [E, C, A, B, D, G] They will be loaded in the same order: [E, C, A, B, D, G] The order, in which functions verify() and apply() will be executed: [E, C, D, B, A, G].
-
Vladimir Davydov authored
Closes #9405 @TarantoolBot document Title: Document then new built-in system event `box.wal_error` The new event is broadcast whenever Tarantool fails to commit a transaction to the write-ahead log (WAL), which usually means there's a problem with the underlying disk storage. The new event's payload is a table that currently contains the only field `count` that stores the number of WAL errors happened so far or nil if there hasn't been any WAL errors.
-
- Nov 28, 2023
-
-
Nikolay Shirokovskiy authored
Test suite run can produce coredumps in case of bugs. Unfortunately coredumps related to bugs are mixed with coredumps produced related to special test conditions, like when we test Tarantool response to deadly signal. Avoid producing coredumps in correct test suite run. NO_CHANGELOG=internal NO_DOC=internal
-
Vladimir Davydov authored
The fix is simple: look up the function in `box.func` by name and, if found, execute its `call` method. The only tricky part is to avoid the lookup before `box.cfg` is called because `box.func` is unavailable at the time. We achieve that by checking `box.ctl.is_recovery_finished`. Closes #9131 NO_DOC=bug fix
-
Nikolay Shirokovskiy authored
On Tarantool shutdown we destroy all the fibers in some sequence. We don't require that all the fibers are finished before shutdown. So it may turn out that we first destroy some alive fiber and then destroy another alive fiber which joins the first one. Currently we have use-after-free issue in this case because clearing `link` field of the second fiber changes `wake` field of the first fiber. Close #9406 NO_DOC=bugfix
-
Nikolay Shirokovskiy authored
Graceful shutdown is done in a special fiber which is started for example on SIGTERM. So it can run concurrently with fiber executing Tarantool init script. On init fiber exit we break event loop to pass control back to the Tarantool initialization code. But we fail to run event loop a bit more to finish graceful shutdown. The test is a bit contrived. A more real world case is when Tarantool is termintated during lingering box.cfg(). Close #9411 NO_DOC=bugfix
-
- Nov 27, 2023
-
-
Alexander Turenko authored
It was suggested by Igor Munkin (@igormunkin) in PR #9288. Part of #8862 Follows up PR #9288 NO_DOC=the help message is not an API, nothing to document NO_CHANGELOG=see NO_DOC NO_TEST=see NO_DOC
-
Mergen Imeev authored
According to ANSI, EXISTS is a predicate that tests a given subquery and returns true if it returns more than 0 rows, false otherwise. However, after 2a720d11, EXISTS worked correctly only if there were exactly 0 or 1 rows, and in all other cases it gave an error. This patch makes EXITS work properly. Closes #8676 NO_DOC=bugfix
-
Magomed Kostoev authored
Before this commit the space rollback had been treated as a new space creation, so it caused creation of a new space object in the Lua's box.space namespace. Since the preceding space drop removed the space object from the namespace, on the space rollback all the Lua users of the space loosed the track of its changes: the original space object is never updated anymore. This is fixed by detecting the space rollback and restoring the old space object instead of creating a new one. Closes #9120 NO_DOC=bugfix
-
Mergen Imeev authored
This patch fixes an assertion or segmentation error if a FOREIGN KEY or CHECK constraint is declared before the first column. Closes #8392 NO_DOC=bugfix
-
- Nov 23, 2023
-
-
Sergey Vorontsov authored
In the `.github/workflows/source.yml` workflow for preparing a tarball with the source code, a PackPack Docker container is already used. For uploading the tarball to the repo, the `aws` utility is used, which is installed before. To skip installation of additional packages on the self-hosted runners, we are moving to the GitHub-hosted runners, which already have the `aws` utility installed. Step `Prepare checkout` is removed because the GitHub-hosted runner is an ephemeral environment. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Sergey Vorontsov authored
In this commit, we're fixing a problem with Docker in the workflow `.github/workflows/source.yml`. The mentioned workflow uses the `.github/actions/environment` action that needs a permission to make a loopback device for [1]. We didn't allow for that before due to missing container args, and it caused the following error: ``` umount: /tmp/luajit-test-vardir: must be superuser to unmount. 256000+0 records in 256000+0 records out 1048576000 bytes (1.0 GB, 1000 MiB) copied, 1.36702 s, 767 MB/s mount: /tmp/luajit-test-vardir: mount failed: Operation not permitted. Error: Process completed with exit code 1. ``` The problem started since commit af996bbb ("ci: dockerize linux workflows"). The simplest way to fix the issue is not to run the workflow inside a Docker container because a tarball with the source code is created via the `./packpack/packpack tarball` command that runs a Docker container as well. [1] https://github.com/tarantool/tarantool/issues/7472 NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Mergen Imeev authored
This patch removes the sql_default_value field from the struct field_def and the sql_default_value_expr field from the struct tuple_field as they are no longer needed. Follow-up #8793 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
- Nov 22, 2023
-
-
Sergey Bronnikov authored
The patch introduces a new CMake target "checkpatch", that checks patches on top of the master branch using script checkpatch.pl [1]. By default CMake looking for `checkpatch.pl` in a directory "checkpatch" in Tarantool's repository root directory and in a directories specified in PATH. By default commit revisions range checked by checkpatch is `origin/master..HEAD`, `origin/master` could be overridden with environment variable `CHECKPATCH_GIT_REF`. 1. https://github.com/tarantool/checkpatch NO_CHANGELOG=build NO_DOC=build NO_TEST=build
-
Vladimir Davydov authored
We run SVACE on static build. It doesn't compile unless libsvace is in the allow list. Follow-up #9242 NO_DOC=build NO_TEST=build NO_CHANGELOG=build
-
Vladimir Davydov authored
SVACE stopped working after commit 98b38e89 ("cmake: allow to bundle static dependencies in main project") changed the bundled libs directory layout. To fix this, let's introduce the new cmake option BUNDLED_LIBS_INSTALL_DIR and set it in static-build/CMakeLists.txt to the legacy location. Also, let's use the legacy directories for each external project's PREFIX, SOURCE_DIR, BINARY_DIR, and STAMP_DIR. Follow-up #9242 NO_DOC=build NO_TEST=build NO_CHANGELOG=build
-
- Nov 21, 2023
-
-
Igor Munkin authored
* Mark CONV as non-weak, to prevent elimination of its side-effect. * Fix ABC FOLD rule with constants. * test: add test for conversions folding * Add NaN check to IR_NEWREF. * test: fix flaky OOM error frame test * LJ_GC64: Fix lua_concat(). * test: introduce asserts assert_str{_not}_equal * ci: enable codespell * cmake: introduce target with codespell * codehealth: fix typos * tools: add cli flag to run profile dump parsers * profilers: purge generation mechanism * memprof: refactor symbol resolution * sysprof: fix crash during FFUNC stream * Fix last commit. * Print errors from __gc finalizers instead of rethrowing them. * x86/x64: Fix math.ceil(-0.9) result sign. * test: fix flaky fix-jit-dump-ir-conv.test.lua * IR_MIN/IR_MAX is non-commutative due to underlying FPU ops. * Fix jit.dump() output for IR_CONV. * Fix FOLD rule for x-0. * FFI: Fix pragma push stack limit check and throw on overflow. * Prevent compile of __concat with tailcall to fast function. * Fix base register coalescing in side trace. * Fix register mask for stack check in head of side trace. * x64: Properly fix __call metamethod return dispatch. Closes #8594 Closes #8767 Closes #9339 Part of #9145 NO_DOC=LuaJIT submodule bump NO_TEST=LuaJIT submodule bump
-