- Jan 25, 2023
-
-
Nikolay Shirokovskiy authored
Flight recorder uses mmapped file for performance reasons. Unfortunately if for some reason mapping is not possible we got SIGBUS on accessing mmapped memory. We already handle this issue on flight recorder start in [1]. But if we got SIGBUS in the meanwhile we currently got a crash. Let's just panic. [1] https://github.com/tarantool/tarantool-ee/commit/e68b84768f3ae76aa9f03d36dd6f0587b884e1bb CE Part of https://github.com/tarantool/tarantool-ee/issues/198 NO_DOC=stub for EE part NO_CHANGELOG=stub for EE part NO_TEST=tested in EE part
-
Nikolay Shirokovskiy authored
Currently crash has code to report crashes to feedback URL. It does not belong to core. Also it brings dependencies from box to core with INSTANCE_UUID and REPLICASET_UUID variables. So let's move this part back to box. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Nikolay Shirokovskiy authored
Currently we keep nanoseconds but do not use them and as a result we have unneeded scaling. Let's use second precision which is on par with our need. Also drop ns_to_localtime. It can be written much shorter and it feels like we don't need distinct function here. Like when we formatting time in say.c. NO_TEST=refactoring NO_DOC=refactoring NO_CHANGELOG=refactoring
-
Mikhail Elhimov authored
1. Add pretty-printer for 'rlist' type Script holds table of possible rlists in tarantool. It recognizes automatically actual type of entries and either expression refers to a single items or to entire list. Then either single entry is displayed (along with its index in the list) or the entire list (entry-by-entry). 2. Add command 'tt-list' that use the mentioned pretty-printer and provide additional functionality, namely: - walking over the list (both directions) - filter entries with predicate - allow to explicitly specify list head and actual entry type if automatic recognition doesn't work See 'help tt-list' for details. Close #7731 NO_DOC=gdb extension NO_CHANGELOG=gdb extension NO_TEST=gdb extension
-
Serge Petrenko authored
Currently all ballot_watcher fibers for different appliers have identical names: "applier_ballot_watcher". It's hard to distinguish them in logs and to know which ballot_watcher belongs to which applier, and that complicates debugging quite a bit. Let's fix this and name the fibers properly. With master's uri at the end. NO_DOC=no behaviour changed NO_CHANGELOG=no behaviour changed NO_TEST=doesn't need one
-
Serge Petrenko authored
Add is_joinable parameter to applier_fiber_new. For now all usages of the function pass true, but the next commit will create a non-joinable applier fiber. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Mergen Imeev authored
Some of the SQL modules have not been used for a long time. This patch drops these modules. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
- Jan 24, 2023
-
-
Serge Petrenko authored
A node configured in 'manual' mode and promoted by `box.ctl.promote()` stays in is_candidate state for the whole term, even though it is not is_cfg_candidate. If such a node is the first one to notice leader death or to hit the election timeout, it bumps the term excessively, then immediately becomes a mere follower, because its is_candidate is reset with is_cfg_candidate. This extra term bump (one term after the node was actually promoted) is unnecessary and might lead to strange errors: tarantool> box.ctl.promote() --- - error: 'The term is outdated: old - 3, new - 4' ... Fix this by checking if a node is configured as a candidate before trying to start new elections. Closes #8168 NO_DOC=bugfix
-
Serge Petrenko authored
There is a false assertion in raft_stop_candidate(): it assumes that the node must always have a running timer whenever it sees the leader. This is not true when the node is busy writing the new term on disk. Cover the mentioned case in the assertion. Closes #8169 NO_DOC=bugfix Co-authored-by:
Sergey Ostanevich <sergos@tarantool.org>
-
Sergey Bronnikov authored
Commit 2be74a65 ("test/cmake: add a function for generating unit test targets") added a function for generating unit test targets in CMake. This function makes code simpler and less error-prone. Proposed patch adds a similar function for generating performance test targets in CMake. NO_CHANGELOG=build infrastructure updated NO_DOC=build infrastructure updated NO_TEST=build infrastructure updated
-
Sergey Bronnikov authored
Commit 2be74a65 ("test/cmake: add a function for generating unit test targets") added a function for generating unit test targets in CMake. This function makes code simpler and less error-prone. Proposed patch adds a similar function for generating fuzzing test targets in CMake. NO_CHANGELOG=build infrastructure updated NO_DOC=build infrastructure updated NO_TEST=build infrastructure updated
-
Sergey Bronnikov authored
Commit 2be74a65 ("test/cmake: add a function for generating unit test targets") added a function for generating unit test targets in CMake. However, due to a wrong scope UNIT_TEST_TARGETS remains empty after generating unit tests. Patch fixes that. NO_CHANGELOG=build infrastructure NO_DOC=build infrastructure NO_TEST=build infrastructure
-
Vladimir Davydov authored
We define flight_recorder_cfg struct, box_get_flightrec_cfg function, and box.internal.cfg_configure_flightrec Lua function in the CE repository although they are actually needed only in the EE repository. Let's drop them all from the CE repository and instead define stub functions box_check_flightrec and box_set_flightrec that would check/apply box.cfg flight recorder parameters. While we are at it, add missing comments to flightrec function stubs. NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Yaroslav Lobankov authored
The approach - name: Set output run: echo "::set-output name={name}::{value}" is deprecated [1]. Switching to the new approach: - name: Set output run: echo "{name}={value}" >> $GITHUB_OUTPUT [1] https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci
-
Serge Petrenko authored
The title is pretty self-explanatory. That's all this commit does. Now a couple of words on why this is needed. Commit 2a0c4f2b ("replication: make replica subscribe to master's ballot") changed replica connect behaviour: instead of holding a single connection to the master, replica may have two: master's ballot retrieval is now performed in a separate connection owned by a separate fiber called ballot_watcher. First connection to master is initialized as always and then applier fiber creates the ballot_watcher, which connects to the same address on its own. This lead to some unexpected consequences: random cartridge integration tests started failing with the following error: tarantool/tarantool/cartridge/test-helpers/cluster.lua:209: "localhost:13303": Replication setup failed, instance orphaned Here's what happened. Cartridge has a module named remote control. The module mimics a tarantool server and "listens" on the same socket the tarantool is intended to listen before box.cfg{listen=...} is called. For example one can see such output in tarantool logs with cartridge: NO_WRAP 13:07:43.210 [10265] main/132/applier/admin@localhost:13301 I> remote master 46a71a25-4328-4a41-985d-d93d6ed7fb7f at 127.0.0.1:13301 running Tarantool 2.11.0 13:07:43.210 [10265] main/133/applier/admin@localhost:13302 I> remote master 00000000-0000-0000-0000-000000000000 at 127.0.0.1:13302 running Tarantool 1.10.0 13:07:43.210 [10265] main/134/applier/admin@localhost:13303 I> remote master bcce45ad-38b7-4d8a-936a-133614a7775f at 127.0.0.1:13303 running Tarantool 2.11.0 NO_WRAP The second "Tarantool" in the output (with zero instance uuid and running Tarantool 1.10.0) is the remote control on an unconfigured tarantool instance. Before splitting applier connection in two, this was no problem: applier would try to get the instance's ballot from a remote control listener and fail (remote control doesn't answer to replication requests). Applier would retry connecting to the same address until it got a reply, meaning that remote control is stopped and real tarantool became listening on the socket. Now applier has two connections, and the following situation became possible: when applier connection is initialized, remote control is still working, and applier is connected to the remote control instance. Applier performs ballot receipt in a separate fiber, which's not yet initialized, so no errors are raised. As soon as applier creates the ballot watcher, remote control is stopped and the real tarantool starts listening on the socket. This means that no error happens in the ballot watcher as well (normal tarantool answers to replication requests, of course). And we get to an unhandled situation when applier itself is connected to (already dead) remote control instance, while its ballot watcher is connected to the real tarantool. As soon as applier sees the ballot is fetched, it continues connection process to the already dead remote control instance and gets an error: NO_WRAP 13:07:44.214 [10265] main/133/applier/admin@localhost:13302 I> failed to authenticate 13:07:44.214 [10265] main/133/applier/admin@localhost:13302 coio.c:326 E> SocketError: unexpected EOF when reading from socket, called on fd 1620, aka 127.0.0.1:54150: Broken pipe 13:07:44.214 [10265] main/133/applier/admin@localhost:13302 I> will retry every 1.00 second 13:07:44.214 [10265] main/115/remote_control/127.0.0.1:50242 C> failed to synchronize with 1 out of 3 replicas 13:07:44.214 [10265] main/115/remote_control/127.0.0.1:50242 I> entering orphan mode NO_WRAP Follow-up #5272 Closes #8185 NO_CHANGELOG=not user-visible NO_DOC=not user-visible (can't create Tarantool with zero uuid)
-
- Jan 23, 2023
-
-
Georgiy Lebedev authored
When we rollback a prepared statement that deletes an MVCC story, we need to reset the deleted story's PSN. Closes #7930 NO_DOC=bugfix
-
Georgiy Lebedev authored
During transaction rollback, we unconditionally assign a PSN to it: we should do this only when necessary, i.e., a transaction is RW and is not already prepared. Needed for #7930 NO_CHANGELOG=refactoring NO_DOC=refactoring NO_TEST=refactoring
-
Georgiy Lebedev authored
Currently, if transaction preparation fails, the transaction is left in an inconsistent state: it has a PSN assigned to it, but its status is not 'prepared' — fix this by resetting its PSN. Needed for #7930 NO_CHANGELOG=refactoring NO_DOC=refactoring NO_TEST=refactoring
-
Georgiy Lebedev authored
During preparation of insert statements in MVCC, we define an old story and abort all transactions that delete this story. If there exists an older story in the history chain, but the story is deleted by a prepared (not necessarily committed) transaction, we consider that it de-facto does not exist anymore — this logic is consistent, since during preparation of the transaction deleting this story, the conflict resolution described above was already done. In this manner, there can be no more than one prepared statement deleting a story at any point in time. Closes #8104 NO_DOC=bugfix
-
Sergey Bronnikov authored
Changelog: https://curl.se/changes.html#7_87_0 New release contains fixes for 7 security problems [1]: https://curl.se/docs/CVE-2022-35252.html https://curl.se/docs/CVE-2022-32221.html https://curl.se/docs/CVE-2022-35260.html https://curl.se/docs/CVE-2022-42915.html https://curl.se/docs/CVE-2022-42916.html https://curl.se/docs/CVE-2022-43551.html https://curl.se/docs/CVE-2022-43552.html Patch adds a new option ENABLE_WEBSOCKETS defined in curl build infrastructure with it's default value used in 7.87.0. 1. https://curl.se/docs/releases.html NO_DOC=libcurl submodule bump NO_TEST=libcurl submodule bump Closes #8150
-
Sergey Bronnikov authored
Curl 7.87 uses CMake's keywords (for example GREATER_EQUAL [0]) available since CMake 3.7. However, we are still supporting old Ubuntu version where CMake version lower than 3.7 is used. This patch adds a script that enables CMake repository with newer CMake packages for Ubuntu 16.04 (Xenial) and bumps a required version of CMake. NOTE (regarding cmake3 package): Commit 1a62d874 ("build: update CMake minimum version to 3.1") [1] added an additional package requirement with "cmake3". This package has been created in addition to package "cmake", because CMake 3 had breaking changes [2]. Package "cmake3" has been provided only in Ubuntu 14.04 LTS (Trusty Tahr) [4], that will be EOLed in 2024, and CentOS 7, that was EOLed in Aug 2020 and will have end of security support in Jun 2024 [5]. Latest version of package "cmake3" for Ubuntu 14.04 is 3.5.1 [3], so it is not worth to bump version of cmake3 in requirements and I left it the same. 0. https://cmake.org/cmake/help/latest/command/if.html#greater-equal 1. https://github.com/tarantool/tarantool/commit/1a62d874db5f4780da5b35b6d4d0e3a296148920 2. https://cmake.org/cmake/help/latest/release/3.0.html#id4 3. https://launchpad.net/ubuntu/trusty/+package/cmake3 4. https://ubuntu.com/about/release-cycle 5. https://wiki.centos.org/About/Product Needed for #8150 NO_CHANGELOG=see the next commit NO_DOC=libcurl submodule bump NO_TEST=libcurl submodule bump
-
- Jan 20, 2023
-
-
Gleb Kashkin authored
Before the change all tarantool.compat options' tests were running on the same instance, and because luatest uses multiple threads, they could interfere with compat configuration at the same time as other test checks default behavior. This could cause such tests to flack. Now all options' tests are being run in a separate instance via server:exec(). Closes #8033 NO_DOC=test fix NO_CHANGELOG=test fix
-
- Jan 19, 2023
-
-
Vladislav Shpilevoy authored
In a few places visible to users and in iproto naming the term "cluster" really means "replicaset". One of those places is a part of public API - box.iproto.key.CLUSTER_UUID - which is not yet released. The commit renames "cluster" in those places as a preparation for introduction of actual "cluster", like a set of replicasets. It will start from introduction of cluster name in addition to replicaset uuid/name. There are places which still mention 'cluster', but their rename would be breaking. It will be addressed in scope of a bigger patchset. Part of #5029 NO_CHANGELOG=Was not released @TarantoolBot document Title: Rename `IPROTO_CLUSTER_UUID` to `IPROTO_REPLICASET_UUID` This is a name for one of the IProto keys. The key value doesn't change and the protocol is still backward compatible. But better rename it to `IPROTO_REPLICASET_UUID`, because in future `IPROTO_CLUSTER_UUID` will most likely mean a different thing.
-
Pavel Semyonov authored
Proofread changelogs for 2.11.0-rc, part 2 Fix grammar, punctuation, and wording NO_CHANGELOG=changelog NO_DOC=changelog NO_TEST=changelog
-
Serge Petrenko authored
The event is used by appliers as a better alternative to IPROTO_VOTE. Besides, event subscribers receive exactly the same payload as the ones sending IPROTO_VOTE. So there's no need to guard against subscription to this particular event as long as IPROTO_VOTE isn't guarded. Follow-up #5272 NO_DOC=no user-visible changes NO_CHANGELOG=no user-visible changes NO_TEST=tested by ee
-
- Jan 18, 2023
-
-
Igor Munkin authored
As a result of the commit 98fcd437 ("ci: add CMAKE_EXTRA_PARAMS to LuaJIT workflow") both <inputs.buildtype> and <inputs.GC64> parameters have become obsolete. All jobs with LuaJIT integration testing has started to use these in scope of the commit tarantool/luajit@5b53850da30e532ced976e95af1f301667a6a272 ("ci: use CMAKE_EXTRA_PARAMS in LuaJIT integration"). Hence, the value of <inputs.CMAKE_EXTRA_PARAMS> has to be used to specify the build flavor, so <inputs.buildtype> and <inputs.GC64> can be dropped later. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci Reviewed-by:
Yaroslav Lobankov <y.lobankov@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
Igor Munkin authored
As a result of the commit 1eb0a696 ("ci: change runner dispatch for LuaJIT testing") <inputs.host> parameter has become obsolete. The testing workflow has been updated in scope of the commit tarantool/luajit@fcaecf8fb42ff8a35582fbd8d034eb6f3b9b5b68 ("ci: use strategy matrix for integration workflow"). Hence, the only changes required to finish the transition from <inputs.host> to <inputs.arch> + <inputs.os> are the following: * Drop <inputs.host> parameter from the LuaJIT integration workflow * Make both <inputs.arch> and <inputs.os> parameters obligatory Besides, there is no need to obtain the kernel name and the machine hardware name in scope of the separate workflow step, since all info need to be passed to .test.mk is already passed via workflow inputs. Anyway, .test.mk need to be adjusted to the values used for the new workflow parameters. NO_DOC=ci NO_TEST=ci NO_CHANGELOG=ci Reviewed-by:
Yaroslav Lobankov <y.lobankov@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
Ilya Verbin authored
The function remove_root_directory, which is used for obtaining module names for per-module logging, throws an error when current working directory is `/'. Rewrite it to fix the bug and rename it to strip_cwd_from_path to make the name more clear. Closes #8158 NO_DOC=unreleased NO_CHANGELOG=unreleased
-
Serge Petrenko authored
See the docbot request for details. Closes #5272 @TarantoolBot document Title: new `bootstrap_strategy` configuration option Default behaviour of replica set bootstrap, replica recovery when connecting to remote nodes and replication reconfiguration is changed. The new behaviour is controlled by the option `bootstrap_strategy`, which has the default value "auto". Now `replication_connect_quorum` configuration option takes no effect, and the effective quorum value for each stage of configuration (quorum of established connections, quorum of synced nodes) is determined automatically. On replica set bootstrap, the nodes will refuse to boot, unless a majority is reached (this would mean replication_connect_quorum = 3, when #box.cfg.repilcation is 4 or 5, for example, or replication_connect_quorum = 2, when #box.cfg.replication is 2 or 3). Moreover, the bootstrap leader will fail to boot unless it sees that every connected node chose it as the bootstrap leader. On new replica join to an existing cluster, the replica will fail to boot only if it couldn't connect to anyone. As long as at least one connection is established, the replica will try to join like before. Moreover, the replica will check that its box.cfg.replication table contains every registered node in the cluster, thus ensuring that it has tried to connect to everyone and chose the best bootstrap leader possible. On replication reconfiguration on a working instance and recovery from local WAL files, the node will try to connect to everyone specified in box.cfg.replication. Any number of connections (even no connections) will be deemed a success, but the replica will stay in orphan mode until it is synced with everyone connected. If you wish to return to the old behavior, a deprecated setting `bootstrap_strategy` = "legacy" is left for now. With `bootstrap_strategy` = "legacy", the node behaves exactly like before: quorum for both connection and synchronisation is determined by `replication_connect_quorum`, and neither bootstrap leader nor joining replicas perform any additional checks on bootstrap.
-
Serge Petrenko authored
The only observable behaviour of non-zero replication_sync_timeout is that it delays box.cfg{replication=...} return until either the node is synced with others or the timeout passes. If the timeout passes without reaching sync, box.cfg{} is exited and the node enters "orphan" state, in which it can't write anything until either a reconfiguration happens or replicaset is finally synced. While the previous box.cfg{} call is running (probably waiting for replication_sync_timeout), the user can't issue another box.cfg{} call. So basically, while giving no guarantees that the node exits box.cfg{} in fully synced state, the timeout makes reconfiguration harder: even if the user knows that the sync won't be achieved, he will have to wait until the full timeout passes in order to reconfigure replication. Let's make the default value of replication_sync_timeout 0 instead of 300 seconds. The user still may set the timeout to whatever he likes. Besides, we have recently introduced box.ctl.on_recovery_state triggers, which have a "synced" event, and this is the new recommended way to wait until the node is synced with others. Part-of #5272 @TarantoolBot document Title: Changed default value for `box.cfg.replication_sync_timeout The default value for `replication_sync_timeout` configuration option was changed from 300 seconds to 0.
-
Serge Petrenko authored
Now the instance appends a list of registered replica set members it knows of to its ballot. Prerequisite #5272 NO_CHANGELOG=not user-visible @TarantoolBot document Title: New fields in instance's ballot. Instance's ballot (a response to IPROTO_VOTE sent on replica connect) receives two new fields: 1) The uuid of the node this instance considers the bootstrap leader. Key: IPROTO_BALLOT_BOOTSTRAP_LEADER_UUID = 0x08 Value: uuid, encoded as 36-byte string (like "bfd2b31c-b740-43e5-bf3c-28538a74c9a6"). 2) An array of registered replica set members uuids. Key: IPROTO_BALLOT_REGISTERED_REPLICA_UUIDS = 0x09 Value: a MP_ARRAY of uuids, each uuid encoded as a 36-byte string (like in an example above).
-
Serge Petrenko authored
Note that bootstrap leader uuid is not set when an anonymous replica registers, because technically it's not performing a bootstrap. Prerequisite #5272 NO_DOC=appended to next commit's doc request NO_CHANGELOG=not user-visible
-
Serge Petrenko authored
Previously replicas chose the remote master to boot from by comparing master ballot, which are received in response to IPROTO_VOTE request right on connection init. Such information is not enough in some scenarios. For example, when implementing anonymous replicas and retrying relica join, we had to restart all connections in order to get the latest ballot information. Let's change that: make replica subscribe to the built-in "internal.ballot" event instead of relying on request-response scheme of IPROTO_VOTE. Now replicas always have up-to-date ballot information and there is no need to reinitialize connections to update the ballots. Introduce a new fiber running in tx thread for this purpose: applier ballot watcher. The fiber subscribes on "internal.ballot" event and watches it all the time while the connection to master is alive. In case the master isn't aware of IPROTO_WATCH request or of "internal.ballot" event, old behaviour is also implemented: ballot watcher simply waits for IPROTO_VOTE response and exits. The ballot watcher is started whenever replica tries to connect or reconnect to the remote master and is cancelled whenever its parent connection to the master is closed. We do not put much effort into restarting the fiber and retrying to connect in case it fails. For now ballot info is only used during bootstrap, and not trying to keep the fiber alive at all costs simplifies the code quite a lot. Later on ballot subscriptions will play a more significant role in choosing the bootstrap leader: replicas will re-check remote ballots every now and then during the bootstrap leader election. Part-of #5272 NO_CHANGELOG=internal change NO_TEST=tested by existing replication tests NO_DOC=internal change
-
Serge Petrenko authored
Extract common connection initialization code in a helper. It'll be used in the next commit by auxiliary fibers connecting to the same master. Part-of #5272 NO_CHANGELOG=refactoring NO_TEST=refactoring NO_DOC=refactoring
-
Serge Petrenko authored
Extract ballot body decoding logic from xrow_decode_ballot, it will be reused to decode "internal.ballot" event in the next commit. Prerequisite #5272 NO_CHANGELOG=refactoring NO_TEST=refactoring NO_DOC=refactoring
-
Serge Petrenko authored
Add a new builtin event carrying instance's ballot information (that is, what this instance would normally send in reply to IPROTO_VOTE request). The event will be watched by connecting replicas to find the bootstrap leader. Prerequisite #5272 NO_DOC=technically user-visible, but not intended for users NO_CHANGELOG=see NO_DOC
-
Serge Petrenko authored
In-scope-of #5272 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
luatest_helpers/cluster module was recently added to luatest trunk and renamed to replica_set. Let's update its name everywhere in gh_6260_add_builtin_events_test, since this test will be amended in the following commits and the new module name will be used. In-scope-of #5272 NO_DOC=refactoring NO_CHANGELOG=refactoring
-
Serge Petrenko authored
box.iproto table with iproto features and constants was exported to Lua in commit fe89aabe ("box: export IPROTO constants and features to Lua"). Add the table to the whitelist of what's available even before box.cfg. Prerequisite #5272 Closes #8053 NO_DOC=intermediate state wasn't released, no changes necessary NO_CHANGELOG=see NO_DOC NO_TEST=used in next commit's tests
-
Serge Petrenko authored
Extract mp_sizeof_ballot_max() and mp_encode_ballot() helpers from iproto_reply_vote(), since they will be used by builtin "internal.ballot" event soon. While I'm at it, fix mp_sizeof_ballot() calculation: add forgotten map element and replace mp_sizeof_uint(UINt*_MAX) with sizes of actual values to be encoded. Prerequisite #5272 NO_CHANGELOG=refactoring NO_TEST=refactoring NO_DOC=refactoring
-