- Jul 19, 2021
-
-
VitaliyaIoffe authored
Build for Fedora 34 is breaking out due to uninitialized variables in a few places: For example, [100%] Built target merger.test /source/build/usr/src/debug/tarantool-2.9.0.116/src/box/sql.c: In function 'tarantoolSqlNextSeqId': /source/build/usr/src/debug/tarantool-2.9.0.116/src/box/sql.c:1186:13: error: 'key' may be used uninitialized [-Werror=maybe-uninitialized] 1186 | if (box_index_max(BOX_SEQUENCE_ID, 0 /* PK */, key, Needed for: #6074
-
- Jul 16, 2021
-
-
We use `-notest` postfix when wanna share the code only, without running tests. And for FreeBSD template the snippet has been missed. Add it. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by:
Alexander Turenko <alexander.turenko@tarantool.org>
-
Serge Petrenko authored
Every error that happens during master processes a join or subscribe request is sent to the replica for better diagnostics. This could lead to the following situation with the TimedOut error: it could be written on top of a half-written row and make the replica stop replication with ER_INVALID_MSGPACK error. The error is unrecoverable and the only way to resume replication after it happens is to reset box.cfg.replication. Here's what happened: 1) Replica is under heavy load, meaning it's event loop is occupied by some fiber not yielding control to others. 2) applier and other fibers aren't scheduled while the event loop is blocked. This means applier doesn't send heartbeat messages to the master and doesn't read any data coming from the master. 3) The unread master's data piles up. First in replica's receive buffer, then in master's send buffer. 4) Once master's send buffer is full, the corresponding socket stops being writeable and the relay yields waiting for the socket to become writeable again. The send buffer might contain a partially written row by now. 5) Replication timeout happens on master, because it hasn't heard from replica for a while. An exception is raised, and the exception is pushed to the replica's socket. Now two situations are possible: a) the socket becomes writeable by the time exception is raised. In this case the exception is logged to the buffer right after a partially written row. Once replica receives the half-written row with an exception logged on top, it errors with ER_INVALID_MSGPACK. Replication is broken. b) the socket isn't writeable still (the most probable scenario) The exception isn't logged to the socket and the connection is closed. Replica eventually receives a partially-written row and retries connection to the master normally. In order to prevent case a) from happening, let's not push TimedOut errors to the socket at all. They're the only errors that could be raised while a row is being written, i.e. the only errors that could lead to the situation described in 5a. Closes #4040
-
- Jul 15, 2021
-
-
Aleksandr Lyapunov authored
The problem was fixed in #5515, this commit just verifies that the test case works fine. Closes #6193
-
- Jul 14, 2021
-
-
Mergen Imeev authored
Prior to this patch, in some cases the type mismatch error description showed the value, and in some cases the type of the value. After this patch, both the type and value will be shown. Also, inconsistent type error description also become more informative. Previously it contained only type of value, now it contains value and its type. Close #6176
-
Mergen Imeev authored
Prior to this patch, the type mismatch error description and the inconsistent types error description in some cases displayed type names that were different from the default ones. After this patch, all types in these descriptions are described using the default names. Part of #6176
-
Mergen Imeev authored
Currently, some values are displayed improperly in the type mismatch error description. For VARBINARY, the word "varbinary" is printed instead of the value. STRING values are printed without quotes, which can be confusing in some cases, such as when it consists of spaces. This patch introduces the following changes: 1) VARBINARY value will be printed as x'<value in hexadecimal format>'. 2) STRING value will be printed in single quotes. 3) UUID value will be printed in single quotes. UUID value does not need to be enclosed in single quotes, since there are no literals for UUIDs, but it looks more convenient. Part of #6176
-
Mergen Imeev authored
STRING, MAP, and ARRAY values that are too long can make the type mismatch error description less descriptive than necessary. This patch truncates values that are too long and adds "..." to indicate that the value has been truncated. Part of #6176
-
- Jul 12, 2021
-
-
Vladislav Shpilevoy authored
box_promote() when called manually used to wait for the existing transactions from a foreign limbo to end during a timeout. Giving them a chance to end on their terms. The waiting was done via polling like while (!done) sleep(small_timeout); Polling is almost always super bad both for execution time and for CPU usage. The patch replaces it with proper waiting based on events happening in the limbo. Closes #5190
-
Sergey Bronnikov authored
Updated third_party/zstd submodule from v1.4.8 to 1.5.0 version. Changelog includes many changes including improved (de)compression speed and decompression ratio, see [1]. However, performance evaluation of reading Tarantool snapshot on start haven't shown improvements [2] . 1. https://github.com/facebook/zstd/releases/tag/v1.5.0 2. https://github.com/tarantool/tarantool/pull/6090
-
Aleksandr Lyapunov authored
The problem was fixed in #6131, this commit just verifies that the test case works fine. Closes #5892
-
- Jul 09, 2021
-
-
Andrey Kulikov authored
Fix build errors on arm64 with backtraces being enabled. Fixes #6142 See also: - https://github.com/libunwind/libunwind/pull/221 - #5471 - #6142
-
Aleksandr Lyapunov authored
The problem was fixed in #5515, this commit just verifies that the test case works fine. Closes #6137
-
Aleksandr Lyapunov authored
There was a serious problem in txm: index_id from struct index was used as an index in some arrays (for example in array of links in stories). As a result, if a user had created an index specifying ID that is not sequential, the array access would have been out of range which could lead to segfault. This patch makes use of indexes directly, and when it comes to array aceess, a dense_id is used, which fits perfectly for that. As a part of #5515 this patch makes the cases in it at least stable. Part of #5515
-
Aleksandr Lyapunov authored
Histoically an index space may be accessed by iid (index ID), that is the ID set in index definition, or by sequential ID, that is a number in [0..space->index_count]. In other words, a space holds two arrays of indexes: 1) sparse (by iid) and 2) dense, by sequential ID. Since an instance of index belongs to one and only once space, any index is implicitly has this sequential ID. We can simply save this ID in index and distinguish indexes by it too. We could call this member 'sequential_id', but this name has too general meaning, while dense_id directly mentions dence array of a space. Part of #5515
-
Aleksandr Lyapunov authored
Before this patch garabage collector was executed right before allocation of a new story. That means that, for example, in the memtx_tx_history_add_stmt GC could be called a couple of times. Garbage collector is free to delete stories if they are no more used. Removing a story can cause an index modification with further tuple delete. For example imagine a space with one index, where one tuple {1, 1, 1} is placed. Then a transaction comes, deletes that tuple and commits. In this moment the tuple {1, 1, 1} can be still in index, marked as 'dirty' and having a corresponding story, which states that the tuple is deleted. This a valid situation, even necessary, for the case when another transaction is in a read view and must see that {1, 1, 1} not yet deleted. But when possible, GC would try to delete the story and remove the tuple from index. Now imagine that this GC happens when a new transaction inserts, for example, {1, 1, 1, 4}. In memtx_tx_history_add_stmt the new tuple replaces the old one in index, but the story of new tuple is not created yet. Then the new story is created, that causes GC, that tries to remove {1, 1, 1} from index and delete it from memory. An this moment memtx_tx_history_add_stmt relies on existance of {1, 1, 1} which doesn't exist. That is an example of general problem: a cleanup should not be done in the middle of complex function that can have some half made not valid intermediate state. The cleanup, including GC, should be done in the end of functions. This patch move story GC to the end of functions that use it. Part of #5515
-
- Jul 08, 2021
-
-
Cyrill Gorcunov authored
When new raft message comes in from the network we need to be sure that the payload is suitable for processing, in particular `raft_msg::state` must be valid because our code logic depends on it. For this sake make `raft_msg::state` being uint64_t which allows to an easier processing of the state field verification. Same time use panic() instead of unreacheable() macro because the test for valid state must be enabled all the time. Closes #6067 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
- Jul 07, 2021
-
-
Alexander Turenko authored
The main reason of writting this test is to learn how those serializer helpers work. It also may be a good base for future test cases. Part of #3228
-
Alexander Turenko authored
And added a general comment about the compilation unit. Everything is to simplify reading. Part of #3228
-
Alexander Turenko authored
It is non-static function, so it looks logical to have the API comment in the header. Made a few code style fixes, while I'm here. Part of #3228
-
Alexander Turenko authored
It is easier to glance on tightly coupled structures and functions, when they're not mixed with others. Just move without actual changes. Part of #3228
-
- Jul 05, 2021
-
-
Aleksandr Lyapunov authored
With MVCC a case may happen: TX1 does something with some space and yields. TX2 deletes the space and commits. TX1 rolls back. The problem was that TX1 does something with already deleted space. This commit fixes that. Part of #6140
-
Aleksandr Lyapunov authored
That's a good practice in general. In particular it makes test case from #6140 to be stable. Part of #6140
-
Aleksandr Lyapunov authored
The problem was in case when mvcc engine was enabled and a transaction that was sent to read view due to conflict was trying to read a key that was the cause of the conflict. Closes #6131
-
Alexander V. Tikhonov authored
Updated: box/net.box_reconnect_after_gh-3164.test.lua gh-5081 replication/errinj.test.lua gh-3870 replication/qsync_basic.test.lua gh-5355 replication/anon.test.lua gh-5381 replication/status.test.lua gh-5409 replication/election_qsync.test.lua gh-5430 Added new: box-py/iproto.test.py gh-qa-132 replication/gh-5435-qsync-clear-synchro-queue-co> gh-qa-129 replication/gh-5445-leader-inconsistency.test.lua gh-qa-129 replication/gh-3055-election-promote.test.lua gh-qa-127 replication/election_basic.test.lua gh-qa-133
-
- Jul 02, 2021
-
-
mechanik20051988 authored
Static buffer to save snapshot filename, reused later in `xlog_cursor_open` function. So when we log this name after it, we get corrupted name that has nothing to do with the real name. We should use `cursor.name` instead.
-
mechanik20051988 authored
In the test, there is a place, where it was checked that the amount of valid data in snapshot in case when it was truncated is less than in case we write garbage to it. Often it's really so, but depends on the place of truncation/garbage location. Removed this check, because on different systems, snapshot size is slightly different each time you run test, so check will not pass every time. Follow-up #5422
-
Nikita Pettik authored
In tree_iterator_start() it was assumed that iterator always contains valid space id. However, ephemeral spaces are known to have zero space id. So in case we are starting iterator which belongs to ephemeral space, we can't simply find that space in space cache. Moreover, we don't need to track ephemeral spaces in MVCC at all since they can be accessed only pointers and their lifespan is restricted by SQL query execution. So let's skip any MVCC-related routine while starting an iterator. Closes #6095
-
Aleksandr Lyapunov authored
Follow up #6147
-
Vladimir Davydov authored
Follow-up 29e2931c ("vinyl: fix race between compaction and gc of dropped LSM").
-
Alexander V. Tikhonov authored
Odroid is GNU/Linux ARM64 platform. In scope of this commit new GitHub Actions workflows for testing Tarantool on Odroid hosts are added: Release: .github/workflows/odroid_arm64.yml Debug: .github/workflows/odroid_debug_arm64.yml Introduced new targets in .travis.mk Makefile: deps_odroid: Installs required dependencies. build_odroid: Builds Tarantool with the following flags set in env of .github/workflows/odroid_debug_arm64.yml file: 1. to avoid the issue #6142: -DENABLE_BACKTRACE=OFF 2. to avoid the issue #6143: -DCMAKE_C_FLAGS="-Wno-type-limits " -DCMAKE_BUILD_TYPE=Debug test_odroid: Builds and tests `LuaJIT-test` suite on Odroid. Also v1 version of GitHub checkout action is used, because action version v2 was introduced in git version 2.18.0 [1]. The latest available version on Odroid is the following: git is already the newest version (1:2.17.1-1ubuntu0.8). [1]: https://github.com/actions/checkout#readme Closes tarantool/tarantool-qa#121
-
- Jul 01, 2021
-
-
Vladimir Davydov authored
An LSM tree (space index, that is) can be dropped while compaction is in progress for it. In this case compaction will still commit the new run to vylog upon completion. This usually works fine, but not if gc has already purged all the information about the dropped LSM tree from vylog by that time, in which case an attempt to commit the new run will result in permanently broken vylog (because compaction will write vylog records for a non-existing object): ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Slice 13 deleted but not registered To prevent this from happening, let's make compaction silently drop the new run without committing it to vylog if the LSM tree has been dropped. This should work just fine - since the LSM tee isn't used anymore we don't need to have it compacted, neither do we need to delete the run, since gc will eventually clean up all artefacts left from the dropped LSM tree. One thing to be noted is that we also must exclude dropped LSM trees from further compaction - if we don't do that, we might end up picking the dropped LSM tree for compaction over and over again (because it isn't actually compacted). This patch also drops the gh-5141-invalid-vylog-file test, because the latter just ensured that the issue fixed by this patch is there. Closes #5436
-
Egor Elchinov authored
Now idle fibers are present in fiber.info() but without their stacks. Added test ensuring that fiber.info doesn't get cluttered by idle fibers stacks after dispatching multiple requests in short time. Closes #4235
-
Egor Elchinov authored
In some cases it's good to have an opportunity to detect if fiber is idle in a fiber_pool. Now this can be done as fiber->flags & FIBER_IS_IDLE. Needed for: #4235
-
- Jun 24, 2021
-
-
VitaliyaIoffe authored
Make able to save packages in S3 buckets. Closes: #5825
-
VitaliyaIoffe authored
Add ubuntu-hirsute workflow, which runs on push and pull-requests. Fix lintian globbing-patterns-out-of-order warnings. Part of: #5825
-
VitaliyaIoffe authored
Due to a build is going as out-of-source after the patch 781fd38, where was deleted the path of a source dir, macro __FILE__ leads to the compilation fail on ubuntu_21_04. Change __FILE__ to the file path. Needed for: #5825
-
- Jun 23, 2021
-
-
Cyrill Gorcunov authored
We already have `box.replication.upstream.lag` entry for monitoring sake. Same time in synchronous replication timeouts are key properties for quorum gathering procedure. Thus we would like to know how long it took of a transaction to traverse `initiator WAL -> network -> remote applier -> initiator ACK reception` path. Typical output is | tarantool> box.info.replication[2].downstream | --- | - status: follow | idle: 0.61753897101153 | vclock: {1: 147} | lag: 0 | ... | tarantool> box.space.sync:insert{69} | --- | - [69] | ... | | tarantool> box.info.replication[2].downstream | --- | - status: follow | idle: 0.75324084801832 | vclock: {1: 151} | lag: 0.0011014938354492 | ... Closes #5447 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> @TarantoolBot document Title: Add `box.info.replication[n].downstream.lag` entry `replication[n].downstream.lag` represents a lag between the main node writes a certain transaction to it's own WAL and a moment it receives an ack for this transaction from a replica.
-
Cyrill Gorcunov authored
Applier fiber sends current vclock of the node to remote relay reader, pointing current state of fetched WAL data so the relay will know which new data should be sent. The packet applier sends carries xrow_header::tm field as a zero but we can reuse it to provide information about first timestamp in a transaction we wrote to our WAL. Since old instances of Tarantool simply ignore this field such extension won't cause any problems. The timestamp will be needed to account lag of downstream replicas suitable for information purpose and cluster health monitoring. We update applier statistics in WAL callbacks but since both apply_synchro_row and apply_plain_tx are used not only in real data application but in final join stage as well (in this stage we're not writing the data yet) the apply_synchro_row is extended with replica_id argument which is non zero when applier is subscribed. The calculation of the downstream lag itself lag will be addressed in next patch because sending the timestamp and its observation are independent actions. Part-of #5447 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Alexander V. Tikhonov authored
Checked and found that: #4353 -> tarantool/tarantool-qa#13: engine/ddl.test.lua fixed in #6102. #4926, tarantool/tarantool#115: box/alter_limits.test.lua fixed in tarantool/tarantool-qa#126. #5547 -> tarantool/tarantool-qa#50: box/net.box_schema_change_gh-2666.test.lua fixed in tarantool/tarantool-qa#126. #5583 -> tarantool/tarantool-qa#22: box/net.box_methods_gh-3107.test.lua fixed in tarantool/tarantool-qa#126. Closes tarantool/tarantool-qa#13 Closes tarantool/tarantool-qa#115 Closes #4926 Closes tarantool/tarantool-qa#50 Closes tarantool/tarantool-qa#22
-