- Aug 17, 2021
-
-
Vladimir Davydov authored
The test assumes that a version string looks like this 2.9.0-123-gabcabcababc. We want to append a flow string after <major>.<minor>.<patch>. Fix the test accordingly. Needed for #6183
-
Serge Petrenko authored
Direct upgrade support from pre-1.7.5 versions was removed in commit 7d3b80e7 (Forbid upgrade from Tarantool < 1.7.5 and refactor upgrade.lua) The reason for that was the mandatory space format checks introduced back then. With these space format checks, old schema couldn't be recovered on new Tarantool versions, because newer versions had different system space formats. So old schema couldn't be upgraded because it couldn't even be recovered. Actually this was rather inconvenient. One had to perform an extra upgrade step when upgrading from, say, 1.6 to 2.x: instead of performing a direct upgrade one had to do 1.6 -> 1.10 -> 2.x upgrade which takes twice the time. Make it possible to boot from snapshots coming from Tarantool version 1.6.8 and above. In order to do so, introduce before_replace triggers on system spaces, which work during snapshot/xlog recovery. The triggers will set tuple formats to the ones supported by current Tarantool (2.x). This way the recovered data will have the correct format for a usual schema upgrade. Also add upgrade_to_1_7_5() handler, which finishes transformation of old schema to 1.7.5. The handler is fired together with other box.schema.upgrade() handlers, so there's no user-visible behaviour change. Side note: it would be great to use the same technique to allow booting from pre-1.6.8 snapshots. Unfortunately, this is not possible. Current triggers don't break the order of schema upgrades, so 1.7.1 upgrades come before 1.7.2 and 1.7.5. This is because all the upgrades in these versions are replacing existing tuples and not inserting new ones, so the upgrades may be handled by the before_replace triggers. Upgrade to 1.6.8 requires inserting new tuples: creating sysviews, like _vspace, _vuser and so on. This can't be done from the before_replace triggers, so we would have to run triggers for 1.7.x first which would allow Tarantool to recover the snapshot, and then run an upgrade handler for 1.6.8. This looks really messy. Closes #5894
-
Serge Petrenko authored
Introduce table.equals for comparing tables. The method respects __eq metamethod, if provided. Needed-for #5894 @TarantoolBot document Title: lua: new method table.equals Document the new lua method table.equals It compares two tables deeply. For example: ``` tarantool> t1 = {a=3} --- ... tarantool> t2 = {a=3} --- ... tarantool> t1 == t2 --- - false ... tarantool> table.equals(t1, t2) --- - true ... ``` The method respects the __eq metamethod. When both tables being compared have the same __eq metamethod, it's used for comparison (just like this is done in Lua 5.1)
-
Serge Petrenko authored
Found the following error in our CI: Test failed! Result content mismatch: --- replication/election_basic.result Fri Aug 13 13:50:26 2021 +++ /build/usr/src/debug/tarantool-2.9.0.276/test/var/rejects/replication/election_basic.reject Sat Aug 14 08:14:17 2021 @@ -116,6 +116,7 @@ | ... box.ctl.demote() | --- + | - error: box.ctl.demote does not support simultaneous invocations | ... -- Even though box.ctl.demote() or box.ctl.promote() isn't called above the failing line, promote() is issued internally once the instance becomes the leader. Wait until previous promote is finished (i.e. box.info.synchro.queue.owner is set)
-
Serge Petrenko authored
upstream.lag is the delta between the moment when a row was written to master's journal and the moment when it was received by the replica. It's an important metric to check whether the replica has fallen too far behind master. Not all the rows coming from master have a valid time of creation. For example, RAFT system messages don't have one, and we can't assign correct time to them: these messages do not originate from the journal, and assigning current time to them would lead to jumps in upstream.lag results. Stop updating upstream.lag for rows which don't have creation time assigned. The upstream.lag calculation changes were meant to fix the flaky replication/errinj.test: Test failed! Result content mismatch: --- replication/errinj.result Fri Aug 13 15:15:35 2021 +++ /tmp/tnt/rejects/replication/errinj.reject Fri Aug 13 15:40:39 2021 @@ -310,7 +310,7 @@ ... box.info.replication[1].upstream.lag < 1 --- -- true +- false ... But the changes were not enough, because now the test may see the initial lag value (TIMEOUT_INFINITY). So fix the test as well by waiting until upstream.lag becomes < 1.
-
- Aug 16, 2021
-
-
Vladimir Davydov authored
Before commit 954194a1 ("net.box: rewrite request implementation in C"), net.box future was a plain Lua table so that the caller could attach extra information to it. Now it isn't true anymore - a future is a userdata object, and it doesn't have indexing methods. For backward compatibility, let's add __index and __newindex fields and store user-defined fields in a Lua table, which is created lazily on the first __newindex invocation. __index falls back on the metatable methods if a field isn't found in the table. Follow-up #6241 Closes #6306
-
Vladimir Davydov authored
It didn't yield before commit 954194a1 ("net.box: rewrite request implementation in C"). It shouldn't yield now. Follow-up #6241
-
Nikita Pettik authored
To avoid sharing (ergo phantom reads) metadata object for different transactions in MVCC mode, let's do following things. Firstly, let's set on replace trigger on all system spaces (content's change in system space is considered to be DDL operation) which disables yields until transaction is committed. The only exceptions are index build and space format check: during these operations yields are allowed since they may take a while (so without yields they block execution). Actually it is not a problem 'cause these two operations must be first-in-transaction: as a result transaction can't contain two yielding statements. So after any cache modification no yields take place for sure. Secondly, on committing transaction that provides DDL changes let's abort all other transaction since they may refer to obsolete schema objects. The last restriction may seem too strict, but it is OK as primitive workaround until transactional DDL is introduced. In fact we should only abort transactions that have read dirty (i.e. modified) objects. Closes #5998 Closes #6140 Workaround for #6138
-
- Aug 14, 2021
-
-
Aleksandr Lyapunov authored
It seem that the issue was fixes in one of previous commits. Just add the test. No logical changes. Closes #5801
-
Aleksandr Lyapunov authored
There was a bug when a transaction makes a wrong statement that is aborted because of duplicate tuple in primary or secondary index. The problem is that check of existing tuple is an implicit read that has usual side effect. This patch tracks that kind of reads like ordinal reads. Part of #5999
-
Aleksandr Lyapunov authored
After the previous patch it became possible to link read trackers to in-progress stories. This patch use one read tracker instead of bunch of direct conflicts in tuple_clarify. This is a bit accurate. Is also allows to avoid unnecessary conflict when a transaction reads its own change. Part of #5999
-
Aleksandr Lyapunov authored
Before this patch when a transaction has performed a write to read gap (interval) a conflict record has beed created for the reader of this gaps. That is wrong since the next writer of the same value will not find a gap - the gap has been splitted into parts. This patch fixes that and create a special read tracker that was designed specially for further tracking of writes. This also requires writer to search for read trackers not only in prepared stories but also in in-progress stories too. Part of #5999
-
Aleksandr Lyapunov authored
There was a obvious bug in transactinal manager's GC. There can be stories about deleted tuples. In other word tuples were deleted, but their story remains for history for some time. That means that pointers to dirty tuples are left in indexes, while the stories say that that tuples are deleted. When GC comes, it must remove pointer to tuple from indexes too. That is simple to check - if a story is on top of chain - it must be in index, and if it is a story about deleted tuple - it must be removed from index. But also that story must be unliked from chain, and the next story becomes the top on chain, but (1) in turn it must not try to delete its tuple from index - we have already done it, deleting the first tuple. For this purpose we mark the next story with space = NULL. The problem is that setting space = NULL work for every index at once, while sometimes we have to hande each index independently. Fortunately the previous commit introduced in_index member of story's link, NULL by default. We can just leave that NULL in older story as a mark that is not in index. This commit makes so and fixes the bug. Closes #6234
-
Aleksandr Lyapunov authored
There was a tricky problem in TX manager that could lead to a crash after deletion of a space. When a space is deleted, TX manager uses a special callback to remove dirty tuples from indexes. It is necessary for correct destruction of space and indexes. The problem is that actual space drop works in several steps, deletings secondary indexes and then deleting primary indexes. Each step is an independend alter. And alters are tricky. For example we had a struct space instance, namely S1, with two indexes I1 and I2. At the first step we have to delete the second index. By design, for that purpose a new instance of space is created, namely S2, with one empty index I3. Then the spaces exchanges their indexes, and S1 becomes with I3 and I2, and S2 owns I1. After that S1 is deleted. That is good until we try to make story cleanup - all the dirty tuples remain in S2.I1, while we try to clean empty S1.I3. The only way to fix it - story index pointer right in story to make sure we are cleaning the right index. Part of #6234 Closes #6274
-
Egor Elchinov authored
MVCC used not to track hash index writes. This patch fixes this problem by transferring the readers which use `ITER_ALL` or `ITER_GT` iterators of hash index to read view after any subsequent external write to this index. Closes #6040
-
Aleksandr Lyapunov authored
The previous commit fixed a bug that caused dirty read but also introduced a much less significat problem - excess conflict in some cases. Usually if a reader reads a tuple - in its story aspecial record is stored. Any write that replaces or deletes that tuple can now cause conflict of current transaction. The problem happened when a reader tries to execute select from some index, but only deleted story is found there. The record is stored and that is good - we must know when somebody will insert a tuple to this place in index. But actually we need to know it only for the index from which the reader executed select. This patch introduces a special index mask in read tracker that is used in the case above to be more precise in conflict detection. Closes #6206
-
Aleksandr Lyapunov authored
In order to preserve repeated reads transactional manager tracks read of each transactions. Generally reads can be of two types - those that have read a tuple or that have found nothing. The first are stored in tuple story, the second - in special gap and hole structures. The problem was that reads that found a dirty tuple that was invisible to this transaction (the story says that it is deleted) was not stored neither in story nor in gap/holes. This patch fixes that. Part of #6206
-
Aleksandr Lyapunov authored
During iteration a memtx tree index must write gap records to TX manager. It is done in order to detect the further writes to that gaps and execute some logic preventing phantom reads. There are two cases when that gap is stores: * Iterator reads the next tuple, the gap is between two tuples. * Iterator finished reading, the gap is between the previous tuple and the key boundary. By a mistake these two cases were not distinguished correctly and that led to excess conflicts. This patch fixes it. Part of #6206
-
Aleksandr Lyapunov authored
There were several problems that was connected with broken pointers in tuple history. Another problems is that that code was quite huge and difficult to understand. This patch refactors all the code that is connected to lists of stories in history. A bunch of helper function was added and in fact these functions was carefully rewtitten: * memtx_tx_history_add_stmt * memtx_tx_history_rollback_stmt * memtx_tx_history_prepare_stmt * memtx_tx_history_commit_stmt In addition to refactoring a couple of significant changes was made to the logic: * Now del_story in statement point to story of the tuple that was effectively deleted by this statement. * Conflicts in secondary indexes (that were previously named as 'cross coflicts' now handled transparently during statement preparation. Closes #6132 Closes #6021
-
EvgenyMekhanik authored
To fix some problems in the transaction manager we disallow yields after DDL operation in TX. Thus, we can't longer perform ddl operations in streams. Needed for #5998
-
- Aug 13, 2021
-
-
mechanik20051988 authored
Implement `begin`, `commit` and `rollback` methods for stream object in `net.box`, which allows to begin, commit and rollback transaction accordingly. Closes #5860 @TarantoolBot document Title: add interactive transaction support in net.box Implement `begin`, `commit` and `rollback` methods for stream object in `net.box`, which allows to begin, commit and rollback transaction accordingly. Now there are multiple ways to begin, commit and rollback transaction from `net.box`: using appropriate stream methods, using 'call` or 'eval' methods or using `execute` method with sql transaction syntax. User can mix these methods, for example, start transaction using `stream:begin()`, and commit transaction using `stream:call('box.commit')` or stream:execute('COMMIT'). Simple example of using interactive transactions via iproto from net.box: ```lua stream = conn:new_stream() space = stream.space.test space_not_from_stream = conn.space.test stream:begin() space:replace({1}) -- return previously inserted tuple, because request -- belongs to transaction. space:select({}) -- empty select, because select doesn't belongs to -- transaction space_not_from_stream:select({}) stream:call('box.commit') -- now transaction was commited, so all requests -- returns tuple. ``` Different examples of using streams you can find in gh-5860-implement-streams-in-iproto.test.lua
-
mechanik20051988 authored
Implement interactive transactions over iproto streams. Each stream can start its own transaction, so they allows multiplexing several transactions over one connection. If any request fails during the transaction, it will not affect the other requests in the transaction. If disconnect occurs when there is some active transaction in stream, this transaction will be rollbacked, if it does not have time to commit before this moment. Part of #5860 @TarantoolBot document Title: interactive transactions was implemented over iproto streams. The main purpose of streams is transactions via iproto. Each stream can start its own transaction, so they allows multiplexing several transactions over one connection. There are multiple ways to begin, commit and rollback transaction: using IPROTO_CALL and IPROTO_EVAL with corresponding function (box.begin, box.commit and box.rollback), IPROTO_EXECUTE with corresponding sql request ('TRANSACTION START', 'COMMIT', 'ROLLBACK') and IPROTO_BEGIN, IPROTO_COMMIT, IPROTO_ROLLBACK accordingly. If disconnect occurs when there is some active transaction in stream, this transaction will be rollbacked, if it does not have time to commit before this moment. Add new command codes for begin, commit and rollback transactions: `IPROTO_BEGIN 14`, `IPROTO_COMMIT 15` and `IPROTO_ROLLBACK 16` accordingly.
-
mechanik20051988 authored
Add stream support to `net.box`. In "net.box", stream is an object over connection that has the same methods, but all requests from it sends with non-zero stream ID. Since there can be a lot of streams, we do not copy the spaces from the connection to the stream immediately when creating a stream, but do it only when we first access space. Also, when updating the schema, we update the spaces in lazy mode: each stream has it's own schema_version, when there is some access to stream space we compare stream schema_version and connection schema_version and if they are different update clear stream space cache and wrap space that is being accessed to stream cache. Part of #5860 @TarantoolBot document Title: stream support was added to net.box In "net.box", stream is an object over connection that has the same methods, but all requests from it sends with non-zero stream ID. Stream ID is generated on the client automatically. Simple example of stream creation using net.box: ```lua stream = conn:new_stream() -- all connection methods are valid, but send requests -- with non zero stream_id. ```
-
mechanik20051988 authored
Implement streams in iproto. There is a hash table of streams for each connection. When a new request comes with a non-zero stream ID, we look for the stream with such ID in this table and if it does not exist, we create it. The request is placed in the queue of pending requests, and if this queue was empty at the time of its receipt, it is pushed to the tx thread for processing. When a request belonging to stream returns to the network thread after processing is completed, we take the next request out of the queue of pending requests and send it for processing to tx thread. If there is no pending requests we remove stream object from hash table and destroy it. Requests with zero stream ID are processed in the old way. Part of #5860 @TarantoolBot document Title: streams are implemented in iproto A distinctive feature of streams is that all requests in them are processed sequentially. The execution of the next request in stream will not start until the previous one is completed. To separate requests belonging to and not belonging to streams we use stream ID field in binary iproto protocol: requests with non-zero stream ID belongs to some stream. Stream ID is unique within the connection and indicates which stream the request belongs to. For streams from different connections, the IDs may be the same.
-
mechanik20051988 authored
There was no check for successful memory allocation in `new` and `clear` functions for mhash table. And if the memory was not allocated, a null pointer dereference occured.
-
mechanik20051988 authored
For further implementation of streams, we need to separate requests belonging to and not belonging to streams. For this purpose, the stream ID field was added to the iproto binary protocol. For requests that do not belong to stream, this field is omitted or equal to zero. For requests belonging to stream, we use this field to determine which stream the request belongs to. Part of #5860 @TarantoolBot document Title: new field in binary iproto protocol Add new field to binary iproto protocol. `IPROTO_STREAM_ID 0x0a` determines whether a request belongs to a stream or not. If this field is omited or equal to zero this request doesn't belongs to stream.
-
- Aug 12, 2021
-
-
Serge Petrenko authored
Found the following error in our CI: [001] Test failed! Result content mismatch: [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 [001] @@ -88,7 +88,7 @@ [001] | ... [001] assert(not box.info.ro) [001] | --- [001] - | - true [001] + | - error: assertion failed! [001] | ... [001] assert(box.info.election.term > term) [001] | --- [001] The problem was the same as in recently fixed election_qsync.test (commit 096a0a7d): PROMOTE is written to WAL asynchronously, and box.ctl.promote() returns earlier than this happens. Fix the issue by waiting for the instance to become writeable. Follow-up #6034
-
Aleksandr Lyapunov authored
Tuple are designed to store (almost) any sizes of msgpack data and rather big count of field offsets. That requires data_offsert and bsize members of tuples to be rather large - 16 and 32 bits. That is good, but the problem is that in cases when the majority of tuples are small that price is significant. This patch introduces compact tuples: if tuple data size and its offset table are small - both tuple_offset and bsize are stored in one 16 bit integer and that saves 4 bytes per tuple. Compact tuples are used for memtx and runtime tuples. They are not implemented for vinyl, because in contrast to memtx vinyl stores engine specific fields after struct tuple and thus requires different approach for compact tuple. Part of #5385
-
Aleksandr Lyapunov authored
Tuples are usually have a very low reference counter (I bet the majority of tuple have it less than 10), and we may rely on the fact in optimization issues. On the other hand it is not actually prohibited for a tuple to have a big reference counter, thus the code must handle it properly. The obvious solution is to store narrow reference counter right in struct tuple, and store it somewhere else if it hits threshold. The previous implementation has a 15 bit counter and 1 bit flag the that actual counter is stored in separate array. That worked fine except 15 bits are still an overkill for real reference counts. And that solution introduced unions into struct tuple, which in turn, generally speaking, causes an UB since by standard it is an UB to access one union part after setting other. The new solution is to store 8 bit counter and 1 bit flag. The external storage is made as hash table to which a portion of the counter is uploaded (or acquire) very seldom. That makes the counter in tuple more compact, rather fast (and even fastest for low reference counter values) and has no limitation such as limited count of tuples that can have big reference counts. Part of #5385
-
Aleksandr Lyapunov authored
Due to C/C++ standard layout sizeof(struct vy_stmt) was 32 bytes. Is a pity since it has only 20 bytes of payload (10 byte for base struct tuple and 10 for lsn (8) + type (1) + flags (1)). Repack struct vy_stmt to be 24 bytes long. Part of #5385
-
- Aug 11, 2021
-
-
Yan Shtunder authored
replicaset.applier.vclock is initialized in replication_init(), which happens before local recovery. If some changes are come from an instance via replication the applier.vclock will be equal 0. This means that if some wild master will send this node already applied data, the node will apply the same data twice. Closes #6028
-
- Aug 10, 2021
-
-
Mergen Imeev authored
This patch prohibits creation of user-defined functions with SQL_BUILTIN engine. Closes #6106
-
Mergen Imeev authored
This patch removes SQL built-in functions from _func. These functions could be called directly from Lua, however all they did was returned an error. After this patch, no SQL built-in functions can be called directly from LUA. Part of #6106
-
Mergen Imeev authored
This patch introduces the sql_func_find() function. This function allows us to centralize the look up of functions during parsing, which simplifies code and fixes some incorrect error messages. Part of #6106
-
- Aug 09, 2021
-
-
Leonid Vasiliev authored
After changing the way symbols are exported, handling several cases in the "ssl-cert-paths-discover" test is no longer necessary. Let's remove it. Part of #5932
-
Leonid Vasiliev authored
Wrap the symbols used in the "ssl-cert-paths-discover" test to avoid clashes. Symbols from openssl have been wraped to: crypto_X509_get_default_cert_dir_env crypto_X509_get_default_cert_file_env Tarantool symbols have been prefixed by "tnt_": tnt_ssl_cert_paths_discover tnt_default_cert_dir_paths tnt_default_cert_file_paths Part of #5932
-
Leonid Vasiliev authored
After unhiding all internal symbols([1]) we experience a bunch of problems ([2], [3]). The second one (clash of symbols from different version of the "small" library) still have no good solution. You can find more on the topic [4]. The situation for tarantool executable is the same as for any other library. A library should expose only its public API and should not increase probability of hard to debug problems due to clash of a user's code with an internal name from the library. Let's hide all symbols by default and create a list of exported symbols. (In fact, this patch is a revert of the patch 03790ac5 ([5]) taking into account the changes made to the code) Explanation of adding some controversial symbols to the export list: * The following symbols are used in shared libraries used in tests ("cfunc*.so", "sql_uuid.so", "gh-6024-funcs-return-bin.so", "function1.so", "gh-5938-wrong-string-length.so", "module_api.so") mp_check mp_encode_array mp_encode_bin mp_encode_bool mp_encode_int mp_encode_map mp_encode_nil mp_encode_str mp_encode_uint mp_decode_array_slowpath mp_decode_str mp_next_slowpath mp_load_u8 mp_next mp_sizeof_array mp_sizeof_str mp_type_hint decimal_from_string * These symbols are used in "crypto.lua" and, if absent, will lead to the failure of the "static_build_cmake_linux" on CI (the crypto prefix was used to avoid the clashes of names) crypto_ERR_error_string crypto_ERR_get_error crypto_EVP_DigestInit_ex crypto_EVP_DigestUpdate crypto_EVP_DigestFinal_ex crypto_EVP_get_digestbyname crypto_HMAC_Init_ex crypto_HMAC_Update crypto_HMAC_Final * For correct work of "schema.lua" in the "static_build_cmake_linux" rl_get_screen_size * The following symbols are used in "ssl-cert-paths-discover.test.lua" (I think these symbols will have to be wrapped in the to avoid clashes problems) X509_get_default_cert_dir_env X509_get_default_cert_file_env ssl_cert_paths_discover From "exports.test.lua" have been removed ZSTD symbols checking (see [6]) and "tt_uuid_str" (see [7]). 1. https://github.com/tarantool/tarantool/issues/2971 2. https://github.com/tarantool/tarantool/issues/5001 3. https://github.com/tarantool/memcached/issues/59 4. https://lists.tarantool.org/pipermail/tarantool-discussions/2020-September/000095.html 5. https://github.com/tarantool/tarantool/commit/03790ac5510648d1d9648bb2281857a7992d0593 6. https://github.com/tarantool/tarantool/issues/4225 7. https://github.com/tarantool/tarantool/commit/acf8745ed8fef47e6d1f1c31708c7c9d6324d2f3 Part of #5932
-
Aleksandr Lyapunov authored
At some point ER_TUPLE_FOUND error message was extended to contain old tuple and new tuple. It seems that after rebase this change was ignored in MVCC code that used previous format. This patch fixes that. Closes #6247
-
Mergen Imeev authored
This patch removes deprecated implicit cast from OP_MakeRecord opcode, which were used in some rare cases, for example during IN operation with subselect as right-value. Closes #4230 Part of #4470
-
Mergen Imeev authored
Since the OP_Seek* opcodes now work using the new implicit casting rules, we don't need to call OP_ApplyType before them. This patch removes such calls. Part of #4230 Part of #4470
-