- Apr 20, 2020
-
-
Mergen Imeev authored
Closes #4384
-
Igor Munkin authored
Fixes the regression from 335f80a0 ('test: adjust luajit-tap testing machinery').
-
Alexander Turenko authored
As stated in the 'OS/X AND DARWIN BUGS' section of the libev documentation [1], kqueue() and poll() have known problems on Mac OS, so the library uses select() on Mac OS (it is the build time default). The library however uses the trick to overcome 1024 fds limit: libev sets the undocumented macro _DARWIN_UNLIMITED_SELECT, which enables linking against select() implementation without the limit. The magic macro stops working at some point around Mac OS 10.10 (see [2]), because it was defined after <sys/time.h> inclusion. For recent Mac OS versions the macro has effect only when it is defined before <sys/time.h> inclusion. The macro definition was [moved][3] in libev 4.25. Excerpt from the changelog [4]: | 4.25 Fri Dec 21 07:49:20 CET 2018 | <...> | - move the darwin select workaround higher in ev.c, as newer versions of | darwin managed to break their broken select even more. More proper fix would be updating of libev to a newer version, however I would postpone it until a moment when we'll have a time to properly test everything with a new version of the library. [1]: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AND_DARWIN_BUGS [2]: http://lists.schmorp.de/pipermail/libev/2018q2/002788.html [3]: http://cvs.schmorp.de/libev/ev.c?r1=1.482&r2=1.483 [4]: http://cvs.schmorp.de/libev/Changes?view=markup Fixes #3867 Fixes #4673 Investigated-by:
Maria Khaydich <maria.khaydich@tarantool.org> Co-authored-by:
Maria Khaydich <maria.khaydich@tarantool.org> (cherry picked from commit 8acc011a)
-
Alexander Turenko authored
After 2.2.0-633-gaa0964ae1 ('net.box: fix schema fetching from 1.10/2.1 servers') net.box expects that _vcollation system view exists on a tarantool server of 2.2.1+ version. This is however not always so: a server may be run on a new version of tarantool, but work on a schema of an old version. The situation with non last schema is usual for replication cluster in process of upgrading: all instances run on the new version of tarantool first (no auto-upgrade is performed by tarantools in a cluster). Then box.schema.upgrade() should be called, but the instances should be operable even before the call. Before the commit net.box was unable to connect a server if it is run on a schema without _vcollation system view (say, 2.1.3), but the server executable is of 2.2.1 version or newer. Note: I trim tests from the commit to polish them a bit more, but include the fix itself to 2.4.1 release. Follows up #4307 Fixes #4691 (cherry picked from commit 06edcbe1)
-
Igor Munkin authored
This changeset makes possible to run luajit-tap tests requiring libraries implemented in C: * symlink to luajit test is created on configuration phase instead of build one. * introduced a CMake function for building shared libraries required for luajit tests. Furthermore this commit enables CMake build for the following luajit-tap tests: * gh-4427-ffi-sandwich * lj-flush-on-trace Reviewed-by:
Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Reviewed-by:
Sergey Ostanevich <sergos@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org> (cherry picked from commit 335f80a0)
-
Kirill Yukhin authored
- jit: abort trace execution on JIT mode change - jit: abort trace recording and execution for C API Reviewed-by:
Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org> (cherry picked from commit a1594091)
-
- Apr 16, 2020
-
-
Serge Petrenko authored
relay_schedule_pending_gc() is executed after relay status update, which made perfect sense before we've introduced local spaces rework, making local space operations use a special instance id: 0. Relay status update is performed only when the remote instance has reported a bigger vclock, than its previous one. However, we may have an entire WAL file filled with local space changes, in which case the changes won't be transmitted to replica, and it will report the same vclock as before, postponing the scheduled gc until a non-local row is created on master. Fix this by reordering relay_schedule_pending_gc() and relay status update. In case nothing new is added to pending_gc queue and replica clock is not updated, relay_schedule_pending_gc() will exit on the first loop iteration, so it doesn't add an overhead. Also make relay_schedule_pending_gc() use vclock_compare_ignore0() instead of plain vclock_compare(). Follow-up #4114 (cherry picked from commit e7ffddce)
-
Serge Petrenko authored
Sign local space requests with a zero instance id. This allows to split local changes aside from the changes, which should be visible to the whole cluster, and stop sending NOPs to replicas to follow local vclock. Moreover, it fixes the following bug with local spaces and replication. In a situation when there are a master and a replica set up, replica may still write to local spaces even if it's read-only. Local space operations used to promote instance's lsn before this patch. Eventually, master would have vclock {1:x} and replica'd have vclock {1:x, 2:y}, where y > 0, due to local space requests performed by the replica. If a snapshot happens on replica side, replica will delete it's .xlog files prior to the snapshot, since no one replicates from it and thus it doesn't have any registered GC consumers. From this point, every attempt to configure replication from replica to master will fail, since master will try to fetch records which account for the difference in master's and replica's vclocks: {1:x} vs {1:x,2:y}, even though master will never need the rows in range {2:1} - {2:y}, since they'll be turned to NOPs during replication. Starting from this patch, in the situation described above, replica's clock will be {0:y, 1:x}, and, since local space requests are now not replicated at all, master will be able to follow replica, turning the configuration to master-master. Closes #4114 (cherry picked from commit 7eb4650e)
-
Serge Petrenko authored
The current WAL GC implementation tracks consumers (i.e. remote replicas) by their vclock signature, which is the sum of all vclock components. This approach is wrong, and this can be shown by a little example. The example will be a little synthetic, but it'll illustrate the problem. Say, you have 2 masters, A and B with ids 1 and 2 respectively, and a replica C with id 3. Say, С replicates from both A and B, and there is no replication between A and B (say, the instances were reconfigured to not replicate from each other). Now, say replica C has followed A and B to vclock {1:5, 2:13}. At the same time, A has lsn 10 and B has lsn 15. A and B do not know about each other’s changes, so A’s vclock is {1:10} and B’s vclock is {2:15}. Now imagine A does a snapshot and creates a new xlog with signature 10. A’s directory will look like: 00…000.xlog 00…010.snap 00….010.xlog Replica C reports its vclock {1:5, 2:13} to A, A uses the vclock to update the corresponding GC consumer. Since signatures are used, GC consumer is assigned a signature = 13 + 5 = 18. This is greater than the signature of the last xlog on A (which is 10), so the previous xlog (00…00.xlog) can be deleted (at least A assumes it can be). Actually, replica still needs 00…00.xlog, because it contains rows corresponding to vclocks {1:6} - {1:10}, which haven’t been replicated yet. If instead of using vclock signatures, gc consumers used vclocks, such a problem wouldn’t arise. Replica would report its vclock {1:5, 2:13}. The vclock is NOT strictly greater than A’s most recent xlog vclock ({1:10}), so the previous log is kept until replica reports a vclock {1:10, 2:something} or {1:11, …} and so on. Rewrite gc to perform cleanup based on finding minimal vclock components present in at least one of the consumer vclocks instead of just comparing vclock signatures. Prerequisite #4114 (cherry picked from commit 11804b46)
-
sergepetrenko authored
0th vclock component will be used to count replica-local rows. These rows won't be replicated and different instances will have different values in vclock[0]. Add a function vclock_compare_ingore0, which doesn't order vclocks by 0 component and use it where appropriate. Part of #4114 (cherry picked from commit 1a2037b1)
-
Serge Petrenko authored
When local space requests will be signed with 0 instance id, every instance writing to local spaces will have its own 0th vclock component. In order to not pollute other instances 0th vclock components, we have to omit it in replication responses. In order to do so, introduce a new function - vclock_size_ignore0(), which doesn't count 0th clock component, and patch xrow_encode_vclock() to skip 0th clock component if it's present. Prerequisite #4114 (cherry picked from commit 93ae77fc)
-
Mergen Imeev authored
Before this patch, if an ephemeral space was used during INSERT or REPLACE, the inserted values were sorted by the first column, since this was the first part of the index. This can lead to an error when using the AUTOINCREMENT feature, since changing the order of the inserted value can change the value inserted instead of NULL. To avoid this, the patch makes the rowid of the inserted row in the ephemeral space the only part of the ephemeral space index. Closes #4256 (cherry picked from commit 2cc7e608)
-
Mergen Imeev authored
This patch specifies field types in ephemeral space format in SQL. Prior to this patch, all fields had a SCALAR field type. This patch allows us to not use the primary index to obtain field types, since now the ephemeral space has field types in the format. This allows us to change the structure of the primary index, which helps to solve the issue #4256. In addition, since we can now set the field types of the ephemeral space, we can use this feature to set the field types according to the left value of the IN operator. This will fix issue #4692. Needed for #4256 Needed for #4692 Closes #3841 (cherry picked from commit 2103f587)
-
Mergen Imeev authored
This patch allows to set field types and names in ephemeral space formats. Needed for #4256 Needed for #4692 Part of #3841 (cherry picked from commit 032de39f)
-
- Apr 15, 2020
-
-
Alexander V. Tikhonov authored
Moved sources tarball creation from travis-ci to gitlab-ci, moved its jobs for sources packing and sources deploying. Close #4895 (cherry picked from commit 34f87bc6)
-
Alexander V. Tikhonov authored
Removed Ubuntu 19.04 Disco from testing which is EOL. Close #4896 (cherry picked from commit 05df6b31)
-
Alexander V. Tikhonov authored
Added ability to remove given in options package from S3. TO remove the needed package need to set '-r=<package name with version>' option, like: ./tools/update_repo.sh -o=<OS> -d=<DIST> -b=<S3 repo> \ -r=tarantool-2.2.2.0 it will remove all found appropriate source and binaries packages from the given S3 repository, also the meta files will be corrected there. Close #4839 (cherry picked from commit d6c50af1)
-
Alexander.V Tikhonov authored
Added instructions on 'product' option with examples. Part of #4839 (cherry picked from commit cccc989c)
-
Alexander.V Tikhonov authored
Found that modules may have only binaries packages w/o sources packages. Script changed to be able to work with only binaries either sources packages. Part of #4839 (cherry picked from commit 4527a4da)
-
Alexander V. Tikhonov authored
Added cleanup functionality for the meta files. Script may have the following situations: - package files removed at S3, but it still registered: Script stores and registers the new packages at S3 and removes all the other registered blocks for the sames files in meta files. - package files already exists at S3 with the same hashes: Script passes it with warning message. - package files already exists at S3 with the old hashes: Script fails w/o force flag, otherwise it stores and registers the new packages at S3 and removes all the other registered blocks for the sames files in meta files. Added '-s|skip_errors' option flag to skip errors on changed packages to avoid of exits on script run. Part of #4839 (cherry picked from commit ed491409)
-
Alexander V. Tikhonov authored
Returned the static build based on Dockerfile to gitlab-ci release branches testing after the issues with missed openssl version fixed at PR #4831. Follow up #4831 (cherry picked from commit b09f44b856e91f1006bd5b3e226a7be0b65b7859) (cherry picked from commit 6f618d62)
-
Alexander V. Tikhonov authored
Found that static build based on Dockerfile used external link and missed that it was removed, like it was in #4830. To avoid of the same issues the cache for building the Dockerfile was disabled with '--no-cache' option at docker build command. Follow up #4830 (cherry picked from commit 1207821e4fc18312a9916d81a55a8eacd75a67b3) (cherry picked from commit c5d27312)
-
Sergey Bronnikov authored
Test was a flaky from the beginning 39d0e427 Time of building indexes varies from time to time and the problem was due to abcense of synchronization in index building and checking numbers of these indexes. Fixes #4353 (cherry picked from commit 5f96ee59)
-
- Apr 08, 2020
-
-
Cyrill Gorcunov authored
Currently the journal provides only one method -- write, which implies a callback to trigger upon write completion (in contrary with 1.10 series where all commits were processing in synchronous way). Lets make difference between sync and async writes more notable: provide journal::write_async method which runs completion function once entry is written, in turn journal:write handle transaction in synchronous way. Redesing notes: 1) The callback for async write set once in journal creation. There is no need to carry callback in every journal entry. This allows us to save some memory; 2) txn_commit and txn_commit_async call txn_rollback where appropriate; 3) no need to call journal_entry_complete on sync writes anymore; 4) wal_write_in_wal_mode_none is too long, renamed to wal_write_none; 5) wal engine use async writes internally but it is transparent to callers. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
This makes code easier to read and allows to reuse txn allocation in sync\async writes. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Otherwise we won't be able to make a rollback in case of journal_entry_new allocation failure. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
We gonna diverge sync and async code flow thus lets make txn_commit to not use txn_commit_async. Fixes #4776 Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
We will use this function inside wal engine right after journal redesign is complete. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
To reuse in sync trancastion once journal redesign is complete. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
For unification sake, we will handle nop transactions via common helper for both sync and async cases. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
To reflect the fact tha we're don't waiting for transaction to complete but rely on completion callback. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
We already have a "get" function in this header which is named in_txt(). Having both get/set in one place should be more consistent. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
There is no need for several instances of recovery journal variable. Lets make it statically allocated inside recovery_journal_create routine. And drop the inline annotation because there is absolutely no need for this routine being inline. Suggested-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Allows to eliminate code duplication. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Test that diag_raise doesn't happen if async transaction fails inside replication procedure. Side note: I don't like merging tests with patches in general and I hate doing so for big tests with a passion because it hides the patch code itself. So here is a separate patch on top of the fix. Test-of #4730 Acked-by:
Serge Petrenko <sergepetrenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Currently when transaction rollback happens we just drop an existing error setting ClientError to the replicaset.applier.diag. This action leaves current fiber with diag=nil, which in turn leads to sigsegv once diag_raise() called right after applier_apply_tx(): | applier_f | try { | applier_subscribe | applier_apply_tx | // error happens | txn_rollback | diag_set(ClientError, ER_WAL_IO) | diag_move(&fiber()->diag, &replicaset.applier.diag) | // fiber->diag = nil | applier_on_rollback | diag_add_error(&applier->diag, diag_last_error(&replicaset.applier.diag) | fiber_cancel(applier->reader); | diag_raise() -> NULL dereference | } catch { ... } Thus: - use diag_set_error() instead of diag_move() to not drop error from a current fiber() preventing a nil dereference; - put fixme mark into the code: we need to rework it in a more sense way. Fixes #4730 Acked-by:
Serge Petrenko <sergepetrenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
To make it a bit more readable. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
For some reason the replica_by_id member (which is an array of pointers) is allocated dynamically. Moreover VCLOCK_MAX = 32 by now and extending it to some new limit will require a way more efforts than just increase the number. Thus reserve memory for replica_by_id inside replicaset statically. This allows to simplify code a bit and drop calloc/free calls. The former code comes from edd76a2a without any explanation why the dynamic member is needed. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
In case if we're hitting memory limit allocating triggers we should setup diag error to prevent nil dereference in diag_raise call (for example from applier_apply_tx). Note that there are region_alloc_xc helpers which are throwing errors but as far as I understand we need the rollback action to process first instead of immediate throw/catch thus we use diag_set. Acked-by:
Sergey Ostanevich <sergos@tarantool.org> Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
In request_create_from_tuple and request_handle_sequence we may be unable to request memory for tuples, don't forget to setup diag error otherwise diag_raise will lead to nil dereference. Acked-by:
Sergey Ostanevich <sergos@tarantool.org> Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-