- Dec 07, 2020
-
-
Kirill Yukhin authored
* x64: Fix __call metamethod return dispatch.
-
- Dec 06, 2020
-
-
Alexander V. Tikhonov authored
In the previous commit the .tarantoolctl configuration file was placed into the test-run submodule repository as: <tarantool repository>/test-run/.tarantoolctl This commit removes it from the tarantool repository. In fact, it unblocks the `./test-run.py --replication-sync-timeout <seconds>` option and now all tests will actually receive test-run's value for the box.cfg() option (100 seconds by default instead of 300 seconds, which is tarantool's default). Updated tests with replication_sync_timeout check value. Set it to hidden value due to it could be set the other than default in options at test-run run command. Found that no need to copy tarantoolctl configuration file to binary path any more, after it was moved to test-run repository, so reverting changes from: aa609de2 ('cmake for tests updated: copy ctl config in builddir') Needed for #5504
-
Alexander Turenko authored
See commits in the PR [1] for detailed description of the changes. User visible changes are the following. 1. Now test-run.py can be invoked from any directory without changing a current working directory to `test/`. 2. The `test/.tarantoolctl` configuration file is not mandatory and can be removed. It is shipped now within the test-run repository. 3. test-run sets the `replication_sync_timeout` box.cfg() option when the `test/.tarantoolctl` is not present in a parent repository. The value is controlled by the --replication-sync-timeout argument and defaults to 100 seconds (unlike tarantool's default, which is 300 seconds). The reason of the changes is to set default `replication_sync_timeout` for all tests to a value lower than `--no-output-timeout` (120 seconds) to allow instances to step into the orphan mode before this deadline and see more descriptive picture when it leads to failure of a test. What is also important, when a test fails before the `--no-output-timeout`, we able to restart it based on the `fragile` suite.ini option and / or collect artifacts to store them in CI. The `--no-output-timeout` deadline remains the show-stopper. We'll introduce a test execution timeout later to step into the general `--no-output-timeout` only in quite rare and unusual cases. The next commit will actually remove `test/.tarantoolctl`, so the new `replication_sync_timeout` will be in effect. [1]: https://github.com/tarantool/test-run/pull/242 Part of #5504
-
- Dec 04, 2020
-
-
Vladislav Shpilevoy authored
Fakesys is a collection of fake implementations of deep system things such as libev and libc. The fake subsystems will provide API just like their original counterparts (except for function names), but with full control of their behaviour in user-space for the sake of unit testing. Fakeev is a bogus version of libev, whose main feature is virtual time. Fakeev has internal clock, which is fully controllable in user-space. That allows to roll hours of tests in milliseconds of real time. Fakeev is used in SWIM tests, and will be used in Raft tests. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain a special library for emulating the event loop: swim_test_ev. It provides API similar to libev, but implemented entirely in user-space, including clock functions. The latter is the most important point, as the original libev does not allow to define your own timing functions - internally it relies on select/kqueue/epoll/poll/select/... with true clock. Because of that it is impossible to perform long tests with the original libev, which could last for minutes or even tens of seconds if their count is big. swim_test_ev uses virtual time, where hours can be played in milliseconds. -- This commit extracts all swim code to swim_test_ev.c. Now this file is nothing but an implementation of swim_ev.h on top of fakeev API. Fakeev, in turn, does not depend on SWIM anymore, and can be moved to fakesys library. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain a special library for emulating the event loop: swim_test_ev. It provides API similar to libev, but implemented entirely in user-space, including clock functions. The latter is the most important point, as the original libev does not allow to define your own timing functions - internally it relies on select/kqueue/epoll/poll/select/... with true clock. Because of that it is impossible to perform long tests with the original libev, which could last for minutes or even tens of seconds if their count is big. swim_test_ev uses virtual time, where hours can be played in milliseconds. The fake libev is going to be re-used for Raft unit tests. But for that it is necessary to detach it from all SWIM dependencies. -- The patch renames swim_test_ev.c/.h to fakeev.c/.h because they will contain only fakeev functions soon. The swim methods, implementing swim_ev.h via fakeev, are moved to their own file in a separate commit. Because their file will be swim_test_ev.c. If they would be moved here, git would treat it like everything *except* swim functions was moved to fakeev.h/.c. It would ruin git history, and is split in 2 commits to avoid this. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain a special library for emulating the event loop: swim_test_ev. It provides API similar to libev, but implemented entirely in user-space, including clock functions. The latter is the most important point, as the original libev does not allow to define your own timing functions - internally it relies on select/kqueue/epoll/poll/select/... with true clock. Because of that it is impossible to perform long tests with the original libev, which could last for minutes or even tens of seconds if their count is big. swim_test_ev uses virtual time, where hours can be played in milliseconds. The fake libev is going to be re-used for Raft unit tests. But for that it is necessary to detach it from all SWIM dependencies. -- This commit makes all swim_test_ev functions have 'fakeev' prefix instead of 'swim'. The functions, implementing swim_ev.h API, are kept as one-line proxies to the fakeev functions. Part of #5303
-
Vladislav Shpilevoy authored
Fakesys is going to be a collection of fake implementations of deep system things such as libev and libc. The fake subsystems will provide API just like their original counterparts (except for function names), but with full control of their behaviour in user-space for the sake of unit testing. This commit introduces first part of fakesys - a subset of libc network API: sendto(), recvfrom(), bind(), close(), getifaddrs(). Main features of fakenet are: - Integration with event loop via fakenet_loop_update(). Although this could be also considered an issue if it will be ever necessary to implement fake epoll, or sockets not bound to any event loop; - Filters to decide which packets to drop depending on their src, dst, and content; - Socket block to suspend packets delivery until the socket is unblocked. Fakenet implements connection-less API, for UDP sockets. This is exactly what is needed in SWIM. Raft fake transport will need reliable sockets with broadcast API. Reliability can be ensured by setting drop rate to 0 (which is default). Broadcast functionality is already present - there is a broadcast interface in fakenet_getifaddrs() result. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain special libraries for emulating event loop and network: swim_test_ev and swim_test_transport. They provide API similar to libev and to network part of libc, which internally is implemented entirely in user-space and allows to simulate all kinds of errors, any time durations, etc. These test libraries are going to be re-used for Raft unit tests. But for that it is necessary to detach them from all SWIM dependencies. -- This commit extracts all swim code to swim_test_transport.c. Now this file is nothing but an implementation of swim_transport.h on top of fakenet API. Fakenet, in turn, does not depend on SWIM anymore, and can be moved to its own library. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain special libraries for emulating event loop and network: swim_test_ev and swim_test_transport. They provide API similar to libev and to network part of libc, which internally is implemented entirely in user-space and allows to simulate all kinds of errors, any time durations, etc. These test libraries are going to be re-used for Raft unit tests. But for that it is necessary to detach them from all SWIM dependencies. -- This commit moves all fake network code to separate files - fakenet.c/.h, which are now easy to relocate to a new library. These files still contain some swim methods, which are moved to their own file in a separate commit. Because their file will be swim_test_transport.c. But if they would be moved there in the same commit, git would treat it like everything *except* the swim methods was moved to fakenet.c/.h because of names clash. That would destroy git history. So the swim code movement is split in 2 commits. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain special libraries for emulating event loop and network: swim_test_ev and swim_test_transport. They provide API similar to libev and to network part of libc, which internally is implemented entirely in user-space and allows to simulate all kinds of errors, any time durations, etc. These test libraries are going to be re-used for Raft unit tests. But for that it is necessary to detach them from all SWIM dependencies. -- The only dependency left in fake network functions is the 'swim' prefix. This commit replaces it with 'fakenet' before they are going to be moved to their own library. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain special libraries for emulating event loop and network: swim_test_ev and swim_test_transport. They provide API similar to libev and to network part of libc, which internally is implemented entirely in user-space and allows to simulate all kinds of errors, any time durations, etc. These test libraries are going to be re-used for Raft unit tests. But for that it is necessary to detach them from all SWIM dependencies. -- This commit extracts libc-like functions from the fake implementation of swim_transport.h functions. That allows to move swim functions out of the fake network library to their own file, and build the fake network as an independent library not related to swim. It is worth mentioning that the bind() emulator (swim_test_bind()) is not a true bind(). Its behaviour is more like socket() + bind() + close(). Such API was designed for swim_transport, and it seems too hard to split it into separate methods, especially socket() and bind(). Because in the fake network library there is a relation between IPv4 address and file descriptor number. It means, you can't create a file descriptor without occupying an address. So socket() + bind() and rebind (= close() + socket() + bind()) must be atomic. swim_test_bind() does that. This will also work fine for Raft, so it was left as is for now. Part of #5303
-
Vladislav Shpilevoy authored
SWIM unit tests contain special libraries for emulating event loop and network: swim_test_ev and swim_test_transport. They provide API similar to libev and to network part of libc, which internally is implemented entirely in user-space and allows to simulate all kinds of errors, any time durations, etc. These test libraries are going to be re-used for Raft unit tests. But for that it is necessary to detach them from all SWIM dependencies. -- One of the dependencies - swim_transport.addr, which was used in swim_transport_send() as an input parameter. swim_transport_send() simulates sendto(), and it can't be generalized while it depends on source address being passed explicitly. This patch makes swim_transport_send() deduct the source address by the file descriptor number. There is a couple of new functions for that: swim_test_sockaddr_in_to_fd and swim_test_fd_to_sockaddr_in. They were inlined earlier, but it seems the fd <-> source address translation is used often enough to extract these functions. Part of #5303
-
Alexander V. Tikhonov authored
Found that test failed in 2 common places when it tried to start the replica and wait it within 'JOIN' either 'SUBSCRIBE' test parts. It used to wait for replica start check the 'wait_until_started()' function 'TarantoolServer' class from test-run repository. But it didn't try resolve connection issues on replica creation, like: [30534] main/103/replica I> connecting to 1 replicas [30534] main/112/applier/localhost:49168 I> can't connect to master [30534] main/112/applier/localhost:49168 sio.c:208 !> SystemError connect to 127.0.0.1:49168, called on fd 27, aka 127.0.0.1:47954: Connection refused [30534] main/112/applier/localhost:49168 I> will retry every 0.10 second [30534] main/112/applier/localhost:49168 I> remote master c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 running Tarantool 2.7.0 [30534] main/103/replica I> connected to 1 replicas [30534] main/103/replica I> bootstrapping replica from c5d480c3-219c-11eb-ac14-080027727614 at 127.0.0.1:49168 [30534] main/112/applier/localhost:49168 I> can't read row [30534] main/112/applier/localhost:49168 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode. [30534] main/103/replica box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode. [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode. [30534] main/103/replica F> can't initialize storage: Can't modify data because this instance is in read-only mode. To resolve it the test was changed to be able to catch exception 'TarantoolStartError' from test-run. Also the test should have the ability to be restarted by test-run using fragile list and in this way 'crash_expected' flag was enabled to let the test fail with exception. Needed by #4949
-
Alexander V. Tikhonov authored
Found hanging test vinyl/ddl.test.lua on: [159] inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.tk:count() end) [159] --- [159] - true [159] ... [159] -box.snapshot() [159] ---- [159] -- ok [159] -... The real issue happend before it when test failed on: [091] --- engine/ddl.result Thu May 14 16:12:09 2020 [091] +++ engine/ddl.reject Fri May 15 04:15:07 2020 [091] @@ -2558,7 +2558,7 @@ [091] ... [091] inspector:wait_cond(function() return box.space.test.index.pk:count() == box.space.test.index.sk:count() end) [091] --- [091] -- true [091] +- false [091] ... Our tests have structure when different standalone subtests exists in the test files. To be able to check all of them this hang must be neutralized to give the next standalone subtest ability to pass. To avoid of this hang decided to disable box.snapshot check if the previous check of the current subtest failed. Needed for #4353
-
Alexander V. Tikhonov authored
Found that the previous fix of the engine/ddl.test.lua test committed with: 5f96ee59 ('Fix flaky test engine/ddl') did not fix the issue #4353 in real and it was reverted. Needed for #4353
-
Alexander Turenko authored
Limit waiting for a tarantool process termination by 5 seconds. When this timeout exceeded, print a warning to the terminal and send SIGKILL to the process. We need to handle the situation with a stuck tarantool process on the testing system side to overcome a problem of this kind that appears on Mac OS (see #5573). This changeset handles one particular case: stopping of a tarantool instance that either started for execution of a 'core = tarantool' test suite or started from a test using the `test_run:cmd('start server foo')` command. It does not handle stopping of tarantool that is started for execution of a 'core = app' test or started from a test directly using io.popen() or built-in 'popen' module. Related to #5573 Part of https://github.com/tarantool/test-run/issues/157 The changeset: https://github.com/tarantool/test-run/pull/186
-
Alexander Turenko authored
This changeset fixes a problem that unlikely will hit anyone, but in theory it may be triggered by an incorrect behaviour of tarantool. In brief, if tarantool does not react to SIGTERM after executing all tests and a test-run's worker stucks at waiting for termination of the tarantool process, the test-run's listener would fail at attempt to access a temporary result file that does not exists. See more details in [1]. [1]: https://github.com/tarantool/test-run/issues/245
-
Alexander Turenko authored
* Added --snapshot and --disable-schema-upgrade arguments (#240). * Fixed reporting of an error for conflicting arguments (#241). The `--snapshot path/to/snapshot` argument copies a given snapshot to a snapshot directory before start a tarantool instance. This allows to verify various functionality in the case, when tarantool is upgraded from a snapshot that is left by an older tarantool version (as opposite to test it on a freshly bootstrapped instance). There are limitations: when a test spawns a replica set, the option does not work correctly. The reason is that the same instance UUIDs (and IDs) cannot be used by different instances in a replica set. Maybe there are other pitfalls. The `--disable-schema-upgrade` argument instructs tarantool to skip execution of the schema upgrade script (using ERRINJ_AUTO_UPGRADE). This way we can verify that, when an instance works on an old schema version, a functionality is workable or at least gives correct error message. This commit only brings the new options into test-run. It does NOT add any new testing targets / rules. Part of #4801
-
- Dec 03, 2020
-
-
Serge Petrenko authored
Follow-up #5440
-
Alexander V. Tikhonov authored
Added replication_connect_timeout to replication/*quorum.lua scripts to decrease replication/quorum.test.lua test run time in 2 times which was 150 seconds before it. Before the patch this test run time was near to 'test-timeout' limit of 110 seconds and even to 120 seconds of 'no-output-timeout' limit. It caused test to fail because of it. Also the test uses to wait for 3rd replica till it will be connected and this timeout helps to avoid of long waitings.
-
Kirill Yukhin authored
Index variable run from 1 .. 5 and was used to index array of size 4. Use iv - 1 instead. Discovered by Coverity.
-
Sergey Voinov authored
Check schema version (stored in box.space._schema) on start and print a warning if it doesn't match last available schema version. It is needed because some users forget to call box.schema.upgrade() after Tarantool update and get stuck with an old schema version until they encounter some hard to debug problems. Closes #4574 Co-developed-by:
Roman Khabibov <roman.habibov@tarantool.org>
-
- Dec 02, 2020
-
-
Sergey Ostanevich authored
Before this patch fiber.cond():wait() just returns for cancelled fiber. In contrast fiber.channel():get() throws "fiber is canceled" error. This patch unifies behaviour of channels and condvars. It also fixes a related net.box module problem #4834 since fiber.cond now performs test for fiber cancellation. Closes #4834 Closes #5013 Co-authored-by:
Oleg Babin <olegrok@tarantool.org> @TarantoolBot document Title: fiber.cond():wait() throws if fiber is cancelled Currently fiber.cond():wait() throws an error if waiting fiber is cancelled.
-
Sergey Ostanevich authored
The fiber_cond_wait() will set an error in case fiber is cancelled. As a result, the current diag in the fiber can be reset during the wal_clear_watcher(). To prevent such overwrite the diag copy from the relay into current fiber is moved to the exit of the relay_subscribe_f(). Part of #5013
-
- Dec 01, 2020
-
-
Serge Petrenko authored
Users usually use box.ctl.wait_rw() to determine the moment when the instance becomes writeable. Since the synchronous replication introduction, this function became pointless, because even when an instance is writeable, it may fail at writing something because its limbo is not empty. To fix the problem introduce a new helper, txn_limbo_is_ro() and start using it in box_update_ro_summary(). Call box_update_ro_summary() every time the limbo gets emptied out or changes an owner. Closes #5440
-
Cyrill Gorcunov authored
Since the commit ae7e2103 we use internal serializer thus we no longer need serpent code. The patch removes the references from the source code and .gitmodules file, still one might need to run | git submodule deinit -f third_party/serpent manually to clean up the working tree depending on local git version. Closes #5517 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
- Nov 26, 2020
-
-
Cyrill Gorcunov authored
In case if we're loading a fresh module we put it into a module's cache first which allows us to not reload same module twice (say there could be several functions in same module). But if the module is loaded for the first time and symbol resolution failed we continue keeping this module loaded even if there may be no more use of it. Thus make a cleanup if needed. There is no portable way to verify via test as far as I know, just manually via "lsof -p `pidof tarantool`". Fixes #5475 Reported-by:
Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Alexander Turenko authored
Improve logging and error reporting of the testing system. The most visible change is the new --debug option, which shows debug logs on the terminal. See details in [1]. [1]: https://github.com/tarantool/test-run/pull/237
-
Roman Khabibov authored
Print the true name of _session_settings space in error messages. Closes #4732
-
Roman Khabibov authored
Context is just a string with a few characters before and after wrong token, wrong token itself and a symbolic arrow pointing to this token. Closes #4339
-
Roman Khabibov authored
Print tokens themselves instead of token names "T_*" in the error messages. Part of #4339
-
Alexander V. Tikhonov authored
Implemented ability to remove opensuse-leap OS packages.
-
Alexander V. Tikhonov authored
Updated help message on remove option.
-
Alexander V. Tikhonov authored
Added message which file to remove to be sure that the needed files were searched to remove.
-
Alexander V. Tikhonov authored
Found that Sources file destroys when module uploaded without sources. Also found that it could happen for Packages file on modules uploading without binaries. To fix it was added additional its downloading from S3 if in modules it was not updated and routine was not used.
-
Alexander V. Tikhonov authored
Added flaky tests results files checksums: app-tap/logger.test.lua gh-5346 app-tap/tarantoolctl.test.lua gh-5059 box/access.test.lua gh-5373 gh-5411 box/alter.test.lua gh-5557 box/before_replace.test.lua gh-5546 box/cfg.test.lua gh-5530 box/ddl_call_twice_gh-2336.test.lua gh-5560 box/ddl_collation_deleted_gh-3290.test.lua gh-5555 box/gh-4703-on_shutdown-bug.test.lua gh-5560 box/hash_gh-1467.test.lua gh-5476 gh-5504 box/iterator.test.lua gh-5523 box/leak.test.lua gh-5548 box/net.box_connect_timeout_gh-2054.test.lua gh-5548 box/net.box_count_inconsistent_gh-3262.test.lua gh-5532 box/net.box_field_names_gh-2978.test.lua gh-5554 box/net.box_get_connection_object.test.lua gh-5549 box/net.box_gibberish_gh-3900.test.lua gh-5548 box/net.box_incorrect_iterator_gh-841.test.lua gh-5434 box/net.box_index_unique_flag_gh-4091.test.lua gh-5551 box/net.box_iproto_hangs_gh-3464.test.lua gh-5548 box/net.box_log_corrupted_rows_gh-4040.test.lua gh-5548 box/net.box_reload_schema_gh-636.test.lua gh-5550 box/net.box_schema_change_gh-2666.test.lua gh-5547 box/on_shutdown.test.lua gh-5562 box/schema_reload.test.lua gh-5552 box/select.test.lua gh-5548 box/tree_pk_multipart.test.lua gh-5528 gh-5556 box-tap/gh-4231-box-execute-locking.test.lua gh-5558 box-tap/session.test.lua gh-5346 box-tap/session.storage.test.lua gh-5346 engine/conflict.test.lua gh-5516 engine/tuple.test.lua gh-5480 replication/bootstrap_leader.test.lua gh-5478 replication/box_set_replication_stress.test.lua gh-4992 replication/gh-3160-misc-heartbeats-on-master-changes.test.> gh-4940 replication/ddl.test.lua gh-5337 replication/election_basic.test.lua gh-5368 replication/election_qsync.test.lua gh-5430 replication/election_qsync_stress.test.lua gh-5395 replication/gh-5287-boot-anon.test.lua gh-5412 replication/gh-5426-election-on-off.test.lua gh-5506 replication/prune.test.lua gh-5361 replication/rebootstrap.test.lua gh-5524 replication/show_error_on_disconnect.test.lua gh-5371 replication/sync.test.lua gh-3835 replication/transaction.test.lua gh-5563 sql/prepared.test.lua gh-5359 sql/checks.test.lua gh-5477 sql/gh2808-inline-unique-persistency-check.test.lua gh-5479 swim/swim.test.lua gh-5403 gh-5561 vinyl/deferred_delete.test.lua gh-5089 vinyl/errinj_tx.test.lua gh-5539 vinyl/gh-4810-dump-during-index-build.test.lua gh-5031 vinyl/gh-4957-too-many-upserts.test.lua gh-5378 vinyl/gh-5141-invalid-vylog-file.test.lua gh-5141 vinyl/gc.test.lua gh-5474 vinyl/iterator.test.lua gh-5141 vinyl/replica_rejoin.test.lua gh-4985 vinyl/snapshot.test.lua gh-4984 vinyl/tx_gap_lock.test.lua gh-4309 xlog/panic_on_broken_lsn.test.lua gh-4991
-
- Nov 23, 2020
-
-
Cyrill Gorcunov authored
It is never used and placed here accidentally. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
The instance_id name is too general, we use it in node's identification while limbo simply "belongs" to those who tracks current transactions queue. Lets rename it to owner_id to distinguish from global instance_id and better grepability. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
To make sure we won't access out of bounds in lsn array. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-