Commits · fb224554319dddf2afdcbb73a1fb94230604bd9f · core / tarantool

Dec 22, 2020

github-ci: update docker images creation routine · fb224554

Alexander V. Tikhonov authored 4 years ago

Due to all the activities moving from Gitlab-CI to Github-CI Actions,
then docker images creation routine updated with the new images naming
and containers registry:

  GITLAB_REGISTRY?=registry.gitlab.com

changed to

  DOCKER_REGISTRY?=docker.io

Part of #5294

fb224554

test: add test filter for vinyl tests · 566b1af7

Alexander V. Tikhonov authored 4 years ago

Added test-run filter on box.snapshot error message:

  'Invalid VYLOG file: Slice [0-9]+ deleted but not registered'

to avoid of printing changing data in results file to be able to use
its checksums in fragile list of test-run to rerun it as flaky issue.

Found issues:

 1) vinyl/deferred_delete.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/913623306#L4552

  [036] 2020-12-15 19:10:01.996 [16602] coio vy_log.c:2202 E> failed to process vylog record: delete_slice{slice_id=744, }
  [036] 2020-12-15 19:10:01.996 [16602] main/103/vinyl vy_log.c:2068 E> ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Slice 744 deleted but not registered

 2) vinyl/gh-4864-stmt-alloc-fail-compact.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/913810422#L4835

  [052] @@ -56,9 +56,11 @@
  [052]  --
  [052]  dump(true)
  [052]   | ---
  [052] - | ...
  [052] -dump()
  [052] - | ---
  [052] + | - error: 'Invalid VYLOG file: Slice 253 deleted but not registered'
  [052] + | ...

 3) vinyl/misc.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/913727925#L5284

  [014] @@ -62,14 +62,14 @@
  [014]  ...
  [014]  box.snapshot()
  [014]  ---
  [014] -- ok
  [014] +- error: 'Invalid VYLOG file: Slice 1141 deleted but not registered'
  [014]  ...

 4) vinyl/quota.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/914016074#L4595

  [025] 2020-12-15 22:56:50.192 [25576] coio vy_log.c:2202 E> failed to process vylog record: delete_slice{slice_id=522, }
  [025] 2020-12-15 22:56:50.193 [25576] main/103/vinyl vy_log.c:2068 E> ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Slice 522 deleted but not registered

 5) vinyl/update_optimize.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/913728098#L2512

  [051] 2020-12-15 20:18:43.365 [17147] coio vy_log.c:2202 E> failed to process vylog record: delete_slice{slice_id=350, }
  [051] 2020-12-15 20:18:43.365 [17147] main/103/vinyl vy_log.c:2068 E> ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Slice 350 deleted but not registered

 6) vinyl/upsert.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/913623510#L6132

  [008] @@ -441,7 +441,7 @@
  [008]  -- Mem has DELETE
  [008]  box.snapshot()
  [008]  ---
  [008] -- ok
  [008] +- error: 'Invalid VYLOG file: Slice 1411 deleted but not registered'
  [008]  ...

 7) vinyl/replica_quota.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/914272656#L5739

  [023] @@ -41,7 +41,7 @@
  [023]  ...
  [023]  box.snapshot()
  [023]  ---
  [023] -- ok
  [023] +- error: 'Invalid VYLOG file: Slice 232 deleted but not registered'
  [023]  ...

 8) vinyl/ddl.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/914309343#L4538

  [039] @@ -81,7 +81,7 @@
  [039]  ...
  [039]  box.snapshot()
  [039]  ---
  [039] -- ok
  [039] +- error: 'Invalid VYLOG file: Slice 206 deleted but not registered'
  [039]  ...

 9) vinyl/write_iterator.test.lua
    https://gitlab.com/tarantool/tarantool/-/jobs/920646297#L4694

  [059] @@ -80,7 +80,7 @@
  [059]  ...
  [059]  box.snapshot()
  [059]  ---
  [059] -- ok
  [059] +- error: 'Invalid VYLOG file: Slice 351 deleted but not registered'
  [059]  ...
  [059]  --
  [059]  -- Create a couple of tiny runs on disk, to increate the "number of runs"

 10) vinyl/gc.test.lua
     https://gitlab.com/tarantool/tarantool/-/jobs/920441445#L4691

  [050] @@ -59,6 +59,7 @@
  [050]  ...
  [050]  gc()
  [050]  ---
  [050] +- error: 'Invalid VYLOG file: Run 1176 deleted but not registered'
  [050]  ...
  [050]  files = ls_data()
  [050]  ---

 11) vinyl/gh-3395-read-prepared-uncommitted.test.lua
     https://gitlab.com/tarantool/tarantool/-/jobs/921944705#L4258

  [019] @@ -38,7 +38,7 @@
  [019]   | ...
  [019]  box.snapshot()
  [019]   | ---
  [019] - | - ok
  [019] + | - error: 'Invalid VYLOG file: Slice 634 deleted but not registered'
  [019]   | ...
  [019]
  [019]  c = fiber.channel(1)

566b1af7

test: setup workspace in tmpfs for OOS build · dfcefb63

Alexander V. Tikhonov authored 4 years ago

Found that running vinyl test suite in parallel using test-run vardir
on real hard drive may cause a lot of tests to fail. It happens because
of bottleneck with hard drive usage up to 100% which can be seen by any
of the tools like atop during vinyl tests run in parallel. To avoid of
it all heavy loaded testing processes should use tmpfs for vardir path.
Found that out-of-source build had to be updated to use tmpfs for it.
This patch mounts additional tmpfs mount point in OOS build docker run
process for test-run vardir. This mount point set using '--tmpfs' flag
because '--mount' does not support 'exec' option which is needed to be
able to execute commands in it [2][3].

Issues met on OOS before the patch, like described in #5504 and [1]:

  Test hung! Result content mismatch:
  --- vinyl/write_iterator.result	Fri Nov 20 14:48:24 2020
  +++ /rw_bins/test/var/081_vinyl/write_iterator.result	Fri Nov 20 15:01:54 2020
  @@ -200,831 +200,3 @@
   ---
   ...
   for i = 1, 100 do space:insert{i, ''..i} if i % 2 == 0 then box.snapshot() end end
  ----
  -...
  -space:delete{1}
  ----
  -...

Closes #5622
Part of #5504

[1] - https://gitlab.com/tarantool/tarantool/-/jobs/863266476#L5009
[2] - https://stackoverflow.com/questions/54729130/how-to-mount-docker-tmpfs-with-exec-rw-flags
[3] - https://github.com/moby/moby/issues/35890

dfcefb63

rfc: luajit metrics · 32358f4f
Sergey Kaplun authored 4 years ago
```
Part of #5187
```
32358f4f

Dec 21, 2020

raft: fix crash on death timeout decrease · 4042b5c0

Vladislav Shpilevoy authored 4 years ago

If death timeout was decreased during waiting for leader death or
discovery to a new value making the current death waiting end
immediately, it could crash in libev.

Because it would mean the remaining time until leader death became
negative. The negative timeout was passed to libev without any
checks, and there is an assertion, that a timeout should always
be >= 0.

This commit makes raft code covered almost on 100%, not counting
one 'unreachable()' place.

Closes #5303

4042b5c0

raft: fix crash on election timeout decrease · ad713399

Vladislav Shpilevoy authored 4 years ago

If election timeout was decreased during election to a new value
making the current election expired immediately, it could crash in
libev.

Because it would mean the remaining time until election end became
negative. The negative timeout was passed to libev without any
checks, and there is an assertion, that a timeout should always
be >= 0.

Part of #5303

ad713399

raft: fix ignorance of bad state receipt · 3fe5367c

Vladislav Shpilevoy authored 4 years ago

raft_process_msg() only validated that the state is specified. But
it didn't check if the state is inside of the allowed value range.

Such messages were considered valid, and even their other fields
were accepted. For instance, an invalid message could bump term.

It is safer to reject such messages.

Part of #5303

3fe5367c

raft: fix crash when received 0 term message · 2f5522dd

Vladislav Shpilevoy authored 4 years ago

Term in raft can never be 0. It starts from 1 and can only grow.
It was assumed it can't be received from other nodes because they
do the same. There was an assertion for that.

But in raft_msg, used as a transport unit between raft nodes, it
was still possible to send 0 term. It could happen as a result of
a bug, or if someone would try to mimic the protocol but made a
mistake.

That led to a crash in the assert in debug build.

Part of #5303

2f5522dd

test: introduce raft unit tests · e688280b

Vladislav Shpilevoy authored 4 years ago

Raft algorithm was tested only by functional Lua tests, as a part
of the Tarantool executable.

Functional testing of something like raft algorithm has drawbacks:

- Not possible or too hard to cover some cases without error
  injections and/or good stability. Such as invalid messages, or
  small time durations, or a complex case which needs multiple
  steps to be reproduced. For instance, start WAL write, receive a
  message, finish the WAL write, and see if an expected thing
  happens.

- Too long time to run when need to test not tiny timeouts. On the
  other hand, with tiny timeouts the test would become unstable.

- Poor reproducibility due to random used in raft, due to system
  load, and number of other tests running in parallel.

- Hard to debug, because for raft it is necessary to start one
  Tarantool process per raft instance.

- Involves too much other systems, such as threads, real journal,
  relays, appliers, and so on. They can affect the process on the
  whole and reduce reproducibility and debug simplicity even more.

Exactly the same problem existed for SWIM algorithm implemented as
a module 'swim'. In order to test it, swim was moved to a separate
library, refactored to be able to start many swims in the process
(no global variables), its functions talking to other systems
were virtualized (swim_ev, swim_transport), and substituted in the
unit tests with fake analogue systems.

In the unit tests these virtual functions were implemented
differently, but the core swim algorithm was left intact and
properly tested.

The same is done for raft. This patch implements a set of helper
functions and objects to unit test raft in raft_test_utils.c/.h
files, and uses it to cover almost all the raft algorithm code.

During implementation of the tests some bugs were found, which are
not covered here, but fixed and covered in next commits.

Part of #5303

e688280b

raft: introduce raft_ev · 1d01394e

Vladislav Shpilevoy authored 4 years ago

Raft_ev.h/.c encapsulates usage of libev, to a certain extent. All
libev functions are wrapped into raft_ev_* wrappers. Objects and
types are left intact.

This is done in order to be able to replace raft_ev.c in the
soon coming unit tests with fakeev functions. That will allow to
simulate time and catch all the raft events with 100%
reproducibility, and without actual waiting for the events in
real time.

The similar approach is used for swim unit tests.

Original raft core file is built into a new library 'raft_algo'.
It is done for the sake of code coverage in unit tests. A test
could be built by directly referencing raft.c in
unit/CMakeLists.txt, but it can't apply compilation options to it,
including gcov options.

When raft.c is built into a library right where it is defined, it
gets the gcov options, and the code covered by unit tests can be
properly shown.

Part of #5303

1d01394e

fakesys: introduce fakeev_timer_remaining() · bce18d3c

Vladislav Shpilevoy authored 4 years ago

ev_timer_remaining() in libev returns number of seconds until the
timer will expire. It is used in raft code.

Raft is going to be tested using fakesys, and it means it needs
fakeev analogue of ev_timer_remaining().

Part of #5303

bce18d3c

fakesys: fix ev_is_active not working on fake timers · bacf206c

Vladislav Shpilevoy authored 4 years ago

fakeev timers didn't set 'active' flag for ev_watcher objects.

Because of that if fakeev_timer_start() was called, the timer
wasn't visible as 'active' via libev API.

Fakeev is supposed to be a drop-in simulation of libev, so it
should "work" exactly the same.

bacf206c

memtx: fix a bug with insertion to space during recovery · 3bc4a156

mechanik20051988 authored 4 years ago

There was a problem whith on_schema_init trigger.
This trigger gives a way to create on_replace trigger
that will modify temporary or is_local spaces during recovery
from snapshot, but on that stage of recovery process
all space indexes are in special build mode when no check
for uniqueness are made. I added a new function
'is_recovery_finished' in box.ctl, which gives
user ability to check that we are in snapshot recovery stage
and can't insert/replace/update/upsert something. Also i added a
check for corresponding operations, now they are failed
if user tries to do them during snapshot recovery.

@TarantoolBot document
Title: Add 'is_recovery_finished' function
Add 'is_recovery_finished' function in box.ctl
to add user ability to check that we are
in snapshot recovery stage and can't
insert/replace/update/upsert something

Closes #5304

3bc4a156

Dec 20, 2020

swim: don't call swim_quit via FFI · bf9549e3

Vladislav Shpilevoy authored 4 years ago

swim_quit yields, because it joins the event handler fiber. Hence
it can't be called via FFI, where a yield might lead to platform
panic.

Closes #4570

bf9549e3

Dec 19, 2020

test: update test-run (--test-timeout) · 6e6e7b29

Alexander Turenko authored 4 years ago

Added `--test-timeout <seconds>` argument. 110 seconds by default. The
main idea is to don't reach --no-output-timeout when possible and so be
able to restart a failed test (according to fragile test checksums) and
store artifacts. PR #244.

Fixed various cases when test-run doesn't wait for a stopping instance,
doesn't try to stop it at all or issue just SIGTERM, without SIGKILL
after some delay. PR #257.

Unverified

6e6e7b29

Dec 16, 2020

sql: update temporary file name format · 427fdaa7

Leonid Vasiliev authored 4 years ago


The bug was consisted in fail when working with temporary files
created by VDBE to sort large result of a `SELECT` statement with
`ORDER BY`, `GROUP BY` clauses.

Whats happen (step by step):
- We have two instances on one node (sharded cluster).
- A query is created that executes on both.
- The first instance creates the name of the temporary file and
  checks a file with such name on existence.
- The second instance creates the name of the temporary file
  (the same as in  first instance) and checks a file with such name
  on existence.
- The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
  flag.
- The second instance opens(try to open) the same file.
- The first instance closes (and removes) the temporary file.
- The second instance tries to work with the file and fails.

Why did it happen:
The temporary file name format has a random part, but the random
generator uses a fixed seed.

When it was decided to use a fixed seed:
32cb1ad2
("sql: drop useless code from os_unix.c")

How the patch fixes the problem:
The patch injects the PID in the temporary file name format.
The generated name is unique for a single process (due to a random part)
and unique between processes (due to the PID part).

Alternatives:
1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
  temporary files). In both cases, we need to update a significant
  part of the code, and some degradation can be added. It's hard to
  review.
2) Return a random seed for the generator. As far as I understand,
  we want to have good reproducible system behavior, in which case
  it's good to use a fixed seed.
3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
  win the fight. Now we set such flags when opening a temporary file,
  but after that we try to open the file in `READONLY` mode and
  if ok - return the descriptor. This is strange logic for me and I
  don't want to add any aditional logic here. Also, such solution will
  add additional attempts to open the file.

So, it look like such minimal changes will work fine and are simple
to review.

Co-authored-by: Mergen <Imeev&lt;imeevma@gmail.com>

Fixes #5537

427fdaa7

luacheck: change global vars to local in sql-tap · b1a4fed6

Artem Starshov authored 4 years ago

Fixed luacheck warning 111 (setting non-standard global variable)
in test/sql-tap directory.
Enabled this directory for checking W111 in
config file(.luacheckrc).

Changed almost all variables in test/sql-tap from globals
to locals. In any cases, where variables need to be global,
added explicit _G. prefix (table of globals).

Fixes #5173
Part-of #5464

b1a4fed6

bitset: replace zero-length array with flexible-array member · 9b0278f4

Artem Starshov authored 4 years ago

Zero-lenght arrays are GNU C extension.
There's ISO C99 flexible array member, which
is preffered mechanism to declare variable-length types.

Flexible array member allows us to avoid applying sizeof
operator cause it's incomplete type, so it will be an error
at compile time. There're any moments else why it's better
way to implement such structures via FAM:
https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

In this issue it fixed gcc 10 warning:
"warning: writing 1 byte into
a region of size 0 [-Wstringop-overflow=]"

Closes #4966
Closes #5564

9b0278f4

sql: fix build with GCC 10 · a65d9c50

Artem Starshov authored 4 years ago

GCC 10 produces the following error:
cc1: warning: function may return address of local variable [-Wreturn-local-addr]

Fix it.

Part-of #4966

a65d9c50

Dec 11, 2020

box: refactor tuple_field_raw to omit path checks · 28f3b2f1

Serge Petrenko authored 4 years ago

tuple_field_raw is an alias to tuple_field_raw_by_path with zero path.
This involves multiple path != NULL checks which aren't needed for tuple
field access by field number. The checks make this function rather slow
compared to its 1.10 counterpart (see results below).

In order to fix perf problems when JSON path indices aren't involved,
factor out the part of tuple_field_raw_by_path which is responsible for
direct field access by number and place it in tuple_field_raw.

This patch was tested by snapshot recovery part involving secondary
index building for a 1.5G snapshot with
one space and one secondary index over 4 integer and one string field.
Comparison table is below:

    Version    | time(seconds)  | Change relative to 1.10
---------------|----------------|------------------------
1.10           |      2:24      |           -/-
2.x(unpatched) |      3:03      |          + 27%
2.x (patched)  |      2:10      |          - 10%

Numbers below show cumulative time spent in tuple_compare_slowpath,
for 1.10 / 2.x(unpatched) / 2.x(patched) for 15, 19 and 14 second
profiles respectively: 13.9 / 17.8 / 12.5.

tuple_field_raw() isn't measured directly, since it's inlined, but all
its uses come from tuple_compare_slowpath.

As the results show, we manage to be even faster, than 1.10 used to be
in this test. This must be due to tuple comparison hints, which are
present only in 2.x.

Closes #4774

28f3b2f1

box: speed up tuple_field_map_create · 420bacb2

Serge Petrenko authored 4 years ago

Since the introduction of JSON path indices tuple_init_field_map, which
was quite a simple routine traversing a tuple and storing its field data
offsets in the field map, was renamed to tuple_field_map_create and
optimised for working with JSON path indices.

The main difference is that tuple format fields are now organised in a
tree rather than an array, and the tuple itself may have indexed fields,
which are not plain array members, but rather members of some sub-array
or map. This requires more complex iteration over tuple format fields
and additional tuple parsing.

All the changes were, however, unneeded for tuple formats not supporting
fields indexed by JSON paths.

Rework tuple_field_map_create so that it doesn't go through all the
unnecessary JSON path-related checks for simple cases and restore most
of the lost performance.

Below are some benchmark results for the same workload that pointed to
the degradation initially.
Snapshot recovery time on RelWithDebInfo build for a 1.5G snapshot
containing a single memtx space with one secondary index over 4 integer
and 1 string field:

        Version            | Time (s) | Difference relative to 1.10
---------------------------|----------|----------------------------
1.10 (the golden standard) |    28    |             -/-
2.x (degradation)          |    39    |            + 39%
2.x (patched)              |    31    |            + 11%

Profile shows that the main difference is in memtx_tuple_new due to
tuple_init_field_map/tuple_field_map_create performance difference.

Numbers below show cumulative time spent in tuple_init_field_map (1.10) /
tuple_field_map_create (unpatched) / tuple_field_map_create (patched).
2.44 s / 8.61 s / 3.19 s

More benchmark results can be seen at #4774

Part of #4774

420bacb2

sql: remove unecessary execute of space_cache_find() · bc16e5df

Mergen Imeev authored 4 years ago

Due to the fact that space_cache_find () is called unnecessarily, it is
possible to set diag "Space '0' does not exist", although in this case
it is not a wrong situation when the space id is 0.

Part of #5592

bc16e5df

core: introduce evenly distributed int64 random in range · 31bf9ef1

Ilya Kosarev authored 4 years ago

Tarantool codebase had at least two functions to generate random
integer in given range and both of them had problems at least with
integer overflow. This patch brings nice functions to generate random
int64_t in given range without overflow while preserving uniform random
generator distribution using unbiased bitmask with rejection method.
It is now possible to use xoshiro256++ PRNG or random bytes as a basis.
Most relevant replacements have been made. Needed tests are introduced.

Closes #5075

31bf9ef1

test: add filter to box/net.*after_gh-3164 test · e519954a

Alexander V. Tikhonov authored 4 years ago

Found issue:

  [079] @@ -115,5 +115,14 @@
  [079]  -- connection is deleted by 'collect'.
  [079]  weak.c
  [079]  ---
  [079] -- null
  [079] +- peer_uuid: 035d7b36-f205-45f4-9e16-e5b0b99a9b0b
  [079] +  opts:
  [079] +    reconnect_after: 0.1
  [079] +  host: unix/
  [079] +  schema_version: 78
  [079] +  protocol: Binary
  [079] +  state: error_reconnect
  [079] +  error: Connection refused
  [079] +  peer_version_id: 132864
  [079] +  port: /tmp/tnt/079_box/proxy.socket-iproto
  [079]  ...

Which could not be restarted with checksum because of changing UUID
value each run. To avoid of it added filter on 'peer_uuid:' output.

e519954a

test: add test filter for vinyl/errinj.test.lua · 718cba1b

Alexander V. Tikhonov authored 4 years ago

Added test-run filter on box.snapshot error message:

  'Invalid VYLOG file: Slice [0-9]+ deleted but not registered'

to avoid of printing changing data in results file to be able to use
its checksums in fragile list of test-run to rerun it as flaky issue.

Needed for #4346

718cba1b

Dec 10, 2020

test: fix replication/skip_conflict_row output · 7398d495

Alexander V. Tikhonov authored 4 years ago

Found that test replication/skip_conflict_row.test.lua fails with
output message in results file:

  [260] @@ -117,11 +117,23 @@
  [260]  -- lsn is not promoted
  [260]  lsn1 == box.info.vclock[1]
  [260]  ---
  [260] -- true
  [260] +- false
  [260]  ...
  [260]  test_run:wait_upstream(1, {status = 'stopped', message_re = "Duplicate key exists in unique index 'primary' in space 'test'"})
  [260]  ---
  [260] -- true
  [260] +- false
  [260] +- id: 1
  [260] +  uuid: bdbf6673-6ee4-47eb-a88d-81164f4e61c9

Test could not be restarted with checksum because of changing values
like UUID on each fail. It happend because test-run uses internal
chain of functions wait_upstream() -> gen_box_info_replication_cond()
which returns instance information on its fails. To avoid of it this
output was redirected to log file instead of results file.

7398d495

test: move error messages into logs · f79fbdd8

Alexander V. Tikhonov authored 4 years ago

Found that some tests on fail use box.info* calls to print information,
like:

  [024] --- replication/wal_rw_stress.result	Mon Nov 30 10:02:43 2020
  [024] +++ var/rejects/replication/wal_rw_stress.reject	Sun Dec  6 16:06:46 2020
  [024] @@ -77,7 +77,45 @@
  [024]              r.downstream.status ~= 'stopped')    \
  [024]  end) or box.info
  [024]  ---
  [024] -- true
  [024] +- version: 2.7.0-109-g0b3ad5d8a0
  [024] +  id: 2
  [024] +  ro: false
  [024] +  uuid: e0b8863f-7b50-4eb5-947f-77f92c491827

It denies test-run from rerunng these tests using checksums, because
of changing output on each fail, like 'version:' either 'uuid:' fields
values above. To avoid of it, these calls outputs should be redirected
to log files using log.error(). Also the same fix made for tests with
fio.listdir() and fio.stat() on errors.

f79fbdd8

gitlab-ci: remove Ubuntu 19.10 from packing · 1bf44ccb

Alexander V. Tikhonov authored 4 years ago

Found issue on Tarantool package build for Ubuntu 19.10 [1]:

E: The repository 'http://archive.ubuntu.com/ubuntu eoan Release' does not have a Release file.
E: The repository 'http://archive.ubuntu.com/ubuntu eoan-updates Release' does not have a Release file.
E: The repository 'http://archive.ubuntu.com/ubuntu eoan-backports Release' does not have a Release file.
E: The repository 'http://security.ubuntu.com/ubuntu eoan-security Release' does not have a Release file.

Also found that time life of Ubuntu 19.04 ended with support [2]
on 17 of July 2020.

So packaging jobs for this OS removed from Gitlab-CI.

[1] - https://gitlab.com/tarantool/tarantool/-/jobs/902339975#L172
[2] - https://fridge.ubuntu.com/2020/07/17/ubuntu-19-10-eoan-ermine-end-of-life-reached-on-july-17-2020/#:~:text=Ubuntu%20announced%20its%2019.10%20(Eoan,updated%20packages%20for%20Ubuntu%2019.10.

1bf44ccb

Dec 08, 2020

test: fix to resolve box/net_msg_max flaky · 9f3e1d0a

Sergey Ostanevich authored 4 years ago

A problem was gh-4834-netbox-fiber-cancel left a request hanging
so the net_msg_max fails in case it follows on the same runner.

9f3e1d0a

Dec 07, 2020

uuid: support uuid comparison with strings · e27745ad

Oleg Babin authored 4 years ago

Before this patch it was impossible to compare uuid values with
string representations of uuid. However we have cases when such
comparisons is possible (e.g. "decimal" where we can compare
decimal values with strings and numbers).

This patch extends uuid comparators (__eq, __lt and __le) and
every string argument is tried to be converted to uuid value to
compare then.

Follow-up #5511

@TarantoolBot document
Title: uuid comparison rules

Currently comparison between uuid values is supported.

Example:
```lua
u1 = uuid.fromstr('aaaaaaaa-aaaa-4000-b000-000000000001')
u2 = uuid.fromstr('bbbbbbbb-bbbb-4000-b000-000000000001')

u1 > u2  -- false
u1 >= u2 -- false
u1 <= u2 -- true
u1 < u2  -- true
```

Also it's possible to compare uuid values with its string
representations:
```lua
u1_str = 'aaaaaaaa-aaaa-4000-b000-000000000001'
u1 = uuid.fromstr(u1_str)
u2_str = 'bbbbbbbb-bbbb-4000-b000-000000000001'

u1 == u1_str -- true
u1 == u2_str -- false

u1 >= u1_str -- true
u1 < u2_str  -- true
```

e27745ad

uuid: support comparison of uuid values in Lua · 5495d8f7

Oleg Babin authored 4 years ago

Since Tarantool has uuid data type sometimes we want to compare
uuid values as it's possible for primitive types and decimals. This
patch exports function for uuid comparison and implements __le and
__lt metamethods for uuid type.

Closes #5511

5495d8f7

luajit: bump new version · c4c626fa
Kirill Yukhin authored 4 years ago
```
* x64: Fix __call metamethod return dispatch.
```
c4c626fa

Dec 06, 2020

test: move .tarantoolctl to test-run submodule · 54667f6f

Alexander V. Tikhonov authored 4 years ago

In the previous commit the .tarantoolctl configuration file was placed
into the test-run submodule repository as:

  <tarantool repository>/test-run/.tarantoolctl

This commit removes it from the tarantool repository. In fact, it
unblocks the `./test-run.py --replication-sync-timeout <seconds>` option
and now all tests will actually receive test-run's value for the
box.cfg() option (100 seconds by default instead of 300 seconds, which
is tarantool's default).

Updated tests with replication_sync_timeout check value. Set it to
hidden value due to it could be set the other than default in options
at test-run run command.

Found that no need to copy tarantoolctl configuration file to binary
path any more, after it was moved to test-run repository, so reverting
changes from:

  aa609de2 ('cmake for tests updated:
  copy ctl config in builddir')

Needed for #5504

Unverified

54667f6f

test: update test-run (--replication-sync-timeout) · 6c8efa84

Alexander Turenko authored 4 years ago

See commits in the PR [1] for detailed description of the changes. User
visible changes are the following.

1. Now test-run.py can be invoked from any directory without changing a
   current working directory to `test/`.
2. The `test/.tarantoolctl` configuration file is not mandatory and can
   be removed. It is shipped now within the test-run repository.
3. test-run sets the `replication_sync_timeout` box.cfg() option when
   the `test/.tarantoolctl` is not present in a parent repository. The
   value is controlled by the --replication-sync-timeout argument and
   defaults to 100 seconds (unlike tarantool's default, which is 300
   seconds).

The reason of the changes is to set default `replication_sync_timeout`
for all tests to a value lower than `--no-output-timeout` (120 seconds)
to allow instances to step into the orphan mode before this deadline and
see more descriptive picture when it leads to failure of a test. What is
also important, when a test fails before the `--no-output-timeout`, we
able to restart it based on the `fragile` suite.ini option and / or
collect artifacts to store them in CI.

The `--no-output-timeout` deadline remains the show-stopper. We'll
introduce a test execution timeout later to step into the general
`--no-output-timeout` only in quite rare and unusual cases.

The next commit will actually remove `test/.tarantoolctl`, so the new
`replication_sync_timeout` will be in effect.

[1]: https://github.com/tarantool/test-run/pull/242

Part of #5504

Unverified

6c8efa84

Dec 04, 2020

fakesys: move fakeev to fakesys library · ac48b2ad

Vladislav Shpilevoy authored 4 years ago

Fakesys is a collection of fake implementations of deep system
things such as libev and libc.

The fake subsystems will provide API just like their original
counterparts (except for function names), but with full control of
their behaviour in user-space for the sake of unit testing.

Fakeev is a bogus version of libev, whose main feature is virtual
time. Fakeev has internal clock, which is fully controllable in
user-space. That allows to roll hours of tests in milliseconds of
real time.

Fakeev is used in SWIM tests, and will be used in Raft tests.

Part of #5303

ac48b2ad

test: factor out swim from fakeev.h/.c files · 5cf45266

Vladislav Shpilevoy authored 4 years ago

SWIM unit tests contain a special library for emulating the event
loop: swim_test_ev. It provides API similar to libev, but
implemented entirely in user-space, including clock functions.

The latter is the most important point, as the original libev
does not allow to define your own timing functions - internally it
relies on select/kqueue/epoll/poll/select/... with true clock.

Because of that it is impossible to perform long tests with the
original libev, which could last for minutes or even tens of
seconds if their count is big. swim_test_ev uses virtual time,
where hours can be played in milliseconds.

--

This commit extracts all swim code to swim_test_ev.c. Now this
file is nothing but an implementation of swim_ev.h on top of
fakeev API.

Fakeev, in turn, does not depend on SWIM anymore, and can be moved
to fakesys library.

Part of #5303

5cf45266

test: move fake libev code to fakeev.c/.h files · 6d9ac21c

Vladislav Shpilevoy authored 4 years ago

SWIM unit tests contain a special library for emulating the event
loop: swim_test_ev. It provides API similar to libev, but
implemented entirely in user-space, including clock functions.

The latter is the most important point, as the original libev
does not allow to define your own timing functions - internally it
relies on select/kqueue/epoll/poll/select/... with true clock.

Because of that it is impossible to perform long tests with the
original libev, which could last for minutes or even tens of
seconds if their count is big. swim_test_ev uses virtual time,
where hours can be played in milliseconds.

The fake libev is going to be re-used for Raft unit tests. But for
that it is necessary to detach it from all SWIM dependencies.

--

The patch renames swim_test_ev.c/.h to fakeev.c/.h because they
will contain only fakeev functions soon.

The swim methods, implementing swim_ev.h via fakeev, are moved to
their own file in a separate commit. Because their file will be
swim_test_ev.c. If they would be moved here, git would treat it
like everything *except* swim functions was moved to fakeev.h/.c.
It would ruin git history, and is split in 2 commits to avoid
this.

Part of #5303

6d9ac21c

test: rename fake libev methods to fakeev · ee09dc7b

Vladislav Shpilevoy authored 4 years ago

SWIM unit tests contain a special library for emulating the event
loop: swim_test_ev. It provides API similar to libev, but
implemented entirely in user-space, including clock functions.

The latter is the most important point, as the original libev
does not allow to define your own timing functions - internally it
relies on select/kqueue/epoll/poll/select/... with true clock.

Because of that it is impossible to perform long tests with the
original libev, which could last for minutes or even tens of
seconds if their count is big. swim_test_ev uses virtual time,
where hours can be played in milliseconds.

The fake libev is going to be re-used for Raft unit tests. But for
that it is necessary to detach it from all SWIM dependencies.

--

This commit makes all swim_test_ev functions have 'fakeev' prefix
instead of 'swim'. The functions, implementing swim_ev.h API, are
kept as one-line proxies to the fakeev functions.

Part of #5303

ee09dc7b

fakesys: introduce fake system library · 4321b64f

Vladislav Shpilevoy authored 4 years ago

Fakesys is going to be a collection of fake implementations of
deep system things such as libev and libc.

The fake subsystems will provide API just like their original
counterparts (except for function names), but with full control of
their behaviour in user-space for the sake of unit testing.

This commit introduces first part of fakesys - a subset of libc
network API: sendto(), recvfrom(), bind(), close(), getifaddrs().

Main features of fakenet are:

- Integration with event loop via fakenet_loop_update(). Although
  this could be also considered an issue if it will be ever
  necessary to implement fake epoll, or sockets not bound to any
  event loop;

- Filters to decide which packets to drop depending on their src,
  dst, and content;

- Socket block to suspend packets delivery until the socket is
  unblocked.

Fakenet implements connection-less API, for UDP sockets. This is
exactly what is needed in SWIM.

Raft fake transport will need reliable sockets with broadcast API.
Reliability can be ensured by setting drop rate to 0 (which is
default). Broadcast functionality is already present - there is a
broadcast interface in fakenet_getifaddrs() result.

Part of #5303

4321b64f

test: factor out swim from fakenet.c/.h files · e463b454

Vladislav Shpilevoy authored 4 years ago

SWIM unit tests contain special libraries for emulating event loop
and network: swim_test_ev and swim_test_transport. They provide
API similar to libev and to network part of libc, which internally
is implemented entirely in user-space and allows to simulate all
kinds of errors, any time durations, etc.

These test libraries are going to be re-used for Raft unit tests.
But for that it is necessary to detach them from all SWIM
dependencies.

--

This commit extracts all swim code to swim_test_transport.c. Now
this file is nothing but an implementation of swim_transport.h on
top of fakenet API.

Fakenet, in turn, does not depend on SWIM anymore, and can be
moved to its own library.

Part of #5303

e463b454