Commits · bd44ab518c64422182258a6085656827df692c5c · core / tarantool

Jan 25, 2022

raft: fix ev_timer.at incorrect usage · bd44ab51

Vladislav Shpilevoy authored 3 years ago

ev_timer.at was used as timeout. But after ev_timer_start() it
turns into the deadline - totally different value.

The patch makes sure ev_timer.at is not used in raft at all.

To test that the fakeev subsystem is patched to start its time not
from 0. Otherwise ev_timer.at often really matched the timeout
even for an active timer.

(cherry picked from commit e51c61ae)

bd44ab51

raft: fix crash on election_timeout reconfig · b428b2de

Vladislav Shpilevoy authored 3 years ago

It used to crash if done during election on a node voted for
anybody, it is a candidate, it doesn't know a leader yet, but has
a WAL write in progress.

Thus it could only happen if the term was bumped by a message from
a non-leader node and wasn't flushed to the disk yet.

The patch makes the reconfig check if there is a WAL write in
progress. Then don't do anything.

Could also check for volatile vote instead of persistent, but it
would create the same problem for the case when started writing
vote for self and didn't finish yet. Reconfig would crash.

(cherry picked from commit 82757e55)

b428b2de

Jan 19, 2022

test: bump new test-run version · 7d35cd2b

Kirill Yukhin authored 3 years ago

* Enable parallel invocation of fragile tests

(cherry picked from commit 91d8fef5)

7d35cd2b

Jan 14, 2022

test: fix flaky vinyl/gh-4810-dump-during-index-build test · eed93fbb

Vladimir Davydov authored 3 years ago

The commit fixes the following test failure:

```
[082] vinyl/gh-4810-dump-during-index-build.test.lua                  Test timeout of 310 secs reached	[ fail ]
[082]
[082] Test failed! Result content mismatch:
[082] --- vinyl/gh-4810-dump-during-index-build.result	Thu Dec  9 05:31:17 2021
[082] +++ /build/usr/src/debug/tarantool-2.10.0~beta1.324.dev/test/var/rejects/vinyl/gh-4810-dump-during-index-build.reject	Thu Dec  9 06:51:03 2021
[082] @@ -117,34 +117,3 @@
[082]  for i = 1, ch:size() do
[082]      ch:get()
[082]  end;
[082] - | ---
[082] - | ...
[082] -
...
```

The test hangs waiting for the test fibers to exit. There are two test
fibers - one builds an index, another populates the test space. The
latter uses pcall so it always returns. The one that builds an index,
however, doesn't. The problem is index build may fail because it builds
a unique index while the fiber populating the space may insert
non-unique values. Fix this by building a non-unique index instead,
which should never fail. To reproduce the issue the test checks is fixed
one can build any index, unique or non-unique, so it should be fine.

Closes #5508

(cherry picked from commit 5cd399b7)

eed93fbb

test: fix flaky vinyl/gh test failure · 698c3c7a

Vladimir Davydov authored 3 years ago

The commit fixes the following test failure:

```
[005] vinyl/gh.test.lua                                               [ fail ]
[005]
[005] Test failed! Result content mismatch:
[005] --- vinyl/gh.result	Mon Dec 13 15:03:45 2021
[005] +++ /root/actions-runner/_work/tarantool/tarantool/test/var/rejects/vinyl/gh.reject	Fri Dec 17 10:41:24 2021
[005] @@ -716,7 +716,7 @@
[005]  ...
[005]  test_run:wait_cond(function() return finished == 2 end)
[005]  ---
[005] -- true
[005] +- false
[005]  ...
[005]  s:drop()
[005]  ---
```

The reason of the failure is that the fiber doing checkpoints fails,
because a checkpoint may be already running by the checkpoint daemon.
Invoke box.snapshot() under pcall to make the test more robust.

Part of #5141

(cherry picked from commit cc6c328d)

698c3c7a

test: fix flaky vinyl/deferred_delete test · 93f96a58

Vladimir Davydov authored 3 years ago

The commit fixes the following test failure:

```
[019] vinyl/deferred_delete.test.lua                                  [ fail ]
[019]
[019] Test failed! Result content mismatch:
[019] --- vinyl/deferred_delete.result	Tue Jan 11 11:10:22 2022
[019] +++ /build/usr/src/debug/tarantool-2.10.0~beta2.37.dev/test/var/rejects/vinyl/deferred_delete.reject	Fri Jan 14 11:45:26 2022
[019] @@ -964,7 +964,7 @@
[019]  ...
[019]  sk:stat().disk.dump.count -- 1
[019]  ---
[019] -- 1
[019] +- 0
[019]  ...
[019]  sk:stat().rows - dummy_rows -- 120 old REPLACEs + 120 new REPLACEs + 120 deferred DELETEs
[019]  ---
```

The test checks that compaction of a primary index triggers dump of
secondary indexes of the same space, because it generates deferred
DELETE statements. There's no guarantee that by the time compaction
completes, secondary index dump have been completed as well, because
compaction may ignore the memory quota (it uses vy_quota_force_use in
vy_deferred_delete_on_replace). Make the check more robust by using
wait_cond.

Follow-up #5089

(cherry picked from commit 7f8c549b)

93f96a58

test: use wait_cond in vinyl/deferred_delete test · a930f8a0
Vladimir Davydov authored 3 years ago
```
It's better than hand-written busy-wait.

(cherry picked from commit 8c913a10)
```
a930f8a0

test: fix flaky vinyl/gc test · 191cf6e9

Vladimir Davydov authored 3 years ago

The commit fixes the following test failure:

```
[013] vinyl/gc.test.lua                                               [ fail ]
[013]
[013] Test failed! Result content mismatch:
[013] --- vinyl/gc.result	Fri Dec 24 12:27:33 2021
[013] +++ /build/usr/src/debug/tarantool-2.10.0~beta2.18.dev/test/var/rejects/vinyl/gc.reject	Thu Dec 30 10:29:29 2021
[013] @@ -102,7 +102,7 @@
[013]  ...
[013]  check_files_number(2)
[013]  ---
[013] -- true
[013] +- null
[013]  ...
[013]  -- All records should have been purged from the log by now
[013]  -- so we should only keep the previous log file.
```

The reason of the failure is that vylog files are deleted asynchronously
(`box.snapshot()` doesn't wait for `unlink` to complete) since commit
8e429f4b ("wal: remove old xlog files
asynchronously"). So to fix the test, we just need to make the test wait
for garbage collection to complete.

Follow-up #5383

(cherry picked from commit cd9fd77e)

191cf6e9

Jan 13, 2022

gen-release-notes: group sections case insensitively · a13bde1d

Alexander Turenko authored 3 years ago

Now `## bugfix/luajit` and `## bugfix/LuaJIT` entries will form one
section.

(cherry picked from commit 50c83f74)

a13bde1d

gen-release-notes: prettify only lowercased headers · 7ebb23ae

Alexander Turenko authored 3 years ago

Now one can write either `## bugfix/luajit` or `## bugfix/LuaJIT`. The
latter will NOT be transformed into `### Luajit` anymore. Both variants
now give `### LuaJIT` section inside `## Bugs fixed` section.

See `REDEFINITIONS` variable inside the script to understand how well
known headers (such as `## <...>/LuaJIT`) are prettified.

There is a problem with section grouping, when headers are written in
lower/title/mixed case. It'll be resolved in a next commit.

(cherry picked from commit df5d69e6)

7ebb23ae

gen-release-notes: handle no changelog entries · 44504217

Alexander Turenko authored 3 years ago

We run the script in CI as linter (see PR #6701), so it should handle
lack of any changelog entries gracefully.

(cherry picked from commit b9c022f2)

44504217

gen-release-notes: handle no features/no bugfixes · d062327b

Alexander Turenko authored 3 years ago

Previously the script mistakely requires at least one feature and at
least one bugfix. However it is quite natural to have only bugfixes in a
bugfix release. Moreover, we added the script into CI as linter (see
PR #6701), so it should work even when we just start filling release
notes.

(cherry picked from commit 8973b968)

d062327b

gen-release-notes: keep changelogs/unreleased dir · 83b99231

Alexander Turenko authored 3 years ago

Git does not store empty directories, but it is convenient for us to
have this directory always.

(cherry picked from commit 06cd9373)

83b99231

Jan 12, 2022

ci: mark 'unicode_de__phonebook_s3' as unstable · eded8278

Yaroslav Lobankov authored 3 years ago

The test for the 'unicode_de__phonebook_s3' collation from
sql-tap/collation_unicode.test.lua fail if the ICU version >= 70.1.
So let's temporarily mark it as unstable until the issue is resolved.

See for more details tarantool/tarantool#6695.

eded8278

Dec 30, 2021

test: fix dynamic modules loading on macOS · 60e751a6

Sergey Kaplun authored 3 years ago

Since the auxiliary libraries are built as dynamically loaded modules on
macOS instead of shared libraries as it is done on Linux and BSD,
another environment variable should be used to guide `ffi.load()` while
searching the extension. Hence the paths are set in test need to be set
to `DYLD_LIBRARY_PATH` variable instead of `LD_LIBRARY_PATH` on macOS.

(cherry picked from commit ee94aa69)

60e751a6

github-ci: introduce macOS ver. 12 testing on amd64 · 7c41a903

Kirill Yukhin authored 3 years ago

New workflow was copy-and-pasted from macOS 11 workflow
and patched obviously.

NB: it seems like app-tap/popen test is constantly failing
on macOS ver. 12. Disable until it is fixed.

Part of #6739

(cherry picked from commit d011dffd)

7c41a903

github-ci: remove strange digit from OSX 11 workflow · d36d8a55
Kirill Yukhin authored 3 years ago
```
I have no idea what this trailing zero stands for.
Drop it.

(cherry picked from commit b3c1b09d)
```
d36d8a55

github-ci: remove macOS ver. 10.15 · 20210007

Kirill Yukhin authored 3 years ago

We support current and previous version of macOS.
So let's remove 10.15 from CI.

(cherry picked from commit 28026722)

20210007

github-ci: move LTO testing on macOS to version 11 · fc36ae7b

Kirill Yukhin authored 3 years ago

We're about to stop CI for macOS version 10.15. But
seems (for now) we still interested in LTO on mscOS.
So, move LTO testing to macOS version 11.

Part of #6739

(cherry picked from commit eb8e7218)

fc36ae7b

Dec 29, 2021

ci: add integration check for memcached module · 1e17b2eb

Yaroslav Lobankov authored 3 years ago

This patch extends the 'integration.yml' workflow and adds a new
workflow call for running tests to verify integration between tarantool
and the memcached module.

Part of #5265
Part of #6056
Closes #6563

(cherry picked from commit e493b777)

1e17b2eb

Dec 27, 2021

recovery: panic in case of recovery and replicaset vclock mismatch · f9e26802

Serge Petrenko authored 3 years ago

We assume that no one touches the instance's WALs, once it has taken the
wal_dir_lock. This is not the case when upgrading from an old setup
(running tarantool 1.7.3-6 or less). Such nodes either take a lock on
snap dir, which may be different from wal dir, or don't take the lock at
all.

So, it's possible that during upgrade an old node is not stopped
properly before a new node is started in the same data directory.

The old node might even write some extra data to WAL during new node's
startup.

This is obviously bad and leads to multiple issues. For example, new node
might start local recovery, scan the WALs and set replicaset.vclock to
some value {1 : 5}. While the node recovers WALs they are appended by the old
node up to vclock {1 : 10}.
The node finishes local recovery with replicaset vclock {1 : 5}, but
data recovered up to vclock {1 : 10}.

The node will use the now outdated replicaset vclock to subscribe to
remote peers (leading to replication breaking due to duplicate keys
found), to initialize WAL (leading to new xlogs appearing with duplicate
LSNs). There might be a number of other issues we just haven't stumbled
upon.

Let's prevent situations like that and panic as soon as we see that the
initially scanned vclock (replicaset vclock) differs from actually
recovered vclock.

Closes #6709

(cherry picked from commit 634f59c7)

f9e26802

Dec 24, 2021

github-ci: Remove Ubuntu 20.10 · 607740a4

Kirill Yukhin authored 3 years ago

Ubuntu 20.10 reached end-of-life few months ago,
remove it from CI/CD.

(cherry picked from commit 19a161b5)

607740a4

Dec 23, 2021

iproto: clear request::header for client requests · 3a9f8899

Vladimir Davydov authored 3 years ago

To apply a client request, we only need to know its type and body. All
the meta information, such as LSN, TSN, or replica id, must be set by
WAL. Currently, however, it isn't necessarily true: iproto leaves a
request header received over iproto as is, and tx will reuse the header
instead of allocating a new one in this case, which is needed to process
replication requests, see txn_add_redo().

Unless a client actually sets one of those meta fields, this causes no
problems. However, if we added transaction support to the replication
protocol, reusing the header would result in broken xlog, because
currently, all requests received over iproto have the is_commit field
set in xrow_header for the lack of TSN, while is_commit must only be set
for the final statement in a transaction. One way to fix it would be
clearing is_commit explicitly in iproto, but ignoring the whole header
received over iproto looks more logical and error-proof.

Needed for #5860

(cherry picked from commit 4fefb519)

3a9f8899

Dec 22, 2021

Fix typo in changelogs/2.8.3.md · 01023dbc
Kirill Yukhin authored 3 years ago

View commits for tag 2.8.3 2.8.3

01023dbc
Generate changelog for 2.8.3 · 0d76f62a
Kirill Yukhin authored 3 years ago

0d76f62a

ci: add linter job for changelog entries · 09b6408e

Alexander Turenko authored 3 years ago

What is bad: a considerable amount of boilerplate code should be added
to just run a simple script. I hope we'll do something with this
in #6604.

(cherry picked from commit 8b1ce351)

09b6408e

ci: rename luacheck.yml to lint.yml · ca003721

Alexander Turenko authored 3 years ago

The idea is to allow to add more checks here: I'm going to add
changelogs check in a next commit.

(cherry picked from commit f0980af8)

ca003721

ci: drop useless chown call · 53bb2a09

Alexander Turenko authored 3 years ago

AFAIK we anyway run self-hosted runners from root due to all those
problems like [1]. No reason to include the hack into every workflow
file. If we'll going to start runners from a non-root user, it is better
to pass `--user $PROPER_UID` to all docker jobs instead, see [2]: at
least it does not require to place a boilerplate into all workflow
files.

Didn't touch perf_* jobs: don't want to dig inside this part of the
infrastructure ATM.

It reverts PR #5953.

[1]: https://github.com/actions/checkout/issues/211
[2]: https://github.com/actions/runner/issues/691

(cherry picked from commit a0a2e8b9)

53bb2a09

ci: drop useless fail-fast clause · 1b62dc6b

Alexander Turenko authored 3 years ago

AFAIU it only has meaning for jobs constructed with matrix expansion.
See the documentation: [1].

It is relevant to [2] as well, but already fixed on nektos/act side.

[1]: https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions#jobsjob_idstrategyfail-fast
[2]: https://github.com/tarantool/tarantool-qa/issues/118

(cherry picked from commit 474d1cc2)

1b62dc6b

ci: run full ci only on PRs with 'full-ci' label · 5beb9b29

Vladimir Davydov authored 3 years ago

After this commit only three workflow are run on pull request or push to
a developer branch:
 - luacheck
 - release
 - debug_coverage

To run all other tests, one should either name the branch `*-full-ci`
and push it to the main repository or set the 'full-ci' label on the
pull request.

It is also possible to disable all tests on push by naming branch as
`*-notest' or setting the 'notest' label on the pull request.

**Caveats**:
 - Unfortunately, currently it doesn't seem to be possible to run
   workflows automatically when a particular label is set - the best we
   can do is run workflows when *any* label is set. So labeling a PR
   that has the 'full-ci' label set will trigger all workflows!
 - For the same reason, removing the 'notest' label doesn't trigger ci.
   One has to synchronize the PR afterwards. We could trigger ci on the
   'unlabel' event, but this would trigger tests when any label is
   removed, not necessarily 'notest'. Since 'notest' is supposed to be
   used only by developers, who can sync the branch, this should be
   acceptable.

While we are at it:
 - Remove the check disabling certain workflow runs on forks - it's
   pointless, because forks don't have ci. Anyway, we don't bother
   disabling most of our workflows on forks, even those that we run on
   self-hosted machines, so that would only be consistent.
 - Remove the condition from the coverity workflow - coverity doesn't
   run on push or PR so it doesn't make any sense.
 - Remove the condition from the 'source' workflow. Instead trigger it
   only when a tag is pushed. This is needed to avoid showing it as a
   skipped workflow in PRs and commits.

Closes #6605

(cherry picked from commit 6c2e664e)

5beb9b29

github-ci: don't deploy tarballs per push · ca8bf66e
Alexander Turenko authored 3 years ago
```
Follows up #6185

(cherry picked from commit f820a3a8)
```
ca8bf66e

ci: cancel outdated workflow runs · e6552caa

Yaroslav Lobankov authored 3 years ago

According to a huge amount of commit pushes by developers we should
cancel all outdated workflow runs (previously scheduled and not relevant
due to new changes) to make CI more efficient. GitHub Actions provides
the 'concurrency' feature [1] as a method of reaching that and this
patch introduces its using.

How does it work?

Basically, an update of a developer branch cancels the previously
scheduled workflow run for this branch. However, the 'master' branch,
release branch (1.10, 2.8, etc.), and tag workflow runs are never
canceled.

[1] https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#concurrency

Closes tarantool/tarantool-qa#100

(cherry picked from commit d3f32d18)

e6552caa

Dec 21, 2021

build: fix build with glibc-2.34 · bba7a2fa

Andrey Saranchin authored 3 years ago

Macros SIGSTKSZ used to be an integral constant but
in glibc-2.34 it turned into a runtime function so it
cannot be used as constant known size for arrays anymore.

Beyond this, SIGSTKSZ is not enough for alt. signal stack size
when you use ASAN, so the size was increased.

Closes #6686

(cherry picked from commit 9c01b325)

bba7a2fa

Move xmalloc to trivia/util.h · 8662fb74

Vladimir Davydov authored 3 years ago

We want to use the xmalloc helper throughout the code, not only in
the core lib. Move its definition to trivia/util.h and use fprintf+exit
instead of say/panic in order not to create circular dependencies.

(cherry picked from commit f3b5ad97)

8662fb74

core: add x* memory allocation functions · badf030e

Vladimir Davydov authored 3 years ago

This patch adds xmalloc, xcalloc, xrealloc, xstrdup, and xstrndup helper
functions. Each of them calls the corresponding memory allocation
function and panics if it fails. See the issue description for the full
justification.

Closes #3534

(cherry picked from commit 60dc88ea)

badf030e

Dec 13, 2021

lib: use memcpy for unaligned load/store · 69fc3de6

Aleksandr Lyapunov authored 3 years ago

Now there's a lot of load_*/store_* functions that are designed
for unaligned access to values in data stream. Unfortunately
they are written in a way that makes new compilers to warn about
outside bounds.

Rewrite it in the most safe, cross-platform and efficient way -
using memcpy. Note that memcpy with compile-time-defined size is
recognized by the magority of compilers and they make the most
efficient operation.

Closes #6703

(cherry picked from commit 7b40dcce)

69fc3de6

debian: eliminate dependency from binutils package · a3239425

Andrey Kulikov authored 3 years ago

tarantool debian package MUST NOT depends on binutils package. This is
due to the fact that binutils include linker and assembler, what in most
cases forbidden on production servers.

This dependency is a residual from times, when tarantool did use libbfd
for stack unwinding. Now it simply does not required at all.

Fixes #6699

(cherry picked from commit a86f5963)

a3239425

readme: update build instructions for FreeBSD · 9a364ecc

Pavel Balaev authored 3 years ago

libiconv must be installed before tarantool build. Otherwise iconv test
will fails.
More info: https://github.com/tarantool/tarantool/issues/3791

python-daemon is not used anymore:
https://github.com/tarantool/test-run/commit/8797cb15bf34f8c70519221b3b9c78f08696aba6
https://github.com/tarantool/tarantool/commit/130cf4659bf938d1fa1bf0e936b0a43a53fa8dcb



python-msgpack is enabled in test-run as a submodule, while
python-gevent and python-yaml are actually used in test-run.

Signed-off-by: Pavel Balaev <balaev@tarantool.org>
(cherry picked from commit 30cba22e)

9a364ecc

ci: fix python path for FreeBSD build target · f56db30b

Pavel Balaev authored 3 years ago


Python3 symlink is missing after installing python3X from pkg on FreeBSD
12.2 and 13. This breaks tests.

Signed-off-by: Pavel Balaev <balaev@tarantool.org>
(cherry picked from commit f3435836)

f56db30b

ci: use custom directory on FreeBSD tests · b08cff1b

Pavel Balaev authored 3 years ago


Running tests inside FreeBSD github runner sandbox directory will fail with:
af_unix socket length exceed.

Signed-off-by: Pavel Balaev <balaev@tarantool.org>
(cherry picked from commit ecede20a)

b08cff1b