Commits · 8aa2c05e347a40097a31e1715e34ae74d994e365 · core / tarantool

Apr 04, 2022

ci: take sysbench out of docker · 8aa2c05e

VitaliyaIoffe authored 3 years ago

Previously, sysbench benchmark was running inside Docker on CI.
According to the performance testing best practices it is a quite
doubtful approach. So let's switch to a more traditional practice:
run on hardware.

Closes tarantool/tarantool-qa#158

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci

(cherry picked from commit 1b1ddd22)

8aa2c05e

Apr 01, 2022

replication: fix bootstrap failing with ER_READONLY · cfdc1d8c

Serge Petrenko authored 3 years ago

When the master is just starting up it's possible for replica's JOIN
request to arrive right in time to bypass ER_LOADING check (after master
is fully recovered) but still fail due to ER_READONLY: box.cfg.read_only
is only read and set after box_cfg() (its C part) returns.

In this case the joining replica simply exits with an error and doesn't
retry JOIN.

Let's fix that. Make ER_READONLY a recoverable error and let replica
retry joining after receiving ER_READONLY.

Anonymous nodes relied on ER_READONLY to forbid replication from
anonymous to normal replicas. That check doesn't work anymore.
So introduce explicit checks banning replication from anonymous nodes.

Note, there were some alternatives to this fix.

First of all, theoretically, we could stop firing ER_LOADING later,
after box_cfg() is complete. This solution wouldn't work because it
would lead to deadlocks: the nodes would be stuck in replicaset_sync(),
because each of them rejects replication with ER_LOADING.

Another solution would be to read the real box.cfg.read_only value
earlier, in order to allow replication right after the node finishes
recovery.
This would also be bad, because we should never let a node become
writeable before box.cfg is finished. Even after local_recovery is
complete, the node should stay read-only until it synchronizes with
other replicas.

That said, neither of the two alternatives fit, so the solution with
retrying JOIN on ER_READONLY was chosen.

Since the bug is fixed, re-enable the test in which it was discovered:
replication-py/init_storage.test.py

Also, remove replication/ddl.test.lua from fragile list, since this bug
was the only reason for its fragility.

Closes #5337
Closes #6966

NO_DOC=minor bugfix

(cherry picked from commit c1c77782)

cfdc1d8c

Mar 30, 2022

lua: rewrite crc32 digest via Lua C API · bdc027a4

Igor Munkin authored 3 years ago

As a result of recording <crc32:update> method or <digest.crc32>
function wrong semantics is compiled (strictly saying, the resulting
trace produces the different result from the one yielded by
interpreter). The easiest solution is disabling JIT for particular
functions, however, such approach drops the overall platform
performance. Hence, the mentioned functions are rewritten line by line
via Lua C API to avoid JIT misbehaviour.

NO_DOC=no visible changes
NO_CHANGELOG=no visible changes

(cherry picked from commit 6b913198)

bdc027a4

test: disable init_storage.test.py due to bug · 045ca62f

Yaroslav Lobankov authored 3 years ago

Disable the init_storage.test.py test due to the bug #6966.
After a bug fix the test should be enabled again.

NO_DOC=testing stuff
NO_CHANGELOG=testing stuff
NO_TEST=testing stuff

(cherry picked from commit cab6822f)

045ca62f

Mar 29, 2022

fiber: fix ignorance of flags for reused fibers · 15980d0d

Vladislav Shpilevoy authored 3 years ago

fiber_new_ex() used to ignore fiber_attr flags when the fiber was
taken from the cache, not created anew.

It didn't matter much though for the public API, because the only
public flag in fiber_attr was FIBER_CUSTOM_STACK (which can be
set via fiber_attr_setstacksize()).

Anyway that was a bug for internal API and would lead to issues in
the future when more public flags are added. The patch fixes it.

NO_DOC=Bugfix
NO_CHANGELOG=No reproducer via public API

(cherry picked from commit 31d27599)

15980d0d

fiber: panic on cancel of a recycled fiber · bb64b689

Vladislav Shpilevoy authored 3 years ago

There was a user who complained about this code crashing:

    f = fiber_new_ex(...);
    fiber_start(f);
    fiber_cancel(f);

The crash was at cancel. It happened because the fiber finished
immediately. It was already recycled after fiber_start() return.

Recycled fiber didn't have any flags, so fiber_cancel() didn't
see the fiber was already dead and tried to wake it up. It crashed
when the fiber tried to call its 'fiber->f' function which was
NULL.

In debug build the process fails earlier with an assertion on
'fiber->fid != 0'.

It can't be really fixed because the problem is the same as with
use-after-free. The fiber could be not recycled but already freed
completely, returned back to the mempool.

This patch tries to help the users by a panic with a message
saying that it wasn't just a crash, it is a bug in user's code.

There is an alternative - make fibers never return to the mempool.
Then fiber_cancel() could ignore recycled fibers. But it would
lead to another problem that if the fiber is already reused, then
fiber_cancel() would hit a totally irrelevant fiber who was
unlucky to reuse that fiber pointer. It seems worse than panic.

Same problem exists for `fiber_wakeup()`, but I couldn't figure
out how to add a panic there and not add an `if` on the normal
execution path (which includes 'ready' and 'running' fibers).

Closes #6837

NO_CHANGELOG=The same crash remains, but happens a bit earlier and
  with a message.

@TarantoolBot document
Title: `fiber_cancel()` C API clarification

The documentation must warn that the fiber passed to
`fiber_cancel()` must not be already dead unless it was set to be
joinable. Same for `fiber_wakeup()` and all the other fiber
functions. A dead non-joinable fiber could already be freed or
reused.

(cherry picked from commit dbb90274)

bb64b689

fiber: fix fibers with custom stack leak · 1f73d81a

Vladislav Shpilevoy authored 3 years ago

Fibers with custom stack couldn't be reused via cord->dead list,
but neither were ever deleted via mempool_free(). They just leaked
until the cord was destroyed. Their custom stack also leaked.

It happened for all non-joinable custom-stack fibers. That was
because fiber_destroy() simply skipped the destruction if the
fiber is the current one.

It didn't affect joinable fibers because their fiber_destroy() is
done in another fiber. Their stack was deleted, but the fiber
itself still leaked.

The fix makes so fiber_destroy() is never called for the current
fiber. Instead, cord uses an approach like in pthread library -
the fiber who wants to be deleted is saved into cord->garbage
member. When some other fiber will want to be deleted in the
future, it will firstly cleanup the previous one and put self into
its place. And so on - fibers cleanup each other.

The process is optimized for the case when the fiber to delete is
not the current one - can delete it right away then.

NO_DOC=Bugfix

(cherry picked from commit 4ea29055)

1f73d81a

build: in static-build download openSSL from backup storage · 75db32e2

Nick Volynkin authored 3 years ago

Use a new storage bucket, made specifically for open-source
third-party software distributions.

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from commit 86a2edab)

75db32e2

ci: clean workspace more thoroughly · c536950b

Yaroslav Lobankov authored 3 years ago

This change adds execution of the `git clean -xffd` command to the
.github/actions/environment action to clean workspace more thoroughly.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit cc768ea8)

c536950b

ci: add server start timeout to PRESERVE_ENVVARS · 82f70337

Yaroslav Lobankov authored 3 years ago

This change adds the server start timeout to the PRESERVE_ENVVARS
environment variable to deliver it to 'packpack' docker containers
while running packaging workflows.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 9c7c675d)

82f70337

ci: store failure logs as artifact in coverage.yml · f9e853ed

Yaroslav Lobankov authored 3 years ago

It looks like the step for storing the artifact with logs after workflow
failures was missed. Now it is added.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 27f4ba7a)

f9e853ed

ci: store artifacts with logs in freebsd workflows · a327a33e

Yaroslav Lobankov authored 3 years ago

This change adds facility to store artifacts with logs after workflow
failures to FreeBSD workflows. The ChristopherHX/github-act-runner known
limitation

    You won't be able to run steps after a failure without using
    continue-on-error: true

is not relevant anymore [1]. So we can add steps with storing artifacts.

[1] https://github.com/ChristopherHX/github-act-runner/tree/58ae37abc6c2244d91822b8ba536aa0a8b829632#known-limitations

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 2959028f)

a327a33e

Mar 25, 2022

test: fix flaky test_cancel_and_errinj · a3658df5

Yaroslav Lobankov authored 3 years ago

This change tries to fix the following sporadic test error:

    [007] not ok 4	http_client.sock_family:"AF_INET".test_cancel_and_errinj
    [007] #   ...arantool/tarantool/test/app-luatest/http_client_test.lua:221: Timeout check - status
    [007] #   expected: 408, actual: 200

Fixes tarantool/tarantool-qa#154

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from a5557281)

a3658df5

Mar 24, 2022

build: in static-build download zlib from backup storage · 7e7af89c

Nick Volynkin authored 3 years ago

zlib.net is unavailable, so we have to download zlib distributions
from a backup storage on VKCS S3.

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from commit 63de77fd)

7e7af89c

ci: freeze luacheck version at 0.25.0 · 6a67b250

Nick Volynkin authored 3 years ago

The newly released 0.26.0 emits a warning on the `_box` variable.
From luacheck v.0.26.0 release notes:

"Function arguments that start with a single underscore
get an "unused hint". Leaving them unused doesn't result
in a warning. Using them, on the other hand, is a
new warning (№ 214)."

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from commit aef0bf81)

6a67b250

Mar 21, 2022

github-ci: add server start timeout · f1538ee6

Kirill Yukhin authored 3 years ago

Added 290 sec. timeout for starting tarantool server.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit f1d750f0)

f1538ee6

test-run: bump to new version · b27646d4

Kirill Yukhin authored 3 years ago

Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from commit 0b46154f)

b27646d4

luacheck: exclude test-run tests · f3f461b7

Kirill Yukhin authored 3 years ago

Running of luacheck against test-run submodule is redundant.
Add dedicated bypass to .luacheckrc.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 8d6a154f)

f3f461b7

ci: use 'concurrency' feature in fuzzing.yml · 4cc785fe

Yaroslav Lobankov authored 3 years ago

It looks like we just missed the fuzzing.yml workflow when worked on
adding this feature to our CI process in #6446.

Follows up tarantool/tarantool-qa#100

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit dbaf03dc)

4cc785fe

ci: remove unused 'repository_dispatch' trigger · d4c71c03

Yaroslav Lobankov authored 3 years ago

It looks like this trigger was added beforehand for some purposes
but never used. So there is no sense to keep it in our workflows.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 88b46b4c)

d4c71c03

ci: remove logic for 'full-ci' branch suffix · 31ea3a8a

Yaroslav Lobankov authored 3 years ago

We have recently moved all development from local branches to forks and
disabled running workflows in forks. So there is no sense to keep logic
for running full testing in forks if a branch has the 'full-ci' suffix.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 1a0b5764)

31ea3a8a

ci: not run workflows in forks · 498f0492

Yaroslav Lobankov authored 3 years ago

We have recently moved all development from local branches to forks.
Most workflows cannot be run in forks because many of them need access
to private runners and/or secrets.

Workflows in forks can be disabled manually, but it's hard to remember
and it needs an extra action from a developer. Instead, we can detect
the repo in workflows and run them if the repo is tarantool/tarantool.

Closes #6913

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit c27e6c74)

498f0492

ci: remove Docker containers after running builds · 41994271

Nick Volynkin authored 3 years ago

Run `docker run` with `--rm` to remove contaners after running
a container-based build.
These containers are never reused, so there's no reason to keep them.
Instead, a new container is created for each build.
Not removing them results in wasting disk space on CI runners.

NO_DOC=ci
NO_CHANGELOG=ci
NO_TEST=ci

(cherry picked from commit 065c2874)

41994271

ci: replace check-commits with checkpatch · 82b4e16d

Vladimir Davydov authored 3 years ago

See https://github.com/tarantool/checkpatch

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci

(cherry picked from commit c86145b3)

82b4e16d

ci: add workflow to check commits on pull request · 5c50aad9

Vladimir Davydov authored 3 years ago

This commit adds a new script, tools/check-commits. The script takes git
revisions in the same format as git-rev-list (actually, it feeds all its
arguments to git-rev-list) and runs some checks on each of the commits.

For example, to check the head commit, run:

  ./tools/check-commits -1 HEAD

To check the last five commits, run:

  ./tools/check-commits HEAD~5..HEAD

Currently, there are only two checks, but in future we may add more
checks, e.g. check diffs for trailing spaces:
 - The commit message contains a documentation request. Can be
   suppressed with NO_DOC=<reason> in the commit message.
 - A new changelog entry is added to changelog/unreleased by the commit.
   Can be suppressed with NO_CHANGELOG=<reason> in the commit message.

This commit also adds a new workflow triggered on pull request,
lint/commits. The workflow runs the tools/check-commits script on all
the commits added by the pull request. The workflow doesn't run on push,
because it's problematic to figure out what commits are new on a branch.
Besides, we don't want to run it on push to release branches, because
it's a pure dev workflow.

Example output:

Checking commit a33f3cc7 PASS
Checking commit 6f29f9d7 FAIL
SHA:     6f29f9d7
SUBJECT: iproto: introduce graceful shutdown protocol
  ERROR:
    Changelog not found in changelog/unreleased. If this commit
    doesn't require changelog, please add NO_CHANGELOG=<reason>
    to the commit message.
Checking commit fbc25aae FAIL
SHA:     fbc25aae
SUBJECT: Update small submodule
  ERROR:
    Missing documentation request ('@TarantoolBot document' not
    found in the commit message). If this commit doesn't need to
    be documented, please add NO_DOC=<reason> to the commit
    message.
  ERROR:
    Changelog not found in changelog/unreleased. If this commit
    doesn't require changelog, please add NO_CHANGELOG=<reason>
    to the commit message.

NO_DOC=ci
NO_CHANGELOG=ci

(cherry picked from commit 005fbb6b)

5c50aad9

ci: run fuzzing on tags and release branches too · d64367b8
Sergey Bronnikov authored 3 years ago
```
(cherry picked from commit 094c46b8)
```
d64367b8

ci: run fuzzing on PR when the 'full-ci' label is set · c4c9d9c7

Sergey Bronnikov authored 3 years ago

Patch allows to run fuzzing in PR where label 'full-ci' is set.
Complememts a patch in commit "ci: enable fuzzing for *-full-ci
branches" (0115aab0).

Follows up #6630

(cherry picked from commit dc9c2117)

c4c9d9c7

build: fix warnings on Clang 13+ · c950277a
Sergey Bronnikov authored 3 years ago
```
Part of #6681

(cherry picked from commit 78827e32)
```
c950277a

ci: enable fuzzing for *-full-ci branches · e85b219f

Sergey Bronnikov authored 3 years ago

Now it is not possible to run fuzzing on CI without merge to master
branch. Last time we missed [1] a broken compilation on merging new
fuzzers to master [2]. This patch enables fuzzing for branches with
postfix 'full-ci'.

1. https://github.com/tarantool/tarantool/pull/6757
2. https://github.com/tarantool/tarantool/pull/6627

(cherry picked from commit 0115aab0)

e85b219f

ci: run 'integration.yml' WF in pre-commit testing · 908f90f4

Yaroslav Lobankov authored 3 years ago

Problem:

For now, the integration testing runs only on a push of changes/tags
to the 'master' and release branches. This approach is not good enough
because we become aware of integration issues after the changes were
merged already to the target branch.

So this patch adds the facility to run the integration testing on a dev
branch to catch integration issues before the changes are merged to the
target branch. Now it can be done via naming the branch as `*-full-ci`
and pushing it to the main repository or setting the 'full-ci' label on
the pull request.

(cherry picked from commit 0c277b84)

908f90f4

Mar 17, 2022

test: fix flaky engine/errinj_ddl test · b34edbf1

Vladimir Davydov authored 3 years ago

The commit fixes the following test failure:

```
[011] engine/errinj_ddl.test.lua                      memtx           [ fail ]
[011]
[011] Test failed! Result content mismatch:
[011] --- engine/errinj_ddl.result      Tue Jan 18 15:28:21 2022
[011] +++ var/rejects/engine/errinj_ddl.reject  Tue Jan 18 15:28:26 2022
[011] @@ -343,7 +343,7 @@
[011]  s:create_index('sk', {parts = {2, 'unsigned'}}) -- must fail
[011]  ---
[011]  - error: Duplicate key exists in unique index "sk" in space "test" with old tuple
[011] -    - [101, 101, "xxxxxxxxxxxxxxxx"] and new tuple - [100, 101]
[011] +    - [100, 101] and new tuple - [101, 101, "xxxxxxxxxxxxxxxx"]
[011]  ...
[011]  ch:get()
[011]  ---
```

The test is inherently racy: a conflicting tuple may be inserted to the
new index either by the index build procedure or by the test fiber doing
DML in the background. The error messages will disagree regarding what
tuple should be considered old and which one new. Let's match the error
message explicitly.

The failure was introduced by d11fb306
("box: change ER_TUPLE_FOUND message") which enhanced error messages
with conflicting tuples.

(cherry picked from commit 197088c3)

b34edbf1

Fix crash in space on_replace triggers. · d7620bd6

EvgenyMekhanik authored 3 years ago

During DDL operations triggers of old space and new created space
are swapped. This leads to crash in case if this swap occurs from
space on_replace triggers. This patch banned all DDL operations
from on_replace triggers.

Closes #6920

@TarantoolBot document
Title: Ban DDL operations from space on_replace triggers

Previously user can set function for space on_replace trigger,
which performs DDL operation. This may leads to tarantool crash,
that's why this patch bans DDL operations from on_replace triggers.
All this operations fails with error: "Space on_replace trigger
does not support DDL operations".

(cherry picked from commit e4e65e40)

d7620bd6

Mar 05, 2022

httpc: reset headers on redirect · b9699708

Vladimir Davydov authored 3 years ago

When a request is redirected, old headers are of no use anymore and
should be dropped. We can detect a redirect by checking the value of
CURLINFO_REDIRECT_COUNT.

Note about the test: we change the redirect reply to 'redirecting...',
so that its length differs from 'hello world'.

Close #6101

NO_DOC=bug fix

(cherry picked from commit 1cde49d4)

b9699708

test: migrate app-tap/http_client test to luatest · 2a122c95
Vladimir Davydov authored 3 years ago
```
NO_DOC=test
NO_CHANGELOG=test

(cherry picked from commit 57102048)
```
2a122c95
test: enable parallel mode for app-luatest · 2800d29a
Vladimir Davydov authored 3 years ago
```
NO_DOC=test
NO_CHANGELOG=test

(cherry picked from commit fd142d71)
```
2800d29a

Mar 03, 2022

luajit: bump new version · 237e30e9

Igor Munkin authored 3 years ago

* ci: introduce GitHub Actions

NO_DOC=LuaJIT submodule bump
NO_CHANGELOG=no visible changes

237e30e9

Mar 01, 2022

box: check tuple format in before_replace triggers · 63e1539a

Andrey Saranchin authored 3 years ago

Currently, we don't check tuple format in before_replace triggers,
that's why some bugs happen if we don't use the triggers correctly.

Let's check tuple format before execution of before_replace triggers
and after each before_replace trigger. The check will be disabled
during recovery for backward compatibility.

Closes #6780

NO_DOC=bug fix

(cherry picked from commit 884b3ff3)

63e1539a

test: refactor #5093 test · ac4b031c

Andrey Saranchin authored 3 years ago

The test checks key validation in before_replace trigger, but it
uses replace() to check it, and, as we plan to check tuple format in
before_replace trigger, we need to use something that works with
key, not tuple: delete(), for example.

Part of #6780

NO_DOC=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 512ba6d5)

ac4b031c

Feb 25, 2022

vinyl: fix crash during secondary index recovery · 9161f19c

Vladimir Davydov authored 3 years ago

A secondary index creation proceeds as follows:
 1. Build the new index by inserting statements from the primary index,
    see vinyl_space_build_index().
 2. Dump the new index and wait for the dump to complete.
 3. Commit the index creation record to the WAL.

While the new index is being dumped at step 2, new statements may be
inserted into the space. We need to insert those statements during
recovery, see vy_build_recover(). We identify such statements by
comparing LSN to vy_lsm::dump_lsn, see vy_build_recover_stmt().

It might occur that the newly built index is empty while the primary
index memory level isn't - if all statements cancel each other. In this
case, the secondary index won't be dumped during creation and its
dump_lsn will be set to -1, see the vy_lsm_is_empty() check in
vinyl_space_build_index(). This would break the assumption made on
recovery: that all statements with LSN > vy_lsm::dump_lsn should be
inserted into the secondary index. If a statement like this isn't
compatible with the new index, we will get a crash trying to insert it.

Let's fix this issue by skipping vy_build_recover() in case the new
secondary index was never dumped.

Closes #6778

NO_DOC=bug fix

(cherry picked from commit dadb8d70)

9161f19c

Feb 22, 2022
- test: remove remaining checksums from suite.ini config files · 5af9dc3d
  Vladimir Davydov authored 3 years ago
  
  Follow-up c28a3b2f ("test: remove checksums from suite.ini config files")
  5af9dc3d