Commits · c92a1699d6c83ff2f3872f27f24b4e07474e11df · core / tarantool

Jun 20, 2022

cmake: support build using Ninja · c92a1699

By default CMake generates Makefiles for building a project. However, it
allows to generate Ninja files. Ninja [1] may build project a bit faster
than Make, see [2].

Patch adds fixes for CMake files allowing to use Ninja for building
Tarantool:

1. Fixed dependencies in ExternalProject_Add(), see explanation in [3]
2. Fixed ninja error due to presence of symbol '$' in cmake/rpm.cmake
3. Added propagation of CMAKE_GENERATOR in dependencies that uses CMake
   for building, see [4]

How-to build wit Ninja:

$ cmake -G Ninja -B build -S .
$ ninja -C build/

1. https://ninja-build.org/
2. https://mesonbuild.com/Simple-comparison.html
3. https://stackoverflow.com/a/65803911/3665613
4. https://cmake.org/cmake/help/latest/module/ExternalProject.html

NO_DOC=internal
NO_CHANGELOG=internal
NO_TEST=internal

c92a1699

Jun 18, 2022

luajit: bump new version · b1953b59

Igor Munkin authored 2 years ago

* ci: add job for build using Ninja on Linux/x86_64
* build: create file lists outside of CMake commands
* build: use unique names for CMake targets
* Revert "test: disable PUC-Rio tests for several -l options"
* ci: make GitHub workflows more CMake-ish
* test: adapt PUC-Rio tests for debug line hook
* test: adapt PUC-Rio test for tail calls debug info
* test: adapt PUC-Rio test with reversed function

Closes #5693
Closes #5702
Closes #5782
Follows up #5747

NO_DOC=LuaJIT submodule bump
NO_TEST=LuaJIT submodule bump
NO_CHANGELOG=LuaJIT submodule bump

b1953b59

Jun 17, 2022

fiber: don't crash on wakeup with dead fibers · 206137e7

Cyrill Gorcunov authored 2 years ago


When fiber has finished its work it ended up in two cases:
1) If no "joinable" attribute set then the fiber is
   simply recycled
2) Otherwise it continue hanging around waiting to be
   joined.

Our API allows to call fiber_wakeup() for dead but joinable
fibers (2) in release builds without any side effects, such
fibers are simply ignored, in turn for debug builds this
causes assertion to trigger. We can't change our API for
backward compatibility sake but same time we must not
preserve different behaviour between release and debug
builds since this brings inconsistency. Thus lets get
rid of assertion call and allow to call fiber_wakeup
in debug build as well.

Fixes #5843

NO_DOC=bug fix

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

206137e7

replication: unify replication filtering with and without elections · deca9749

Serge Petrenko authored 2 years ago

Once the split-brain detection is in place, it's fine to nopify obsolete
data even on a node with elections disabled. Let's not keep a bug around
anymore.

This behaviour change leads to changing
"gh_6842_qsync_applier_order_test.lua" a bit. It actually relied on old
and buggy behaviour: it assumed old transactions would not be nopified
and would trigger replication error.

This doesn't happen anymore, because nopify works correctly, and the
transactions are not followed by a conflicting CONFIRM.

The test for this commit is simply altering the
gh_5295_split_brain_detection_test.lua to work with elections disabled.

Closes #6133
Follow-up #5295

NO_DOC=internal change
NO_CHANGELOG=internal change

deca9749

txn_limbo: filter incoming synchro requests · af7d703f

Cyrill Gorcunov authored 2 years ago


When we receive synchro requests we can't just apply them blindly
because in worst case they may come from split-brain configuration
(where a cluster split into several clusters and each one has own
leader elected, then clusters are trying to merge back into the original
one). We need to do our best to detect such disunity and force these
nodes to rejoin from the scratch for data consistency sake.

Thus when we're processing requests we pass them to the packet filter
first which validates their contents and refuse to apply if they violate
consistency.

Depending on request type each packet traverses an appropriate chain.

filter_generic(): a common chain for any synchro packet.
 1) request:replica_id = 0 allowed for PROMOTE request only.
 2) request:replica_id should match limbo:owner_id, IOW the
    limbo migration should be noticed by all instances in the
    cluster.

filter_confirm_rollback(): a chain for CONFIRM | ROLLBACK packets.
 1) Zero lsn is disallowed for such requests.

filter_promote_demote(): a chain for PROMOTE | DEMOTE packets.
 1) The requests should come in with nonzero term, otherwise
    the packet is corrupted.
 2) The request's term should not be less than maximal known
    one, iow it should not come in from nodes which didn't notice
    raft epoch changes and living in the past.

filter_queue_boundaries(): a common finalization chain.
 1) If LSN of the request matches current confirmed LSN the packet
    is obviously correct to process.
 2) If LSN is less than confirmed LSN then the request is wrong,
    we have processed the requested LSN already.
 3) If LSN is greater than confirmed LSN then
    a) If limbo is empty we can't do anything, since data is already
       processed and should issue an error;
    b) If there is some data in the limbo then requested LSN should
       be in range of limbo's [first; last] LSNs, thus the request
       will be able to commit and rollback limbo queue.

Note the filtration is disabled during initial configuration where we
apply requests from the only source of truth (either the remote master,
or our own journal), so no split brain is possible.

In order to make split-brain checks work, the applier nopify filter now
passes synchro requests from obsolete term without nopifying them.

Also, now ANY asynchronous request coming from an instance with obsolete
term is treated as a split-brain. Think of it as of a syncrhonous
request committed with a malformed quorum.

Closes #5295

NO_DOC=it's literally below

Co-authored-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

@TarantoolBot document
Title: new error type: ER_SPLIT_BRAIN

If for some reason the cluster had 2 leaders working independently (for
example, user has mistakenly lovered the quorum below N / 2 + 1), then
once such leaders and their followers try connecting to each other, they
will receive the ER_SPLIT_BRAIN error, and the connection will be
aborted. This is done to preserve data integrity. Once the user notices
such an error he or she has to manually inspect the data on both the
split halves, choose a way to restore the data, and rebootstrap one of
the halves from the other.

af7d703f

txn_limbo: change function return types · 9eab2868

Serge Petrenko authored 2 years ago

Change return types of txn_limbo_req_prepare, txn_limbo_process,
txn_limbo_write_promote, txn_limbo_write_demote from void to int.
This is a preparation for when these functions start returning errors.

Part-of #5295

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

9eab2868

box: change box_issue_promote(demote) return type · fd5e1439

Serge Petrenko authored 2 years ago

Make box_issue_promote and box_issue_demote return a return code.
For now it's always 0, but soon they will return errors.

Part-of #5295

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

fd5e1439

txn_limbo: track CONFIRM lsn on replicas · 129d83e9

Serge Petrenko authored 2 years ago

limbo->confirmed_lsn was only filled on limbo owner in
txn_limbo_write_confirm. Replicas and recovering limbo owner need to track
it as well to correctly detect split-brains based on confirmed_lsn.

So update confirmed_lsn in txn_limbo_read_confirm.

Part-of #5295

NO_DOC=internal change
NO_TEST=tested in future commits
NO_CHANGELOG=internal change

129d83e9

txn_limbo: do not confirm/rollback anything after restart · 6cc1b1f2

Serge Petrenko authored 2 years ago

It's important for the synchro queue owner to not finalize any of the
pending synchronous transactions after restart.

Since the node was down for some time the chances are pretty high it was
deposed by some new leader during its downtime. It means that the node
might not know yet that it's transactions were already finalized by someone
else.

So, any arbitrary finalization might lead to a future split-brain, once the
remote PROMOTE finally reaches the local node.

Let's fix this by adding a new reason for the limbo to be frozen - a
queue owner has recovered but has not issued a new PROMOTE locally and
hasn't received any PROMOTE requests from the remote nodes.

Once the first PROMOTE is issued or received, it's safe to return to the
old mode of operation.

So, now the synchro queue owner starts in "frozen" state and can't
CONFIRM, ROLLBACK or issue new transactions until either issuing a
PROMOTE or receiving a PROMOTE from some remote node.

This also required modifying box.ctl.promote() behaviour: it's no
longer a no-op on a synchro queue owner, when elections are disabled and
the queue is frozen due to restart.

Also fix the tests, which assumed the queue owner is writeable after a
restart. gh-5298 test was partially deleted, because it became pointless.

And while we are at it, remove the double run of gh-5288 test. It is
storage engine agnostic, so there's no point in running it for both
memtx and vinyl.

Part-of #5295

NO_CHANGELOG=covered by previous commit

@TarantoolBot document
Title: ER_READONLY error receives new reasons

When box.info.ro_reason is "synchro" and some operation throws an
ER_READONLY error, this error now might include the following reason:
```
Can't modify data on a read-only instance - synchro queue with term 2
belongs to 1 (06c05d18-456e-4db3-ac4c-b8d0f291fd92) and is frozen due to
fencing
```
This means that the current instance is indeed the synchro queue owner,
but it has noticed, that someone else in the cluster might start new
elections or might overtake the synchro queue soon.
This may be also detected by `box.info.election.term` becoming greater than
`box.info.synchro.queue.term` (this is the case for the second error
message).
There is also a slightly different error message:
```
Can't modify data on a read-only instance - synchro queue with term 2
belongs to 1 (06c05d18-456e-4db3-ac4c-b8d0f291fd92) and is frozen until
promotion
```
This means that the node simply cannot guarantee that it is still the
synchro queue owner (for example, after a restart, when a node still thinks
it is the queue owner, but someone else in the cluster has already
overtaken the queue).

6cc1b1f2

txn_limbo: fence upon receiving raft term greater than queue term · 0e48475d

Serge Petrenko authored 2 years ago

Receiving a raft term greater than the current queue term means that
someone has either already written PROMOTE (in case elections are
disabled), or is going to write PROMOTE once he wins the elections (in
case they are enabled).

In both cases the queue owner in an old term should freeze the limbo
until queue term catches up with raft term.

Unfreezing happens automatically once synchro queue term catches up.

Part-of #5295

NO_DOC=covered by next commit

0e48475d

txn_limbo: rework limbo->frozen flag · ce0a83eb

Serge Petrenko authored 2 years ago

Soon there will be more reasons for a transaction limbo to be frozen.
Let's make the limbo->frozen flag a bitmap and rename it to
limno->frozen_reasons.
The first bit, named frozen_due_to_fencing, represents the only current
reason for the limbo to be frozen.
While we are at it, rename txn_limbo_(un)freeze to txn_limbo_(un)fence
to better reflect the situation.

Part-of #5295

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

ce0a83eb

txn_libmo: preserve confirmed_lsn after reading a PROMOTE · 896a20e4

Serge Petrenko authored 2 years ago

Previously we assumed that every PROMOTE request changes limbo owner,
and thus limbo should have confirmed_lsn = 0 after the request is
processed, because new confirmed lsn is yet unknown.

This is not true for PROMOTE requests coming in JOIN or saved in
snapshot: such requests don't change limbo owner: they are like
savepoints, they notify the instance of the current limbo state.

Such promotions may be detected by the rule
replica_id (old limbo owner) == origin_id (new limbo owner)

So, for the sake of correct split-brain detection, confirmed_lsn should
be nonzero after such promotions.

Part-of #5295

NO_DOC=internal change
NO_TEST=tested in future commits
NO_CHANGELOG=internal change

896a20e4

test: refactor gh_6036_qsync_order test · 978731b3

Serge Petrenko authored 2 years ago


The test involves creating a manual split-brain between nodes r1 and r2.
After the split-brain detection introduction it's impossible to reuse
the nodes in the next test without recreating them.

Let's fix that by switching nodes r1 and r3. Now there's a split-brain
between (r1, r2) and r3, and r3 isn't used in the following tests and
may be safely deleted.

Follow-up #5295

NO_DOC=refactoring
NO_CHANGELOG=refactoring

Signed-off-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

978731b3

relay: fix PROMOTE and raft term ordering · 67090419

Serge Petrenko authored 2 years ago

Fix two issues with sent_raft_term calculations:
* first of all, it doesn't matter during initial and final join, so set it
  to UINT64_MAX.
* secondly, it's nullified after a successful dispatch from the tx
  thread. This might make the relay stall forever. For example, when
  elections are disabled.

NO_DOC=bugfix
NO_TEST=tested in next commit

67090419

Jun 16, 2022

core: allow spurious wakeups in coio_waitpid · 7a582646

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a fiber, which is waiting for a
child process termination, using Tarantool C API. This will leave
a zombie process behind. This patch reworks `coio_waitpid` in such
a way that it yields until `cw.data` is set to NULL in the process
status change callback.

Part of #7166

NO_DOC=refactoring
NO_CHANGELOG=refactoring

7a582646

core: allow spurious wakeups in cord_cojoin · 87e7d312

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a fiber, which is waiting for task
completion, using Tarantool C API. This will cause a "wrong fiber woken"
panic. This patch reworks `cord_cojoin` in such a way that it yields
until a completion flag is set.

Part of #7166

NO_DOC=refactoring
NO_CHANGELOG=refactoring

87e7d312

core: get rid of fiber_set_cancellable in hot_standby_f · 65470cb4

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a `hot_standby_f` fiber from Lua,
this does not lead to any error, but it results in redundant
`recover_remaining_wals` calls.
This patch handles such spurious wakeups in `hot_standby_f`.

Part of #7166

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

65470cb4

core: get rid of fiber_set_cancellable in gc_checkpoint_fiber_f · 6e5b89e0

Ilya Verbin authored 2 years ago

Spurious wakeups are already handled correctly.

Part of #7166

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

6e5b89e0

box: fix transaction "read-view" and "conflicted" states · 4d52199e

Georgiy Lebedev authored 2 years ago

Currently, there is a fundamental logical inconsistency with read-view and
conflicted states of transactions.

Conflicted transactions see all prepared changes (e.g., #7238), because
they are handled differently than read-view ones. At the same time, one
does not know the state of the transaction until `box.commit` is called.

A similar problem arises with read-view transactions: if such transactions
do any DML statements, they are de-facto conflicted, but this will only be
determined at preparation stage:
https://github.com/tarantool/tarantool/blob/79245573dabf3c1eb4eb904fd80ee84270360476/src/box/txn.c#L1006-L1013

Fix this inconsistency by the following changes:
1. Conflict "read-view" transactions on attempt to perform DML statements
immediately — guarantee this with an assertion at preparation stage.
2. Make conflicted transactions unconditionally throw "Transaction has been
aborted by conflict" error on any CRUD operations (including read-only
ones) until they are either rolled back (which will return no error) or
committed (which will return the same error).

Closes #7238
Closes #7239
Closes #7240

@TarantoolBot document
Title: new  behaviour of "conflicted" transactions

"Conflicted" transactions now return "Transaction aborted by conflicted"
error on any CRUD operations (including read-only ones), until they are
either rolled back (which will return no error) or committed (which will
return the same error).

4d52199e

tutorial: use https links · 2f80fbf0
Sergey Bronnikov authored 2 years ago
```
NO_CHANGELOG=internal
NO_DOC=internal
NO_TEST=internal
```
2f80fbf0

tools: fix gdb.sh revision regex · 375ceaaa

Pavel Balaev authored 2 years ago

Regular expression now works on versions: alpha, beta, rc and so on.

NO_DOC=bugfix
NO_TEST=bugfix
NO_CHANGELOG=bugfix

375ceaaa

tools: edit gdb.sh code formatting · 83e8c50f

Pavel Balaev authored 2 years ago

Tabs were replaced with spaces to bypass checkpatch.

NO_DOC=bugfix
NO_TEST=bugfix
NO_CHANGELOG=bugfix

83e8c50f

Jun 14, 2022

test: fix flaky viny/tx_gap_lock test · b3f462bf

Vladimir Davydov authored 2 years ago

The `cmp_tuple` helper function is broken - it assumes that all tuple
fields, including the payload, are numeric. It isn't true - the payload
field is either nil or string. This results in a false-positive test
failure:

```
error: '[string "function cmp_tuple(t1, t2)     for i = 1, PAY..."]:1:
       attempt to compare nil with string'
```

Closes #6336

NO_DOC=test
NO_CHANGELOG=test

b3f462bf

test-run: bump to new version · e6e73423

Yaroslav Lobankov authored 2 years ago

Bump test-run to new version with the following improvements:

  - Fix issue with not detecting successful server start [1]

[1] tarantool/test-run#343

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

e6e73423

Jun 09, 2022

test: set shutdown timeout to infinity for default luatest instance · ede831d3

Vladimir Davydov authored 2 years ago

With the default shutdown timeout of 3 seconds, a test that leaves
behind asynchronous requests would still pass, but it would take longer
to finish, because the server instance started by Tarantool would have
to wait for the dangling requests to complete. Setting the timeout to
infinity will result in a hang, making us fix the test.

Infinite timeout is also good for catching bugs like #7225 and #7256.

We don't set the timeout for diff and TAP tests because those are
deprecated and shouldn't be used for writing new tests. Nevertheless,
I manually checked that none of them hangs if the timeout is set to
infinity.

Closes #6820

NO_DOC=test
NO_CHANGELOG=test

ede831d3

iostream: shutdown socket fd before close · 9cf03555

Vladimir Davydov authored 2 years ago

If a socket fd is shared by a child process, closing it in the parent
will not shut down the underlying connection. As a result, the server
may hang executing the graceful shutdown protocol. Fix this problem by
explicitly shutting down the connection socket fd before closing it.

This is a recommended way to terminate a Unix socket connection, see
http://www.faqs.org/faqs/unix-faq/socket/#:~:text=2.6.%20%20When%20should%20I%20use%20shutdown()%3F

Closes #7256

NO_DOC=bug fix

9cf03555

wal: allow spurious wakeups in wal_write · 4bf52367

Ilya Verbin authored 2 years ago

It's possible to wakeup a fiber, which is waiting for WAL write
completion, using Tarantool C API. This results in an error like:
```
main/118/lua F> Journal result code -1 can't be converted to an error
```

This patch introduces a flag, which is set when WAL write is
finished, that allows fibers to yield until the flag is set.

Closes #6506

NO_DOC=bugfix

4bf52367

test-run: bump to new version · 0dc60b5f

Yaroslav Lobankov authored 2 years ago

Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

0dc60b5f

Jun 08, 2022

sql: fix wrong ephemeral space format · a6818acc

Mergen Imeev authored 2 years ago

This patch fixes format building when an ephemeral space was used in
ORDER BY and ORDER BY uses at least two variables from the list of
selected columns.

Closes #7042

NO_DOC=Bugfix

a6818acc

decimal: fix index comparison with Inf, NaN · 22fc1f94

Serge Petrenko authored 3 years ago

There was an assertion failure when inserting  a decimal into an index
which contained double Inf or NaN.

The reason for that was never checking decimal_from_*() return values,
and decimal_from_double() not being able to handle NaN or Inf, because
these values are not representable in decimal numbers.

Start handling decimal_from_<type> return values and fix decimal
comparison with Inf, NaN.

Closes #6377

NO_DOC=bugfix

22fc1f94

Jun 07, 2022

test: use unix socket in replication-py/swim tests · cb6fc4a3

Yaroslav Lobankov authored 2 years ago

To reduce the chance to encounter the tarantool/test-run#141 issue in
replication-py/swim tests, let's switch to using unix sockets instead
of TCP ports for tarantool console.

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

cb6fc4a3

Jun 06, 2022

core: allow spurious wakeups in cbus_call · bd6fb06a

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a fiber, which is waiting for `cbus_call`
completion, using Tarantool C API. This will cause a misleading `TimedOut`
error. This patch reworks `cbus_call` in such a way that it yields until
a completion flag is set.

Part of #7166

NO_DOC=refactoring
NO_CHANGELOG=refactoring

bd6fb06a

core: get rid of unused cbus_flush · e568e7f0
Ilya Verbin authored 2 years ago
```
Part of #7166

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
```
e568e7f0

datetime: refactor interval_to_string · b7ff1615

Timur Safin authored 2 years ago

Simplify/shorten `interval_to_string()` implementation.

Part of #7045

NO_CHANGELOG=refactoring
NO_DOC=refactoring
NO_TEST=refactoring

b7ff1615

datetime: do not mess with nsec in interval · 36bc6f83

Timur Safin authored 2 years ago

Do not even try to make more readable output of secs/nsec,
but rather report them as is, without any [de]normalization.

Not the prior way:
```
tarantool> dt.interval.new{min=1, sec=59, nsec=2e9+1}
--
- +1 minutes, 61.000000001 seconds
...
```

But instead as:
```
tarantool> dt.interval.new{min=1, sec=59, nsec=2e9+1}
--
- +1 minutes, 59 seconds, 2000000001 nanoseconds
...
```

Closes #7045

NO_DOC=internal

36bc6f83

net.box: fix hang in graceful shutdown protocol · 79245573

Vladimir Davydov authored 2 years ago

The graceful shutdown protocol works as follows:

 1. The server sends a shutdown request (the box.shutdown event) to all
    its clients that subscribed to it.
 2. Upon receiving a shutdown request, a client is supposed to close its
    connection.
 3. The server waits for all clients subscribed to box.shutdown event to
    exit.
 4. The server exits.

In net.box, the box.shutdown event is processed by `remote._callback`.
The problem is it may occur that `remote._callback` is garbage collected
while the `remote` object isn't. If this happens, the shutdown request
will never get processed, and the server won't exit until the `remote`
object is garbage collected, which may take forever.

Let's fix this issue by breaking the worker loop if we see that the
callback was garbage collected.

Closes #7225

NO_DOC=bug fix

79245573

Jun 02, 2022

build: define TZDIR for tzcode build · b9c9a7b0

Boris Stepanenko authored 2 years ago

nixos (and probably some other distributives) place zoneinfo directory
not in /usr/share (in /etc for example). TZDIR is set accordingly.
Currently zoneinfo is looked for in /usr/share, disregarding TZDIR env
variable.

This commit adds compile definition for TZDIR if such env variable is
defined. This fixes zoneinfo lookup for nixos.

NO_CHANGELOG=build
NO_DOC=build
NO_TEST=build

b9c9a7b0

Revert "github-ci: use openssl@1.1" · 6605de25
Vladimir Davydov authored 2 years ago
```
This reverts commit 33830978.

Follow-up #6477

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci
```
6605de25
Revert "ci: fix RPM spec to build packages for Fedora 36" · 7e1df16e
Vladimir Davydov authored 2 years ago
```
This reverts commit 9d1f9f0e.

Follow-up #6477

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci
```
7e1df16e

crypto: OpenSSL 3.0 support · e3bf73c8

Vladimir Davydov authored 2 years ago

Two things we need to do to fix build with OpenSSL 3.0:

1. Use EVP_MAC_* functions instead of HMAC_*
   https://www.openssl.org/docs/man3.0/man3/HMAC_CTX_new.html

2. Load the Legacy provider to enable legacy algorithms, such as MD4
   https://wiki.openssl.org/index.php/OpenSSL_3.0#Programming_in_OpenSSL_3.0

Closes #6477

NO_DOC=build fix
NO_TEST=build fix
NO_CHANGELOG=build fix

e3bf73c8