Commits · 129d83e934d7d33580f2a403c41050f0c5df9eeb · core / tarantool

Jun 17, 2022

txn_limbo: track CONFIRM lsn on replicas · 129d83e9

Serge Petrenko authored 2 years ago

limbo->confirmed_lsn was only filled on limbo owner in
txn_limbo_write_confirm. Replicas and recovering limbo owner need to track
it as well to correctly detect split-brains based on confirmed_lsn.

So update confirmed_lsn in txn_limbo_read_confirm.

Part-of #5295

NO_DOC=internal change
NO_TEST=tested in future commits
NO_CHANGELOG=internal change

129d83e9

txn_limbo: do not confirm/rollback anything after restart · 6cc1b1f2

Serge Petrenko authored 2 years ago

It's important for the synchro queue owner to not finalize any of the
pending synchronous transactions after restart.

Since the node was down for some time the chances are pretty high it was
deposed by some new leader during its downtime. It means that the node
might not know yet that it's transactions were already finalized by someone
else.

So, any arbitrary finalization might lead to a future split-brain, once the
remote PROMOTE finally reaches the local node.

Let's fix this by adding a new reason for the limbo to be frozen - a
queue owner has recovered but has not issued a new PROMOTE locally and
hasn't received any PROMOTE requests from the remote nodes.

Once the first PROMOTE is issued or received, it's safe to return to the
old mode of operation.

So, now the synchro queue owner starts in "frozen" state and can't
CONFIRM, ROLLBACK or issue new transactions until either issuing a
PROMOTE or receiving a PROMOTE from some remote node.

This also required modifying box.ctl.promote() behaviour: it's no
longer a no-op on a synchro queue owner, when elections are disabled and
the queue is frozen due to restart.

Also fix the tests, which assumed the queue owner is writeable after a
restart. gh-5298 test was partially deleted, because it became pointless.

And while we are at it, remove the double run of gh-5288 test. It is
storage engine agnostic, so there's no point in running it for both
memtx and vinyl.

Part-of #5295

NO_CHANGELOG=covered by previous commit

@TarantoolBot document
Title: ER_READONLY error receives new reasons

When box.info.ro_reason is "synchro" and some operation throws an
ER_READONLY error, this error now might include the following reason:
```
Can't modify data on a read-only instance - synchro queue with term 2
belongs to 1 (06c05d18-456e-4db3-ac4c-b8d0f291fd92) and is frozen due to
fencing
```
This means that the current instance is indeed the synchro queue owner,
but it has noticed, that someone else in the cluster might start new
elections or might overtake the synchro queue soon.
This may be also detected by `box.info.election.term` becoming greater than
`box.info.synchro.queue.term` (this is the case for the second error
message).
There is also a slightly different error message:
```
Can't modify data on a read-only instance - synchro queue with term 2
belongs to 1 (06c05d18-456e-4db3-ac4c-b8d0f291fd92) and is frozen until
promotion
```
This means that the node simply cannot guarantee that it is still the
synchro queue owner (for example, after a restart, when a node still thinks
it is the queue owner, but someone else in the cluster has already
overtaken the queue).

6cc1b1f2

txn_limbo: fence upon receiving raft term greater than queue term · 0e48475d

Serge Petrenko authored 2 years ago

Receiving a raft term greater than the current queue term means that
someone has either already written PROMOTE (in case elections are
disabled), or is going to write PROMOTE once he wins the elections (in
case they are enabled).

In both cases the queue owner in an old term should freeze the limbo
until queue term catches up with raft term.

Unfreezing happens automatically once synchro queue term catches up.

Part-of #5295

NO_DOC=covered by next commit

0e48475d

txn_limbo: rework limbo->frozen flag · ce0a83eb

Serge Petrenko authored 2 years ago

Soon there will be more reasons for a transaction limbo to be frozen.
Let's make the limbo->frozen flag a bitmap and rename it to
limno->frozen_reasons.
The first bit, named frozen_due_to_fencing, represents the only current
reason for the limbo to be frozen.
While we are at it, rename txn_limbo_(un)freeze to txn_limbo_(un)fence
to better reflect the situation.

Part-of #5295

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

ce0a83eb

txn_libmo: preserve confirmed_lsn after reading a PROMOTE · 896a20e4

Serge Petrenko authored 2 years ago

Previously we assumed that every PROMOTE request changes limbo owner,
and thus limbo should have confirmed_lsn = 0 after the request is
processed, because new confirmed lsn is yet unknown.

This is not true for PROMOTE requests coming in JOIN or saved in
snapshot: such requests don't change limbo owner: they are like
savepoints, they notify the instance of the current limbo state.

Such promotions may be detected by the rule
replica_id (old limbo owner) == origin_id (new limbo owner)

So, for the sake of correct split-brain detection, confirmed_lsn should
be nonzero after such promotions.

Part-of #5295

NO_DOC=internal change
NO_TEST=tested in future commits
NO_CHANGELOG=internal change

896a20e4

test: refactor gh_6036_qsync_order test · 978731b3

Serge Petrenko authored 2 years ago


The test involves creating a manual split-brain between nodes r1 and r2.
After the split-brain detection introduction it's impossible to reuse
the nodes in the next test without recreating them.

Let's fix that by switching nodes r1 and r3. Now there's a split-brain
between (r1, r2) and r3, and r3 isn't used in the following tests and
may be safely deleted.

Follow-up #5295

NO_DOC=refactoring
NO_CHANGELOG=refactoring

Signed-off-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

978731b3

relay: fix PROMOTE and raft term ordering · 67090419

Serge Petrenko authored 2 years ago

Fix two issues with sent_raft_term calculations:
* first of all, it doesn't matter during initial and final join, so set it
  to UINT64_MAX.
* secondly, it's nullified after a successful dispatch from the tx
  thread. This might make the relay stall forever. For example, when
  elections are disabled.

NO_DOC=bugfix
NO_TEST=tested in next commit

67090419

Jun 16, 2022

core: allow spurious wakeups in coio_waitpid · 7a582646

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a fiber, which is waiting for a
child process termination, using Tarantool C API. This will leave
a zombie process behind. This patch reworks `coio_waitpid` in such
a way that it yields until `cw.data` is set to NULL in the process
status change callback.

Part of #7166

NO_DOC=refactoring
NO_CHANGELOG=refactoring

7a582646

core: allow spurious wakeups in cord_cojoin · 87e7d312

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a fiber, which is waiting for task
completion, using Tarantool C API. This will cause a "wrong fiber woken"
panic. This patch reworks `cord_cojoin` in such a way that it yields
until a completion flag is set.

Part of #7166

NO_DOC=refactoring
NO_CHANGELOG=refactoring

87e7d312

core: get rid of fiber_set_cancellable in hot_standby_f · 65470cb4

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a `hot_standby_f` fiber from Lua,
this does not lead to any error, but it results in redundant
`recover_remaining_wals` calls.
This patch handles such spurious wakeups in `hot_standby_f`.

Part of #7166

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

65470cb4

core: get rid of fiber_set_cancellable in gc_checkpoint_fiber_f · 6e5b89e0

Ilya Verbin authored 2 years ago

Spurious wakeups are already handled correctly.

Part of #7166

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

6e5b89e0

box: fix transaction "read-view" and "conflicted" states · 4d52199e

Georgiy Lebedev authored 2 years ago

Currently, there is a fundamental logical inconsistency with read-view and
conflicted states of transactions.

Conflicted transactions see all prepared changes (e.g., #7238), because
they are handled differently than read-view ones. At the same time, one
does not know the state of the transaction until `box.commit` is called.

A similar problem arises with read-view transactions: if such transactions
do any DML statements, they are de-facto conflicted, but this will only be
determined at preparation stage:
https://github.com/tarantool/tarantool/blob/79245573dabf3c1eb4eb904fd80ee84270360476/src/box/txn.c#L1006-L1013

Fix this inconsistency by the following changes:
1. Conflict "read-view" transactions on attempt to perform DML statements
immediately — guarantee this with an assertion at preparation stage.
2. Make conflicted transactions unconditionally throw "Transaction has been
aborted by conflict" error on any CRUD operations (including read-only
ones) until they are either rolled back (which will return no error) or
committed (which will return the same error).

Closes #7238
Closes #7239
Closes #7240

@TarantoolBot document
Title: new  behaviour of "conflicted" transactions

"Conflicted" transactions now return "Transaction aborted by conflicted"
error on any CRUD operations (including read-only ones), until they are
either rolled back (which will return no error) or committed (which will
return the same error).

4d52199e

tutorial: use https links · 2f80fbf0
Sergey Bronnikov authored 2 years ago
```
NO_CHANGELOG=internal
NO_DOC=internal
NO_TEST=internal
```
2f80fbf0

tools: fix gdb.sh revision regex · 375ceaaa

Pavel Balaev authored 2 years ago

Regular expression now works on versions: alpha, beta, rc and so on.

NO_DOC=bugfix
NO_TEST=bugfix
NO_CHANGELOG=bugfix

375ceaaa

tools: edit gdb.sh code formatting · 83e8c50f

Pavel Balaev authored 2 years ago

Tabs were replaced with spaces to bypass checkpatch.

NO_DOC=bugfix
NO_TEST=bugfix
NO_CHANGELOG=bugfix

83e8c50f

Jun 14, 2022

test: fix flaky viny/tx_gap_lock test · b3f462bf

Vladimir Davydov authored 2 years ago

The `cmp_tuple` helper function is broken - it assumes that all tuple
fields, including the payload, are numeric. It isn't true - the payload
field is either nil or string. This results in a false-positive test
failure:

```
error: '[string "function cmp_tuple(t1, t2)     for i = 1, PAY..."]:1:
       attempt to compare nil with string'
```

Closes #6336

NO_DOC=test
NO_CHANGELOG=test

b3f462bf

test-run: bump to new version · e6e73423

Yaroslav Lobankov authored 2 years ago

Bump test-run to new version with the following improvements:

  - Fix issue with not detecting successful server start [1]

[1] tarantool/test-run#343

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

e6e73423

Jun 09, 2022

test: set shutdown timeout to infinity for default luatest instance · ede831d3

Vladimir Davydov authored 2 years ago

With the default shutdown timeout of 3 seconds, a test that leaves
behind asynchronous requests would still pass, but it would take longer
to finish, because the server instance started by Tarantool would have
to wait for the dangling requests to complete. Setting the timeout to
infinity will result in a hang, making us fix the test.

Infinite timeout is also good for catching bugs like #7225 and #7256.

We don't set the timeout for diff and TAP tests because those are
deprecated and shouldn't be used for writing new tests. Nevertheless,
I manually checked that none of them hangs if the timeout is set to
infinity.

Closes #6820

NO_DOC=test
NO_CHANGELOG=test

ede831d3

iostream: shutdown socket fd before close · 9cf03555

Vladimir Davydov authored 2 years ago

If a socket fd is shared by a child process, closing it in the parent
will not shut down the underlying connection. As a result, the server
may hang executing the graceful shutdown protocol. Fix this problem by
explicitly shutting down the connection socket fd before closing it.

This is a recommended way to terminate a Unix socket connection, see
http://www.faqs.org/faqs/unix-faq/socket/#:~:text=2.6.%20%20When%20should%20I%20use%20shutdown()%3F

Closes #7256

NO_DOC=bug fix

9cf03555

wal: allow spurious wakeups in wal_write · 4bf52367

Ilya Verbin authored 2 years ago

It's possible to wakeup a fiber, which is waiting for WAL write
completion, using Tarantool C API. This results in an error like:
```
main/118/lua F> Journal result code -1 can't be converted to an error
```

This patch introduces a flag, which is set when WAL write is
finished, that allows fibers to yield until the flag is set.

Closes #6506

NO_DOC=bugfix

4bf52367

test-run: bump to new version · 0dc60b5f

Yaroslav Lobankov authored 2 years ago

Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

0dc60b5f

Jun 08, 2022

sql: fix wrong ephemeral space format · a6818acc

Mergen Imeev authored 2 years ago

This patch fixes format building when an ephemeral space was used in
ORDER BY and ORDER BY uses at least two variables from the list of
selected columns.

Closes #7042

NO_DOC=Bugfix

a6818acc

decimal: fix index comparison with Inf, NaN · 22fc1f94

Serge Petrenko authored 3 years ago

There was an assertion failure when inserting  a decimal into an index
which contained double Inf or NaN.

The reason for that was never checking decimal_from_*() return values,
and decimal_from_double() not being able to handle NaN or Inf, because
these values are not representable in decimal numbers.

Start handling decimal_from_<type> return values and fix decimal
comparison with Inf, NaN.

Closes #6377

NO_DOC=bugfix

22fc1f94

Jun 07, 2022

test: use unix socket in replication-py/swim tests · cb6fc4a3

Yaroslav Lobankov authored 2 years ago

To reduce the chance to encounter the tarantool/test-run#141 issue in
replication-py/swim tests, let's switch to using unix sockets instead
of TCP ports for tarantool console.

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

cb6fc4a3

Jun 06, 2022

core: allow spurious wakeups in cbus_call · bd6fb06a

Ilya Verbin authored 2 years ago

Currently it's possible to wakeup a fiber, which is waiting for `cbus_call`
completion, using Tarantool C API. This will cause a misleading `TimedOut`
error. This patch reworks `cbus_call` in such a way that it yields until
a completion flag is set.

Part of #7166

NO_DOC=refactoring
NO_CHANGELOG=refactoring

bd6fb06a

core: get rid of unused cbus_flush · e568e7f0
Ilya Verbin authored 2 years ago
```
Part of #7166

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
```
e568e7f0

datetime: refactor interval_to_string · b7ff1615

Timur Safin authored 2 years ago

Simplify/shorten `interval_to_string()` implementation.

Part of #7045

NO_CHANGELOG=refactoring
NO_DOC=refactoring
NO_TEST=refactoring

b7ff1615

datetime: do not mess with nsec in interval · 36bc6f83

Timur Safin authored 2 years ago

Do not even try to make more readable output of secs/nsec,
but rather report them as is, without any [de]normalization.

Not the prior way:
```
tarantool> dt.interval.new{min=1, sec=59, nsec=2e9+1}
--
- +1 minutes, 61.000000001 seconds
...
```

But instead as:
```
tarantool> dt.interval.new{min=1, sec=59, nsec=2e9+1}
--
- +1 minutes, 59 seconds, 2000000001 nanoseconds
...
```

Closes #7045

NO_DOC=internal

36bc6f83

net.box: fix hang in graceful shutdown protocol · 79245573

Vladimir Davydov authored 2 years ago

The graceful shutdown protocol works as follows:

 1. The server sends a shutdown request (the box.shutdown event) to all
    its clients that subscribed to it.
 2. Upon receiving a shutdown request, a client is supposed to close its
    connection.
 3. The server waits for all clients subscribed to box.shutdown event to
    exit.
 4. The server exits.

In net.box, the box.shutdown event is processed by `remote._callback`.
The problem is it may occur that `remote._callback` is garbage collected
while the `remote` object isn't. If this happens, the shutdown request
will never get processed, and the server won't exit until the `remote`
object is garbage collected, which may take forever.

Let's fix this issue by breaking the worker loop if we see that the
callback was garbage collected.

Closes #7225

NO_DOC=bug fix

79245573

Jun 02, 2022

build: define TZDIR for tzcode build · b9c9a7b0

Boris Stepanenko authored 2 years ago

nixos (and probably some other distributives) place zoneinfo directory
not in /usr/share (in /etc for example). TZDIR is set accordingly.
Currently zoneinfo is looked for in /usr/share, disregarding TZDIR env
variable.

This commit adds compile definition for TZDIR if such env variable is
defined. This fixes zoneinfo lookup for nixos.

NO_CHANGELOG=build
NO_DOC=build
NO_TEST=build

b9c9a7b0

Revert "github-ci: use openssl@1.1" · 6605de25
Vladimir Davydov authored 2 years ago
```
This reverts commit 33830978.

Follow-up #6477

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci
```
6605de25
Revert "ci: fix RPM spec to build packages for Fedora 36" · 7e1df16e
Vladimir Davydov authored 2 years ago
```
This reverts commit 9d1f9f0e.

Follow-up #6477

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci
```
7e1df16e

crypto: OpenSSL 3.0 support · e3bf73c8

Vladimir Davydov authored 2 years ago

Two things we need to do to fix build with OpenSSL 3.0:

1. Use EVP_MAC_* functions instead of HMAC_*
   https://www.openssl.org/docs/man3.0/man3/HMAC_CTX_new.html

2. Load the Legacy provider to enable legacy algorithms, such as MD4
   https://wiki.openssl.org/index.php/OpenSSL_3.0#Programming_in_OpenSSL_3.0

Closes #6477

NO_DOC=build fix
NO_TEST=build fix
NO_CHANGELOG=build fix

e3bf73c8

ssl: move OpenSSL library initialization code to separate file · f9739160

Vladimir Davydov authored 2 years ago

We redefine ssl_init and ssl_free in the EE build, because we need to do
some extra work there. Currently, it's fine to duplicate the bulk of the
OpenSSL library initialization code between EE and CE repositories, but
with the introduction of OpenSSL 3.0 it's going to become more
complicated so duplicating would look bad. Let's move the common code to
ssl_init_impl() and ssl_free_impl() helper functions.

Needed for #6477

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

f9739160

crypto: use ERR_reason_error_string instead of ERR_error_string · 9cc130f0

Vladimir Davydov authored 2 years ago

ERR_error_string adds some extra information that depends on the OpenSSL
library version (code, module, method). This information says nothing to
the end user, and it results in different test results after updating to
OpenSSL 3.0. Let's use ERR_reason_error_string instead, which just
prints a human-readable error message.

Part of #6477

NO_DOC=minor change in error message
NO_CHANGELOG=minor change in error message

9cc130f0

crypto: fix openssl_err_str · f72662c5

Vladimir Davydov authored 2 years ago

openssl_err_str is used for reporting OpenSSL errors. It calls
crypto_ERR_* functions using FFI. There's a typo in the code:
ffi.crypto_ERR_error_string is used instead of ffi.C.*.

We don't normally step on this, because OpenSSL doesn't return errors
in our configuration, but if it did for some reason (e.g. a cipher was
disabled in the library), we'd get a confusing error message.

NO_DOC=bug fix
NO_TEST=occur only on internal error
NO_CHANGELOG=occur only on internal error

f72662c5

Jun 01, 2022

ci: clean workspace on self-hosted runners · 7ff87404

artembo authored 2 years ago

Added 'tarantool/actions/cleanup' action to each job which uses
self-hosted runners.

The action cleans workspace directory of self-hosted runner after
previous run. The main reason to add this action is 'Need a single
revision' error [1] caused by a conflict of submodule versions,
the standard 'actions/checkout' action fails with this error. It's a
well-known problem and related issue [2] is still opened.

[1] https://github.com/tarantool/tarantool-qa/issues/145
[2] https://github.com/actions/checkout/issues/418

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci

Closes tarantool/tarantool-qa#145

7ff87404

core: introduce clock_lowres · 37d5ac5a

Andrey Saranchin authored 2 years ago

This patch introduces not thread-safe low resolution
monotonic clock, based on interval timer. It should
be used only by thread that initialized it.

Part of #6085

NO_CHANGELOG=internal feature
NO_DOC=internal feature

37d5ac5a

replace sigprocmask() with pthread_sigmask() · 50107cf2

Andrey Saranchin authored 2 years ago

Since the use of sigprocmask() is unspecified in a multithreaded
process we should use pthread_sigmask() instead. This patch
replaces all the sigprocmask calls with pthread analogue.

NO_TEST=refactoring
NO_CHANGELOG=refactoring
NO_DOC=refactoring

50107cf2

ci: don't overwrite job artifacts in pkg workflows · 37759699

Yaroslav Lobankov authored 2 years ago

To ensure that regular and GC64 jobs in packaging workflows don't
overwrite artifacts of each other, we need to use a different artifact
name per job.

NO_DOC=ci
NO_TEST=ci
NO_CHANGELOG=ci

37759699