Commits · 17289440bf3c75e573824e53db226084b241bd33 · core / tarantool

May 13, 2021

recovery: make it yield when positioning in a WAL · 17289440

Serge Petrenko authored 4 years ago

We had various places in box.cc and relay.cc which counted processed
rows and yielded every now and then. These yields didn't cover cases,
when recovery has to position inside a long WAL file:

For example, when tarantool exits without leaving an empty WAL file
which will be used to recover instance vclock on restart. In this case
the instance freezes while processing the last available WAL in order
to recover the vclock.

Another issue is with replication. If a replica connects and needs data
from the end of a really long WAL, recovery will read up to the needed
position without yields, making relay disconnect by timeout.

In order to fix the issue, make recovery decide when a yield should
happen. Once recovery decides so, it calls a xstream callback,
schedule_yield. Currently schedule_yield is fired once recovery
processes (either skips or writes) WAL_ROWS_PER_YIELD rows.

schedule_yield either yields right away, in case of relay, or saves the
yield for later, in case of local recovery, because it might be in the
middle of a transaction.

Closes #5979

17289440

May 04, 2021

fiber: fiber_join -- don't crash on misuse · 4500547d

Cyrill Gorcunov authored 4 years ago


In case if we call fiber_join() over the non joinable fiber
we trigger an assert and crash execution (on debug build).

On release build the asserts will be zapped and won't cause
problems but there is an another one -- the target fiber will
cause double fiber_reset() calls which in result cause to
unregister_fid() with id = 0 (not causing crash but definitely
out of intention) and we will drop stack protection which
might be not ours anymore.

Since we're not allowed to break API on C level lets just
panic early in case of such misuse, it is a way better than
continue operating with potentially screwed data in memory.

Fixes #6046

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

4500547d

fiber: fiber_join -- drop redundat variable · 5cf721ff

Cyrill Gorcunov authored 4 years ago


No need for additional variable here.

In-scope-of #6046

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

5cf721ff

Apr 30, 2021

build: fix linker flags for executable on MacOS · e50a6d96

Igor Munkin authored 4 years ago


This patch fixes inaccuracy in Tarantool build configuration introduced
by commit 07c83aab ('build: adjust
LuaJIT build system'). All those MacOS-related tweaks for __PAGEZERO
size and preferred load address for the bundle are necessary only for
builds with 32-bit GC area on 64-bit host. The only case fitting these
conditions is x86_64 with no LUAJIT_ENABLE_GC64. All other 64-bit builds
use 64-bit GC area unconditionally.

Part of #5983
Needed for #5629
Follows up #4862

Reviewed-by: Sergey Kaplun <skaplun@tarantool.org>
Reviewed-by: Nikita Pettik <korablev@tarantool.org>
Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>

e50a6d96

Apr 28, 2021

txn: destroy commit and rollback triggers · 7fd53b4c

Vladislav Shpilevoy authored 4 years ago

They were not deleted ever. Worked fine for DDL and replication,
for which they were introduced in the first place, because these
triggers are on the region memory.

But didn't work when the triggers became available in the public
API, because these are allocated on the heap. As a result, all the
box.on_commit() and box.on_rollback() triggers leaked.

The patch ensures all the on_commit/on_rollback triggers are
destroyed.

The statement triggers on_commit/on_rollback are left intact since
they are private and never need deletion, but the patch adds
assertions in case they ever would need to be destroyed.

Another option was to force all the commit and rollback triggers
clear themselves. For example, in case of commit all the on_commit
triggers must clear themselves, and the rollback triggers are
destroyed. Vice versa when a rollback happens. This would allow
not to manually destroy on_commit triggers in case of commit. But
it didn't work because the Lua triggers all work via a common
runner lbox_trigger_run(), which can't destroy its argument in
most of the cases (space:on_replace, :before_replace, ...). It
could be patched but requires to add some work to the Lua triggers
affecting all of them, which in total might be not better.

Closes #6025

7fd53b4c

Apr 27, 2021

test: fix regex in box-py/args.test.py · f44663ed

HustonMmmavr authored 4 years ago

Regex for validating version was expecting a single
character (digit) for version `patch`, but it's not correct.
This patch fixes test behaviour for tarantool 1.10.10

Close #6039

f44663ed

fiber: use wakeup safely on self everywhere · 8d1ebd83

Vladislav Shpilevoy authored 4 years ago

The previous commit made fiber_wakeup() safe to use on the current
fiber. Leverage the new behaviour everywhere in the source code to
remove all checks f != fiber() before fiber_wakeup(f) calls.

Follow-up #5292

8d1ebd83

fiber: make wakeup in Lua and C nop on self · db0ded5d

Vladislav Shpilevoy authored 4 years ago

fiber.wakeup() in Lua and fiber_wakeup() in C could lead to a
crash or undefined behaviour when called on the currently running
fiber.

In particular, if after wakeup was started a new fiber in a
blocking way (fiber.create() and fiber_start()) it would crash in
debug build, and lead to unknown results in release.

If after wakeup was made a sleep with non-zero timeout or an
infinite yield (fiber_yield()), the fiber woke up in the same
event loop iteration regardless of any timeout or other wakeups.
It was a spurious wakeup, which is not expected in most of the
places internally.

The patch makes the wakeup nop on the current fiber making it safe
to use anywhere.

Closes #5292
Closes #6043

@TarantoolBot document
Title: fiber.wakeup() in Lua and fiber_wakeup() in C are nop on self
In Lua `fiber.wakeup()` being called on the current fiber does not
do anything, safe to use. The same for `fiber_wakeup()` in C.

db0ded5d

wal: sanitize wal_mode · 5bd34c97

Cyrill Gorcunov authored 4 years ago


 - add comments about modes
 - there is no need to carry NULL in wal_mode_STRS
 - use designated assignment in wal_mode_STRS

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Acked-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

5bd34c97

Apr 26, 2021

txn_limbo: simplify txn_limbo_on_parameters_change · 63979fec

Cyrill Gorcunov authored 4 years ago


There is no need to test @confirm_lsn since we're
continue the list traversing in any case.

I keep "continue" here to be consistent with other "if"s.

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Acked-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

63979fec

txn_limbo: simplify txn_limbo_ack · d88e160f

Cyrill Gorcunov authored 4 years ago


There is no need to test @confirm_lsn at all because
even with value -1 we're to continue iterating the queue.
Lets drop it and save a branch.

I keep "continue" here to be consistent with other "if"s.

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Acked-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

d88e160f

txn_limbo: simplify owner migration condition · 99854091

Cyrill Gorcunov authored 4 years ago


When limbo owner is about to change the state we should check
if there are some pending transactions which are not yet processed,
iow if queue is empty. No need to test if current limbo owner
is zero. The owner is set to zero once -- when limbo is created
during initialization.

After all I think even if owner would ever zero and we're about
to change it the queue simply must be empty, that is the only
safe state.

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Acked-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

99854091

Apr 22, 2021

digest: introduce FFI bindings for xxHash32/64 · f998ea39

Oleg Babin authored 4 years ago

This patch introduces new hash types for digest module - xxHash32
and xxHash64.

Closes #2003

@TarantoolBot document
Title: digest module supports xxHash32/64

```lua
-- Examples below demonstrate xxHash32.
-- xxHash64 has exactly the same interface

-- Calculate the 32-bits hash (default seed is 0).
digest.xxhash32(string[, seed])

-- Streaming
-- Start a new hash by initializing state with a seed.
-- If no value provided, 0 is used as default.
xxhash = digest.xxhash32.new([seed])
-- Also it's possible to specify seed manually. If no value
-- provided a value initially passed to "new" is used.
-- Here and below "seed" expected to be unsigned
-- number. Function returns nothing.
xxhash:clear([seed])
-- Feed the hash state by calling "update" as many times as
-- necessary. Function returns nothing.
xxhash:update('string')
-- Produce a hash value.
xxhash:result()
```

f998ea39

digest: eliminate excess table lookup · 2f1df7eb

Oleg Babin authored 4 years ago

This patch removes excess table lookup for FFI calls. So we save
one hash operation and do it in mannier that used for another
modules such as "tuple", "uuid", etc.

For details see [1] section "To Cache or Not to Cache".

  [1] https://luajit.org/ext_ffi_tutorial.html

2f1df7eb

Apr 21, 2021

Dummy commit · 15bbdab9
Kirill Yukhin authored 4 years ago

View commits for tag 2.9.0 2.9.0

15bbdab9
Create changelog for 2.8.1 release · 06a0f4f4
Kirill Yukhin authored 4 years ago

06a0f4f4

feedback_daemon: fix indexing a nil value issue · 3c16ddc0

Serge Petrenko authored 4 years ago

When running tarantool with disabled feedback daemon the following error
appeared on each space/index create/drop:

builtin/box/feedback_daemon.lua:380: attempt to index field 'cached_events'
(a nil value)

This happened because 'cached_events' table is initialized only on
feedback daemon start.
Fix the issue by checking for type of `cached_events` and only indexing
it when it's a table.

Follow-up #5750

3c16ddc0

rfc: consistent lua/sql types · 1abe1823

Timur Safin authored 4 years ago


There is discussion in #5910 about all inconsistencies
we see between Lua and SQL worlds and possible future
directions of SQL development and new types additions.

This RFC is current state of this discussion.

Part of #5910

Co-authored-by: Igor Munkin <imun@tarantool.org>
Co-authored-by: Peter Gulutzan <pgulutzan@ocelot.ca>
Co-authored-by: Mergen Imeev <imeevma@tarantool.org>
Co-authored-by: Sergey Ostanevich <sergos@tarantool.org>

1abe1823

feedback_daemon: speedup the first send to 2 minutes · b1f00db4

Serge Petrenko authored 4 years ago

The first send should happen sooner, than default feedback interval, to
catch not so long-living instances. This replaces the commit we had with
sending feedback right from initial box.cfg{} and on first event
appearance, such as creation/drop of a space or index.

The reason for this commit instead of "send feedback on server start",
is that the latter one was quite hacky and didn't work correctly without
some ugly crutches, namely, fiber.sleep(10) in feedback daemon code.

Follow-up #5750

(cherry picked from commit 843fa23e74178b5eb1519e78cff36bad88b03587)

b1f00db4

feedback_daemon: do not trigger feedback send on first occurred event · d5b66641
Serge Petrenko authored 4 years ago
```
Follow-up #5750

(cherry picked from commit 5f37ecb788b80f0e12eb6eebcacd048cbf624ff9)
```
d5b66641
Revert "feedback_daemon: send feedback on server start" · 09a6b17a
Serge Petrenko authored 4 years ago
```
This reverts commit bc15e0f0.

(cherry picked from commit 3ec624b87aeb632d2bcbece9f12484cbb3298f43)
```
09a6b17a

txn: make NOPs fully asynchronous · 7d8b6feb

Serge Petrenko authored 4 years ago

When a transaction consists of NOPs solely, it shouldn't wait for other
synchronous transactions to finish. It might get committed right away.

Such transactions may appear when applier filters out synchronous rows
from an outdated instance, and appending such transactions to the limbo
could lead to ER_UNCOMMITTED_FOREIGN_SYNC_TXNS  error, which we tried to
avoid in the first place when replaced tx rows with NOPs.

Follow-up #5445

7d8b6feb

replication: send accumulated Raft messages after relay start · 660f3323

Serge Petrenko authored 4 years ago

It may happen that a Raft leader fails to send a broadcast to the
freshly connected follower.

Here's what happens: a follower subscribes to a candidate during
on-going elections. box_process_subscribe() sends out current node's
Raft state, which's candidate. Suppose a relay from follower to
candidate is already set up. Follower immediately responds to the vote
request. This makes the candidate become leader. But candidate's relay
is not yet ready to process Raft messages, and is_leader message from
the candidate gets rejected. Once relay starts, it relays all the xlogs,
but the follower rejects all the data, because it hasn't received
is_leader notification from the candidate.

Fix this by sending the last rejected message as soon as relay starts
dispatching Raft messages.

Also, while we're at it rework relay_push_raft to use a pair of
pre-allocated raft messages instead of allocating a new one on every
raft state update.

Follow-up #5445

660f3323

box.ctl: rename clear_synchro_queue to promote · 14d5cfad

Serge Petrenko authored 4 years ago

New function name will be `box.ctl.promote()`. It's much shorter and
closer to the function's now enriched functionality.

Old name `box.ctl.clear_synchro_queue()` remains in Lua for the sake of
backward compatibility.

Follow-up #5445
Closes #3055

@TarantoolBot document
Title: deprecate `box.ctl.clear_synchro_queue()` in favor of `box.ctl.promote()`

Replace all the mentions of `box.ctl.clear_synchro_queue()` with
`box.ctl.promote()` and add a note that `box.ctl.clear_synchro_queue()`
is a deprecated alias to `box.ctl.promote()`

14d5cfad

box: remove parameter from clear_synchro_queue · a58ef790

Serge Petrenko authored 4 years ago

The `try_wait` parameter became redundant with the inroduction of manual
elections concept. It may be determined whether the node should wait for
pending confirmations or not by looking at election mode, so remove the
parameter.

Part of #3055

a58ef790

election: support manual elections in clear_synchro_queue() · 9f33de44

Serge Petrenko authored 4 years ago

This patch adds support for manual elections from
`box.ctl.clear_synchro_queue()`. When an instance is in
`election_mode='manual'`, calling `clear_synchro_queue()` will make it
start a new election round.

Follow-up #5445
Part of #3055

@TarantoolBot document
Title: describe election_mode='manual'

Manual election mode is introduced. It may be used when the user wants to
control which instance is the leader explicitly instead of relying on
Raft election algorithm.

When an instance is configured with `election_mode='manual'`, it behaves
as follows:
 1) By default, the instance acts like a voter: it is read-only and may
    vote for other instances that are candidates.
 2) Once `box.ctl.clear_synchro_queue()` is called, the instance becomes a
    candidate and starts a new election round. If the instance wins the
    elections, it remains leader, but won't participate in any new elections.

9f33de44

raft: introduce raft_start/stop_candidate · ae4f1dc7

Serge Petrenko authored 4 years ago

Extract raft_start_candidate and raft_stop_candidate functions from
raft_cfg_is_candidate.

These functions will be used in manual elections.

Prerequisite #3055

ae4f1dc7

election: introduce a new election mode: "manual" · a9b84f32

Serge Petrenko authored 4 years ago

When an instance is configured in "manual" election mode, it behaves as
a voter for most of the time, until `box.ctl.promote()` is called.

Once `box.ctl.promote()` is called the instance will behave as a
candidate for a full election round, e.g. until the leader is known. If
the instance wins the elections, it remains in `leader` state until the
next elections. Otherwise the instance returns to `voter` state.

Follow-up #5445
Part of #3055

a9b84f32

txn_limbo: filter rows based on known peer terms · 560d71b4

Serge Petrenko authored 4 years ago


Start writing the actual leader term together with the PROMOTE request
and process terms in PROMOTE requests on receiver side.

Make applier only apply synchronous transactions from the instance which
has the greatest term as received in PROMOTE requests.

Closes #5445

Co-developed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>

560d71b4

box: write PROMOTE even for empty limbo · a33dfe07

Serge Petrenko authored 4 years ago

PROMOTE entry will be used to mark limbo ownership transition besides
emptying the limbo. So it has to be written every time
`box.ctl.clear_synchro_queue()` succeeds. Even when the limbo was
already empty.

Part of #5445

a33dfe07

box: make clear_synchro_queue() write a PROMOTE entry instead of CONFIRM + ROLLBACK · f1746512

Serge Petrenko authored 4 years ago

A successful box_clear_synchro_queue() call results in writing
CONFIRM(N) ROLLBACK(N+1) pair, where N is  the confirmed lsn.

Let's write a single PROMOTE(N) entry instead. It'll have  the same
meaning as CONFIRM + ROLLBACK and it will give followers some additional
information regarding leader state change later.

Part of #5445

f1746512

box: actualise iproto_key_type array · 588b6930

Serge Petrenko authored 4 years ago

iproto_key_type array is used while validating incoming requests, but it
was only half-filled. The last initialized field was 0x2b, while
IPROTO_KEY_MAX is currently 0x54.

We got away with it, since the array is only  used in xrow_header_decode(),
xrow_decode_dml() and xrow_decode_synchro(), and all the keys usually present
in these requests were present in the array. This is not true anymore,
so it's time to make array contents up to date with all the IPROTO_KEY_*
constants we have.

Part of #5445

588b6930

xrow: introduce a PROMOTE entry · 23a83352

Serge Petrenko authored 4 years ago

A PROMOTE entry combines effect of CONFIRM, ROLLBACK and RAFT_TERM
entries with some additional semantics on top.

PROMOTE carries the following arguments:

1) former_leader_id - the id of previous limbo owner whose entries we
   want to confirm.
2) confirm_lsn - the lsn of the last former leader's transaction to be
   confirmed. In this sense PROMOTE(confirm_lsn) replaces
   CONFIRM(confirm_lsn) + ROLLBACK(confirm_lsn + 1).
3) replica_id - id of the instance issuing
   `box.ctl.clear_synchro_queue()`
4) term - the new term the instance issuing
   `box.ctl.clear_synchro_queue()` has just entered.

This entry will be written to WAL instead of the usual CONFIRM +
ROLLBACK pair on a successful `box.ctl.clear_synchro_queue()` call.

Note, the ususal CONFIRM and ROLLBACK occurrences (after a confirmed or
rolled back synchronous transaction) are here to stay.

Part of #5445

23a83352

xrow: enrich row's meta information with sync replication flags · 109502cc

Serge Petrenko authored 4 years ago

Introduce two new flags to xrow_header: `wait_ack` and `wait_sync`.
These flags are set for rows belonging to synchronous transactions in
addition to `is_commit`.

The new flags help to define whether the rows belong to a synchronous
transaction or not without parsing them all and checking whether any of
the rows touches a synchronous space.

This will be used in applier once it is taught to filter synchronous
transactions based on whether they are coming from a raft leader or not.

P.S. These flags will also be useful once we allow to turn any transaction
synchronous. Once this is done, the flags in row header will be the only
source of information on whether the transaction is synchronous or not.

Prerequisite #5445

@TarantoolBot document
Title: new values for IPROTO_FLAGS field

IPROTO_FLAGS bitfield is enriched with two new constant:
IPROTO_FLAG_WAIT_SYNC = 0x02
IPROTO_FLAG_WAIT_ACK = 0x04

IPROTO_FLAG_WAIT_SYNC is set for the last row of a transaction which
cannot be committed immediately: either because it is synchronous or
because it waits for other synchronous transactions to complete.
IPROTO_FLAG_WAIT_ACK is set for the last synchronous transaction row.

109502cc

wal: make wal_assign_lsn accept journal entry · 766c6754

Serge Petrenko authored 4 years ago

Refactor wal_assign_lsn() to accept a journal entry instead of a pair of
pointers to the first and last entry rows.

Journal entry will carry additional meta information for the last row
soon, which will be needed in wal_assign_lsn().

Prerequisite #5445

766c6754

Apr 20, 2021

github-ci: trigger release packages creation · dbadc071

Alexander V. Tikhonov authored 4 years ago

To trigger release packages creation on tagged commits need to check
GITHUB_REF environment variable if it starts with 'refs/tags/' prefix.

dbadc071

Apr 19, 2021

tools: implement toolchain for crash artefacts · e733f08d

Igor Munkin authored 4 years ago


This patch introduces two scripts to ease crash artefacts collecting and
loading for postmortem analysis:

* tarabrt.sh - the tool collecting a tarball with the crash artefacts
  the right way: the coredump with the binary, all loaded shared libs,
  Tarantool version (this is a separate exercise to get it from the
  binary built with -O2). Besides, the tarball has a unified layout, so
  it can be easily processed with the second script:
  - /coredump - core dump file on the root level
  - /binary - tarantool executable on the root level
  - /version - plain text file on the root level with
    `tarantool --version` output
  - /checklist - plain text file on the root level with the list of the
    collected entities
  - /etc/os-release - the plain text file containing operating system
    identification data
  - all shared libraries used by the crashed instance - their layout
    respects the one on the host machine, so they can be easily loaded
    with the following gdb command: set sysroot $(realpath .)

  The script can be easily used either manually or via
  kernel.core_pattern variable.

* gdb.sh - the auxiliary script originally written by @Totktonada, but
  needed to be adjusted to the crash artefacts layout every time. Since
  there is a unified layout, the original script is enhanced a bit to
  automatically load the coredump via gdb the right way.

Closes #5569

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Sergey Bronnikov <sergeyb@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>

e733f08d

luajit: bump new version · be10be8f

Igor Munkin authored 4 years ago

LuaJIT submodule is bumped to introduce the following change:
* test: fix directory detection in lua-Harness suite

This changeset fixes the hidden bug in lua-Harness test suite that was
revealed by applying 9610c741 ('hotfix:
update libcurl submodule to 7.76.0'). This was 314-th patch in 2.8.0
release, so RPM-related tools confuse the testing machinery using the
patch revision in the building tree names[1].

[1]: https://github.com/tarantool/tarantool/runs/2361010932#step:5:10032

Follows up #5844

be10be8f

Apr 17, 2021
- Add test and changelog for issue gh-5890 · d2bd4510
  Mergen Imeev authored 4 years ago
  
  d2bd4510
- Add changelog for select uuid in SQL · 3a637126
  Mergen Imeev authored 4 years ago
  
  3a637126