Commits · 604eb73723a2f1edb54786f7d180b30fcb478f84 · core / tarantool

Jul 13, 2020

test: fix flaky qsync_basic.test.lua · 604eb737

Vladislav Shpilevoy authored 4 years ago

In one of the test cases 2 fibers were started making a
transaction. In the first fiber the transaction was rolled back,
and the second fiber was expected to do the same.

It did rollback too, but not always immediately after the first
one. Because the first fiber needed not just do rollback right
away, but write a ROLLBACK entry into WAL before applying the
rollback to all next transactions. This led to a yield, during
which it was possible to observe the second fiber not dead yet.

The patch makes the test explicitly wait for the fibers death.

Closes #5162

604eb737

Jul 12, 2020
- tx: introduce txn_stmt_destroy · 9438d074
  Aleksandr Lyapunov authored 4 years ago
  
  9438d074
Jul 11, 2020

qsync: txn_limbo_wait_complete -- use txn_limbo_abort · 8b85cc0b
Cyrill Gorcunov authored 4 years ago
```
Instead of open coding.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
```
8b85cc0b

qsync: txn_limbo_read_rollback -- use txn_limbo_abort · b58c0a36

Cyrill Gorcunov authored 4 years ago


Bsaically this is the same what txn_limbo_abort does.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

b58c0a36

qsync: txn_limbo_pop -- drop fake reference · b35166e6

Cyrill Gorcunov authored 4 years ago


The limbo variable is accessed unconditionally
thus no need for fake reference.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

b35166e6

qsync: txn_limbo_assign_local_lsn -- drop redundant declaration · e3d65d95

Cyrill Gorcunov authored 4 years ago


We use limbo variable accounting acks so no need for
formal read here.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

e3d65d95

qsync: add a comment about sync txn in journal allocation · 390916e3

Cyrill Gorcunov authored 4 years ago


Otherwise it is not clear why we should setup a flag here.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

390916e3

Jul 10, 2020

test: update suites 'fragile' lists · daae9586
Alexander V. Tikhonov authored 4 years ago
```
Syncronized suites 'fragile' lists with actual list of flaky tests.
```
daae9586

sql: use mem_mp_type() in sql_value_type() · 52c3af7e

Nikita Pettik authored 4 years ago

sql_value_type() and mem_mp_type() do the same thing: return messagePack
type corresponding to value stored in memory cell. However,
sql_value_type() operates on opaque API wrapper - sql_value*. To avoid
duplicating code let's invoke mem_mp_type() in sql_value_type().
At once, let's account that mp_type now can be not only _BIN, but also
_ARRAY and _MAP - this fact will be used when we introduce arrays and
maps in SQL.

52c3af7e

sql: introduce mem_mp_type() function · ac307cfb

Nikita Pettik authored 4 years ago

It takes memory cell object and returns corresponding to its value
messagePack type (i.e. it maps MEM_* types on MP_* types). It's an
internal analogue of sql_value_type(). In other words, it operates
directly on struct Mem *.

ac307cfb

replication: test sync txn local rollback is reversed · ff3123a6

Vladislav Shpilevoy authored 4 years ago

Transactions are rolled back in reversed order, always. Limbo
somewhy removed rolled back transactions from the beginning, not
from the end. The test ensures it is not so.

Closes #5147

ff3123a6

replication: add test for encoding CONFIRM/ROLLBACK on txn region · db1742aa

Vladislav Shpilevoy authored 4 years ago

In the original issue there were 2 bugs: one memory leak and one
memory corruption.

The leak was in txn_limbo_write_confirm_rollback(). This function
used fiber->gc region to encode CONFIRM/ROLLBACK, but never freed
it.

The corruption was in applier.cc in process_confirm_rollback().
CONFIRM/ROLLBACK were stored on the applier's ibuf. As a result,
if applier experiences relatively intensive load, the ibuf will
be quickly recycled, right during a WAL write of CONFIRM/ROLLBACK
stored on it.

DML requests would have the same problem, but they were copied in
txn_add_redo() inside of xrow_encode_dml() call.

The test checks whether CONFIRM/ROLLBACK are also copied.

Closes #5138

db1742aa

txn_limbo: introduce dynamic synchro config · 29766ce7

Vladislav Shpilevoy authored 4 years ago

Synchronous replication options - replication_synchro_quorum and
replication_synchro_timeout - were not updated for the existing
transactions on change. As a result, there could be weird
inconsistencies, when a new transaction could have required quorum
smaller than a previous transaction's, and could implicitly
confirm it. The same could be told about rollback on timeout - new
transactions could wake up earlier than older transactions.

This patch makes configuration dynamic. So if the mentioned
options are updated, they are applied to the existing transactions
too.

It opens wide administrative capabilities. For example, when
replica count becomes less than the quorum, an administrator can
lower the quorum dynamically, and it will be applied to all the
existing transactions.

Closes #5119

29766ce7

box.ctl: introduce clear_synchro_queue function · 9509a036

Serge Petrenko authored 4 years ago

Introduce a new function to box.ctl API: box.ctl.clear_synchro_queue()
The function performs some actions to make sure that after it's
executed, the txn_limbo is free of any transactions issued on a remote
instance.
In order to achieve this goal, the instance first waits for 2
replication_synchro_timeouts so that confirmations and rollbacks from
the remote instance reach it.

If the limbo remains non-empty, the instance starts figuring out which
transactions should be confirmed and which should be rolled back. In
order to do so the instance scans through vclocks of all the instances
that replicate from it and defines which old leader's lsn is the last
reached by replication_synchro_quorum of replicas.

Then the instance writes appropriate CONFIRM and ROLLBACK entries.
After these actions the limbo must be empty.

Closes #4849

9509a036

util: move cmp_i64 from xlog.c to util.h · c25d4cb9
Serge Petrenko authored 4 years ago
```
The comparator will be needed in other files too, e.g. box.cc

Prerequisite #4849
```
c25d4cb9

test: add test on local transactions wait synchronous · 4e0552a1

Vladislav Shpilevoy authored 4 years ago

Fully local transactions are expected to be blocked if there is
a synchronous transaction not finished.

Also there is a special case for when a transaction is not local,
but has a local row in the end, related to #4928.

4e0552a1

replication: add tests for sync replication with snapshots · 90e7a270
Sergey Bronnikov authored 4 years ago
```
Part of #5055
```
90e7a270
replication: add tests for sync replication with anon replica · 39cc2935
Sergey Bronnikov authored 4 years ago
```
Part of #5055
```
39cc2935
replication: add advanced tests for sync replication · 0fdd675a
Sergey Bronnikov authored 4 years ago
```
Part of #5055
```
0fdd675a

replication: add test for quorum 1 · 158c7404

Vladislav Shpilevoy authored 4 years ago

When synchro quorum is 1, the final commit and confirmation write
are done by the fiber created the transaction, right after WAL
write. This case got special handling in the previous patches,
and this commits adds a test for that.

Closes #5123

158c7404

replication: add test for async transactions block when not empty limbo · 7ea50cd9
Vladislav Shpilevoy authored 4 years ago
```
Follow-up #4845
```
7ea50cd9

replication: only send confirmed data during final join · 920efcb4

Serge Petrenko authored 4 years ago

Final join (or register) stage is needed to deliver the replica its
_cluster registration. Since this stage is followed by a snapshot on
replica, the data received during this stage must be confirmed.

Make master check that there are no rollbacks for the data to be sent
during final join and that all the data is confirmed before final join
starts.

Closes #5097

920efcb4

replication: delay initial join until confirmation · 41e979f0

Serge Petrenko authored 4 years ago

All the data that master sends during the join stage (both initial and
final) is embedded into the first snapshot created on replica, so this
data mustn't contain any unconfirmed or rolled back synchronous
transactions.

Make sure that master starts sending the initial data, which contains a
snapshot-like dump of all the spaces only after the latest synchronous
tx it has is confirmed. In case of rollback, the replica may retry
joining.

Part of #5097

41e979f0

txn_limbo: add diag_set in txn_limbo_wait_confirm · 9c88b6cd
Serge Petrenko authored 4 years ago
```
Add failure reason to txn_limbo_wait_confirm

Prerequisite #5097
```
9c88b6cd

applier: use WAL write event instead of commit for ACK · 7836fb49

Vladislav Shpilevoy authored 4 years ago

Applier used to send ACKs to master when commit happens. But for
sync transactions this is not enough - their ACK should be sent
after WAL write. Master doesn't really care whether a commit
will happen after WAL write on the replica. The only thing which
matters is whether the replica managed to persist the sync
transaction.

Now applier uses WAL write event instead of commit to send ACKs.
Nothing changed for async transactions (for them WAL write ==
commit). But sync transactions now send ACKs immediately, without
waiting for heartbeat timeout.

Closes #5100
Closes #5127

7836fb49

applier: don't miss WAL writes happened during ACK send · 6154d053

Vladislav Shpilevoy authored 4 years ago

Applier has a writer fiber sending vclock of the instance to the
master after each WAL write or when heartbeat timeout passes.

However it missed WAL writes happened *during* sending ACK on a
previous WAL write. That made applier sleep heartbeat timeout
even though it had not sent data. It is not a problem for async
replication, but becomes a bug when sync transactions appear. For
them an ACK should be sent as soon as possible.

Part of #5100

6154d053

txn: introduce on_wal_write trigger · cd033be4

Vladislav Shpilevoy authored 4 years ago

With synchronous replication a sycn transaction passes 2 stages:
WAL write + commit. These are separate events on the contrary with
async transactions, where WAL write == commit.

The WAL write event is needed on non-leader nodes to be able to
send an ACK to the master.

Part of #5100

cd033be4

replication: add test for synchro CONFIRM/ROLLBACK · 0f7a722c
Serge Petrenko authored 4 years ago
```
Follow-up #4847
Follow-up #4848
```
0f7a722c
replication: support ROLLBACK and CONFIRM during recovery · b88a1a27
Serge Petrenko authored 4 years ago
```
Follow-up #4847
Follow-up #4848
Closes #4851
```
b88a1a27

box: rework local_recovery to use async txn_commit · a464ec38

Serge Petrenko authored 4 years ago

Local recovery should use asynchronous txn commit procedure in order to
get to CONFIRM and ROLLBACK statements for a transaction that needs
confirmation before confirmation timeout happens.
Using async txn commit doesn't harm other transactions, since the
journal used during local recovery fakes writes and its write_async()
method may reuse plain write().

Follow-up #4847
Follow-up #4848

a464ec38

txn_limbo: add ROLLBACK processing · 97bcfc6f

Serge Petrenko authored 4 years ago

Now txn_limbo writes a ROLLBACK entry to WAL when one of the limbo
entries fails to gather quorum during a txn_limbo_confirm_timeout.
All the limbo entries, starting with the failed one, are rolled back in
reverse order.

Closes #4848

97bcfc6f

txn_limbo: add timeout when waiting for acks. · 40a4f702

Serge Petrenko authored 4 years ago

Now txn_limbo_wait_complete() waits for acks only for txn_limbo_confirm_timeout
seconds. If a timeout is reached, the entry and all the ones following
it must be rolled back.

Part-of #4848

40a4f702

replication: add support of qsync to the snapshot machinery · 6492da52

Leonid Vasiliev authored 4 years ago

To support qsync replication, the waiting for confirmation of
current "sync" transactions during a timeout has been added to
the snapshot machinery. In the case of rollback or the timeout
expiration, the snapshot will be cancelled.

Closes #4850

6492da52

replication: write and read CONFIRM entries · 59f3edfb

Serge Petrenko authored 4 years ago

Make txn_limbo write a CONFIRM entry as soon as a batch of entries
receive their acks. CONFIRM entry is written to WAL and later replicated
to all the replicas.

Now replicas put synchronous transactions into txn_limbo and wait for
corresponding confirmation entries to arrive and end up in their WAL
before committing the transactions.

Closes #4847

59f3edfb

txn: introduce various reasons for txn rollback · 6e1c848e

Serge Petrenko authored 4 years ago

Transaction on_rollback triggers will need to distinguish
txn_limbo-issued rollbacks from rollbacks that happened due to a failed
WAL write or memory error.

Prerequisite #4847
Prerequisite #4848

6e1c848e

xrow: introduce CONFIRM and ROLLBACK entries · 76be5119

Serge Petrenko authored 4 years ago

Add methods to encode/decode CONFIRM entry.
A CONFIRM entry will be written to WAL by synchronous replication master
as soon as it finds that the transaction was applied on a quorum of
replicas.
CONFIRM rows share the same header with other rows in WAL, but their body
differs: it's just a map containing replica_id and lsn of the last
confirmed transaction.

ROLLBACK request contains the same data as CONFIRM request.
The only difference is the request semantics. While a CONFIRM request
releases all the limbo entries up to the given lsn, the ROLLBACK request
rolls back all the entries with lsn greater than given one.

Part-of #4847
Part-of #4848

76be5119

replication: make sync transactions wait quorum · c927fce9

Vladislav Shpilevoy authored 4 years ago

Synchronous transaction (which changes anything in a synchronous
space) before commit waits until it is replicated onto a quorum
of replicas.

When there is a not committed synchronous transaction, any attempt
to commit a next transaction is suspended, even if it is an async
transaction.

This restriction comes from the theoretically possible dependency
of what is written in the async transactions on what was written
in the previous sync transactions.

So far all the 'synchronousness' is basically the same as the well
known 'wait_lsn' technique. With the exception, that the
transaction really is not committed until replicated.

Problem of wait_lsn is still present though, in case master
restarts. Because there is no a 'confirm' record in WAL telling
which transactions are replicated and can be applied.

Closes #4844
Closes #4845

c927fce9

Jul 09, 2020

replication: introduce replication_synchro_* cfg options · 4cc49f32

Vladislav Shpilevoy authored 4 years ago

Synchronous transactions are supposed to be replicated on a
specified number of replicas before committed on master. The
number of replicas can be specified using
replication_synchro_quorum option. It is 1 by default, so sync
transactions work like asynchronous when not configured anyhow.
1 means successful WAL write on master is enough for commit.

When replication_synchro_quorum is greater than 1, an instance has to
wait for the specified number of replicas to reply with success. If
enough replies aren't collected during replication_synchro_timeout,
the instance rolls back the tx in question.

Part of #4844
Part of #5073

4cc49f32

replication: introduce space.is_sync option · c14563f5

Vladislav Shpilevoy authored 4 years ago

Synchronous space makes every transaction, affecting its data,
wait until it is replicated on a quorum of replicas before it is
committed.

Part of #4844
Part of #5073

c14563f5

vinyl: add NULL check of xrow_upsert_execute() retval · 35162fa4

Nikita Pettik authored 4 years ago

xrow_upsert_execute() can fail and return NULL for various reasons.
However, in vy_apply_upsert() the result of xrow_upsert_execute() is
used unconditionally which may lead to crash. Let's fix it and in case
xrow_upser_execute() fails return from vy_apply_upsert() NULL value.

Part of #4957

35162fa4