test · 7ec8267460a4042001d5232e229795c39cea31f8 · core / tarantool

replication: replicaset state machine assert fail

Nikita Zheleztsov authored 2 years ago

Currently replicaset state machine tracking the number of connected,
loading and synced appliers may perform unnecessary decrementing of
their count. On debug version this may lead to assertion failure.
Here's the way it may happen:
  1. Any kind of exception occurs in applier thread and leads to
     invoking its destructor (applier_thread_data_destroy), which
     is set with scoped guard;
  2. Cbus call is made in order to remove the corresponding applier
     from the thread. According to the fact that cbus_call is
     synchronous, we yield, waiting for the result from the applier
     thread.
  3. During yielding user calls reconfiguration, which invokes
     replicaset_update. Old appliers are pruned: for every replica
     trigger on changing state machine counter is deleted after which
     we stop fiber and wait its join.
  4. If the first replica in replicaset_foreach is not the errored
     one and the errored fiber wakes up during yielding with
     fiber_join, then zero decrementing happens.

Let's clear the above mentioned triggers for all replicas at the
first place and only after that stop and join their applier fibers.

Closes #7590

NO_DOC=bugfix

7ec82674

History

7ec82674 2 years ago

History

Name	Last commit	Last update
..
app-luatest
app-tap
app
box-luatest
box-py
box-tap
box
engine-luatest
engine-tap
engine
engine_long
fuzz
long_run-py
replication-luatest
replication-py
replication
share
sql-luatest
sql-tap
sql
static/corpus
swim
unit
vinyl-luatest
vinyl
wal_off
xlog-py
xlog
.gitattributes
CMakeLists.txt
interactive_tarantool.lua
luajit-test-init.lua
small
test-run.py