Skip to content
Snippets Groups Projects
user avatar
Nikita Zheleztsov authored
Currently replicaset state machine tracking the number of connected,
loading and synced appliers may perform unnecessary decrementing of
their count. On debug version this may lead to assertion failure.
Here's the way it may happen:
  1. Any kind of exception occurs in applier thread and leads to
     invoking its destructor (applier_thread_data_destroy), which
     is set with scoped guard;
  2. Cbus call is made in order to remove the corresponding applier
     from the thread. According to the fact that cbus_call is
     synchronous, we yield, waiting for the result from the applier
     thread.
  3. During yielding user calls reconfiguration, which invokes
     replicaset_update. Old appliers are pruned: for every replica
     trigger on changing state machine counter is deleted after which
     we stop fiber and wait its join.
  4. If the first replica in replicaset_foreach is not the errored
     one and the errored fiber wakes up during yielding with
     fiber_join, then zero decrementing happens.

Let's clear the above mentioned triggers for all replicas at the
first place and only after that stop and join their applier fibers.

Closes #7590

NO_DOC=bugfix
7ec82674
History