Skip to content
Snippets Groups Projects
  1. Nov 27, 2024
  2. Oct 18, 2022
    • George Vaintrub's avatar
      error: fix conflicting member names · b47b6c29
      George Vaintrub authored
      This patch solves the problem with conflicting
      arguments in the error module.
      
      The arguments in the 'ROUTER_ALREADY_EXISTS'
      and 'ROUTER_CFG_IS_IN_PROGRESS' errors have been
      renamed to 'router_name'. The 'name' field in the
      error object is already in use.
      
      NO_DOC=undocumented behavior
      b47b6c29
  3. Aug 24, 2022
    • Nikita Zheleztsov's avatar
      ci: add linters · c508c822
      Nikita Zheleztsov authored
      This patch introduces running luacheck and checkpatch on CI.
      
      It uses the lastest version of luacheck available on luarocks.
      We should add ignoring of warning in 'version_test' file as this
      version induces us to replace all comparison signs to be
      positive (without using 'not' boolean expression). Moreover, due
      to the recent changes mutating of the ivshard global variable was
      added, which should also be ignored.
      
      Checkpatch should ignore NO_CHANGELOG and run only on PR in order
      not to make CI red when pushing directly to the master branch.
      
      Closes #369
      
      NO_DOC=ci
      NO_TEST=ci
      c508c822
    • Nikita Zheleztsov's avatar
      ci: replace debug tarantool's master with release · e673b723
      Nikita Zheleztsov authored
      All integration tests in main tarantool's repository run on
      RelWithDebInfo version. However, we use Debug in vshard's CI,
      which increases time, needed for tests and building, which leads
      to inconsistency, as all others checks use release versions.
      
      Moreover, in order to disable tests on vshard's CI we should disable
      them on both release and debug versions. But all current tests are
      disabled only on release versions.
      
      Let's test vshard's patches using release build of tarantool's master.
      
      Part of #369
      
      NO_DOC=ci
      NO_TEST=ci
      e673b723
    • Nikita Zheleztsov's avatar
      ci: drop testing on tarantool 2.9 · 83d9c2af
      Nikita Zheleztsov authored
      Currently tests run on the 2.9. However, this version of tarantool
      was never released, this is the name of the intermediate versions
      between 2.8.1 and 2.10.0.
      
      Let's drop 2.9 version from testing matrix.
      
      Part of #369
      
      NO_DOC=ci
      NO_TEST=ci
      83d9c2af
  4. Aug 22, 2022
    • Nikita Zheleztsov's avatar
      test: fix flaky router/router · bcf87e27
      Nikita Zheleztsov authored
      The part of the test router/router inserts a tuple to the space
      of the master, invokes 'vshard.storage.sync' on the same master
      and executes 'vshard.router.callro', which routes the request to
      the replica of the corresponding replicaset.
      
      The problem was that sometimes this replica didn't have needed
      tuple and returned null, which wasn't the value the test expected.
      It was caused by the fact, that sometimes replication didn't have
      time to happen and 'vshard.storage.sync' didn't check vclocks of
      all replicas in a replicaset.
      
      The purpose of the 'vshard.storage.sync()' is to wait until the
      dataset is successfully synchronized on all replicas. The work of
      this function relies on the downstreams from 'box.info.replication'.
      These fields are not 'nil' only if the instance knows about the
      other ones, which follow her. In the previous implementation of the
      'wait_lsn()', which is used in 'sync()', the function returned true
      even if there was no 'downstreams' at all, which means that it used
      to said that the data is fully synchronized, without checking
      vclocks of all replicas.
      
      The fix is to make the 'wait_lsn()' function return true only if
      there's downstreams to all replicas and vclock of the current replica
      is less or equal than any of replicas vclocks.
      
      Closes #366
      bcf87e27
    • Nikita Zheleztsov's avatar
      router: prohibit simultaneous configuration · b6b95d0e
      Nikita Zheleztsov authored
      Currently router's configuration can be executed concurrently from
      several fibers, which can lead to non-obvious errors.
      
      Let's prohibit such behavior and throw an error in case of trying
      to run `vshard.router.cfg()` or `router:cfg()` while the other
      configuration of the corresponding router is still in progress.
      
      Closes #140
      b6b95d0e
    • Nikita Zheleztsov's avatar
      storage: prohibit simultaneous configuration · 92694575
      Nikita Zheleztsov authored
      Currently `vshard.storage.cfg()` can be executed concurrently from
      several fibers, which can lead to non-obvious errors.
      
      Let's prohibit such behavior and throw an error in case of trying
      to run `vshard.storage.cfg()` while the other is still in progress.
      
      Part of #140
      92694575
    • Nikita Zheleztsov's avatar
      test: fix dropping the server in router_test.lua · a25c50b7
      Nikita Zheleztsov authored
      Currently `router-luatest/router_test.lua doesn't allow to create
      servers named `router_1` as it wasn't deleted correctly.
      
      The server should not only be dropped but also deleted from cluster's
      table. The patch introduces function for deleting instance properly and
      fixes `router-luatest/router_test.lua`.
      a25c50b7
  5. Aug 19, 2022
  6. Aug 16, 2022
  7. Aug 15, 2022
    • Nikita Zheleztsov's avatar
      replicaset: reconnect after fiber kill · 2500190d
      Nikita Zheleztsov authored
      Currently if we kill the worker fiber of the connection, which was
      initialized with 'reconnect_after' option, this connection goes into
      'error_reconnect' or 'error' state (depends on tarantool version).
      Reconnecting doesn't happen in both cases and the only way for user
      to return router to working order is reloading or manual restoring of
      the connections.
      
      This patch introduces reconnecting in that case. It should be used
      wisely, though. Fiber's killing doesn't happen instantly and if the
      user doesn't wait util fiber's status is 'dead' and makes the request
      immediately, exception will be probably thrown as the fiber can die
      in the middle of request.
      
      Closes #341
      2500190d
    • Nikita Zheleztsov's avatar
      test: fix hang in storage/recovery.test · 8b5dd90c
      Nikita Zheleztsov authored
      The problem is that recovery fiber wakes up earlier than we want it
      to do so. This leads to the test output which we don't expect.
      
      Let's block recovery fiber before making any changes to the `_bucket`.
      It'll start again as soon as the instance is restarted.
      
      Needed for #341
      8b5dd90c
    • Nikita Zheleztsov's avatar
      test: decrease wait_timeout · 5f2fc91b
      Nikita Zheleztsov authored
      Currently `wait_timeout` is set to 120 seconds, which is also a default
      value for luatest. The problem is the fact, that any timeout loop, if it
      hangs, will be executed until the process won't be forcefully killed by
      luatest. In such case result file is not created and the programmer
      can't even see, what happened.
      
      Let's set `wait_timeout` to 50 seconds, which seems to be pretty enough
      for any test to be completed.
      5f2fc91b
  8. Aug 09, 2022
    • Vladislav Shpilevoy's avatar
      gc: wait replication before checking SENT buckets · afe764ca
      Vladislav Shpilevoy authored
      GC goes to replicas to check if a SENT bucket can be deleted. But
      it can happen that this call via netbox was faster than the
      replication. Some replicas still can have the bucket SENDING.
      
      That triggered 5 seconds GC backoff. That was not a problem for
      GC, but it was for map calls. Router's map_callrw() had to wait
      until GC woke up after the backoff and finally deleted the bucket.
      
      All that time the pending map_callrw() was not only just waiting,
      but also disrupting rebalancing. Because the "bucket-move vs
      storage-ref" scheduler tries to give time both to moves and refs
      fairly.
      
      The patch makes bucket GC synchronize with replicas before
      checking their buckets. It is done per-batch. Doing it just once
      in the beginning of _bucket space iteration wouldn't be enough,
      because new SENT buckets not covered by that sync would keep
      appearing (if the rebalancing is still ongoing).
      
      Part of #173
      afe764ca
    • Vladislav Shpilevoy's avatar
      gc: do not check local buckets via netbox · f22993d2
      Vladislav Shpilevoy authored
      Bucket GC makes a map request on all nodes in the replicaset to
      check which buckets are eligible for turning into GARBAGE.
      
      Going to the own local instance via netbox of course makes no
      sense. The buckets are checked locally both before and after the
      map request anyway.
      
      The patch optimizes GC so that it won't visit self via netbox
      during the map call.
      
      Part of #173
      f22993d2
    • Vladislav Shpilevoy's avatar
      gc: protect bucket refs on replicas from GC · 96588fa9
      Vladislav Shpilevoy authored
      There is a bug about master instance's bucket GC fiber not
      respecting RO refs on replicas. The idea of a fix - make the
      bucket GC procedure consult each replica before marking a bucket
      as GARBAGE.
      
      That allows not to affect the requests coming to replicas at all.
      They can keep looking just at the bucket status and not consult
      the master instance before doing an RO ref. All the hard work is
      done in background and only after some buckets were actually sent.
      
      GC of a sent bucket works now by the plan:
      - ACTIVE bucket becomes SENT;
      - SENT status is delivered to replicas via plain replication;
      - The replicas stop accepting new RO requests to the SENT bucket;
      - Master periodically asks replicas if they still have RO refs on
        the SENT bucket;
      - Eventually none of the replicas have RO refs on the bucket;
      - Master can safely mark the bucket as GARBAGE and delete it
        being sure that no existing nor new requests could access it
        now.
      
      However this is not all. While it improves the protection, still
      data inconsistency is reachable. Can be achieved via replication
      and/or configuration problems. One scenario:
      - Node1 is a master, node2 is a replica, bucket1 is ACTIVE on
        both;
      - Node1 sends bucket1 to another replicaset;
      - The bucket becomes SENT on both nodes;
      - Node1 and node2 loose replication. But netbox still works;
      - Node1 and node2 drop each other from config and from
        replication;
      - Each of them becomes master and deletes the SENT bucket.
      - Node2 receives the bucket from another replicaset again. Now it
        is ACTIVE here and not present on node1 at all. On node2 it is
        serving an RW or an RO request right now.
      - Node1 and node2 again get their old config and restore the
        replication.
      - Node2 receives removal of the bucket from node1.
      
      In this scenario node2 shouldn't apply the bucket removal.
      Firstly, it should always follow the path ACTIVE -> SENT ->
      GARBAGE. Secondly, its removal would make the currently running
      requests access inconsistent data.
      
      This should be fixed separately. Most likely by addition of sanity
      checks to the _bucket:on_replace trigger to raise an error when
      there is a threat of data loss or corruption.
      
      Part of #173
      96588fa9
    • Vladislav Shpilevoy's avatar
      test: fix flaky reroute_wrong_bucket · 743b607f
      Vladislav Shpilevoy authored
      It relied on the bucket being deleted by the time a router call
      reaches the storage. But if GC is not so fast, the bucket still
      can be in SENT or GARBAGE state. Router still retries the call,
      but in the end returns a slightly different error.
      
      The patch makes the test ignore irrelevant error fields.
      743b607f
    • Vladislav Shpilevoy's avatar
      test: fix flaky upgrade/upgrade · e063d82b
      Vladislav Shpilevoy authored
      The replica (storage_1_b) sometimes didn't have time to receive
      the schema upgrade from the master (storage_1_a). The fix is to
      wait for it explicitly.
      
      Closes #338
      e063d82b
    • Vladislav Shpilevoy's avatar
      test: fix flaky storage/recovery · 98065ae2
      Vladislav Shpilevoy authored
      In one place the test assumed that after recovery_wakeup() + yield
      the recovery is fully done. It is not so even now and gets worse
      after following commits.
      
      The fix is to wait for the needed state of _bucket space instead
      of assuming that it is immediately reachable.
      98065ae2
    • Vladislav Shpilevoy's avatar
      replicaset: introduce map_call() · 6e246250
      Vladislav Shpilevoy authored
      There is a bug about master instance's bucket GC fiber not
      respecting RO refs on replicas. The idea of a fix - make the
      bucket GC procedure consult each replica before marking a bucket
      as GARBAGE. Why this way - see motivation in the main commit.
      
      Consulting each replica means map-reduce. The present commit
      introduces replicaset:map_call(). It is like router.map_callrw(),
      but works on all replicas of a single replicaset instead of
      masters of all replicasets.
      
      It is going to be used in the main commit.
      
      Needed for #173
      6e246250
    • Vladislav Shpilevoy's avatar
      vtest: rename most of storage_ funcs to cluster_ · a1726b67
      Vladislav Shpilevoy authored
      Functions like storage_new(), storage_cfg(), storage_for_each()
      and others operated on all storage nodes in the cluster. The
      naming was fine since individual storages didn't have any methods
      except storage_first_bucket(). But it is going to change in next
      commits. And then the naming would be confusing - when would use
      vtest.storate_start(), would it start the entire cluster or a
      single storage node?
      
      The patch renames most of storage_ functions to have cluster_
      prefix. Those which always operated on entire cluster. It wasn't
      done in the beginning because routers might seem like a part of
      the cluster. But the new assumption of 'cluster' meaning only
      storages looks less confusing than 'storage' meaning all the data
      nodes.
      
      Needed for #173
      a1726b67
    • Vladislav Shpilevoy's avatar
      gc: introduce service_call.bucket_test_gc() · 1d566d12
      Vladislav Shpilevoy authored
      There is a bug about master instance's bucket GC fiber not
      respecting RO refs on replicas. The idea of a fix - make the
      bucket GC procedure consult replicas before marking a bucket as
      GARBAGE. Why this way - see motivation in the main commit.
      
      The present commit introduces a helper - service call
      bucket_test_gc(). The function takes bucket IDs and returns which
      of them are not ok to be GCed.
      
      The function returns "not ok" bids instead of "ok" ones. Because
      most of the tested buckets are supposed to be collectible. Sending
      "not ok" bids would produce less network traffic. And is easier to
      process in a map-reduce way. Indeed, easier to mark all bids which
      are not ok on at least one replica, then for each bid do count how
      many replicas are ok with it being removed. Is clearly visible in
      the main commit later.
      
      Needed for #173
      1d566d12
    • Vladislav Shpilevoy's avatar
      test: fix flaky test on unstable mvcc · 9e9d7923
      Vladislav Shpilevoy authored
      In core Tarantool there is a bug in mvcc which sometimes makes
      space:count() ~= #space:select(). That works fine now, but an
      existing test breaks on that after one of the next commits.
      Including an already released Tarantool version 2.10.0. It means
      the test must bypass this bug somehow and live with that as long
      as vshard supports 2.10.0 at all.
      
      Needed for #173
      9e9d7923
  9. Jul 27, 2022
    • Nikita Zheleztsov's avatar
      test: fix flaky router/router2.test.lua · ce4c0a00
      Nikita Zheleztsov authored
      The part of the router/router2 tests changing discovery mode to
      different states. During reconfiguration, which is invoked after
      every of these changes, previos discovery fiber is killed and new
      one is created.
      
      The problem is that we check the statuses of the fibers without
      any waiting and sometimes there was no time for fiber to be killed.
      This happens due to the fact that fiber's cancellation is performed
      asynchronously: it's not instant.
      
      Let's wait until the status of the last tested fiber is not dead.
      By this point all other fibers will already be dead anyway.
      Moreover, the last fiber, which is executed only once, can be
      already dead by this time. Let's check if it's done or still working.
      
      Closes #358
      ce4c0a00
  10. Jul 18, 2022
    • Nikita Zheleztsov's avatar
      ci: add a debug tarantool version · d7dab3e3
      Nikita Zheleztsov authored
      This patch inroduce CI testing on the latest git version of
      Tarantool, which is built as `Debug`.
      
      By default master branch is used. However, any other one can
      be used too: just change everything after `debug-` to another
      name.
      
      Part of #339
      d7dab3e3
  11. Jul 11, 2022
    • Vladislav Shpilevoy's avatar
      test: limit max count of tuples in stress test · 96b6d279
      Vladislav Shpilevoy authored
      Rebalancer stress tests potentially could generate infinite number
      of tuples. The tests should have limits on all resources that they
      use.
      
      The patch sets the limit to a large value, but it is at least not
      infinite.
      
      Follow up #309
      96b6d279
    • Vladislav Shpilevoy's avatar
      test: fix flaky rebalancer/stress · 68d564e5
      Vladislav Shpilevoy authored
      The tests stress_add_remove_several_rs and stress_add_remove_rs
      try to change cluster topology during rebalancing to check if
      the rebalancing eventually ends without errors.
      
      Moreover, during doing that they also generate some RW load via a
      router.
      
      The RW load was generated in a fiber which in the end of a test
      case is canceled, then the test ensures that all written data is
      available for reading.
      
      The problem is that the RW generator fiber had no fiber cancel
      checks. It only called `vshard.router.call()` in a loop until it
      succeeds. The function doesn't raise exceptions, so if the fiber
      was canceled before the `call()` went through, the call was
      constantly failing.
      
      What is worse, requests were failing, but they still could get to
      the network. They were appended to the netbox's internal buffer,
      but the fiber couldn't wait for a response.
      
      Here is how these orphan requests affected the tests. The tests
      did the RW load at least 2 times. If a fiber was canceled wrong on
      a first load, then on a second load it messed up with the new
      requests by keeping retries of the canceled request. That could,
      for example, lead to a duplicate PK error in the `test` space.
      Then the rebalancer wouldn't be able to finish - duplicate pks in
      2 buckets would prevent their storage on the same instance.
      
      Besides, this canceled fibers constantly spamed about that
      clogging the logs.
      
      Closes #309
      68d564e5
  12. Jul 08, 2022
    • Nikita Zheleztsov's avatar
      router: auto and manual enable/disable · dd70cfb2
      Nikita Zheleztsov authored
      This patch introduces protecting router's API while its configuration
      is not done as accessing these functions at that time is not safe and
      can cause low level errors like 'bad arguments' or 'no such function'.
      
      Now all non-trivial vshard.router functions are disabled until
      `vshard.router.cfg` (or `vshard.router.new`) is finished and error is
      raised in case of an attempt to access them.
      
      Manual API's enabling/disabling is also introduced in this patch.
      
      Closes #194
      Closes #291
      
      @TarantoolBot document
      Title: vshard.router.enable/disable()
      `vshard.router.disable()` makes most of the `vshard.router`
      functions throw an error. As Lua exception, not via `nil, err`
      pattern.
      
      `vshard.router.enable()` reverts the disable.
      
      `router_object:enable()/disable()`, where `router_object` is
      the return value of `vshard.router.new()`, can also be used
      for manual API access configuration for the specific non-static
      router.
      
      By default the router is enabled.
      
      Additionally, the router is forcefully disabled automatically
      until its configuration is finished and the instance finished
      recovery (its `box.info.status` is `'running'`, for example).
      
      Auto-disable protects from usage of vshard functions before the
      router's global state is fully created.
      
      Manual router's disabling helps to achieve the same for user's
      application. For instance, a user might want to do some preparatory
      work after `vshard.router.cfg` before the application is ready.
      Then the flow would be:
      ```Lua
      vshard.router.disable()
      vshard.router.cfg(...)
      -- Do your preparatory work here ...
      vshard.router.enable()
      ```
      
      The behavior of the router's API enabling/disabling is similar to the
      storage's one.
      dd70cfb2
  13. Jul 07, 2022
    • Vladislav Shpilevoy's avatar
      test: port bucket GC test to luatest · 20263e05
      Vladislav Shpilevoy authored
      Garbage collection is going to be significantly reworked in order
      to prevent deletion of SENT buckets which still can have RO refs
      on replicas.
      
      It means more GC tests will be needed and the old ones will need
      to change. The patch prepares the existing tests to that by
      porting them to luatest.
      
      Part of #173
      20263e05
    • Vladislav Shpilevoy's avatar
      storage: unify index:min() behaviour · b07d24cb
      Vladislav Shpilevoy authored
      At < 2.1.0 index:min(key) worked as
      
          index:select(key, {limit = 1, iterator='GE'})
      
      At >= 2.1.0 it started working as:
      
          index:select(key, {limit = 1})
      
      The new behaviour is good and correct. But still vshard needs to
      function on 1.10 too. The patch introduces vshard.util.index_min()
      which behaves like index:min() at >= 2.1.0. And .index_has()
      helper to check if a key exists in an index, which is usually the
      purpose of :min().
      
      Follow up tarantool/tarantool#3167
      Needed for #173
      b07d24cb
    • Vladislav Shpilevoy's avatar
      recovery: turn SENDING into SENT, not GARBAGE · e35e38fb
      Vladislav Shpilevoy authored
      Previously SENT and GARBAGE statuses were in practice the same.
      Both could be deleted as soon as have no RO refs (except that it
      didn't work on replicas).
      
      But soon it is going to change. GARBAGE would be assigned to a
      bucket only if it already has no any refs in the entire
      replicaset and won't be able to get new ones.
      
      SENT would mean the bucket still can have at least RO refs and
      needs validation whether it has them on any replica in its
      replicaset.
      
      The patch makes bucket recovery turn SENDING bucket into SENT in
      case it is activated in its destination replicaset. The garbage
      collector then is trusted to deal with SENT buckets in a special
      way. To be done in future patches.
      
      Part of #173
      e35e38fb
  14. Jul 06, 2022
    • Nikita Zheleztsov's avatar
      test: fix flaky router/reconnect_to_master · 5edceebe
      Nikita Zheleztsov authored
      Counting known buckets on router at first time in this test was not
      stable as sometimes discovery_fiber didn't have time to start discovery
      process.
      
      Let's start router with disabled discovery, make sure the router doesn't
      know any buckets, enable discovery mode and wait until it gets all
      buckets from replica.
      
      Closes #304
      5edceebe
  15. Jun 27, 2022
    • Vladislav Shpilevoy's avatar
      ci: include 2.10 · 969005a5
      Vladislav Shpilevoy authored
      Part of #339
      969005a5
    • Vladislav Shpilevoy's avatar
      storage: fail a call if couldn't unref the bucket · a1f1ca36
      Vladislav Shpilevoy authored
      vshard.storage.call() takes care of bucket referencing to prevent
      its move while the call is in progress. It assumed that if a ref
      succeeded, then an unref would also work.
      
      But it is not always true. Now it is quite easy to break, because
      if a ref is on a replica, then a master could move the bucket even
      while it is being accessed on another instance.
      
      The patch makes so vshard.storage.call() fails if unref fails,
      signaling that the bucket was probably deleted together with its
      ref counter.
      
      In future patches there are going to be more fail-safe measures
      like this one. But for some of them will firstly need to fix
      bucket GC so as it would consult replicas if they still have refs.
      
      Part of #173
      a1f1ca36
    • Vladislav Shpilevoy's avatar
      error: support stacked box errors · 3cd2aaed
      Vladislav Shpilevoy authored
      Box errors can be stacked via 'prev' member. Such errors didn't
      raise any exceptions in vshard, but vshard.error.box() wouldn't
      unpack any errors beyond the first one.
      
      The patch makes so vshard.error.box() walks the whole error stack.
      
      It is done to be consistent with a future patch which will use
      'prev' field for vshard errors too, thus producing stacks of
      errors as Lua tables.
      
      That in turn is needed in order to return more than one error from
      vshard.storage.call() which will be able to fail both user
      function and bucket unref.
      
      Needed for #173
      3cd2aaed
    • Vladislav Shpilevoy's avatar
      storage: make bucket_unref fail critically · 0c5d840f
      Vladislav Shpilevoy authored
      It used to fail with WRONG_BUCKET error. That isn't a good idea,
      because if the error ever reaches the router, it will retry the
      request.
      
      It can't happen now, but in a future patch it will be possible.
      Then it could be that ref succeeded, request started execution,
      then bucket was deleted, and unref failed. The last step shouldn't
      allow the router to retry the request. Especially if it was an RW
      request which is not idempotent.
      
      Part of #173
      0c5d840f
    • Vladislav Shpilevoy's avatar
      storage: handle _bucket changes in commit triggers · 896bce78
      Vladislav Shpilevoy authored
      Previously _bucket changes only triggered wakeup of various
      background fibers. They didn't validate anything nor changed.
      
      While the patch doesn't do much on the field of validation (but it
      is a part of the future patches), it makes _bucket updates do some
      changes on commit. Without need to do that after an explicit
      transaction like
      
          _bucket:replace(...)
          bucket_ref.ro_lock = true
      
      Now the replace itself on commit tries to do the right things.
      
      This is important because otherwise replicas don't get any updates
      to bucket_ref objects. For example, if a bucket was ACTIVE and
      became SENT/GARBAGE, and on the replica it had ro refs > 0, then
      on the replica it didn't install `ro_lock` to reject new ro refs.
      So new read requests could come and would be accepted.
      
      Now also when a bucket is deleted or becomes garbage, its ref
      descriptor is deleted. That fixes a similar problem when an
      ACTIVE bucket is just deleted completely.
      
      There are also more updates which handle cases impossible during
      normal functioning, but still could be done manually or happen as
      a bug. For example, it shouldn't be possible that a RECEIVING
      bucket has a ref descriptor at all. Yet if it is detected, the
      ref is deleted.
      
      The patch intentionally doesn't install rw_lock on SENDING bucket,
      because it is supposed to be installed *before* the bucket becomes
      SENDING. Checks like that are a subject of a future patch which
      will make _bucket on_replace trigger do sanity checks before
      commit happens.
      
      The patch is a part of big work on making _bucket updates more
      robust and make replicas aware of and agree with them.
      
      Part of #173
      896bce78
    • Vladislav Shpilevoy's avatar
      vtest: store wait_timeout in a single place · 2c28be75
      Vladislav Shpilevoy authored
      VShard luatests used to define wait_timeout in their test.lua
      file. That led to the timeout being duplicated in all the files
      and to a bit clumsy code when the timeout needed to be passed to
      the instances.
      
      The patch makes so the timeout is stored in a single place - in
      vtest. It is a default timeout for existing waiting functions and
      will be used in the future patches too.
      
      Needed for #173
      2c28be75
    • Vladislav Shpilevoy's avatar
      storage: change ERRINJ_NO_RECOVERY behaviour · ba55c69d
      Vladislav Shpilevoy authored
      In new tests it will be necessary to pause the recovery again. But
      the old implementation of the injection didn't allow to find out
      whether the recovery has already got stuck on it. The new one
      makes it possible to wait until the recovery is paused by checking
      the injection value.
      
      The name is changed to ERRINJ_RECOVERY_PAUSE to reflect the
      behaviour better. A similar injection will be introduced for GC.
      
      Needed for #173
      ba55c69d
Loading