Skip to content
Snippets Groups Projects
  1. Jun 22, 2023
    • Aleksandr Lyapunov's avatar
      memtx: fix lost gap and full scan items · f58ecaeb
      Aleksandr Lyapunov authored
      By a mistake in 8a565144 a shortcut was added to procedure
      that handles gap write: it was considered that if the writing
      transaction is the same as reading - there is no actual conflict
      that must be stored further. That was a wrong decision: if such
      a transaction yields and another transaction comes and commits
      a value with the same key - the first one must go to conflicted
      state since it has read no more possible state.
      
      Another similar mistake was made in e6f5090c, where writing
      after full scan of the same transaction was not tracked as read.
      Obviously that was wrong: if some other transaction overwrites
      the key and commits - this transaction must go to read view since
      it did not see anything by this key which is not so anymore.
      
      Fix it, reverting the first commit and an modifying the second and
      add a test.
      
      Closes #8326
      
      NO_DOC=bugfix
      
      (cherry picked from commit b41c4546)
      f58ecaeb
    • Aleksandr Lyapunov's avatar
      memtx: refactor handling point write · 38e84d71
      Aleksandr Lyapunov authored
      There's a special 'point hole' mechanism in mvcc transactional
      manager that manages point gap reads by full key when no raw
      tuple was found in the index. For instance, it's the only way
      to collect gap reads for non-tree indexes.
      
      Once a new tuple is inserted to the index, the read records are
      transferred to the normal read set in the corresponding story.
      Actually after that the 'point hole' record in no more needed.
      
      So let's remove it.
      
      While we are here, drop unused point_holes_size, improve names
      and comments.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 62c65639)
      38e84d71
    • Aleksandr Lyapunov's avatar
      memtx: refactor mvcc story linking to the top of chain · 5001e92e
      Aleksandr Lyapunov authored
      Before this patch there were several different places in the code
      that deal with referencing tuple in space, setting in_index member
      and marking the story as retained or not. But logically all above
      is about the same - about placing a story to the top of a chain,
      i.e. the first story in version list to which index points.
      
      This commit refactors these things a bit. This mostly relates to
      two functions - memtx_tx_story_new and memtx_tx_story_link_top.
      
      Changes in memtx_tx_story_new are based on the fact that if a story
      is created by tuple, it is or immediately will be at the top of
      chain. Considering this we can omit argument `is_referenced_to_pk`
      and always create a story ready to be in top of chain. If a story
      is already in the top - nothing else is needed; if it is to become
      the top - memtx_tx_story_link_top must be called after.
      
      Further, linking to top of chain is needed exactly in two cases:
      * if a story just created by memtx_tx_story_new must become a top
      * if a chain is reordered involving the top story (the top and the
        next stories are swapped)
      These two cases are logically very close but still different.
      Even more, previously there were two functions for that:
      memtx_tx_story_link_top_light and memtx_tx_story_link_top
      correspondingly. This commit introduces one function for that
      (although with one more argument) that also incapsulates
      activities about referencing tuples and marking stories as
      retained.
      
      After this patch the rules are logical and simple:
      * if a tuple is inserted - call _story_new and _link_top(.. true).
      * if a story of existing clean tuple is needed - call _story_new.
      * if a chain is reordered involving top story - _link_top(.. false).
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 202340b7)
      5001e92e
    • Aleksandr Lyapunov's avatar
      memtx: use xallocation in mvcc engine · cb88847b
      Aleksandr Lyapunov authored
      Remove runtime allocation error handling and use panic-on-fail
      versions of allocation functions. Reasons for that:
      * Memory error handling was never tested an probably doesn't work.
      * Some return codes was ignored so the code had obvious flaws.
      * Rollback in case of memory error made some code overcomplicated.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=no new functionality added
      NO_TEST=no new functionality added
      NO_CHANGELOG=no new functionalily added
      
      (cherry picked from commit c951c9de)
      cb88847b
    • Aleksandr Lyapunov's avatar
      memtx: replace conflict trackers with read trackers · b793604c
      Aleksandr Lyapunov authored
      Conflict trackers are used to store information that if some
      transaction is committed then some another transaction must be
      aborted. This happens when the first transaction writes some
      key while the other reads the same key. On the other hand there
      are another trackers - read trackers - that are designed to
      handle exactly the same situation. That's why conflict trackers
      can be simply replaced with read trackers.
      
      That would allow to remove conflict trackers as not needed
      anymore.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit a6c2b9ff)
      b793604c
    • Aleksandr Lyapunov's avatar
      memtx: refactor rollbacked stories · bdf66dd5
      Aleksandr Lyapunov authored
      If addition of a tuple is rolled back while the corresponding
      story is needed for something else (for example it stores a read
      set of another transaction) - the story cannot be deleted.
      Now there's a special flag `rollbacked` that is set to true
      for such stories, and the flag must be considered in places
      where history chains are scanned. That approach also requires
      psn to be set for rolled-back transactions, which surprisingly
      not as simple as it to say. All that makes the code complicated
      and hard to maintain.
      
      There's another approach for managing rolled back stories: simply
      set their del_psn to a low enough value (lower than any existing
      transaction's PSN) and (if necessary) push them to the end of
      history chain. Such a story would be invisible to any transaction
      due to already existing mechanisms, that's what is needed.
      
      In order to provide "low enough" del_psn it will be natural to
      assign real PSN starting from some predefined value, so any value
      below that predefined value will be less that any existing PSN and
      thus "low enough".
      
      Implement this more simple approach.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit ba394a58)
      bdf66dd5
    • Aleksandr Lyapunov's avatar
      memtx: add a couple of test cases to tx_man.test · 19b3aa56
      Aleksandr Lyapunov authored
      That's strange, but in this test in a group of simple test cases
      there are test cases that checks replaces, updates and deletes,
      but occasionally there's no test case that checks inserts.
      
      Fix it and add simple test cases for inserts.
      
      No logical changes.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=new test case
      NO_CHANGELOG=new test case
      
      (cherry picked from commit 37b4561f)
      19b3aa56
  2. Jun 21, 2023
  3. Jun 20, 2023
  4. Jun 19, 2023
    • Yaroslav Lobankov's avatar
      ci: extend default tests run with osx wokflows · c56e1835
      Yaroslav Lobankov authored
      It was decided to include the `osx_debug.yml` and `osx_release.yml`
      workflows to the default tests run (without the `full-ci` label).
      Now we can get test results for macOS faster and without an extra
      load on CI.
      
      NO_DOC=ci
      NO_TEST=ci
      NO_CHANGELOG=ci
      
      (cherry picked from commit de404cb0)
      Unverified
      c56e1835
    • Nikita Zheleztsov's avatar
      limbo: set user for triggers on sync transaction · f4d30856
      Nikita Zheleztsov authored
      Commit/rollback triggers are run asynchronously, upon receiving the
      write status from WAL. We can't run them in the original fiber that
      submitted the WAL request, because it would open a time window between
      writing a transaction to WAL and committing it in tx, which could lead
      to violating the cascading rolback principles. As a result,
      commit/rollback triggers run with admin privileges.
      
      The issue was already solved for confirming async transaction, but
      session and user are still not correct, when the transaction is
      confirmed by the limbo. Let's fix this issue by temporarily setting
      session and credentials to the original fiberfor running
      commit/rollback triggers.
      
      Closes #8742
      
      NO_DOC=bugfix
      
      (cherry picked from commit 8cd0cd09)
      f4d30856
    • Nikita Zheleztsov's avatar
      limbo: fix commit/rollback failures with triggers · a7a51aef
      Nikita Zheleztsov authored
      Currently some transactions on synchronous space fail to complete with
      the `ER_CURSOR_NO_TRANSACTION` error, when on_rollback/on_commit triggers
      are set.
      
      This is caused due to the fact, that some rollback/commit triggers
      require in_txn fiber variable to be set but it's not done when a
      transaction is completed from the limbo. Callbacks, which are used to
      work with iterators (`lbox_txn_pairs` and `lbox_txn_iterator_next`),
      acquire tnx statements from the current transactions, but they cannot
      do that, when this transaction is not assigned to the current fiber, so
      `ER_CURSOR_NO_TRANSACTION` is thrown.
      
      Let's assign in_txn variable when we complete transaction from the limbo.
      Moreover, let's add assertions, which check whether in_txn() is correct,
      in order to be sure, that `txn_complete_success/fail` always run with
      in_txn set.
      
      Closes #8505
      
      NO_DOC=bugfix
      
      (cherry picked from commit 6fadc8a0)
      a7a51aef
    • Yan Shtunder's avatar
      replication: recovery mixed transacrtions · 15271938
      Yan Shtunder authored
      See the docbot request for details.
      
      Closes #7932
      
      @TarantoolBot document
      Title: correct recovery of mixed transactions
      
      In this patch implemented correct recovery of mixed transactions. To do
      this, set  `box.cfg.force_recovery` to `true`. If you need to revert to
      the old behavior, don't set the `force_recovery` option.
      
      What to do when one node feeds the other a xlog with mixed transactions?
      Let there be two nodes (`node#1` and `node#2`). And let the data be
      replicated from `node#1` to `node#2`. Suppose that at some point in time,
      `node#1` is restoring data from an xlog containing mixed transactions. To
      replicate data from `node#1` to `node#2`, do the following:
      1. Stop `node#2` and delete all xlog files from it
      2. Restart `node#1` by setting `force_recovery` to `true`
      3. Make `node#2` rejoin to `node#1` again.
      
      (cherry picked from commit 2b1c8713)
      15271938
    • Yan Shtunder's avatar
      trivia: add xlsregion_alloc macros and xlsregion_alloc_object · eb90fd87
      Yan Shtunder authored
      A new macros have been introduced because OOM errors are not handled in the
      code.
      
      Needed for #7932
      
      NO_DOC=internal
      NO_TEST=internal
      NO_CHANGELOG=internal
      
      (cherry picked from commit 53857148)
      eb90fd87
    • Vladislav Shpilevoy's avatar
      box: store box.cfg.force_recovery in C · db3b6b99
      Vladislav Shpilevoy authored
      box.cfg.force_recovery used to be needed only during box.cfg() in
      a few places, but its usage is going to extend.
      
      In future commits about cluster/replicaset/instance names it will
      be needed to allow rename. It won't be entirely legal (hence can't
      be done without any flags), but won't be fully illegal either.
      
      The "valid" rename will be after upgrading, when an
      old cluster updated to a new version and wants to start using the
      names. Then it will have to set force_recovery, set the names,
      sync the instances, drop force_recovery. One-time action to allow
      old installations use the new feature - the names.
      
      Part of #5029
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      NO_TEST=already covered
      
      (cherry picked from commit ae4c96c7)
      db3b6b99
  5. Jun 16, 2023
    • Serge Petrenko's avatar
      core: fix wal saving an empty xlog during startup failure · 9f65512e
      Serge Petrenko authored
      After the commit a392eb76 ("box: destroy its modules even if box.cfg
      not done") wal_free() on exit is called even if box.cfg() isn't finished
      yet, and thus wal isn't yet enabled. This leads to a bug when wal would
      unconditionally write a 00...0.xlog regardless of the actual xlog
      directory contents or the actual vclock.
      
      Let's fix this by not writing an empty xlog on exit if wal writer's
      vclock is zero. It means the writer either wasn't enabled yet, or was
      enabled, but hasn't written anything. In both cases writing the empty
      xlog is extraneous.
      
      Closes #8704
      
      NO_DOC=bugfix
      
      (cherry picked from commit f78908c9)
      9f65512e
    • Serge Petrenko's avatar
      replication: fix waiting for remote ballot updates during bootstrap · 7c057ec3
      Serge Petrenko authored
      There was a problem in applier_wait_bootstrap_leader_uuid_is_set():
       * it didn't set an error cause when the ballot watcher was dead,
         leading to a segfault when trying to diag_add() to the cause.
       * it didn't expect the ballot watcher to exit without an error
         before the bootstrap leader uuid is known, hanging forever.
      
      The first issue could be reproduced by trying to bootstrap a replicaset
      consisting of both "old" (Tarantool 2.10 or less) and "new" instances.
      Or by bootstrapping a replicaset consisting of "new" instances and
      stopping some of them in a specific manner.
      
      The second issue could be reproduced only by manually broadcasting an
      empty "internal.ballot" event.
      
      Fix both issues. The first one by setting an ER_UNKNOWN error when
      the ballot watcher fiber is dead. And the second one by checking if the
      ballot watcher has died during waiting for the ballot update.
      
      The test only covers the first issue: the second one can only happen
      with manual intervention and is hard to test: it involves broadcasting
      an empty "internal.ballot" from the replica while it's still connecting
      to remote nodes during an initial `box.cfg{}` call.
      
      Closes #8757
      
      NO_DOC=bugfix
      
      (cherry picked from commit 1c25efb4)
      7c057ec3
    • Georgiy Lebedev's avatar
      box: fix typo in pagination `after` position option validation · 98a3d5f0
      Georgiy Lebedev authored
      Closes #8716
      
      NO_DOC=bugfix
      
      (cherry picked from commit 3d0afc60)
      98a3d5f0
  6. Jun 15, 2023
    • Georgiy Lebedev's avatar
      build: replace `string` with JOIN option to `string_join` helper in CMake · 2de4cb07
      Georgiy Lebedev authored
      `string` with JOIN option is only available since CMake 3.12, but we have
      developers using CMake 3.1: implement a utility `string_join` function to
      remove this dependency.
      
      Closes #5881
      
      NO_CHANGELOG=<build fix>
      NO_DOC=<build fix>
      NO_TEST=<build fix>
      
      (cherry picked from commit 55298308)
      2de4cb07
    • Vladimir Davydov's avatar
      xrow: fix large bit shift error in xrow_decode_dml · d64a639b
      Vladimir Davydov authored
      Reported by ASAN. The issue was fixed in the master branch in commit
      b9550f19 ("box: support space and index names in IPROTO requests").
      
      NO_TEST=asan
      NO_DOC=bug fix
      NO_CHANGELOG=minor
      d64a639b
    • Vladimir Davydov's avatar
      xrow: ignore unknown IPROTO keys on decode · b6e31b42
      Vladimir Davydov authored
      The xrow_decode_* functions are written in such a way that they ignore
      unknown IPROTO keys. This is required for connectivity between different
      Tarantool version. However, there's bug in the code connected with the
      value type checking: we fail if the key is >= iproto_key_MAX. This
      worked fine as long as we added new IPROTO keys in the middle of the key
      space, without bumping iproto_key_MAX, but this assumption broke when we
      added IPROTO_AUTH_TYPE. The issue is exacerbated by the fact that
      IPROTO_AUTH_TYPE is used by IPROTO_ID, which is sent unconditionally on
      connect. Let's fix the value type check and add some tests.
      
      Notes:
       - xrow_decode_heartbeat turns out to be unused. Drop it.
       - Fix the net.box helpers response_body_decode and netbox_decode_table
         to handle unknown keys and empty body. This is needed to properly
         decode a response to an injection in tests.
       - Testing unknown keys in replication requests would be complicated.
         Instead we add a bunch of unit tests.
       - Convert the xrow unit test to TAP.
      
      Closes #8745
      
      NO_DOC=bug fix
      
      (cherry picked from commit ee0660b8)
      b6e31b42
    • Georgiy Lebedev's avatar
      box: refactor net.box response body decoding · 3c9b9b93
      Georgiy Lebedev authored
      Response body decoding of DML and call/eval requests is very ad-hoc and
      hard to extend: introduce a new `response_body_decode` helper that decodes
      the response body similarly to `xrow_decode_dml` — this will allow to
      separate decoding from processing.
      
      Needed for #8147
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      
      (cherry picked from commit 1d6043fa)
      3c9b9b93
  7. Jun 13, 2023
  8. Jun 08, 2023
  9. Jun 07, 2023
  10. Jun 06, 2023
    • Vladimir Davydov's avatar
      test: write bad snap with errinj in gh_7974_force_recovery_bugs_test · 90d9d88f
      Vladimir Davydov authored
      Corrupted snap files that are used in the test were generated manually
      using a now old Tarantool version that has an outdated system schema.
      In the scope of #7149 DDL was forbidden until the system schema is
      upgraded. The problem is luatest tries to grant super privileges to
      the guest user (which is a DDL operation) after starting a test instance
      unless they are already granted. Since the snap files don't store the
      required privileges, luatest fails.
      
      To fix this issue, let's generate corrupted snap files right in the test
      using error injection.
      
      Closes #8702
      
      NO_DOC=test
      NO_CHANGELOG=test
      
      (cherry picked from commit 67598073)
      90d9d88f
  11. Jun 02, 2023
    • Andrey Saranchin's avatar
      test: fix flaky feedback_daemon_metrics test · ea6064dd
      Andrey Saranchin authored
      The patch fixes two mistakes that were made while writing the test.
      Firstly, we should replace metrics module with an empty table before
      every test case - otherwise, metrics can be collected before the test.
      Secondly, all attempts to check if a required amount of metrics was
      collected is pointless, even with margin - some environments are
      incredibly slow. So let's check if some metrics were collected, and check
      if there are not too many of them.
      
      Also, the patch increases some timeouts to minimize the probability of
      fail due to slow environment.
      
      Follows up #8192
      Closes tarantool/tarantool-qa#312
      
      NO_CHANGELOG=test
      NO_DOC=test
      
      (cherry picked from commit d187b142)
      ea6064dd
    • Oleg Chaplashkin's avatar
      metrics: bump to new version · f28d9e24
      Oleg Chaplashkin authored
      Bump the metrics submodule to 1.0.0-2-gea83227 version.
      
      NO_DOC=metrics submodule bump
      NO_TEST=metrics submodule bump
      NO_CHANGELOG=metrics submodule bump
      
      (cherry picked from commit 2780e8e0)
      f28d9e24
Loading