Skip to content
Snippets Groups Projects
  1. Jun 23, 2023
    • Aleksandr Lyapunov's avatar
      memtx: refactor handling point write · 513d2c0a
      Aleksandr Lyapunov authored
      There's a special 'point hole' mechanism in mvcc transactional
      manager that manages point gap reads by full key when no raw
      tuple was found in the index. For instance, it's the only way
      to collect gap reads for non-tree indexes.
      
      Once a new tuple is inserted to the index, the read records are
      transferred to the normal read set in the corresponding story.
      Actually after that the 'point hole' record in no more needed.
      
      So let's remove it.
      
      While we are here, drop unused point_holes_size, improve names
      and comments.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 62c65639)
      513d2c0a
    • Aleksandr Lyapunov's avatar
      memtx: refactor mvcc story linking to the top of chain · 8b3f7425
      Aleksandr Lyapunov authored
      Before this patch there were several different places in the code
      that deal with referencing tuple in space, setting in_index member
      and marking the story as retained or not. But logically all above
      is about the same - about placing a story to the top of a chain,
      i.e. the first story in version list to which index points.
      
      This commit refactors these things a bit. This mostly relates to
      two functions - memtx_tx_story_new and memtx_tx_story_link_top.
      
      Changes in memtx_tx_story_new are based on the fact that if a story
      is created by tuple, it is or immediately will be at the top of
      chain. Considering this we can omit argument `is_referenced_to_pk`
      and always create a story ready to be in top of chain. If a story
      is already in the top - nothing else is needed; if it is to become
      the top - memtx_tx_story_link_top must be called after.
      
      Further, linking to top of chain is needed exactly in two cases:
      * if a story just created by memtx_tx_story_new must become a top
      * if a chain is reordered involving the top story (the top and the
        next stories are swapped)
      These two cases are logically very close but still different.
      Even more, previously there were two functions for that:
      memtx_tx_story_link_top_light and memtx_tx_story_link_top
      correspondingly. This commit introduces one function for that
      (although with one more argument) that also incapsulates
      activities about referencing tuples and marking stories as
      retained.
      
      After this patch the rules are logical and simple:
      * if a tuple is inserted - call _story_new and _link_top(.. true).
      * if a story of existing clean tuple is needed - call _story_new.
      * if a chain is reordered involving top story - _link_top(.. false).
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 202340b7)
      8b3f7425
    • Aleksandr Lyapunov's avatar
      memtx: use xallocation in mvcc engine · 04f9f1f9
      Aleksandr Lyapunov authored
      Remove runtime allocation error handling and use panic-on-fail
      versions of allocation functions. Reasons for that:
      * Memory error handling was never tested an probably doesn't work.
      * Some return codes was ignored so the code had obvious flaws.
      * Rollback in case of memory error made some code overcomplicated.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=no new functionality added
      NO_TEST=no new functionality added
      NO_CHANGELOG=no new functionalily added
      
      (cherry picked from commit c951c9de)
      04f9f1f9
    • Aleksandr Lyapunov's avatar
      memtx: replace conflict trackers with read trackers · 7ae8c354
      Aleksandr Lyapunov authored
      Conflict trackers are used to store information that if some
      transaction is committed then some another transaction must be
      aborted. This happens when the first transaction writes some
      key while the other reads the same key. On the other hand there
      are another trackers - read trackers - that are designed to
      handle exactly the same situation. That's why conflict trackers
      can be simply replaced with read trackers.
      
      That would allow to remove conflict trackers as not needed
      anymore.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit a6c2b9ff)
      7ae8c354
    • Aleksandr Lyapunov's avatar
      memtx: refactor rollbacked stories · c80c6ed7
      Aleksandr Lyapunov authored
      If addition of a tuple is rolled back while the corresponding
      story is needed for something else (for example it stores a read
      set of another transaction) - the story cannot be deleted.
      Now there's a special flag `rollbacked` that is set to true
      for such stories, and the flag must be considered in places
      where history chains are scanned. That approach also requires
      psn to be set for rolled-back transactions, which surprisingly
      not as simple as it to say. All that makes the code complicated
      and hard to maintain.
      
      There's another approach for managing rolled back stories: simply
      set their del_psn to a low enough value (lower than any existing
      transaction's PSN) and (if necessary) push them to the end of
      history chain. Such a story would be invisible to any transaction
      due to already existing mechanisms, that's what is needed.
      
      In order to provide "low enough" del_psn it will be natural to
      assign real PSN starting from some predefined value, so any value
      below that predefined value will be less that any existing PSN and
      thus "low enough".
      
      Implement this more simple approach.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit ba394a58)
      c80c6ed7
    • Aleksandr Lyapunov's avatar
      memtx: add a couple of test cases to tx_man.test · 3d90f662
      Aleksandr Lyapunov authored
      That's strange, but in this test in a group of simple test cases
      there are test cases that checks replaces, updates and deletes,
      but occasionally there's no test case that checks inserts.
      
      Fix it and add simple test cases for inserts.
      
      No logical changes.
      
      Part of #8648
      Part of #8654
      
      NO_DOC=new test case
      NO_CHANGELOG=new test case
      
      (cherry picked from commit 37b4561f)
      3d90f662
    • Georgiy Lebedev's avatar
      box: fix memory leaks on `ER_MULTISTATEMENT_TRANSACTION` in DDL · c219f1ee
      Georgiy Lebedev authored
      Space index build and space format checking operations don't destroy space
      iterator on `txn_check_singlestatement` failure — fix this.
      
      Closes #8773
      
      NO_DOC=bugfix
      NO_TEST=<leak happens in small, cannot be detected by sanitizer>
      
      (cherry picked from commit 6689f511)
      c219f1ee
    • Georgiy Lebedev's avatar
      core: use `malloc` instead of `region` allocator in procedure name cache · 95cdf5a4
      Georgiy Lebedev authored
      The procedure name cache uses a region for hash table entry allocation, but
      the cache is a thread local variable, while the `region` uses a `slab`
      allocator the lifetime of which is bounded by `main`, while thread locals
      are destroyed after `main`: use the `malloc` allocator instead.
      
      Benchmark from #7207 shows that this change does not effect performance.
      
      Closes #8777
      
      NO_CHANGELOG=<leak does not affect users>
      NO_DOC=bugfix
      NO_TEST=<detected by ASAN>
      
      (cherry picked from commit 7135910e)
      95cdf5a4
  2. Jun 22, 2023
    • Sergey Bronnikov's avatar
      test: testing tarantool in background mode · 9351a095
      Sergey Bronnikov authored
      Before a commit ec1af129 ("box: do not close xlog file descriptors in
      the atfork handler") there was a bug when Tarantool with enabled
      background mode via environment variable could lead a crash:
      
      NO_WRAP
      ```
      $ TT_PID_FILE=tarantool.pid TT_LOG=tarantool.log TT_BACKGROUND=true TT_LISTEN=3301 tarantool -e 'box.cfg{}'
      $ tail -3 tarantool.log
      2021-11-02 16:05:43.672 [2341202] main init.c:696 E> LuajitError: cannot read stdin: Resource temporarily unavailable
      2021-11-02 16:05:43.672 [2341202] main F> fatal error, exiting the event loop
      2021-11-02 16:05:43.672 [2341202] main F> fatal error, exiting the event loop
      ```
      NO_WRAP
      
      With commit ec1af129 ("box: do not close xlog file descriptors in
      the atfork handler") described bug could not be reproduced.
      
      Proposed patch adds a test that starts Tarantool in background mode
      enabled via box.cfg option and via environment variable TT_BACKGROUND to
      make sure this behaviour will not be broken in a future.
      
      Closes #6128
      
      NO_DOC=test
      
      (cherry picked from commit f676fb7c)
      9351a095
  3. Jun 20, 2023
  4. Jun 19, 2023
    • Yaroslav Lobankov's avatar
      ci: extend default tests run with osx wokflows · 047b7a3c
      Yaroslav Lobankov authored
      It was decided to include the `osx_debug.yml` and `osx_release.yml`
      workflows to the default tests run (without the `full-ci` label).
      Now we can get test results for macOS faster and without an extra
      load on CI.
      
      NO_DOC=ci
      NO_TEST=ci
      NO_CHANGELOG=ci
      
      (cherry picked from commit de404cb0)
      047b7a3c
    • Nikita Zheleztsov's avatar
      limbo: set user for triggers on sync transaction · 81bfb000
      Nikita Zheleztsov authored
      Commit/rollback triggers are run asynchronously, upon receiving the
      write status from WAL. We can't run them in the original fiber that
      submitted the WAL request, because it would open a time window between
      writing a transaction to WAL and committing it in tx, which could lead
      to violating the cascading rolback principles. As a result,
      commit/rollback triggers run with admin privileges.
      
      The issue was already solved for confirming async transaction, but
      session and user are still not correct, when the transaction is
      confirmed by the limbo. Let's fix this issue by temporarily setting
      session and credentials to the original fiberfor running
      commit/rollback triggers.
      
      Closes #8742
      
      NO_DOC=bugfix
      
      (cherry picked from commit 8cd0cd09)
      81bfb000
    • Nikita Zheleztsov's avatar
      limbo: fix commit/rollback failures with triggers · c9d5e5b3
      Nikita Zheleztsov authored
      Currently some transactions on synchronous space fail to complete with
      the `ER_CURSOR_NO_TRANSACTION` error, when on_rollback/on_commit triggers
      are set.
      
      This is caused due to the fact, that some rollback/commit triggers
      require in_txn fiber variable to be set but it's not done when a
      transaction is completed from the limbo. Callbacks, which are used to
      work with iterators (`lbox_txn_pairs` and `lbox_txn_iterator_next`),
      acquire tnx statements from the current transactions, but they cannot
      do that, when this transaction is not assigned to the current fiber, so
      `ER_CURSOR_NO_TRANSACTION` is thrown.
      
      Let's assign in_txn variable when we complete transaction from the limbo.
      Moreover, let's add assertions, which check whether in_txn() is correct,
      in order to be sure, that `txn_complete_success/fail` always run with
      in_txn set.
      
      Closes #8505
      
      NO_DOC=bugfix
      
      (cherry picked from commit 6fadc8a0)
      c9d5e5b3
  5. Jun 15, 2023
    • Georgiy Lebedev's avatar
      build: replace `string` with JOIN option to `string_join` helper in CMake · 3250910f
      Georgiy Lebedev authored
      `string` with JOIN option is only available since CMake 3.12, but we have
      developers using CMake 3.1: implement a utility `string_join` function to
      remove this dependency.
      
      Closes #5881
      
      NO_CHANGELOG=<build fix>
      NO_DOC=<build fix>
      NO_TEST=<build fix>
      
      (cherry picked from commit 55298308)
      3250910f
    • Vladimir Davydov's avatar
      xrow: fix large bit shift error in xrow_decode_dml · 992ea525
      Vladimir Davydov authored
      Reported by ASAN. The issue was fixed in the master branch in commit
      b9550f19 ("box: support space and index names in IPROTO requests").
      
      NO_TEST=asan
      NO_DOC=bug fix
      NO_CHANGELOG=minor
      
      (cherry picked from commit d64a639b)
      992ea525
    • Vladimir Davydov's avatar
      xrow: ignore unknown IPROTO keys on decode · 5bbc2b6e
      Vladimir Davydov authored
      The xrow_decode_* functions are written in such a way that they ignore
      unknown IPROTO keys. This is required for connectivity between different
      Tarantool version. However, there's bug in the code connected with the
      value type checking: we fail if the key is >= iproto_key_MAX. This
      worked fine as long as we added new IPROTO keys in the middle of the key
      space, without bumping iproto_key_MAX, but this assumption broke when we
      added IPROTO_AUTH_TYPE. The issue is exacerbated by the fact that
      IPROTO_AUTH_TYPE is used by IPROTO_ID, which is sent unconditionally on
      connect. Let's fix the value type check and add some tests.
      
      Notes:
       - xrow_decode_heartbeat turns out to be unused. Drop it.
       - Fix the net.box helpers response_body_decode and netbox_decode_table
         to handle unknown keys and empty body. This is needed to properly
         decode a response to an injection in tests.
       - Testing unknown keys in replication requests would be complicated.
         Instead we add a bunch of unit tests.
       - Convert the xrow unit test to TAP.
      
      Closes #8745
      
      NO_DOC=bug fix
      
      (cherry picked from commit ee0660b8)
      5bbc2b6e
    • Georgiy Lebedev's avatar
      box: refactor net.box response body decoding · 81ce2185
      Georgiy Lebedev authored
      Response body decoding of DML and call/eval requests is very ad-hoc and
      hard to extend: introduce a new `response_body_decode` helper that decodes
      the response body similarly to `xrow_decode_dml` — this will allow to
      separate decoding from processing.
      
      Needed for #8147
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      
      (cherry picked from commit 1d6043fa)
      81ce2185
  6. Jun 13, 2023
  7. Jun 08, 2023
  8. Jun 07, 2023
  9. Jun 06, 2023
    • Vladimir Davydov's avatar
      test: write bad snap with errinj in gh_7974_force_recovery_bugs_test · ab299c6a
      Vladimir Davydov authored
      Corrupted snap files that are used in the test were generated manually
      using a now old Tarantool version that has an outdated system schema.
      In the scope of #7149 DDL was forbidden until the system schema is
      upgraded. The problem is luatest tries to grant super privileges to
      the guest user (which is a DDL operation) after starting a test instance
      unless they are already granted. Since the snap files don't store the
      required privileges, luatest fails.
      
      To fix this issue, let's generate corrupted snap files right in the test
      using error injection.
      
      Closes #8702
      
      NO_DOC=test
      NO_CHANGELOG=test
      
      (cherry picked from commit 67598073)
      ab299c6a
  10. Jun 02, 2023
    • Oleg Chaplashkin's avatar
      test: ban direct calling of box.cfg() · 67dc40e2
      Oleg Chaplashkin authored
      Direct call and configuration of the runner instance is prohibited. Now
      if you need to test something with specific configuration use a server
      instance please (see luatest.Server module).
      
      In-scope-of tarantool/luatest#245
      
      NO_DOC=ban calling box.cfg
      NO_TEST=ban calling box.cfg
      NO_CHANGELOG=ban calling box.cfg
      
      (cherry picked from commit fc3426d8)
      67dc40e2
  11. May 29, 2023
    • Serge Petrenko's avatar
      raft: fix spurious split-vote · 3e0229fb
      Serge Petrenko authored
      Due to a typo raft candidate counted a vote for another node as a vote
      for self in its split-vote detector. This could lead to spurious
      split-vote detection in cases when another node wins elections with a bare
      minimum of votes for it (exactly a quorum of votes).
      
      Closes #8698
      
      NO_DOC=bugfix
      
      (cherry picked from commit 2afde5b1)
      3e0229fb
    • Serge Petrenko's avatar
      raft: make promote bump term and vote at once · 27c550cf
      Serge Petrenko authored
      box.ctl.promote() was implemented as follows: an instance bumps the
      term and marks itself a candidate, but doesn't vote for self
      immediately. Instead it relies on the machinery which makes a candidate
      vote for self as soon as it persists a new term.
      
      This differs from a normal election start due to leader timeout: there
      term and vote are bumped at once.
      
      Besides, this increases probability of box.ctl.promote() resulting in
      other node getting elected: if a node first broadcasts a term without a
      vote, it is not considered a candidate, so other candidates might start
      elections and vote for themselves.
      
      Let's bring promote into line with automatic elections.
      
      Closes #8497
      
      NO_DOC=bugfix
      
      (cherry picked from commit 17371215)
      27c550cf
    • Serge Petrenko's avatar
      raft: persist vote for self together with term bump · 657e3f92
      Serge Petrenko authored
      Commit c9155ac8 ("raft: persist new term and vote separately") made
      the nodes persist new term and vote separately, using 2 WAL writes.
      Writing the term first is needed to flush all the ongoing transactions,
      so that the node's vclock is updated and can be checked against the
      candidate's vclock. Otherwise it could happen that the node persists a
      vote for some candidate only to find that it's vclock would actually
      become incomparable with the candidate's.
      
      Actually, this guard is not needed when checking a vote for self,
      because a node can always vote for self. Besides, splitting term bump
      and vote can lead to increased probability of split-vote. It may happen
      that a candidate bumps and broadcasts the new term without a vote,
      making other nodes vote for self. Let's go back to writing term and vote
      together for self votes.
      
      This change makes raft candidate persist term bump and vote for self in
      one WAL write instead of two, so all the tests which count WAL writes or
      expect 2 separate state updates for term and vote are rewritten.
      
      Prerequisite #8497
      
      NO_DOC=not user-visible
      NO_CHANGELOG=not user-visible
      
      (cherry picked from commit 8a124e50)
  12. May 24, 2023
  13. May 23, 2023
    • Igor Munkin's avatar
      luajit: bump new version · 43297db7
      Igor Munkin authored
      * LJ_GC64: Make ASMREF_L references 64 bit.
      * lldb: introduce luajit-lldb
      * x64/LJ_GC64: Fix emit_rma().
      * Limit path length passed to C library loader.
      
      Part of #4808
      Part of #8069
      Part of #8516
      
      NO_DOC=LuaJIT submodule bump
      NO_TEST=LuaJIT submodule bump
      43297db7
    • Nikita Zheleztsov's avatar
      replication: replicaset state machine assert fail · 9128f50e
      Nikita Zheleztsov authored
      Currently replicaset state machine tracking the number of connected,
      loading and synced appliers may perform unnecessary decrementing of
      their count. On debug version this may lead to assertion failure.
      Here's the way it may happen:
        1. Any kind of exception occurs in applier thread and leads to
           invoking its destructor (applier_thread_data_destroy), which
           is set with scoped guard;
        2. Cbus call is made in order to remove the corresponding applier
           from the thread. According to the fact that cbus_call is
           synchronous, we yield, waiting for the result from the applier
           thread.
        3. During yielding user calls reconfiguration, which invokes
           replicaset_update. Old appliers are pruned: for every replica
           trigger on changing state machine counter is deleted after which
           we stop fiber and wait its join.
        4. If the first replica in replicaset_foreach is not the errored
           one and the errored fiber wakes up during yielding with
           fiber_join, then zero decrementing happens.
      
      Let's clear the above mentioned triggers for all replicas at the
      first place and only after that stop and join their applier fibers.
      
      Closes #7590
      
      NO_DOC=bugfix
      
      (cherry picked from commit 7ec82674)
      9128f50e
Loading