Skip to content
Snippets Groups Projects
  1. Sep 29, 2022
    • Serge Petrenko's avatar
      gc: replace vclockset_psearch with _match in wal_collect_garbage_f · d6fc95f6
      Serge Petrenko authored
      When using vclockset_psearch, the resulting vclock may be incomparable
      to the search key. For example, with a vclock set { } (empty vclock),
      {0: 1, 1: 10}, {0: 2, 1:11} vclockset_psearch(set, {0:2, 1: 9}) might
      return {0: 1, 1: 10}, and not { }.
      This is known and avoided in other places, for example
      recover_remaining_wals(), where vclockset_match() is used instead.
      vclockset_match() starts with the same result as vclockset_psearch() and
      then unwinds the result until the first vclock which is less or equal to
      the search key is found.
      
      Having vclockset_psearch in wal_collect_garbage_f could lead to issues
      even before local space changes became written to 0-th vclock component.
      Once replica subscribes, its' gc consumer is set to the vclock, which
      the replica sent in subscribe request. This vclock might be incomparable
      with xlog vclocks of the master, leading to the same issue of
      potentially deleting a needed xlog during gc.
      
      Closes #7584
      
      NO_DOC=bugfix
      
      (cherry picked from commit c63bfb9a)
      d6fc95f6
  2. Sep 28, 2022
    • Georgiy Lebedev's avatar
      memtx: fix transaction manager MVCC invariant violation · 1fac9eef
      Georgiy Lebedev authored
      We hold the following invariant in MVCC: the story at the top of the
      history chain is present in index.
      
      If a story is subject to be deleted from index and there is an older story
      in the history chain, the older story starts to be at the top of the
      history chain and is not present in index, which violates our invariant:
      explicitly check for this case when evaluating whether a story can be
      garbage collected and add an assertion to check the invariant above is not
      violated.
      
      Rollbacked stories need to be handled in a special way: they are
      present at the end of some history chains and completely unlinked from
      others (which also implies they are not present in the corresponding
      indexes).
      
      `memtx_tx_story_full_unlink` is called in two contexts: space deletion, in
      which we delete all stories, and garbage collection step — the former case
      can break the invariant described above, while the latter must preserve it,
      hence add two different functions for the corresponding contexts.
      
      Closes #7490
      
      NO_CHANGELOG=<internal bugfix not user observable>
      NO_DOC=<bugfix>
      
      (cherry picked from commit c8eccfbb)
      1fac9eef
    • Georgiy Lebedev's avatar
      memtx: rework transaction rollback · 61be2c8f
      Georgiy Lebedev authored
      When we rollback a transaction statement, we relink its read trackers
      to a newer story in the history chain, if present (6c990a7b), but we do not
      handle the case when there is no newer story.
      
      If there is an older story in the history chain, we can relink the
      rollbacked story's reader to it, but if the rollbacked story is the
      only one left, we need to retain it, because it stores the reader list
      needed for conflict resolution — such stories are distinguished by the
      rollbacked flag, and there can be no more than one such story located
      strictly at the end of a given history chain (which means a story can be
      fully unlinked from some indexes and present at the end of others).
      
      There are several nuances we need to account for:
      
      Firstly, such rollbacked stories must be impossible to read from an index:
      this is ensured by `memtx_tx_story_is_visible`.
      
      Secondly, rollbacked transactions need to be treated as prepared with
      stories that have `add_psn == del_psn`, so that they are correctly deleted
      during garbage collection.
      
      After this logical change we have the following partially ordered set over
      tuple stories:
      ———————————————————————————————————————————————————————> serialization time
      |- - - - - - - -|— — — — — -|— — — — — |— — — — — — -|— — — — — — — -
      | No more than  | Committed | Prepared | In-progress | One dirty
      | one rollbacked|           |          |             | story in index
      | story         |           |          |             |
      |- - - - - - - -|— — — — — -| — — — — —|— — — — — — -|— — — — — — — —
      
      Closes #7343
      
      NO_DOC=bugfix
      
      (cherry picked from commit 56cf737c)
      61be2c8f
    • Georgiy Lebedev's avatar
      memtx: remove redundant `space` field from `struct memtx_story` · 8ee6a2f2
      Georgiy Lebedev authored
      `struct memtx_story` has a `space` field, which is basically used
      to identify that a tuple is unlinked from the history chain in
      `memtx_tx_index_invisible_count_slow` (though this can be determined by its
      presence in the index) and is used to get the space's index in
      `memtx_tx_story_link_top` (though it can be  retrieved from the older
      story's link field): remove this redundant field.
      
      Needed for #7343
      
      NO_CHANGELOG=<refactoring>
      NO_DOC=<refactoring>
      NO_TEST=<refactoring>
      
      (cherry picked from commit 55e64a8d)
      8ee6a2f2
    • Georgiy Lebedev's avatar
      memtx: refactor story cleanup on space delete · 0533d097
      Georgiy Lebedev authored
      When a space is deleted, all transactions need to be aborted and all their
      stories need to be removed immediately out of order: currently we
      artificially rollback statements — instead call this statement
      removal to logically distinguish it from rollback. It differs in the sense
      that the whole space's tuple history is teared down instead — no more
      transaction managing is going to be done as opposed to rollback of an
      individual transaction.
      
      Needed for #7343
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      
      (cherry picked from commit 88203d4f)
      0533d097
    • Georgiy Lebedev's avatar
      memtx: refactor `memtx_tx_history_rollback_stmt` · 81fede2f
      Georgiy Lebedev authored
      Follow `memtx_tx_history_{add, prepare}_{insert, delete}` pattern: split
      code responsible for rollbacking addition and deletion of a story into
      separate functions.
      
      Needed for #7343
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactorin
      
      (cherry picked from commit 9dd27681)
      81fede2f
    • Georgiy Lebedev's avatar
      memtx: refactor removing of story's delete statements · a3f136cf
      Georgiy Lebedev authored
      When a statement gets rollbacked, we need to remove delete statements
      attached to the story it adds by relinking them and making them delete an
      older story in the history chain: refactor this loop out into a separate
      function.
      
      Needed for #7343
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      
      (cherry picked from commit 1da727f6)
      a3f136cf
    • Georgiy Lebedev's avatar
      memtx: refactor sinking of story added by prepared statement · f0c3ccb8
      Georgiy Lebedev authored
      If a statement becomes prepared, the story it adds must be 'sunk' to
      the level of prepared stories: refactor this loop into a
      separate function.
      
      Needed for #7343
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      
      (cherry picked from commit b25d3729)
      f0c3ccb8
  3. Sep 26, 2022
    • Vladislav Shpilevoy's avatar
      xrow: fix crash on nested map/array update ops · 3def2916
      Vladislav Shpilevoy authored
      If an update operation tried to insert a new key into a map or an
      array which was created by a previous update operation, then the
      process would fail an assertion.
      
      That was because the first operation was stored as a bar update.
      The second operation tried to branch it assuming that the entire
      bar update's JSON path must exist, but it wasn't so for the newly
      created part of the path.
      
      The solution is to fallback to branching earlier than the entire
      bar path ends, if can see that the next part of the path can't be
      found.
      
      Closes #7705
      
      NO_DOC=bugfix
      
      (cherry picked from commit 8425ebfc)
      3def2916
  4. Sep 23, 2022
    • Georgiy Lebedev's avatar
      memtx: track `index:random` reads and clarify result · 2af84a85
      Georgiy Lebedev authored
      TREE (HASH) index implements `random` method: if the space is empty from
      the transaction's perspective, which means we have to return nothing, add
      gap tracking of whole range (full scan
      tracking), since this result is equivalent to `index:select{}`, otherwise
      repeatedly call `random` and clarify result, until we get a non-empty one.
      We do not care about performance here, since all operations in context of
      transaction management currently have O(number of dirty tuples)
      complexity.
      
      Closes #7670
      
      NO_DOC=bugfix
      
      (cherry picked from commit 1b82beb2)
      2af84a85
    • Vladimir Davydov's avatar
      salad: add LIGHT(random) method · a647b1d8
      Vladimir Davydov authored
      This commit moves the code that gets the index of a random light
      record from the memtx hash index implementation to a new light method.
      This gives us more freedom of refactoring the light internals without
      modifying the code using it.
      
      After this change, LIGHT(pos_valid) isn't needed anymore so it's
      inlined in LIGHT(random).
      
      Needed for #7192
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 76add786)
      a647b1d8
    • Georgiy Lebedev's avatar
      memtx: refactor `index_def_new` · 07c0d3a6
      Georgiy Lebedev authored
      Since `key_def_merge` sets the merged key definition's unique part count
      equal to the new part count, the extra assignment in case the index is not
      unique is redundant: remove it.
      
      NO_CHANGELOG=<refactoring>
      NO_DOC=<refactoring>
      NO_TEST=<refactoring>
      
      (cherry picked from commit 1d6c92e5)
      07c0d3a6
    • Georgiy Lebedev's avatar
      memtx: fix TREE index `get` check for part count · b9d62fca
      Georgiy Lebedev authored
      If TREE index `get` result is empty, the key part count is incorrectly
      compared to the tree's `cmp_def->part_count`, though it should be compared
      with `cmp_def->unique_part_count`. But we can actually assume that by the
      time we get to the index's `get` method the part count is equal to the
      unique part count (partial keys are rejected and `get` is not
      supported for non-unique indexes): change check to correct assertion.
      
      Closes #7685
      
      NO_DOC=<bugfix>
      
      (cherry picked from commit bfcd8ca7)
      b9d62fca
  5. Sep 21, 2022
    • Boris Stepanenko's avatar
      limbo: fix assertions in box_issue_de/promote · 65b3bad6
      Boris Stepanenko authored
      Replaced assertions, that no one started new elections/promoted while
      acquiring limbo, with checks that raft term and limbo term didn't
      change. In case they did - don't write DEMOTE/PROMOTE and just release
      limbo, because it's already owned/will soon be by someone else.
      
      Closes #7086
      
      NO_DOC=Bugfix
      
      (cherry picked from commit 8ee0e434)
      65b3bad6
  6. Sep 16, 2022
  7. Sep 15, 2022
    • Yaroslav Lobankov's avatar
      test: bump test-run to new version · 18b8a80d
      Yaroslav Lobankov authored
      Bump test-run to new version with the following improvements:
      
      - Improve getting iproto port for tarantool < 2.4.1 [1]
      
      [1] https://github.com/tarantool/test-run/pull/349
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      
      (cherry picked from commit 4668db62)
      18b8a80d
    • Ilya Verbin's avatar
      cmake: add extra security compiler options · ce4b08eb
      Ilya Verbin authored
      Introduce cmake option ENABLE_HARDENING, which is TRUE by default for
      non-debug regular and static builds, excluding AArch64 and FreeBSD.
      It passess compiler flags that harden Tarantool (including the bundled
      libraries) against memory corruption attacks. The following flags are
      passed:
      
      * -Wformat - Check calls to printf and scanf, etc., to make sure that
        the arguments supplied have types appropriate to the format string
        specified.
      
      * -Wformat-security -Werror=format-security - Warn about uses of format
        functions that represent possible security problems. And make the
        warning into an error.
      
      * -fstack-protector-strong - Emit extra code to check for buffer
        overflows, such as stack smashing attacks.
      
      * -fPIC -pie - Generate position-independent code (PIC). It allows to
        take advantage of the Address Space Layout Randomization (ASLR).
      
      * -z relro -z now - Resolve all dynamically linked functions at the
        beginning of the execution, and then make the GOT read-only.
      
      Also do not disable hardening for Debian and RPM-based Linux distros.
      
      Closes #5372
      Closes #7536
      
      NO_DOC=build
      NO_TEST=build
      
      (cherry picked from commit e6abe1c9)
      ce4b08eb
    • Georgiy Lebedev's avatar
      memtx: fix 'use after free' of garbage collected MVCC stories · 0daf8382
      Georgiy Lebedev authored
      `directly_replaced` stories can potentially get garbage collected in
      `memtx_tx_handle_gap_write`, which is unexpected and leads to 'use after
      free': in order to fix this, limit garbage collection points only to
      external API calls.
      
      Wrap all possible garbage collection points with explicit warnings (see
      c9981a56).
      
      Closes #7449
      
      NO_DOC=bugfix
      
      (cherry picked from commit 18e042f5)
      0daf8382
  8. Sep 14, 2022
    • Alexander Turenko's avatar
      lua/merger: fix use-after-free during iteration · f9aecfb8
      Alexander Turenko authored
      All merge sources (including the merger itself) share the same
      `<merge source>:pairs()` implementation, which returns `gen, param,
      state` triplet. `gen` is `lbox_merge_source_gen()`, `param` is `nil`,
      `state` in the merge source.
      
      The `lbox_merge_source_gen()` returns `source, tuple`. The returned
      source is supposed to be the same object as a one passed to the function
      (`gen(param, state)`), so the function assumes the object as alive and
      don't increment source's refcounter at entering, don't decrease it at
      exitting.
      
      This logic is perfect, but there was a mistake in the implementation:
      the function returns a new cdata object (which holds the same pointer to
      the merge source structure) instead of the same cdata object.
      
      The new cdata object neither increases the source's refcounter at
      pushing to Lua, nor decreases it at collecting. At result, if we'll loss
      the original merge source object (and the first `state` that is returned
      from `:pairs()`), the source structure may be freed. The pointer in the
      new cdata object will be invalid so.
      
      A sketchy code that illustrates the problem:
      
      ```lua
      gen, param, state0 = source:pairs()
      assert(state0 == source)
      source = nil
      state1, tuple = gen(param, state0)
      state0 = nil
      -- assert(state1 == source) -- would fails
      collectgarbage()
      -- The cdata object that is referenced as `source` and as `state`
      -- is collected. The GC handler is called and dropped the merge
      -- source structure refcounter to zero. The structure is freed.
      -- The call below will crash.
      gen(param, state1)
      ```
      
      In the fixed code `state1 == source`, so the GC handler is not called
      prematurely: we have the merge source object alive till the end of the
      iterator or till the stop of the traversal.
      
      Fixes #7657
      
      NO_DOC=a crash is definitely not what we want to document
      
      (cherry picked from commit 3bc64229)
      Unverified
      f9aecfb8
  9. Sep 13, 2022
    • Yaroslav Lobankov's avatar
      test: slight refactoring of replication-py tests · 852770c2
      Yaroslav Lobankov authored
      - Remove unused imports
      - Remove unnecessary creation of 'replica' instance objects
      - Use `<instance>.iproto.uri` object attribute instead of calling
        `box.cfg.listen` via admin connection
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      
      (cherry picked from commit d13b06bd)
      852770c2
    • Yaroslav Lobankov's avatar
      test: bump test-run to new version · 96dfd98b
      Yaroslav Lobankov authored
      Bump test-run to new version with the following improvements:
      
      - Report job summary on GitHub Actions [1]
      - Free port auto resolving for TarantoolServer and AppServer [2]
      
      Also, this patch includes the following changes:
      
      - removing `use_unix_sockets` option from all suite.ini config files
        due to permanent using Unix sockets for admin connection recently
        introduced in test-run
      - switching replication-py tests to Unix sockets for iproto connection
      - fixing replication-py/swap.test.py and swim/swim.test.lua tests
      
      [1] tarantool/test-run#341
      [2] tarantool/test-run#348
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      
      (cherry picked from commit 4335b442)
      96dfd98b
    • Georgiy Lebedev's avatar
      memtx: track read story when conflicting full scans due to gap write · 23b7d3cb
      Georgiy Lebedev authored
      When conflicting transactions that made full scans in
      `memtx_tx_handle_gap_write`, we need to also track that the conflicted
      transaction has read the inserted tuple, just like we do in gap tracking
      for ordered indexes — otherwise another transaction can overwrite the
      inserted tuple in which case no gap tracking will be handled.
      
      Closes #7493
      
      NO_DOC=bugfix
      
      (cherry picked from commit 7f52f445)
      23b7d3cb
  10. Sep 12, 2022
    • Vladimir Davydov's avatar
      Use MT-Safe strerror_r instead of strerror · 03ceaafc
      Vladimir Davydov authored
      strerror() is MT-Unsafe, because it uses a static buffer under the hood.
      We should use strerror_r() instead, which takes a user-provided buffer.
      The problem is there are two implementations of strerror_r(): XSI and
      GNU. The first one returns an error code and always writes the message
      to the beginning of the buffer while the second one returns a pointer to
      a location within the buffer where the message starts. Let's introduce a
      macro HAVE_STRERROR_R_GNU set if the GNU version is available and define
      tt_strerror() which writes the message to the static buffer, like
      tt_cstr() or tt_sprintf().
      
      Note, we have to export tt_strerror(), because it is used by Lua via
      FFI. We also need to make it available in the module API header, because
      the say_syserror() macro uses strerror() directly. In order to avoid
      adding tt_strerror() to the module API, we introduce an internal helper
      function _say_strerror(), which calls tt_strerror().
      
      NO_DOC=bug fix
      NO_TEST=code is covered by existing tests
      
      (cherry picked from commit 44f46dc8)
      03ceaafc
  11. Sep 09, 2022
    • Alexander Turenko's avatar
      popen: fix a race between setpgrp() and killpg() · 99040255
      Alexander Turenko authored
      In brief: `vfork()` on Mac OS 12 and newer doesn't suspend the parent
      process, so we should wait for `setpgrp()` to use `killpg()`. See more
      detailed description of the problem in a comment of the
      `popen_wait_group_leadership()` function.
      
      The solution is to spin in a loop and check child's process group. It
      looks as the most simple and direct solution. Other possible solutions
      requires to estimate cons and pros of using extra file descriptor or
      assigning a signal number for the child -> parent communication.
      
      There are the following alternatives and variations:
      
      * Create a pipe and notify the parent from the child about the
        `setpgrp()` call.
      
        It costs extra file descriptor, so I decided to don't do that.
        However if we'll need some channel to deliver information from the
        child to the parent for another task, it'll worth to reimplement this
        function too.
      
        One possible place, where we may need such channel is delivery of
        child's errors to the parent. Now the child writes them directly to
        logger's fd and it requires some tricky code to keep and close the
        descriptor at right points. Also it doesn't allow to catch those
        errors in the parent, but we may need it for #4925.
      * Notify the parent about `setpgrp()` using a signal.
      
        It seems too greedly to assign a specific signal for such local
        problem. It is also unclear how to guarantee that it'll not break any
        user's code: a user can load a dynamic library, which uses some
        signals on its own.
      
        However we can consider using this approach here if we'll design some
        common interprocess notification system.
      * We can use the fiber cond or the `popen_wait_timeout()` function from
        PR #7648 to react to the child termination instantly.
      
        It would complicate the code and anyway wouldn't allow to react
        instantly on `setpgrp()` in the child.
      
        Also it assumes yielding during the wait (see below).
      * Wait until `setpgrp()` in `popen_send_signal()` instead of
        `popen_new()`.
      
        It would add yielding/waiting inside `popen_send_signal()` and likely
        will extend a set of its possible exit situations. It is undesirable:
        this function should have simple and predictable behavior.
      * Finally, we considered yielding in `popen_wait_group_leadership()`
        instead of sleeping the whole tx thread.
      
        `<popen handle>:new()` doesn't yield at the moment and a user's code
        may lean on this fact.
      
        Yielding would allow to achieve better throughtput (amount of parallel
        requests per second), but we don't take much care to performance on
        Mac OS. The primary goal for this platform is to offer the same
        behavior as on Linux to allow development of applications.
      
      I didn't replace `vfork()` with `fork()` on Mac OS, because `vfork()`
      works and I don't know consequences of calling `pthread_atfork()`
      handlers in a child created by popen. See the comment in `popen_new()`
      near to `vfork()` call: it warns about possible mutex double locks. This
      topic will be investigated further in #6674.
      
      Fixes #7658
      
      NO_DOC=fixes incorrect behavior, no need to document the bug
      NO_TEST=already tested by app-tap/popen.test.lua
      
      (cherry picked from commit e2207fdc)
      99040255
  12. Sep 07, 2022
    • Vladislav Shpilevoy's avatar
      raft: persist new term and vote separately · 61a07baf
      Vladislav Shpilevoy authored
      If a node persisted a foreign term + vote request at the same
      time, it increased split-brain probability. A node could vote for
      a candidate having smaller vclock than the local one. For example,
      via the following scenario:
      
      - Node1, node2, node3 are started;
      - Node1 becomes a leader;
      - The topology becomes node1 <-> node2 <-> node3 due to network
          issues;
      - Node1 sends a synchro txn to node2. The txn starts a WAL write;
      - Node3 bumps term and votes for self. Sends it all to node2;
      - Node2 votes for node3, because their vclocks are equal;
      - Node2 finishes all pending WAL writes, including the txn from
          node1. Now its vclock is > node3's one and the vote was wrong.
      - Node3 wins, writes PROMOTE, and it conflicts with node1 writing
          CONFIRM.
      
      This patch makes so a node can't persist a vote in a new term in
      the same WAL write as the term bump. Term bump is written first
      and alone. It serves as a WAL sync after which the node's vclock
      is not supposed to change except for the 0 (local) component.
      
      The vote requests are re-checked after term bump is persisted to
      see if they still can be applied.
      
      Part of #7253
      
      NO_DOC=bugfix
      
      (cherry picked from commit c9155ac8)
      61a07baf
    • Vladislav Shpilevoy's avatar
      qsync: fix txn fiber hang on fencing at CONFIRM · 618bafe6
      Vladislav Shpilevoy authored
      If the limbo was fenced during CONFIRM WAL write, then the
      confirmed txn was committed just fine, but its author-fiber kept
      hanging. This is because when it was woken up, it checked if the
      limbo is frozen and went to infinite waiting before actually
      checking if the txn is completed.
      
      The fiber would unfreeze if would be woken up explicitly as a
      workaround.
      
      The fix is simple - change the checks order.
      
      Part of #7253
      
      NO_DOC=bugfix
      
      (cherry picked from commit ec628100)
      618bafe6
    • Vladislav Shpilevoy's avatar
      promote: abort it when become non-candidate · cbebd024
      Vladislav Shpilevoy authored
      box.ctl.promote() bumps the term, makes the node a candidate, and
      waits for the term outcome. The waiting used to be until there is
      a leader elected or the node lost connection quorum or the term
      was bumped again.
      
      There was a bug that a node could hang in box.ctl.promote() even
      when became a voter. It could happen if the quorum was still there
      and a leader couldn't be elected in the current term at all. For
      instance, others could have `election_mode='off'`.
      
      The fix is to stop waiting for the term outcome if the node can't
      win anyway.
      
      NO_DOC=bugfix
      
      (cherry picked from commit ab08dad9)
      cbebd024
    • Vladislav Shpilevoy's avatar
      promote: fix infinite elections with multi-promote · b200d298
      Vladislav Shpilevoy authored
      If box.ctl.promote() was called on more than one instance, then it
      could lead to infinite or extremely long elections bumping
      thousands of terms in just a few seconds.
      
      This was because box.ctl.promote() used to be a loop. The loop
      retried term bump + voted for self until the node won. Retry
      happened immediately as the node saw the term was bumped again
      and there was no leader elected or the connection quorum was lost.
      
      If 2 nodes would start box.ctl.promote() almost at the same time,
      they could bump each other's terms, not see any winner, bump them
      again, and so on. For example:
      
      - Node1 term=1, node2 term=2;
      - Promote is called on both;
      - Node1 term=2, node2 term=3. They receive the messages. Node2
          ignores node1's old term. Node1 term is bumped and it votes
          for node2, but it didn't win, so box.ctl.promote() bumps its
          term to 4.
      - Node2 receives term 4 from node1. Its own box.ctl.promote() sees
          the term was bumped and no winner, so it bumps it to 5 and the
          process continues for a long time.
      
      It worked good enough in tests - the problem happened sometimes,
      terms could roll like 80k times in a few seconds, but the tests
      ended fine anyway.
      
      One of the next commits will make term bump + vote written in
      separate WAL records. That aggravates the problem drastically.
      
      Basically, this mutual term bump loop could end only if one node
      would receive vote for self from another node and send back the
      message 'I am a leader' before the other node's box.ctl.promote()
      notices the term was bumped externally. This will get much harder
      to achieve.
      
      The patch simply drops the loop. Let box.ctl.promote() fail if the
      term was bumped outside.
      
      There was an alternative to keep running it in a loop with a
      randomized election timeout like it works inside of raft. But the
      current solution is just simpler.
      
      NO_DOC=bugfix
      NO_TEST=election_split_vote_test.lua catches it already
      
      (cherry picked from commit dd89c57e)
      b200d298
  13. Sep 06, 2022
  14. Sep 05, 2022
    • Ilya Grishnov's avatar
      uri: fix resolve with only port specification · 399bea26
      Ilya Grishnov authored
      Supplemented the implementation of the `src/lib/uri` parser.
      Before this fix a call `uri.parse(uri.format(uri.parse(3301)))`
      returned an error of 'Incorrect URI'.
      Now this call return correct `service: '3301'`.
      As a result, the possibility of using host=localhost by default
      for `tarantoolctl connect` has been restored now.
      As well as for `console.connect`.
      
      Fixes #7479
      
      NO_DOC=bugfix
      
      (cherry picked from commit 96d8dcec)
      399bea26
    • Alexander Turenko's avatar
      test: always perform assertions in module API test · 3247d3d1
      Alexander Turenko authored
      This commit pursues several goals:
      
      * Eliminate unused parameter/variable warnings at building module_api.c
        in non-debug configuration. The problem was introduced in commit
        5c1bc3da ("decimal: add the library into the module API").
      * Eliminate a need to check newly added tests in two build
        configurations (Debug and RelWithDebInfo) and don't forget to add
        `(void)x;` statements in addition to a test condition check.
      * Fail the testing if conditions required by the
        app-tap/module_api.test.lua test are not met -- not only in the Debug
        build, but also in RelWithDebInfo.
      
      Fixes #7625
      
      NO_DOC=a change in a test, purely development matter
      NO_CHANGELOG=see NO_DOC
      
      (cherry picked from commit aaf3bf91)
      3247d3d1
    • Ilya Verbin's avatar
      box: fix high CPU usage while on_shutdown triggers are running · 69a8a649
      Ilya Verbin authored
      
      Currently this script causes 100% CPU usage for 10 sec, because
      os.exit() infinitely yields to the scheduler until on_shutdown
      fiber completes and breaks the event loop. Fix this by a sleep.
      
      ```
      box.ctl.set_on_shutdown_timeout(100)
      box.ctl.on_shutdown(function() require('fiber').sleep(10) end)
      os.exit()
      ```
      
      Closes #6801
      
      NO_DOC=bugfix
      NO_TEST=don't know how to catch this by a test
      
      Co-authored-by: default avatarGeorgy Moshkin <louielouie314@gmail.com>
      (cherry picked from commit 6d91e44b)
      69a8a649
    • Ilya Verbin's avatar
      main: run an event loop for on_shutdown triggers · 53347bcf
      Ilya Verbin authored
      When Tarantool is stopped by Ctrl+D or by reaching the end of the
      script, run_script_f() breaks the event loop, then tarantool_exit()
      is called from main(), however the fibers that execute on_shutdown
      triggers can not be longer scheduled, because the event loop is
      already stopped. Fix this by starting an auxiliary event loop for
      such cases.
      
      Closes #7434
      
      NO_DOC=bugfix
      
      (cherry picked from commit cdd5674c)
      53347bcf
  15. Sep 02, 2022
    • Vladimir Davydov's avatar
      Revert "log: free resources while event loop is running" · 8a97dccd
      Vladimir Davydov authored
      This reverts commit 0c3f9b37.
      
      If log_destroy and log_boot use the same fd (STDERR_FILENO), say()
      called after say_logger_free() will write to a closed fd. What's worse,
      the fd may be reused, in which case say() will write to a completely
      unrelated file or socket (maybe a data file!). This is what happened
      with flightrec - flightrec finalization info message was written to
      an xlog file. Let's move say_logger_free() back to where it belongs -
      after other subsystem has been finalized.
      
      Since 2.10.2 was released, this commit also adds a changelog.
      
      Reopens #4450
      Needed for https://github.com/tarantool/tarantool-ee/issues/223
      
      NO_DOC=bug fix
      NO_TEST=revert
      
      (cherry picked from commit 5cb688ed)
      8a97dccd
  16. Sep 01, 2022
  17. Aug 31, 2022
    • Nikolay Shirokovskiy's avatar
      box: fix unauthorized inserts into _truncate table · 01c9ea9e
      Nikolay Shirokovskiy authored
      Non privileged user (thru public role) has write access to _truncate
      table in order to be able to perform truncates on it's tables. Normally
      it should be able to modify records only for the tables he has write
      access. Yet now due to bootstrap check it is not so.
      
      Closes tarantool/security#5
      
      NO_DOC=bugfix
      
      (cherry picked from commit 941318e7)
      01c9ea9e
    • Nikolay Shirokovskiy's avatar
      box: make part simplicity check easier · f2b8d63b
      Nikolay Shirokovskiy authored
      Simple part is a part without any extra key besides 'field' and 'type'.
      Let's make a check in try_simplify_index_parts itself.
      
      NO_TEST=refactoring
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit bc0872fd)
      f2b8d63b
    • Nikolay Shirokovskiy's avatar
      box: fix inheriting format options for old-style parts · 4deb7663
      Nikolay Shirokovskiy authored
      If index parts are specified using old syntax like:
      
      	parts = {1, 'number', 2, 'string'},
      
      then (except if parts count is 1) index options set in space format
      are not taken into account. Solution is to continue after parsing 1.6.0
      style parts so to use code that check format options.
      
      Closes #7614
      
      NO_DOC=bugfix
      
      (cherry picked from commit 91ba0a59)
      4deb7663
  18. Aug 30, 2022
    • Nikita Zheleztsov's avatar
      core: mark some internal fibers as system ones · 00fab37c
      Nikita Zheleztsov authored
      Currently internal tarantool fibers can be cancelled from the user's app,
      which can lead to critical errors.
      
      Let's mark these fibers as a system ones in order to be sure that they
      won't be cancelled from the Lua world.
      
      Closes #7448
      Closes #7473
      
      NO_DOC=minor change
      
      (cherry picked from commit 3733ff25)
      00fab37c
Loading