Skip to content
Snippets Groups Projects
  1. Jun 29, 2022
    • Ilya Verbin's avatar
      clock: check for clock_gettime return value · fd3eb7e9
      Ilya Verbin authored
      clock_gettime() returns 0 for success, or -1 for failure.
      Add missed checks for the return value.
      
      Part of #5869
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      fd3eb7e9
    • Vladislav Shpilevoy's avatar
      main: free port, random, and memory · 04e59ab1
      Vladislav Shpilevoy authored
      These 3 modules are low hanging fruits which right now can be
      freed at return from main() without any effort.
      
      There are still a lot of other modules whose freeing is not that
      easy. A few hard to untangle knots, and there are more:
      
      - Session, credentials, iproto, and fibers are tied together via
        the latter. Each fiber potentially has a session, its current
        credentials object. Each iproto connection has a session and a
        file descriptor which is stored in the session too.
      
        The possible solution would be to walk all the fibers and
        destroy them before proceeding to destroy everything else.
      
      - Tuples depend on memtx, and Lua depends on tuples. Because there
        are tuples allocated on memtx->arena. Hence destruction of memtx
        and its arena makes the tuples still stored in Lua invalid.
      
        It seems Lua should be destroyed first, not last. It would free
        all the refs which might be kept at objects in C in the other
        modules.
      
      - IProto connections leak when iproto is destroyed. They are not
        freed and their descriptors are not closed properly. That
        requires additional preparatory work to destroy them correctly
        on iproto module deconstruction.
      
      Given amount of work, it should be done as a big separate ticket.
      
      Follow up #7259
      
      NO_CHANGELOG=Not a visible change
      NO_DOC=Not a visible change
      NO_TEST=Not a visible change
      04e59ab1
    • Vladislav Shpilevoy's avatar
      main: destroy main fiber and whole cord on return · ec37c573
      Vladislav Shpilevoy authored
      main() used to skip most of modules destruction in
      tarantool_free(). That got ASAN complaining on clang-13 about a
      leak of a fiber on_stop trigger which was allocated in Lua.
      
      The patch makes fiber_free() called for the main cord. It destroys
      and frees all the fibers together with their on_stop triggers.
      
      Closes #7259
      
      NO_CHANGELOG=Not a visible change
      NO_DOC=Not a visible change
      NO_TEST=Not a visible change
      ec37c573
  2. Jun 28, 2022
    • Nikita Pettik's avatar
      tuple: refactor flags · 9da70207
      Nikita Pettik authored
      Before this patch struct tuple had two boolean bit fields: is_dirty and
      has_uploaded_refs. It is worth mentioning that sizeof(boolean) is
      implementation depended. However, in code it is assumed to be 1 byte
      (there's static assertion restricting the whole struct tuple size by 10
      bytes). So strictly speaking it may lead to the compilation error on
      some non-conventional system. Secondly, bit fields anyway consume at
      least one size of type (i.e. there's no space benefits in using two
      uint8_t bit fields - they anyway occupy 1 byte in total). There are
      several known pitfalls concerning bit fields:
       - Bit field's memory layout is implementation dependent;
       - sizeof() can't be applied to such members;
       - Complier may raise unexpected side effects
         (https://lwn.net/Articles/478657/).
      
      Finally, in our code base as a rule we use explicit masks:
      txn flags, vy stmt flags, sql flags, fiber flags.
      
      So, let's replace bit fields in struct tuple with single member called
      `flags` and several enum values corresponding to masks (to be more
      precise - bit positions in tuple flags).
      
      NO_DOC=<Refactoring>
      NO_CHANGELOG=<Refactoring>
      NO_TEST=<Refactoring>
      9da70207
  3. Jun 27, 2022
    • Timur Safin's avatar
      datetime: fix set with hour=nil · ba140128
      Timur Safin authored
      We did not retain correctly `hour` attribute if modified
      via `:set` method attributes `min`, `sec` or `nsec`.
      
      ```
      tarantool> a = dt.parse '2022-05-05T00:00:00'
      
      tarantool> a:set{min = 0, sec = 0, nsec = 0}
      --
      - 2022-05-05T12:00:00Z
      ...
      ```
      
      Closes #7298
      
      NO_DOC=bugfix
      ba140128
  4. Jun 24, 2022
    • Vladimir Davydov's avatar
      vinyl: rotate active memory index in vy_squash_process if necessary · e797a748
      Vladimir Davydov authored
      If vy_point_lookup called by vy_sauash_process yields (doing disk read),
      a dump may be triggered bumping the L0 generation counter, in which case
      we would insert a statement to a sealed vy_mem, as explained in #5080.
      Let's check the generation counter and rotate the active vy_mem if
      necessary after vy_point_lookup to avoid that.
      
      Closes #5080
      
      NO_DOC=bug fix
      NO_TEST=complicated, need stress/perf test to catch bugs like this
      e797a748
    • Vladimir Davydov's avatar
      vinyl: add vy_lsm_rotate_mem_if_required helper · 415d8b25
      Vladimir Davydov authored
      We often call vy_lsm_rotate_mem_if_required if its generation or schema
      version is older than the current one. Let's add a helper function for
      that.
      
      Needed for #5080
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      415d8b25
    • Vladimir Davydov's avatar
      vinyl: drop UPSERT squashing optimization when there is no disk data · afce0913
      Vladimir Davydov authored
      The optimization is mostly useless, because it only works if there's no
      data on disk. As explained in #5080, it contains a potential bug: if L0
      dump is triggered between 'prepare' and 'commit', it will insert a
      statement to a sealed vy_mem. Let's drop it.
      
      Part of #5080
      
      NO_DOC=bug fix
      NO_CHANGELOG=later
      afce0913
    • Nikita Pettik's avatar
      Fix gh_6634 test case · 47ad3bc9
      Nikita Pettik authored
      gh_6634_different_log_on_tuple_new_and_free_test.lua verifies that
      proper debug message gets into logs for tuple_new() and tuple_delete():
      occasionally tuple_delete() printed wrong tuple address. However, still
      there are two debug logs: one in tuple_delete() and another one in
      memtx_tuple_delete(). So to avoid any possible confusions let's fix
      regular expression to find proper log so that now it definitely finds
      memtx_tuple_delete().
      
      NO_CHANGELOG=<Test fix>
      NO_DOC=<Test fix>
      47ad3bc9
    • Vladimir Davydov's avatar
      net.box: explicitly forbid synchronous requests in triggers · 0d944f90
      Vladimir Davydov authored
      Net.box triggers (on_connect, on_schema_reload) are executed
      by the net.box connection worker fiber so a request issued by
      a trigger callback can't be processed until the trigger returns
      execution to the net.box fiber. Currently, an attempt to issue
      a synchronous request from a net.box trigger leads to a silent
      hang of the connection, which is confusing. Let's instead raise
      an error until #7291 is implemented.
      
      We need to add the check to three places in the code:
       1. luaT_netbox_wait_result for future:wait_result()
       2. luaT_netbox_iterator_next for future:pairs()
       3. conn._request for all synchronous requests.
          (We can't add the check to luaT_netbox_transport_perform_request,
          because conn._request may also call conn.wait_state, which would
          hang if called from on_connect or on_schema_reload trigger.)
      
      We also add an assertion to netbox_request_wait to ensure that we
      never wait for a request completion in the net.box worker fiber.
      
      Closes #5358
      
      @TarantoolBot document
      Title: Synchronous requests are not allowed in net.box triggers
      
      An attempt to issue a synchronous request (e.g. `call`) from
      a net.box trigger (`on_connect`, `on_schema_reload`) now raises
      an error: "Synchronous requests are not allowed in net.box trigger"
      (Before https://github.com/tarantool/tarantool/issues/5358 was
      fixed, it silently hung.)
      
      Invoking an asynchronous request (see `is_async` option) is allowed,
      but the request will not be processed until the trigger returns and
      an attempt to wait for the request completion with `future:pairs()`
      or `future:wait_result()` will raise the same error.
      0d944f90
    • Ilya Verbin's avatar
      raft: get rid of fiber_set_cancellable in box/raft.c · e20e41df
      Ilya Verbin authored
      `wal_write` has been adapted to spurious wakeups, so this protection is
      no longer needed (see 4bf52367).
      
      Part of #7166
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      e20e41df
    • Ilya Verbin's avatar
      txn_limbo: get rid of fiber_set_cancellable in box/txn_limbo.c · e71d5950
      Ilya Verbin authored
      Currently it's possible to wakeup a fiber, which is waiting on
      `limbo->wait_cond`, using Tarantool C API, so the protection
      by `fiber_set_cancellable` doesn't make much sense. This patch
      removes it. Spurious wakeups are already handled correctly.
      
      Part of #7166
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      e71d5950
    • Ilya Verbin's avatar
      core: fix comment in fiber_make_ready · db928477
      Ilya Verbin authored
      Part of #7166
      
      NO_DOC=fix comment
      NO_TEST=fix comment
      NO_CHANGELOG=fix comment
      db928477
  5. Jun 23, 2022
    • Georgiy Lebedev's avatar
      core: return procedure name cache to speed up frame resolving · 39c91ead
      Georgiy Lebedev authored
      Procedure name cache was removed during refactoring in 8ce76364 based
      off the interpretation that libunwind caches procedure names internally —
      turned out it only caches unwinding info, which causes a severe performance
      downgrade (#7207): return it to speed up frame resolving.
      
      Closes #7207
      
      NO_DOC=performance improvement
      NO_TEST=performance improvement
      39c91ead
    • Georgiy Lebedev's avatar
      build: pass necessary compiler flags to libunwind project submodule · bd026399
      Georgiy Lebedev authored
      For the sake of maximizing backtrace collection performance, build
      the libunwind project submodule with "-O2" compiler flag.
      
      Also, build it with the "-g" compiler flag just in case to simplify
      debugging.
      
      Last but not least, pass the same archive-maintaining program used by the
      main CMake project to the libunwind project submodule to make the build
      homogeneous.
      
      Needed for #7207
      
      NO_CHANGELOG=build
      NO_DOC=build
      NO_TEST=build
      bd026399
    • Vladimir Davydov's avatar
      box: fix exclude_null for json and multikey indexes · 30cb5d6e
      Vladimir Davydov authored
      exclude_null is a special index option, which makes the index ignore
      tuples that contain null in any of the indexed fields. Currently, it
      doesn't work for json and multikey indexes, because:
       1. index_filter_tuple ignores json path.
       2. index_filter_tuple ignores multikey index.
      
      Issue no. 1 is easy to fix - we just need to use tuple_field_by_part
      instead of tuple_field when checking if a key field is null.
      
      Issue no. 2 is more complicated, because when we call index_filter_tuple
      we don't know the multikey index. We address this issue by pushing the
      index_filter_tuple call down to engine-specific index implementation.
      
      For Vinyl, we make vy_stmt_foreach_entry, which iterates over multikey
      tuple entries, skip entries that contain nulls.
      
      For memtx, we move the check to index-specific index_replace function
      implementation.  Fortunately, only tree indexes support nullable fields
      so we just need to update the memtx tree implementation.
      
      Ideally, we should handle multikey indexes in memtx at the top level,
      because the implementation should essentially be the same for all kinds
      of indexes, but this refactoring is complicated and will be done later.
      For now, just fix the bug.
      
      Closes #5861
      
      NO_DOC=bug fix
      30cb5d6e
    • Vladimir Davydov's avatar
      test: make all engine/null test cases multi-engine · 7e605b13
      Vladimir Davydov authored
      For some reason, some test cases create memtx spaces irrespective of
      the value of the engine parameter.
      
      NO_DOC=test
      NO_CHANGELOG=test
      7e605b13
  6. Jun 22, 2022
  7. Jun 21, 2022
    • Igor Munkin's avatar
      ci: add workflow for LuaJIT integration tests · 7f595759
      Igor Munkin authored
      
      This patch introduces reusable workflow used by integration testing
      machinery run within tarantool/luajit repository.
      
      For the first attempt GitHub action has been used, but its fetch (or
      more precisely unpack) phase fails due to test/test-run.py symlink into
      test-run submodule (the action being used doesn't fetch it while packing
      tarantool repository). As the alternative for removing this symlink, it
      was decided to use reusable workflows despite its known limitations
      (e.g. inability to use the testing matrix) until the issue with symlink
      is resolved in any possible way.
      
      As an alternate way, a common action to be used in all submodules for
      integration testing can be added to tarantool/actions repository.
      
      NO_DOC=ci
      NO_TEST=ci
      NO_CHANGELOG=ci
      
      Reviewed-by: default avatarYaroslav Lobankov <y.lobankov@tarantool.org>
      Reviewed-by: default avatarSergey Bronnikov <sergeyb@tarantool.org>
      Signed-off-by: default avatarIgor Munkin <imun@tarantool.org>
      7f595759
    • Vladimir Davydov's avatar
      vinyl: fix !vy_tx_is_in_read_view assertion failure in vy_tx_prepare · 2971f691
      Vladimir Davydov authored
      Commit 4d52199e ("box: fix transaction "read-view" and "conflicted"
      states") updated vy_tx_send_to_read_view so that now it aborts all RW
      transactions right away instead of sending them to read view and
      aborting them on commit. It also updated vy_tx_begin_statement to fail
      if a transaction sent to a read view tries to do DML. With all that,
      we assume that there cannot possibly be an RW transaction sent to read
      view so we have an assertion checking that in vy_tx_commit.
      
      However, this assertion may fail, because a DML statement may yield
      on disk read before it writes anything to the write set. If this is
      the first statement in a transaction, the transaction is technically
      read-only and we will send it to read-view instead of aborting it.
      Once it completes the disk read, it will apply the statement and hence
      become read-write, breaking our assumption in vy_tx_commit.
      
      Fix this by aborting RW transactions sent to read-view in vy_tx_set.
      
      Follow-up #7240
      
      NO_DOC=bug fix
      NO_CHANGELOG=unreleased
      2971f691
  8. Jun 20, 2022
  9. Jun 18, 2022
    • Igor Munkin's avatar
      luajit: bump new version · b1953b59
      Igor Munkin authored
      * ci: add job for build using Ninja on Linux/x86_64
      * build: create file lists outside of CMake commands
      * build: use unique names for CMake targets
      * Revert "test: disable PUC-Rio tests for several -l options"
      * ci: make GitHub workflows more CMake-ish
      * test: adapt PUC-Rio tests for debug line hook
      * test: adapt PUC-Rio test for tail calls debug info
      * test: adapt PUC-Rio test with reversed function
      
      Closes #5693
      Closes #5702
      Closes #5782
      Follows up #5747
      
      NO_DOC=LuaJIT submodule bump
      NO_TEST=LuaJIT submodule bump
      NO_CHANGELOG=LuaJIT submodule bump
      b1953b59
  10. Jun 17, 2022
    • Cyrill Gorcunov's avatar
      fiber: don't crash on wakeup with dead fibers · 206137e7
      Cyrill Gorcunov authored
      
      When fiber has finished its work it ended up in two cases:
      1) If no "joinable" attribute set then the fiber is
         simply recycled
      2) Otherwise it continue hanging around waiting to be
         joined.
      
      Our API allows to call fiber_wakeup() for dead but joinable
      fibers (2) in release builds without any side effects, such
      fibers are simply ignored, in turn for debug builds this
      causes assertion to trigger. We can't change our API for
      backward compatibility sake but same time we must not
      preserve different behaviour between release and debug
      builds since this brings inconsistency. Thus lets get
      rid of assertion call and allow to call fiber_wakeup
      in debug build as well.
      
      Fixes #5843
      
      NO_DOC=bug fix
      
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      206137e7
    • Serge Petrenko's avatar
      replication: unify replication filtering with and without elections · deca9749
      Serge Petrenko authored
      Once the split-brain detection is in place, it's fine to nopify obsolete
      data even on a node with elections disabled. Let's not keep a bug around
      anymore.
      
      This behaviour change leads to changing
      "gh_6842_qsync_applier_order_test.lua" a bit. It actually relied on old
      and buggy behaviour: it assumed old transactions would not be nopified
      and would trigger replication error.
      
      This doesn't happen anymore, because nopify works correctly, and the
      transactions are not followed by a conflicting CONFIRM.
      
      The test for this commit is simply altering the
      gh_5295_split_brain_detection_test.lua to work with elections disabled.
      
      Closes #6133
      Follow-up #5295
      
      NO_DOC=internal change
      NO_CHANGELOG=internal change
      deca9749
    • Cyrill Gorcunov's avatar
      txn_limbo: filter incoming synchro requests · af7d703f
      Cyrill Gorcunov authored
      
      When we receive synchro requests we can't just apply them blindly
      because in worst case they may come from split-brain configuration
      (where a cluster split into several clusters and each one has own
      leader elected, then clusters are trying to merge back into the original
      one). We need to do our best to detect such disunity and force these
      nodes to rejoin from the scratch for data consistency sake.
      
      Thus when we're processing requests we pass them to the packet filter
      first which validates their contents and refuse to apply if they violate
      consistency.
      
      Depending on request type each packet traverses an appropriate chain.
      
      filter_generic(): a common chain for any synchro packet.
       1) request:replica_id = 0 allowed for PROMOTE request only.
       2) request:replica_id should match limbo:owner_id, IOW the
          limbo migration should be noticed by all instances in the
          cluster.
      
      filter_confirm_rollback(): a chain for CONFIRM | ROLLBACK packets.
       1) Zero lsn is disallowed for such requests.
      
      filter_promote_demote(): a chain for PROMOTE | DEMOTE packets.
       1) The requests should come in with nonzero term, otherwise
          the packet is corrupted.
       2) The request's term should not be less than maximal known
          one, iow it should not come in from nodes which didn't notice
          raft epoch changes and living in the past.
      
      filter_queue_boundaries(): a common finalization chain.
       1) If LSN of the request matches current confirmed LSN the packet
          is obviously correct to process.
       2) If LSN is less than confirmed LSN then the request is wrong,
          we have processed the requested LSN already.
       3) If LSN is greater than confirmed LSN then
          a) If limbo is empty we can't do anything, since data is already
             processed and should issue an error;
          b) If there is some data in the limbo then requested LSN should
             be in range of limbo's [first; last] LSNs, thus the request
             will be able to commit and rollback limbo queue.
      
      Note the filtration is disabled during initial configuration where we
      apply requests from the only source of truth (either the remote master,
      or our own journal), so no split brain is possible.
      
      In order to make split-brain checks work, the applier nopify filter now
      passes synchro requests from obsolete term without nopifying them.
      
      Also, now ANY asynchronous request coming from an instance with obsolete
      term is treated as a split-brain. Think of it as of a syncrhonous
      request committed with a malformed quorum.
      
      Closes #5295
      
      NO_DOC=it's literally below
      
      Co-authored-by: default avatarSerge Petrenko <sergepetrenko@tarantool.org>
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      
      @TarantoolBot document
      Title: new error type: ER_SPLIT_BRAIN
      
      If for some reason the cluster had 2 leaders working independently (for
      example, user has mistakenly lovered the quorum below N / 2 + 1), then
      once such leaders and their followers try connecting to each other, they
      will receive the ER_SPLIT_BRAIN error, and the connection will be
      aborted. This is done to preserve data integrity. Once the user notices
      such an error he or she has to manually inspect the data on both the
      split halves, choose a way to restore the data, and rebootstrap one of
      the halves from the other.
      af7d703f
    • Serge Petrenko's avatar
      txn_limbo: change function return types · 9eab2868
      Serge Petrenko authored
      Change return types of txn_limbo_req_prepare, txn_limbo_process,
      txn_limbo_write_promote, txn_limbo_write_demote from void to int.
      This is a preparation for when these functions start returning errors.
      
      Part-of #5295
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      9eab2868
    • Serge Petrenko's avatar
      box: change box_issue_promote(demote) return type · fd5e1439
      Serge Petrenko authored
      Make box_issue_promote and box_issue_demote return a return code.
      For now it's always 0, but soon they will return errors.
      
      Part-of #5295
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      fd5e1439
    • Serge Petrenko's avatar
      txn_limbo: track CONFIRM lsn on replicas · 129d83e9
      Serge Petrenko authored
      limbo->confirmed_lsn was only filled on limbo owner in
      txn_limbo_write_confirm. Replicas and recovering limbo owner need to track
      it as well to correctly detect split-brains based on confirmed_lsn.
      
      So update confirmed_lsn in txn_limbo_read_confirm.
      
      Part-of #5295
      
      NO_DOC=internal change
      NO_TEST=tested in future commits
      NO_CHANGELOG=internal change
      129d83e9
Loading