Skip to content
Snippets Groups Projects
  1. Sep 17, 2024
    • Andrey Saranchin's avatar
      memtx: fix use-after-free in mvcc on ddl · 8cbe42eb
      Andrey Saranchin authored
      When space is being altered, `memtx_tx_space_on_delete` is called - it
      deletes all the stories associated with the old schema. However, before
      deleting a story, its `reader_list` member is not unlinked from the list
      so other nodes can still access this memory. The commit fixes this
      problem and adds an assertion that checks if story is always unlinked
      from reader list when is being deleted.
      
      Part of #10146
      
      NO_CHANGELOG=later
      NO_DOC=bugfix
      
      (cherry picked from commit a32f56dfbb4b56b410ac376fce079613cac0ccb6)
      8cbe42eb
    • Andrey Saranchin's avatar
      memtx: do not use memtx_build_on_replace trigger with mvcc enabled · 74584869
      Andrey Saranchin authored
      Now background build of index uses index iterator that collects
      conflicts during iteration if MVCC is enabled. Thus, trigger
      `memtx_build_on_replace` is not needed - if someone writes to
      prefix we already scanned, it will lead to transaction conflict.
      Moreover, `memtx_ddl_state` that is needed for rollback is allocated
      on stack of function called from DDL transaction, so if conflicted
      transaction rolls back later that DDL is over (and it's possible only
      with MVCC enabled), segmentation fault will happen. So let's simply
      don't set the trigger is MVCC is enabled.
      
      Closes #10147
      
      NO_CHANGELOG=later
      NO_DOC=bugfix
      
      (cherry picked from commit 9fe60c5754cf77686404fc7ee3d24af32b6c486c)
      74584869
    • Andrey Saranchin's avatar
      memtx: unlink all delete statements of mvcc stories on space delete · 1d72b80f
      Andrey Saranchin authored
      Since one tuple can be deleted by many concurrent transactions, member
      `del_stmt` of `struct memtx_story` is actually a list. It seems we
      forgot about it when implementing `memtx_tx_on_space_delete` so the
      function unlink only one of delete statements. The commit fixes this
      mistake.
      
      Part of #10146
      
      NO_CHANGELOG=later
      NO_DOC=bugfix
      
      (cherry picked from commit 5a31551467308f26b8471a9de233b94e380f23cf)
      1d72b80f
  2. Sep 16, 2024
    • Nikolay Shirokovskiy's avatar
      box: fix crash on rollback on memtx memory OOM and massive index change · e9fc51d0
      Nikolay Shirokovskiy authored
      We cannot tolerate index extent memory allocation failure on rollback.
      At the same time it is not practical to reserve memory because a whole
      index can easily be changed on rollback if read view is created before
      rollback.
      
      So in case of rollback and memtx memory OOM let's allocate outside the
      memtx arena limited by quota.
      
      Now part of the index can reside outside memtx arena. But regularly the
      index changes will move this part back to the memtx arena. Until next
      such situation of course.
      
      Closes #10551
      
      NO_DOC=bugfix
      
      (cherry picked from commit 32ea713af0a4f27f9ae37bb767c21722ee8c6742)
      e9fc51d0
    • Nikolay Shirokovskiy's avatar
      memtx: free extents on exit · 1fb5a7cc
      Nikolay Shirokovskiy authored
      Part-of #10211
      
      NO_TEST=internal
      NO_CHANGELOG=internal
      NO_DOC=internal
      
      (cherry picked from commit 134a2a4f7f0a3bad15bc42e2dc051708c3583fed)
      1fb5a7cc
    • Nikolay Shirokovskiy's avatar
      core: add (void *) set definition · e320972a
      Nikolay Shirokovskiy authored
      Part-of #10551
      
      NO_TEST=declarative code
      NO_CHANGELOG=internal
      NO_DOC=internal
      
      (cherry picked from commit 398c7031c915380bd6e93b7aeab9145cf0ebe511)
      e320972a
  3. Sep 13, 2024
    • Nikolay Shirokovskiy's avatar
      small: bump version · e60f5fbd
      Nikolay Shirokovskiy authored
      New commits:
      * slab cache: fix slab alignment to 16 bytes
      
      NO_TEST=submodule bump
      NO_CHANGELOG=submodule bump
      NO_DOC=submodule bump
      
      (cherry picked from commit 2300704e8317f2d8a545cde1394f8cbbb7e95741)
      e60f5fbd
    • Vladimir Davydov's avatar
      sptree: don't use variable length arrays · 977ef353
      Vladimir Davydov authored
      This causes warnings if compiled with clang-18. Let's define a sane
      upper limit for the max tree depth and use it for allocating arrays
      on stack. Note that we don't really care about performance because
      sptree is used only in unit tests.
      
      Closes #10354
      
      NO_DOC=internal
      NO_TEST=internal
      NO_CHANGELOG=internal
      
      (cherry picked from commit 187d288f0c3b008ed2d281e8bb43159e44c4106e)
      977ef353
    • Sergey Bronnikov's avatar
      test: disable flaky testcases in http_client_test · 7d120035
      Sergey Bronnikov authored
      The testcase "http_client.sock_family:\"AF_UNIX\".test_follow_location"
      is flaky in each run of `release_clang_asan` and
      `debug_asan_clang` workflows. Disabling a single testcase does not
      help. The patch disables a group of testcases executed with Unix
      domain socket.
      
      Needed for #9854
      
      NO_CHANGELOG=testing
      NO_DOC=testing
      
      (cherry picked from commit 8fae8004f79ecd555537960c60c6e646b037c4cc)
      7d120035
    • Sergey Bronnikov's avatar
      test: fix luacheck warnings · e393ee29
      Sergey Bronnikov authored
      The patch fixes a warning produced by luacheck:
      
      NO_WRAP
      test/app-luatest/http_client_test.lua:27:8: Error prone negation: negation is executed before relational operator.
      test/app-luatest/http_client_test.lua:28:8: Error prone negation: negation is executed before relational operator.
      NO_WRAP
      
      Found by Luacheck 1.2.0.
      
      Closes #10037
      
      NO_CHANGELOG=codehealth
      NO_DOC=codehealth
      NO_TEST=codehealth
      
      (cherry picked from commit 8fd37731b68e1e1d8e258ab919d65907d52ec764)
      e393ee29
  4. Sep 09, 2024
    • Vladimir Davydov's avatar
      test: fix flaky #10148 test · adbb726a
      Vladimir Davydov authored
      The test may exceed the default fiber slice (1 second):
      
      ```
      [060] server | 2024-09-09 09:16:16.329 [33093] main/111/main fiber.h:1132 W> fiber has not yielded for more than 0.500 seconds
      [060] server | 2024-09-09 09:16:16.825 [33093] main/111/main/test-run.lib.luatest.luatest.log I> Assert "FiberSliceIsExceeded" equals to "OutOfMemory"
      [060] not ok 1	box-luatest.gh_10148_fix_crash_low_slab_alloc_factor.test_low_slab_alloc_factor
      [060] #   ...uatest/gh_10148_fix_crash_low_slab_alloc_factor_test.lua:36: expected: "OutOfMemory"
      [060] #   actual: "FiberSliceIsExceeded"
      [060] #   stack traceback:
      [060] #   	...uatest/gh_10148_fix_crash_low_slab_alloc_factor_test.lua:30: in function 'box-luatest.gh_10148_fix_crash_low_slab_alloc_factor.test_low_slab_alloc_factor'
      [060] #   	...
      [060] #   	[C]: in function 'xpcall'
      [060] #   artifacts:
      [060] #   	server -> /tmp/t/060_box-luatest/artifacts/server-RulP4Fj6qEoI
      [060] luatest | 2024-09-09 09:16:16.839 [32904] main/104/luatest/test-run.lib.luatest.luatest.log I> End test "box-luatest.gh_10148_fix_crash_low_slab_alloc_factor.test_low_slab_alloc_factor"
      [060] server | 2024-09-09 09:16:16.849 [33093] main/116/iproto.shutdown I> tx_binary: stopped
      [060] # Ran 1 tests in 2.388 seconds, 0 succeeded, 1 failed
      ```
      
      Let's set the fiber slice to a sufficiently big value.
      
      Fixes commit e4ce9e111483 ("test: add test for #10148").
      
      NO_DOC=test fix
      NO_CHANGELOG=test fix
      
      (cherry picked from commit 565cda7f2f0d74b2b726b475d2b7ed0c3344920e)
      adbb726a
    • Vladimir Davydov's avatar
      vinyl: fix ERRINJ_VY_DELAY_PK_LOOKUP · 7c1d6841
      Vladimir Davydov authored
      Enabling `ERRINJ_VY_DELAY_PK_LOOKUP` makes Vinyl yield in a place where
      it wouldn't normally do. If the transaction is aborted in the meantime,
      we'll get the assertion failure:
      
      ```
      ./src/box/vy_point_lookup.c:219: vy_point_lookup: Assertion 'tx == NULL || tx->state == VINYL_TX_READY' failed.
      ```
      
      To prevent this from happening, let's replace this invalid error
      injection with the new one `ERRINJ_VY_POINT_LOOKUP_DELAY` that injects
      a delay to `vy_point_lookup()` before reading disk. This doesn't have
      exactly the same effect as the old error injection because it also
      delays direct lookups in the primary index. Fortunately, the old error
      injection is used in the only test, where the new one works as expected
      if we make the secondary index created in the test non-unique and enable
      deferred writes (this makes the `s:replace{2, 2}` statement bypass
      a lookup in the primary index).
      
      Also, let's replace `VY_POINT_ITER_WAIT` with the new error injection
      because they have very a similar meaning and `VY_POINT_LOOKUP_DELAY`
      works in the test using it with a very small adjustment (we need to
      clear it explicitly after `box.snapshot()`).
      
      Closes #10517
      
      NO_DOC=errinj fix
      NO_CHANGELOG=errinj fix
      
      (cherry picked from commit 926196359eaa46bbc670d196103730e196c31437)
      7c1d6841
    • Vladimir Davydov's avatar
      vinyl: use VERBOSE level for logging ranges · a9765933
      Vladimir Davydov authored
      Whenever a range is compacted, split, or coalesced, we log the range
      boundaries. This gets really annoying if there's an index that has
      a lot of key parts or contains binary strings. Let's lower the level
      used for logging these events down to VERBOSE so that they are not
      shown by default but can be enabled if needed.
      
      Closes #10524
      
      NO_DOC=bug fix
      
      (cherry picked from commit 06fa83947b0b63c39732efba4c9d67578f113612)
      a9765933
    • Nikolay Shirokovskiy's avatar
      test: add test for #10148 · 6bebc1b5
      Nikolay Shirokovskiy authored
      The fix itself is in the small submodule which is bumped in the previous
      commit.
      
      Closes #10148
      
      NO_DOC=bugfix
      
      (cherry picked from commit e4ce9e111483a24d66e078f4f05679d309fcb94d)
      6bebc1b5
    • Nikolay Shirokovskiy's avatar
      small: bump version · e1bb094f
      Nikolay Shirokovskiy authored
      New commits:
      
      * small: small: fix crash with low alloc_factor and high memory pressure
      * test: get rid of debug message
      * test: assign label to tests
      * test: introduce a CMake function create_test
      
      Part of #10148
      
      NO_TEST=submodule bump
      NO_CHANGELOG=submodule bump
      NO_DOC=submodule bump
      
      (cherry picked from commit f3dd6960852f1885ca14587a9c72769fad6b9f55)
      e1bb094f
    • Serge Petrenko's avatar
      small: bump version · 387dcbaa
      Serge Petrenko authored
      New commits:
      * test: fix memory leaks reported by LSAN
      * region: fix memleak in ASAN version
      * matras: introduce `matras_needs_touch` and `matras_touch_no_check`
      * lsregion: implement lsregion_reserve for asan build
      
      Prerequisite #10161
      
      NO_CHANGELOG=submodule bump
      NO_TEST=submodule bump
      NO_DOC=submodule bump
      
      (cherry picked from commit c191a1bbe96a67405cbdbb3e421dbf7ea543bf47)
      387dcbaa
  5. Sep 06, 2024
    • Vladimir Davydov's avatar
      vinyl: handle error loading statement from disk during key lookup · 69450ca7
      Vladimir Davydov authored
      `vy_page_stmt()` may fail (return NULL) if:
       - the statement is corrupted;
       - memory allocation for the statement fails;
       - the statement size exceeds `box.cfg.vinyl_max_tuple_size`.
      
      If this happens `vy_page_find_key()` won't return an error. Instead,
      it'll either point the caller to a wrong statement or claim that there's
      no statement matching the key in this page. This may result in invalid
      index selection results and, later on, a crash caused by inconsistencies
      in the tuple cache. The issue was introduced by commit ac8ce023
      ("vinyl: factor out function to lookup key in page").
      
      All of the three cases are actually very unlikely to happen in
      production:
       - If a statement stored in a run file is corrupted, we'll probably fail
         to load the whole page due to failed checksums and never even get to
         `vy_page_stmt()`.
       - Statements are allocated with `malloc()`, which doesn't normally
         fail (instead the whole process would be terminated by OOM) .
       - Users don't tend to lower the tuple size limit after restart.
      
      Still, let's fix the issue by implementing proper error handling for
      `vy_page_find_key()`.
      
      Closes #10512
      
      NO_DOC=bug fix
      
      (cherry picked from commit 9dbaa6a9bc0d65984b417f8a76aa8373b6125d16)
      69450ca7
  6. Aug 30, 2024
    • Nikolay Shirokovskiy's avatar
      lua: fix iconv memory leak · 08c80081
      Nikolay Shirokovskiy authored
      `ffi.C.tnt_iconv_open` returns pointer to `struct iconv`. In this case
      `__gc` in metatable is not bound to the object.
      
      Closes #10487
      Part-of #10211
      
      NO_TEST=covered by existing tests
      NO_DOC=bugfix
      
      (cherry picked from commit 105e6188ee6cc8de71ca2ab077f78f51be07559d)
      08c80081
    • Nikolay Shirokovskiy's avatar
      vinyl: fix memory leak on dump/compaction failure · b6cd6bbe
      Nikolay Shirokovskiy authored
      The issue is we increment `page_count` only on page write. If we fail
      for some reason before then page info `min_key` in leaked.
      
      LSAN report for 'vinyl/recovery_quota.test.lua':
      
      ```
      2024-07-05 13:30:34.605 [478603] main/103/on_shutdown vy_scheduler.c:1668 E> 512/0: failed to compact range (-inf..inf)
      
      =================================================================
      ==478603==ERROR: LeakSanitizer: detected memory leaks
      
      Direct leak of 4 byte(s) in 1 object(s) allocated from:
          #0 0x5e4ebafcae09 in malloc (/home/shiny/dev/tarantool/build-asan-debug/src/tarantool+0x1244e09) (BuildId: 20c5933d67a3831c4f43f6860379d58d35b81974)
          #1 0x5e4ebb3f9b69 in vy_key_dup /home/shiny/dev/tarantool/src/box/vy_stmt.c:308:14
          #2 0x5e4ebb49b615 in vy_page_info_create /home/shiny/dev/tarantool/src/box/vy_run.c:257:23
          #3 0x5e4ebb48f59f in vy_run_writer_start_page /home/shiny/dev/tarantool/src/box/vy_run.c:2196:6
          #4 0x5e4ebb48c6b6 in vy_run_writer_append_stmt /home/shiny/dev/tarantool/src/box/vy_run.c:2287:6
          #5 0x5e4ebb72877f in vy_task_write_run /home/shiny/dev/tarantool/src/box/vy_scheduler.c:1132:8
          #6 0x5e4ebb73305e in vy_task_compaction_execute /home/shiny/dev/tarantool/src/box/vy_scheduler.c:1485:9
          #7 0x5e4ebb73e152 in vy_task_f /home/shiny/dev/tarantool/src/box/vy_scheduler.c:1795:6
          #8 0x5e4ebb01e0b1 in fiber_cxx_invoke(int (*)(__va_list_tag*), __va_list_tag*) /home/shiny/dev/tarantool/src/lib/core/fiber.h:1331:10
          #9 0x5e4ebc389ee0 in fiber_loop /home/shiny/dev/tarantool/src/lib/core/fiber.c:1182:18
          #10 0x5e4ebd3e9595 in coro_init /home/shiny/dev/tarantool/third_party/coro/coro.c:108:3
      
      SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).
      ```
      
      Closes #10489
      Part-of #10211
      
      NO_TEST=covered by existing tests
      NO_DOC=bugfix
      
      (cherry picked from commit 84101f60947dc9322b6bb31d2b3c536101c723c7)
      b6cd6bbe
    • Nikolay Shirokovskiy's avatar
      box: fix memory leak on user DDL when access is denied · 97902542
      Nikolay Shirokovskiy authored
      Besides mentioned #10485 we also fix a similar memleak (updating user)
      that introduced by the same commit 5b32bb7f ("alter: Refactor
      access_check outside constructors").
      
      Closes #10485
      Part-of #10211
      
      NO_TEST=covered by existing tests
      NO_DOC=bugfix
      
      (cherry picked from commit 84f10be00824348844c9e1997bd813b881836928)
      97902542
    • Maksim Tiushev's avatar
      test: add test_ prefix to a function name · 12246244
      Maksim Tiushev authored
      The test function `g.jit_off_on_macOS_by_default` in `gh_8252` was
      silently ignored by the luatest due to its lack of the required
      `test_` prefix. This commit renames the function to
      `test_jit_off_on_macOS_by_default`, ensuring that it is recognized
      and executed by the luatest.
      
      Closes #10210
      
      NO_DOC=codehealth
      NO_CHANGELOG=codehealth
      
      (cherry picked from commit eca4f17b3588d38a4d61a71af8371f5ed15de248)
      12246244
  7. Aug 28, 2024
  8. Aug 26, 2024
    • Vladimir Davydov's avatar
      vinyl: do not discard run on dump/compaction abort if index was dropped · 5b5a0568
      Vladimir Davydov authored
      If an index is dropped while a dump or compaction task is in progress
      we must not write any information about it to the vylog when the task
      completes otherwise there's a risk of getting a vylog recovery failure
      in case the garbage collector manages to purge the index from the vylog.
      
      We disabled logging on successful completion of a dump task quite a
      while ago, in commit 29e2931c ("vinyl: fix race between compaction
      and gc of dropped LSM"), and for compaction only recently, in commit
      ae6a02eb ("vinyl: do not log dump if index was dropped"), but the
      issue remains for a dump/compaction failure, when we log a discard
      record for a run file we failed to write. These results in errors like:
      
      ```
      ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Run 6 deleted twice
      ```
      
      or
      
      ```
      ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Run 5768 deleted but not registered
      ```
      
      Let's fix these issues in exactly the same way as we fixed them for
      successful dump/compaction completion - by skipping writing to vylog
      in case the index is marked as dropped.
      
      Closes #10452
      
      NO_DOC=bug fix
      
      (cherry picked from commit de59504c2bdb0369cdd27af892301f8515293fe1)
      5b5a0568
    • Andrey Saranchin's avatar
      memtx: skip excluded tuples in index count with MVCC enabled · a9b5ae1d
      Andrey Saranchin authored
      Excluded tuples actually have their own history chains in MVCC - such
      chains consist of only one `memtx_story` containing excluded tuple
      itself. Such chains should be skipped when counting invisible tuples
      because they are not inserted to the index - that's what the commit
      does.
      
      Closes #10396
      
      NO_DOC=bugfix
      
      (cherry picked from commit 8947cb04f59423e2944d48b8a1effec2fb11b1db)
      a9b5ae1d
  9. Aug 23, 2024
    • Andrey Saranchin's avatar
      memtx: do not pass pagination key to MVCC · 76bd0d99
      Andrey Saranchin authored
      Currently, when starting an iterator in memtx tree on a range request,
      we pass key from `start_data` to memtx MVCC. The problem is `start_data`
      can contain pagination key that is extracted with `cmp_def`, but MVCC
      performs all comparisons with `key_def`. Fortunately, first parts of
      `cmp_def` is actually `key_def` of the index, so let's crop `start_data`
      by passing `part_count` not greater than `key_def->part_count` to MVCC.
      
      Closes #10448
      
      NO_DOC=bugfix
      
      (cherry picked from commit 0dca0076c0fdaee142020cdeddb031bc0e2238cb)
      76bd0d99
    • Vladimir Davydov's avatar
      vinyl: enable exact match optimization for unique secondary indexes · 93a8edbc
      Vladimir Davydov authored
      If the iterator type is EQ/REQ/LE/GE and the search key is exact (that
      is, there may be at most one tuple matching the key in the index),
      there's no need to scan disk levels if we found a statement for this
      key in the memory level. We've had this optimization for ages but it
      worked only for full keys in terms `cmp_def` (key definition extended
      with primary key parts). Apparently, a lookup in a secondary index
      performed by the user wouldn't match these criteria unless the secondary
      index explicitly included all primary key parts.
      
      This commit improves on that. Now, we enable the optimization if the
      search key is **exact**. We consider a key **exact** if either of the
      following conditions is true:
      
       - The key statement is a tuple (tuple has all key parts).
       - The key statement is a full key in terms of `cmp_def`.
       - The key statement is a full key in terms of `key_def`, it doesn't
         contain nulls, and the index is unique. The check for nulls is
         necessary because even a unique nullable index may contain more than
         one equal key with nulls.
      
      Note, this patch slightly refactors the optimization, adding a few
      comments and hopefully making it more understandable. In particular,
      we remove the one-result-tuple optimization for exact EQ/REQ from
      `vy_read_iterator_advance` and put it in `vy_read_iterator_evaluate_src`
      instead. This way the whole optimization resides in one place.
      
      Closes #10442
      
      NO_DOC=bug fix
      
      (cherry picked from commit 850673db5a69df2c7250d174ab15305624b2634a)
      93a8edbc
  10. Aug 22, 2024
    • Vladimir Davydov's avatar
      test: fix flaky gh-5998-one-tx-for-ddl.test.lua · 3067139b
      Vladimir Davydov authored
      The test expects that any DDL operation aborts **all** concurrent
      transactions, but since commit f5f061d051dc ("vinyl: do not abort
      unrelated transactions on DDL") this isn't exactly true: transactions
      that haven't read/written anything aren't aborted. In the test we expect
      a transaction that haven't done anything to be aborted by DDL and it
      **is** aborted most of them time but for a different reason: it reads
      data that are later modified for `box.schema.user.create()` reads
      `box.space._user:max()` to generate an id for the new user first. Since
      it reads before writing anything, it has the "read-confirmed" isolation
      level hence it's aborted by the transaction creating another user
      because the latter updates `box.space._user:max()`. However, sometimes
      both users are created and the test fails. This happens if the first
      transaction manages to commit before the second one reads the `_user`
      system space.
      
      To fix the test and make the transaction creating the second user fail
      due to DDL, let's add a read of the `_user` system space before putting
      it to sleep. Actually, this even makes the test closer to the "original
      test from #5998".
      
      Closes #10444
      
      NO_DOC=test fix
      NO_CHANGELOG=test fix
      
      (cherry picked from commit 62c051e22109369f9079b5adf4de30e0c53f6ca7)
      3067139b
  11. Aug 21, 2024
  12. Aug 20, 2024
  13. Aug 16, 2024
    • Nikita Zheleztsov's avatar
      engine: introduce stubs for checkpoint FETCH_SNAPSHOT · 23c7899e
      Nikita Zheleztsov authored
      This commit introduces engine stubs that enable a new method
      of fetching snapshots for anonymous replicas. Instead of using
      the traditional read-view join approach, this update allows
      file snapshot fetching. Note that file snapshot fetching
      is only available in Tarantool EE.
      
      Checkpoint fetching is done via IPROTO_IS_CHECKPOINT_JOIN,
      IPROTO_CHECKPOINT_VCLOCK and IPROTO_CHECKPOINT_LSN fields.
      
      If IPROTO_CHECKPOINT_JOIN is set to true, join will be done from
      files: .snap for memtx, .run for vinyl, if false - from read view.
      
      Checkpoint join allows to continue from the place, where client
      stopped in case of snapshot fetching error. This allows to avoid
      rebootstrap of an anonymous client. This can be done by specifying
      CHECKPOINT_VCLOCK, which says from which file server should continue
      join, client gets vclock at the beginning of the join. Specifying
      CHECKPOINT_LSN allows to continue from some position in checkpoint.
      Server sends all data >= CHECKPOINT_LSN.
      
      If CHECKPOINT_VCLOCK is not specified, fetching is done from the latest
      available checkpoint. If CHECKPOINT_LSN is not specified - start from
      the beginning of the snap. So, specifying only IS_CHECKPOINT_JOIN
      triggers fetching the latest checkpoint from files.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=ee
      NO_TEST=ee
      NO_CHANGELOG=ee
      
      (cherry picked from commit 2fca5c13)
      23c7899e
    • Nikita Zheleztsov's avatar
      engine: send vclock with 0th component during join · 9434531b
      Nikita Zheleztsov authored
      This commit makes engine to send vclock without ignoring 0th component
      during join, which is needed for checkpoint FETCH SNAPSHOT.
      
      Currently engine join functions are invoked only from
      relay_initial_join, which is done during JOIN or FETCH SNAPSHOT.
      They respond with vclock of the read view we're going to send.
      
      In the following commit checkpoint FETCH SNAPSHOT will be introduced,
      which responds with vclock of the checkpoint, we're going to send.
      Such vclock may include 0th component and it's crucial to send it to
      a client, as in case of connection failure, client will send us the
      same vclock and we'll have to use its signature to figure out, which
      checkpoint client wants.
      
      So, we have to send and receive 0th component of the vclock during
      FETCH_SNAPSHOT. This commit also introduces decoding vclocks without
      ignoring 0th component, as they'll be used in the following commit too.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=internal
      NO_TEST=ee
      NO_CHANGELOG=internal
      
      (cherry picked from commit 56058393)
      9434531b
    • Nikita Zheleztsov's avatar
      xrow: rename xrow_encode_vclock · 4de3d0d6
      Nikita Zheleztsov authored
      This commit renames xrow_encode_vlock to xrow_encode_vclock_ignore0
      since the next commit will introduce encoding vclock without ignoring
      0th component, which is needed during sending the response to fetch
      snapshot request.
      
      This commit also removes internal field inside the replication_request
      structure, as the following commit will use 'vclock' for
      encoding/decoding vclock without ignoring component.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 313bd730)
      4de3d0d6
    • Nikita Zheleztsov's avatar
      relay: refactor relay_initial_join · 854d09ff
      Nikita Zheleztsov authored
      From now on during initial join memtx engine prepares vclock, raft and
      limbo states, it also sends them during memtx_engine_join.
      
      It's done in order to simplify the code of initial join, as in the
      consequent commit checkpoint initial join will be introduced and we want
      relay code to handle it the same as read-view join without confusing
      conditions.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 72cc2b3e)
      854d09ff
    • Nikita Zheleztsov's avatar
      engine: move raft and limbo states after system data in checkpoint · c107ba11
      Nikita Zheleztsov authored
      Before this commit raft and limbo states were written at the end of the
      checkpoint, which makes it very costly to access them.
      
      Checkpoint join needs to access limbo and raft state in order to send
      them during JOIN_META stage. We cannot use the latest states, like it's
      done for read-view snapshot fetching: states may be far ahead of the
      data, written to the checkpoint, which we're going to send.
      
      This commit moves raft and limbo states after data from the system
      spaces but before user data. We cannot put them right at the beginning
      of the snapshot, because then we'll have to patch recovery process,
      which currently strongly relies on the fact, that system spaces are
      at the beginning of the snapshot (this was done in order to apply force
      recovery only for user data). If we patch recovery process, then old
      versions, where it's unpatched, won't be able to recover from the
      snapshots done by the newer version, compatibility of snapshots will be
      broken.
      
      The current change is not breaking, old Tarantool versions can restore
      from the snapshot made by the newer one.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      
      (cherry picked from commit 3da31b83)
      c107ba11
Loading