Skip to content
Snippets Groups Projects
  1. Dec 11, 2024
    • Vladimir Davydov's avatar
      vinyl: fix cache invalidation on rollback of DELETE statement · 76345442
      Vladimir Davydov authored and Dmitry Ivanov's avatar Dmitry Ivanov committed
      Once a statement is prepared to be committed to WAL, it becomes visible
      (in the 'read-committed' isolation level) so it can be added to the
      tuple cache. That's why if the statement is rolled back due to a WAL
      error, we have to invalidate the cache. The problem is that the function
      invalidating the cache (`vy_cache_on_write`) ignores the statement if
      it's a DELETE judging that "there was nothing and there is nothing now".
      This is apparently wrong for rollback. Fix it.
      
      Closes #10879
      
      NO_DOC=bug fix
      
      (cherry picked from commit d64e29da2c323a4b4fcc7cf9fddb0300d5dd081f)
      76345442
    • Vladimir Davydov's avatar
      vinyl: fix handling of duplicate multikey entries in transaction write set · 3d584a69
      Vladimir Davydov authored and Dmitry Ivanov's avatar Dmitry Ivanov committed
      A multikey index stores a tuple once per each entry of the indexed
      array field, excluding duplicates. For example, if the array field
      equals {1, 3, 2, 3}, the tuple will be stored three times. Currently,
      when a tuple with duplicate multikey entries is inserted into a
      transaction write set, duplicates are overwritten as if they belonged
      to different statements. Actually, this is pointless: we could just as
      well skip them without trying to add to the write set. Besides, this
      may break the assumptions taken by various optimizations, resulting in
      anomalies. Consider the following example:
      
      ```lua
      local s = box.schema.space.create('test', {engine = 'vinyl'})
      s:create_index('primary')
      s:create_index('secondary', {parts = {{'[2][*]', 'unsigned'}}})
      s:replace({1, {10, 10}})
      s:update({1}, {{'=', 2, {10}}})
      ```
      
      It will insert the following entries to the transaction write set
      of the secondary index:
      
       1. REPLACE {10, 1} [overwritten by no.2]
       2. REPLACE {10, 1} [overwritten by no.3]
       3. DELETE {10, 1} [turned into no-op as REPLACE + DELETE]
       4. DELETE {10, 1} [overwritten by no.5]
       5. REPLACE {10, 1} [turned into no-op as DELETE + REPLACE]
      
      (1-2 correspond to `replace()` and 3-5 to `delete()`)
      
      As a result, tuple {1, {10}} will be lost forever.
      
      Let's fix this issue by silently skipping duplicate multikey entries
      added to a transaction write set. After the fix, the example above
      will produce the following write set entries:
      
       1. REPLACE{10, 1} [overwritten by no.2]
       2. DELETE{10, 1} [turned into no-op as REPLACE + DELETE]
       3. REPLACE{10, 1} [committed]
      
      (1 corresponds to `replace()` and 2-3 to `delete()`)
      
      Closes #10869
      Closes #10870
      
      NO_DOC=bug fix
      
      (cherry picked from commit 1869dce15d9a797391e45df75507078d91f1651e)
      3d584a69
    • Vladimir Davydov's avatar
      vinyl: skip invisible read sources · 8f7bae8c
      Vladimir Davydov authored and Dmitry Ivanov's avatar Dmitry Ivanov committed
      A Vinyl read iterator scans all read sources (memory and disk levels)
      even if it's executed in a read view from which most of the sources are
      invisible. As a result, a long running scanning request may spend most
      of the time skipping invisible statements. The situation is exacerbated
      if the instance is experiencing a heavy write load because it would pile
      up old statement versions in memory and force the iterator to skip over
      them after each disk read.
      
      Since the replica join procedure in Vinyl uses a read view iterator
      under the hood, the issue is responsible for a severe performance
      degradation of the master instance and the overall join procedure
      slowdown when a new replica is joined to an instance running under
      a heavy write load.
      
      Let's fix this issue by making a read iterator skip read sources that
      aren't visible from its read view.
      
      Closes #10846
      
      NO_DOC=bug fix
      
      (cherry picked from commit 6a214e42e707b502022622866d898123a6f177f1)
      8f7bae8c
    • Vladimir Davydov's avatar
      vinyl: fix handling of overwritten statements in transaction write set · 3344bffc
      Vladimir Davydov authored and Dmitry Ivanov's avatar Dmitry Ivanov committed
      Statements executed in a transaction are first inserted into the
      transaction write set and only when the transaction is committed, they
      are applied to the LSM trees that store indexed keys in memory. If the
      same key is updated more than once in the same transaction, the old
      version is marked as overwritten in the write set and not applied on
      commit.
      
      Initially, write sets of different indexes of the same space were
      independent: when a transaction was applied, we didn't have a special
      check to skip a secondary index statement if the corresponding primary
      index statement was overwritten because in this case the secondary
      index statement would have to be overwritten as well. This changed when
      deferred DELETEs were introduced in commit a6edd455 ("vinyl:
      eliminate disk read on REPLACE/DELETE"). Because of deferred DELETEs,
      a REPLACE or DELETE overwriting a REPLACE in the primary index write
      set wouldn't generate DELETEs that would overwrite the previous key
      version in write sets of the secondary indexes. If we applied such
      a statement to the secondary indexes, it'd stay there forever because,
      since there's no corresponding REPLACE in the primary index, a DELETE
      wouldn't be generated on primary index compaction. So we added a special
      instruction to skip a secondary index statement if the corresponding
      primary index was overwritten, see `vy_tx_prepare()`. Actually, this
      wasn't completely correct because we skipped not only secondary index
      REPLACE but also DELETE. Consider the following example:
      
      ```lua
      local s = box.schema.space.create('test', {engine = 'vinyl'})
      s:create_index('primary')
      s:create_index('secondary', {parts = {2, 'unsigned'}})
      
      s:replace{1, 1}
      
      box.begin()
      s:update(1, {{'=', 2, 2}})
      s:update(1, {{'=', 2, 3}})
      box.commit()
      ```
      
      UPDATEs don't defer DELETEs because, since they have to query the old
      value, they can generate DELETEs immediately so here's what we'd have
      in the transaction write set:
      
       1. REPLACE {1, 2} in 'test.primary' [overwritten by no.4]
       2. DELETE {1, 1} from 'test.secondary'
       3. REPLACE {1, 2} in 'test.secondary' [overwritten by no.5]
       4. REPLACE{1, 3} in 'test.primary'
       5. DELETE{1, 2} from 'test.secondary'
       6. REPLACE{1, 3} in 'test.secondary'
      
      Statement no.2 would be skipped and marked as overwritten because of
      the new check, resulting in {1, 1} never deleted from the secondary
      index. Note, the issue affects spaces both with and without enabled
      deferred DELETEs.
      
      This commit fixes this issue by updating the check to only skip REPLACE
      statements. It should be safe to apply DELETEs in any case.
      
      There's another closely related issue that affects only spaces with
      enabled deferred DELETEs. When we generate deferred DELETEs for
      secondary index when a transaction is committed (we can do it if we find
      the previous version in memory), we assume that there can't be a DELETE
      in a secondary index write set. This isn't true: there can be a DELETE
      generated by UPDATE or UPSERT. If there's a DELETE, we have nothing to
      do unless the DELETE was optimized out (marked as no-op).
      
      Both issues were found by `vinyl-luatest/select_consistency_test.lua`.
      
      Closes #10820
      Closes #10822
      
      NO_DOC=bug fix
      
      (cherry picked from commit 6a87c45deeb49e4e17ae2cc0eeb105cc9ee0f413)
      3344bffc
  2. Nov 21, 2024
    • Andrey Saranchin's avatar
      sql: do not use raw index for count · 1b12f241
      Andrey Saranchin authored
      Currently, we use raw index for count operation instead of
      `box_index_count`. As a result, we skip a check if current transaction
      can continue and we don't begin transaction in engine if needed. So, if
      count statement is the first in a transaction, it won't be tracked by
      MVCC since it wasn't notified about the transaction. The commit fixes
      the mistake. Also, the commit adds a check if count was successful and
      covers it with a test.
      
      In order to backport the commit to 2.11, space name was wrapped with
      quotes since it is in lower case and addressing such spaces with SQL
      without quotes is Tarantool 3.0 feature. Another unsupported feature is
      prohibition of data access in transactional triggers - it was used in a
      test case so it was rewritten.
      
      Closes #10825
      
      NO_DOC=bugfix
      
      (cherry picked from commit 0656a9231149663a0f13c4be7466d4776ccb0e66)
      1b12f241
  3. Nov 12, 2024
    • Vladimir Davydov's avatar
      vinyl: fix duplicate multikey stmt accounting with deferred deletes · 26a3c8cf
      Vladimir Davydov authored
      `vy_mem_insert()` and `vy_mem_insert_upsert()` increment the row count
      statistic of `vy_mem` only if no statement is replaced, which is
      correct, while `vy_lsm_commit()` increments the row count of `vy_lsm`
      unconditionally. As a result, `vy_lsm` may report a non-zero statement
      count (via `index.stat()` or `index.len()`) after a dump. This may
      happen only with a non-unique multikey index, when the statement has
      duplicates in the indexed array, and only if the `deferred_deletes`
      option is enabled, because otherwise we drop duplicates when we form
      the transaction write set, see `vy_tx_set()`. With `deferred_deletes`,
      we may create a `txv` for each multikey entry at the time when we
      prepare to commit the transaction, see `vy_tx_handle_deferred_delete()`.
      
      Another problem is that `vy_mem_rollback_stmt()` always decrements
      the row count, even if it didn't find the rolled back statement in
      the tree. As a result, if the transaction with duplicate multikey
      entries is rolled back on WAL error, we'll decrement the row count
      of `vy_mem` more times than necessary.
      
      To fix this issue, let's make the `vy_mem` methods update the in-memory
      statistic of `vy_lsm`. This way they should always stay in-sync. Also,
      we make `vy_mem_rollback_stmt()` skip updating the statistics in case
      the rolled back statement isn't present in the tree.
      
      This issue results in `vinyl-luatest/select_consistency_test.lua`
      flakiness when checking `index.len()` after compaction. Let's make
      the test more thorough and also check that `index.len()` equals
      `index.count()`.
      
      Closes #10751
      Part of #10752
      
      NO_DOC=bug fix
      
      (cherry picked from commit e8810c555d4e6ba56e6c798e04216aa11efb5304)
      26a3c8cf
  4. Nov 07, 2024
    • Nikita Zheleztsov's avatar
      upgrade: fix upgrading from schema 1.6.9 · 4e4d4bc1
      Nikita Zheleztsov authored
      This commit fixes some cases of upgrading schema from 1.6.9:
      
      1. Fix updating empty password for users. In 1.6 credentials were array
         in _user, in 1.7.5 they became map.
      
      2. Automatically update the format of user spaces. Format of system
         spaces have been properly fixed during upgrade to 1.7.5. However,
         commit 519bc82e ("Parse and validate space formats") introduced
         strict checking of format field in 1.7.6. So, the format of user
         spaces should be also fixed.
      
      Back in 1.6 days, it was allowed to write anything in space format.
      This commit only fixes valid uses of format:
          {name = 'a', type = 'number'}
          {'a', type = 'number'}
          {'a', 'num'}
          {'a'}
      
      Invalid use of format (e.g. {{}}, or {{5, 'number'}} will cause error
      anyway. User has to fix the format on old version and only after that
      start a new one.
      
      This commit also introduces the test, which checks, that we can
      properly upgrade from 1.6.9 to the latest versions, at least in basic
      cases.
      
      Closes #10180
      
      NO_DOC=bugfix
      
      (cherry picked from commit f69e2ae488b3620e31f1a599d8fb78a66917dbfd)
      4e4d4bc1
  5. Nov 01, 2024
    • Andrey Saranchin's avatar
      memtx: fix use-after-free on background index build · a4456c10
      Andrey Saranchin authored
      When building an index in background, we create on_rollback triggers for
      tuples inserted concurrently. The problem here is on_rollback trigger
      has independent from `index` and `memtx_ddl_state` lifetime - it can be
      called after the index was build (and `memtx_ddl_state` is destroyed)
      and even after the index was altered. So, in order to avoid
      use-after-free in on_rollback trigger, let's drop all on_rollback
      triggers when the DDL is over. It's OK because all owners of triggers
      are already prepared, hence, in WAL or replication queue (since we
      build indexes in background only without MVCC so the transactions cannot
      yield), so if they are rolled back, the same will happen to the DDL.
      
      In order to delete on_rollback triggers, we should collect them into a
      list in `memtx_ddl_state`. On the other hand, when the DML statement is
      over (committed or rolled back), we should delete its trigger from the
      list to prevent use-after-free. That's why the commit adds the on_commit
      trigger to background build process.
      
      Closes #10620
      
      NO_DOC=bugfix
      
      (cherry picked from commit d8d82dba4c884c3a7ad825bd3452d35627c7dbf4)
      a4456c10
  6. Oct 30, 2024
    • Sergey Bronnikov's avatar
      cmake: enable UBSan check signed-integer-overflow · 2f28137b
      Sergey Bronnikov authored
      The patch enable UBsan check signed-integer-overflow that was
      disabled globally in commit 5115d9f3
      ("cmake: split UB sanitations into separate flags.") and disable
      it for a several functions inline.
      
      See also #10703
      See also #10704
      Closes #10228
      
      NO_CHANGELOG=codehealth
      NO_DOC=codehealth
      NO_TEST=codehealth
      
      (cherry picked from commit 60ba7fb4c0038d9d17387f7ce9755eb587ea1da4)
      2f28137b
  7. Oct 28, 2024
    • Andrey Saranchin's avatar
      memtx: clarify tuples against given index in memtx_tx_snapshot_cleaner · c5b11266
      Andrey Saranchin authored
      Currently, we create `memtx_tx_snapshot_cleaner` for each index in read
      view. However, we somewhy clarify all tuples against primary index in
      all cleaners. As a result, secondary indexes work incorrectly in read
      view when MVCC is enabled, we may even get a tuple with one key, but
      a tuple with another key will be returned because it is clarified
      against primary index and repsects its order - that's wrong because
      all indexes have its own orders. Let's clarify tuples against given
      index to fix this mistake.
      
      Community Edition is not affected at all since it uses read view only
      for making a snapshot - we use only primary indexes there.
      
      Part of tarantool/tarantool-ee#939
      
      NO_TEST=in EE
      NO_CHANGELOG=in EE
      NO_DOC=bugfix
      
      (cherry picked from commit 835fadd)
      c5b11266
  8. Oct 23, 2024
    • Nikolay Shirokovskiy's avatar
      box: build fix · 312dbf07
      Nikolay Shirokovskiy authored
      I got compile error for release build on gcc 14.2.1 20240910 version.
      
      ```
      In function ‘char* mp_store_double(char*, double)’,
          inlined from ‘char* mp_encode_double(char*, double)’ at /home/shiny/dev/tarantool-ee/tarantool/src/lib/msgpuck/msgpuck.h:2409:24,
          inlined from ‘uint32_t tuple_hash_field(uint32_t*, uint32_t*, const char**, field_type, coll*)’ at /home/shiny/dev/tarantool-ee/tarantool/src/box/tuple_hash.cc:317:46:
      /home/shiny/dev/tarantool-ee/tarantool/src/lib/msgpuck/msgpuck.h:340:16: error: ‘value’ may be used uninitialized [-Werror=maybe-uninitialized]
        340 |         cast.d = val;
            |         ~~~~~~~^~~~~
      /home/shiny/dev/tarantool-ee/tarantool/src/box/tuple_hash.cc: In function ‘uint32_t tuple_hash_field(uint32_t*, uint32_t*, const char**, field_type, coll*)’:
      /home/shiny/dev/tarantool-ee/tarantool/src/box/tuple_hash.cc:311:24: note: ‘value’ was declared here
        311 |                 double value;
            |
      ```
      
      NO_TEST=build fix
      NO_CHANGELOG=build fix
      NO_DOC=build fix
      
      (cherry picked from commit 1129c758d0e3bd86eec89e5229eac3f99155d8ac)
      312dbf07
  9. Oct 18, 2024
    • Andrey Saranchin's avatar
      memtx: always read prepared tuples of system spaces · 7f0b2bee
      Andrey Saranchin authored
      Since we often search spaces, users, funcs and so on in internal caches
      that have `read-committed` isolation level (prepared tuples are seen),
      let's always allow to read prepared tuples of system spaces.
      
      Another advantage of such approach is that we never handle MVCC when
      working with system spaces, so after the commit they will behave in the
      same way - prepared tuples will be seen. The only difference is that
      readers of prepared rows will be aborted if the row will be rolled back.
      
      By the way, the inconsistency between internal caches and system spaces
      could lead to crash in some sophisticated scenarios - the commit fixes
      this problem as well because now system spaces and internal caches are
      synchronized.
      
      Closes #10262
      Closes tarantool/security#131
      
      NO_DOC=bugfix
      
      (cherry picked from commit b33f17b25de6bcbe3ebc236250976e4a0250e75e)
      7f0b2bee
    • Andrey Saranchin's avatar
      alter: wait for previous alters to commit on DDL · ea1c829f
      Andrey Saranchin authored
      Yielding DDL operations acquire DDL lock so that the space cannot be
      modified under its feet. However, there is a case when it actually can:
      if a yielding DDL has started when there is another DDL is being
      committed and it gets rolled back due to WAL error, `struct space`
      created by rolled back DDL is deleted - and it's the space being altered
      by the yielding DDL. In order to fix this problem, let's simply wait for
      all previous alters to be committed.
      
      We could use `wal_sync` to wait for all previous transactions to be
      committed, but it is more complicated - we need to use `wal_sync` for
      single instance and `txn_limbo_wait_last_txn` when the limbo queue has
      an owner. Such approach has more pitfalls and requires more tests to
      cover all cases. When relying on `struct alter_space` directly, all
      situations are handled with the same logic.
      
      Alternative solutions that we have tried:
      1. Throw an error in the case when user tries to alter space when
         there is another non-committed alter. Such approach breaks applier
         since it applies rows asynchronously. Trying applier to execute
         operations synchronously breaks it even harder.
      2. Do not use space in `build_index` and `check_format` methods. In this
         case, there is another problem: rollback order. We have to rollback
         previous alters firstly, and the in-progress one can be rolled back
         only after it's over. It breaks fundamental memtx invariant: rollback
         order must be reverse of replace order. We could try to use
         `before_replace` triggers for alter, but the patch would be bulky.
      
      Closes #10235
      
      NO_DOC=bugfix
      
      (cherry picked from commit fee8c5dd6b16471739ed8512ba4137ff2e7274aa)
      ea1c829f
  10. Oct 16, 2024
    • Ilya Verbin's avatar
      box: fix SIGSEGV on unaligned access to `struct applier` · 8a1f72b6
      Ilya Verbin authored
      All structures with a non-default alignment (set by `alignas()`) must be
      allocated by `aligned_alloc()`, otherwise an access to such a structure
      member fill crash, e.g. if compiled with AVX-512 support.
      
      See also commit a60ec82d4f07 ("box: fix SIGSEGV on unaligned access to a
      struct with extended alignment").
      
      Closes #10699
      
      NO_DOC=bugfix
      NO_CHANGELOG=minor
      NO_TEST=tested by debug_asan_clang workflow
      
      (cherry picked from commit bf091358806ed17bf44efd2cf382a43c0ba49fe0)
      8a1f72b6
  11. Oct 15, 2024
    • Nikolay Shirokovskiy's avatar
      say: fix NULL pointer dereference in log_syslog_init · f67e047a
      Nikolay Shirokovskiy authored
      If opts.identity is NULL and strdup is failed we do NULL pointer
      dereference when reporting the error. Let's just panic if strdup() failed.
      While at it replace another strdup() with xstrdup() in this function. Our
      current approach is to panic on runtime OOM.
      
      Closes tarantool/security#128
      
      NO_TEST=issue is not possible after the fix
      NO_CHANGELOG=not reproducible
      NO_DOC=bugfix
      
      (cherry picked from commit 47b72f44986797466b95b9431a381dbef7dd64fd)
      f67e047a
  12. Oct 14, 2024
    • Ilya Verbin's avatar
      box: fix UBSan error regarding misaligned store in field_map.c · d835c495
      Ilya Verbin authored
      The type cast is unnecessary and causes false-positive errors:
      
      NO_WRAP
      ```
      ./src/box/field_map.c:110:10: runtime error: store to misaligned address 0x507000071082 for type 'uint32_t *' (aka 'unsigned int *'), which requires 4 byte alignment
      0x507000071082: note: pointer points here
       01 00  00 00 be be be be f0 ff  ff ff 02 00 00 00 be be  be be be be be be 00 00  00 00 00 00 00 00
                    ^
      SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ./src/box/field_map.c:110:10
      ```
      NO_WRAP
      
      Closes #10631
      
      NO_DOC=bugfix
      NO_CHANGELOG=minor
      NO_TEST=tested by debug_asan_clang workflow
      
      (cherry picked from commit 5ddbd85cc377a29dc27d01ad06acdc6acc24cc5b)
      d835c495
    • Ilya Verbin's avatar
      small: bump version · ada3dddf
      Ilya Verbin authored
      New commits:
      * mempool: fix UBSan errors regarding misaligned stores
      
      NO_DOC=submodule bump
      NO_TEST=submodule bump
      NO_CHANGELOG=submodule bump
      
      (cherry picked from commit 9dd56f49be85dc8a1fe874629711a828835f740c)
      ada3dddf
    • Ilya Verbin's avatar
      box: fix SIGSEGV on unaligned access to a struct with extended alignment · 9c36990e
      Ilya Verbin authored
      All structures with a non-default alignment (set by `alignas()`) must be
      allocated by `aligned_alloc()`, otherwise an access to such a structure
      member fill crash, e.g. if compiled with AVX-512 support.
      
      Closes #10215
      Part of #10631
      
      NO_DOC=bugfix
      NO_TEST=tested by debug_asan_clang workflow
      NO_CHANGELOG=fix is actually not user-visible, because tarantool still
                   doesn't work with enabled AVX-512 (#10671)
      
      (cherry picked from commit a60ec82d4f07720148b0724e5feff31f76291b56)
      9c36990e
    • Ilya Verbin's avatar
      Revert "hotfix: change aligned_alloc to posix_memalign" · 10aecd64
      Ilya Verbin authored
      This reverts commit 3c25c667.
      
      `aligned_alloc()` is supported by macOS since 10.15.
      I believe that we do not support older versions now.
      
      NO_DOC=internal
      NO_TEST=internal
      NO_CHANGELOG=internal
      
      (cherry picked from commit 2f4594f748cff99d15f8f6d603797a308793de86)
      10aecd64
  13. Oct 08, 2024
    • Vladimir Davydov's avatar
      box: log error that caused initial checkpoint failure · 607ff4cd
      Vladimir Davydov authored
      Currently, we just panic without providing any additional information
      if we failed to create the initial checkpoint on bootstrap. This
      complicates trouble shooting. Let's replace `panic()` with `say_error()`
      and raise the exception that caused the failure. The exception will be
      caught by `box_cfg()`, which will log it and then panic.
      
      NO_DOC=error logging
      NO_TEST=error logging
      NO_CHANGELOG=error logging
      
      (cherry picked from commit e1b5114d99ed2f224e9e9a17bf29882e50be3653)
      607ff4cd
  14. Oct 07, 2024
    • Nikita Zheleztsov's avatar
      upgrade: introduce 2.11.5 schema version · 46cac24c
      Nikita Zheleztsov authored
      We decided to introduce new schema version, which does nothing in order
      to distinguish, which 2.11 schema we can safely use to allow persistent
      names on it.
      
      Follow up #10549
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      NO_TEST=nothing to test
      46cac24c
    • Vladislav Shpilevoy's avatar
      schema: allow _cluster update after join · 2be2e75c
      Vladislav Shpilevoy authored
      The function replica_check_id() is called on any change in
      _cluster: insert, delete, update. It was supposed to check if the
      replica ID is valid - not nil, not out of range (VCLOCK_MAX).
      
      But it was also raising an error when the ID matched this
      instance's ID unless the instance was joining. That happened even
      if a _cluster tuple was updated without changing the ID at all.
      For example, if one would just do
      _cluster:replace(_cluster:get(box.info.id)).
      
      Better do the check in the only place where the mutation can
      happen - on deletion. Since replica ID is a primary key in
      _cluster, it can't be updated there. Only inserted or deleted.
      
      This commit is backported to 2.11, since we want to allow using
      persistent names as early as we can in order to simplify the upgrade
      process. We also bump the schema version in the following commit in
      order to distinguish this version from overs 2.11.X, where persistent
      names doesn't work.
      
      Closes #10549
      
      NO_DOC=bugfix and refactoring
      NO_CHANGELOG=cannot happen without touching system spaces
      NO_TEST=too insignificant for an own test
      
      (cherry picked from commit cb8f4715)
      2be2e75c
    • Sergey Bronnikov's avatar
      httpc: replace ibuf_alloc with xibuf_alloc · 5bdda673
      Sergey Bronnikov authored
      There is no check for NULL for a value returned by `ibuf_alloc`,
      the NULL will be passed to `memcpy()` if the aforementioned
      function will return a NULL. The patch fixes that by replacing
      `ibuf_alloc` with macros `xibuf_alloc` that never return NULL.
      
      Found by Svace.
      
      NO_CHANGELOG=codehealth
      NO_DOC=codehealth
      NO_TEST=codehealth
      
      (cherry picked from commit b4ee146fde6e418aed590ac6054cff75c2a59626)
      5bdda673
    • Astronomax's avatar
      limbo: speed up synchronous transaction queue processing · d615f3f7
      Astronomax authored
      This patch optimizes the process of collecting ACKs from replicas for
      synchronous transactions. Before this patch, collecting confirmations
      was slow in some cases. There was a possible situation where it was
      necessary to go through the entire limbo again every time the next ACK
      was received from the replica. This was especially noticeable in the
      case of a large number of parallel synchronous requests.
      For example, in the 1mops_write bench with parameters --fibers=6000
      --ops=1000000 --transaction=1, performance increases by 13-18 times on
      small clusters of 2-4 nodes and 2 times on large clusters of 31 nodes.
      
      Closes #9917
      
      NO_DOC=performance improvement
      NO_TEST=performance improvement
      
      (cherry picked from commit 4a866f64d64c610a3c8441835fee3d8dda5eca71)
      d615f3f7
    • Astronomax's avatar
      vclock: introduce `vclock_nth_element` and `vclock_count_ge` · c2c87816
      Astronomax authored
      Two new vclock methods have been added: `vclock_nth_element` and
      `vclock_count_ge`.
      * `vclock_nth_element` takes n and returns whatever element would occur in
      nth position if vclock were sorted. This method is very useful for
      synchronous replication because it can be used to find out the lsn of the
      last confirmed transaction - it's simply the result of calling this
      method with argument {vclock_size - replication_synchro_quorum} (provided
      that vclock_size >= replication synchro quorum, otherwise it is obvious
      that no transaction has yet been confirmed).
      * `vclock_count_ge` takes lsn and returns the number of components whose
      value is greater than or equal to lsn. This can be useful to understand
      how many replicas have already received a transaction with a given lsn.
      
      Part of #9917
      
      NO_CHANGELOG=Will be added in another commit
      NO_DOC=internal
      
      (cherry picked from commit 58f3c93b660499e85f08a4f63373040bcae28732)
      c2c87816
  15. Oct 04, 2024
    • Andrey Saranchin's avatar
      memtx: do not pass NULL to memcpy when creating gap item in MVCC · e92f7806
      Andrey Saranchin authored
      According to the C standard, passing `NULL` to `memcpy` is UB, even if
      it copies nothing (number of bytes to copy is 0). The commit fixes such
      situation in memtx MVCC.
      
      Closes tarantool/security#129
      
      NO_TEST=fix UB
      NO_CHANGELOG=fix UB
      NO_DOC=fix UB
      
      (cherry picked from commit 24d38cef5adff900bea2484235762678ac1c5234)
      e92f7806
  16. Sep 25, 2024
    • Vladimir Davydov's avatar
      vinyl: fix crash when empty PK DDL races with DML · b4304df7
      Vladimir Davydov authored
      Vinyl doesn't support altering the primary index of a non-empty space,
      but the check forbidding this isn't entirely reliable - the DDL function
      may yield to wait for pending WAL writes to finish after ensuring that
      the space doesn't contain any tuples. If a new tuples is inserted into
      the space in the meantime, the DDL operation will proceed rebuilding
      the primary index and trigger a crash because the code is written on
      the assumption that it's rebuilding a secondary index:
      
      ```
      ./src/box/vinyl.c:1572: vy_check_is_unique_secondary_one: Assertion `lsm->index_id > 0' failed.
      ```
      
      Let's fix this by moving the check after syncing on WAL.
      
      Closes #10603
      
      NO_DOC=bug fix
      
      (cherry picked from commit 955537b57c2aade58b7ca42501a9bbe50dd91f26)
      b4304df7
  17. Sep 23, 2024
    • Vladimir Davydov's avatar
      box: check fiber slice in generic implementation of index count · 55fffaed
      Vladimir Davydov authored
      `index.count()` may hang for too long in Vinyl if a substantial
      consecutive hunk of the space is stored in memory. Let's add
      a fiber slice check to it to prevent it from blocking the TX thread
      for too long.
      
      Closes #10553
      
      NO_DOC=bug fix
      
      (cherry picked from commit e19bca5a74e83d2521fe770f2a93c3e3d3ad4801)
      55fffaed
    • Vladimir Davydov's avatar
      vinyl: fix cache corruption on skipping unconfirmed tuple · eef3d7d2
      Vladimir Davydov authored
      The tuple cache doesn't store historical data. It stores only the newest
      tuple versions, including prepared but not yet confirmed (committed but
      not written to WAL) tuples. This means that transactions sent to a read
      view shouldn't add any new chains to the cache because such a chain may
      bypass a tuple invisible from the read view.
      
      A transaction may be sent to a read view in two cases:
      
       1. If some other transactions updates data read by it.
       2. If the transaction is operating in the 'read-confirmed' isolation
          mode and skips an unconfirmed tuple while scanning the memory level.
          This was added in commit 588170a7 ("vinyl: implement transaction
          isolation levels").
      
      The second point should be checked by the read iterator itself, and it
      is indeed for the standard case when we scan the memory level before
      reading the disk. However, there's the second case: if some other tuples
      are inserted into the memory level while the read iterator was waiting
      for a disk read to complete, it rescans the memory level and may skip
      a new unconfirmed tuple that wasn't there the first time we scanned
      the memory level. Currently, if this happens, it won't send itself to
      a read view and may corrupt the cache by inserting a chain that skips
      over the unconfirmed tuple. Fix this by adding the missing check.
      
      While we are at it, let's simplify the code a bit by moving the check
      inside `vy_read_iterator_scan_mem()`. It's okay because sending to
      a read view a transaction that's already in the read view is handled
      correctly by `vy_tx_send_to_read_view()`.
      
      Closes #10558
      
      NO_DOC=bug fix
      
      (cherry picked from commit a3feee322e76a1e10ab874e63f17f97b6457b59d)
      eef3d7d2
  18. Sep 20, 2024
    • Vladimir Davydov's avatar
      vinyl: fix compaction crash on disk read error · 45138738
      Vladimir Davydov authored
      `vy_slice_stream_next()` clears the return value on failure. This isn't
      expected by `vy_write_iterator_merge_step()`, which doesn't update
      the source position in the `vy_wirte_iterator::src_heap` in this case.
      As a result, an attempt to remove `end_of_key_src` from the heap in
      `vy_write_iterator_build_history()` may crash as follows:
      
      ```
       # 1  0x572a2ecc21a6 in crash_collect+256
       # 2  0x572a2ecc2be2 in crash_signal_cb+100
       # 3  0x7cfef6645320 in __sigaction+80
       # 4  0x572a2eab16de in tuple_format+16
       # 5  0x572a2eab1a25 in vy_stmt_is_key+24
       # 6  0x572a2eab1be8 in vy_stmt_compare+89
       # 7  0x572a2eab1e37 in vy_entry_compare+74
       # 8  0x572a2eab2913 in heap_less+88
       # 9  0x572a2eab21e3 in vy_source_heap_sift_up+255
       # 10 0x572a2eab20b9 in vy_source_heap_update_node+54
       # 11 0x572a2eab25c1 in vy_source_heap_delete+249
       # 12 0x572a2eab4134 in vy_write_iterator_build_history+1497
       # 13 0x572a2eab4995 in vy_write_iterator_build_read_views+193
       # 14 0x572a2eab4ce6 in vy_write_iterator_next+380
       # 15 0x572a2eadd20b in vy_task_write_run+1132
       # 16 0x572a2eade6cf in vy_task_compaction_execute+124
       # 17 0x572a2eadfa8d in vy_task_f+445
       # 18 0x572a2e9ea143 in fiber_cxx_invoke(int (*)(__va_list_tag*), __va_list_tag*)+34
       # 19 0x572a2eccee7c in fiber_loop+219
       # 20 0x572a2f0aef18 in coro_init+120
      ```
      
      Normally, a function shouldn't update the return value on failure so
      let's fix `vy_slice_stream_next()`.
      
      Closes #10555
      
      NO_DOC=bug fix
      
      (cherry picked from commit f1144c533b6c52c324ffe1cc4fcaeab1f2f6cd9f)
      45138738
    • Vladimir Davydov's avatar
      vinyl: use ERROR_INJECT_COUNTDOWN where appropriate · 69ea9ba1
      Vladimir Davydov authored
      ERRINJ_VY_RUN_OPEN and ERRINJ_VY_STMT_ALLOC are countdown injections.
      Let's name them appropriately and use the helper macro. Also, let's
      raise the ER_INJECTION error code for them to make it clear that they
      aren't real errors.
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      
      (cherry picked from commit 21fe14582c948f560720fa285ed3e21483d11dc2)
      69ea9ba1
    • Vladimir Davydov's avatar
      errinj: fix ERROR_INJECT_COUNTDOWN · 70a3a976
      Vladimir Davydov authored
      We shouldn't decrement the counter if it's negative - otherwise it may
      wrap around and mistakenly trigger the error injection.
      
      NO_DOC=internal
      NO_TEST=internal
      NO_CHANGELOG=internal
      
      (cherry picked from commit d11d4576b0d0cbfc03dc1a3570573b7bbf1126b5)
      70a3a976
  19. Sep 18, 2024
    • Sergey Bronnikov's avatar
      datetime: introduce tz in datetime.parse() · c42b850d
      Sergey Bronnikov authored
      There is an option tz in `datetime.parse()`, it was added in
      commit 3c403661 ("datetime, lua: date parsing functions").
      The option is not documented, and the commit message says that
      option `tz` is "Not yet implemented in this commit.".
      
      The patch added tests and a doc request for this option.
      The behaviour of the option `tz` is the same as with option
      `tzoffset`:
      - if timezone was not set in a parsed string then it is set to
        a value specified by `tz`
      - if timezone was set in a parsed string then option `tz` is
        ignored
      
      ```
      tarantool> date.parse("1970-01-01T01:00:00 MSK", { tz = 'Europe/Paris' })
      ---
      - 1970-01-01T01:00:00 MSK
      - 23
      ...
      
      tarantool> date.parse("1970-01-01T01:00:00", { tz = 'Europe/Paris' })
      ---
      - 1970-01-01T01:00:00 Europe/Paris
      - 19
      ...
      ```
      
      Follows up #6731
      Fixes #10420
      
      @TarantoolBot document
      Title: Introduce option `tz` in `datetime.parse()`
      
      The option `tz` is added in a function `datetime.parse()`.
      The option set timezone to a passed value if it was not set in
      a parsed string.
      
      (cherry picked from commit c6bab23a6dc4f819167cbc78eb93859847a389ea)
      c42b850d
    • Sergey Bronnikov's avatar
      datetime: use tzoffset in a parse() with custom format · 45d40d13
      Sergey Bronnikov authored
      The patch fixes a behaviour, when `datetime.parse()` ignores
      `tzoffset` option if custom format is used.
      
      Fixes #8333
      Relates to #10420
      
      NO_DOC=bugfix
      
      (cherry picked from commit 04811e032f29afe0fa6206ef2c7a0f8434861830)
      45d40d13
    • Sergey Bronnikov's avatar
      refactoring: datetime_parse_full drop parse offset · 45c0f8c1
      Sergey Bronnikov authored
      The patch refactors a function `datetime_parse_full()` -
      overriding of timezone is not a part of datetime string parsing
      and this part was removed.
      
      Needed for #8333
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      
      (cherry picked from commit d7d3063fbd5a74563fde539f2c74852a1e04c1cd)
      45c0f8c1
  20. Sep 17, 2024
    • Sergey Bronnikov's avatar
      sql: forbid non-integer values in datetime · e4dc4d11
      Sergey Bronnikov authored
      The patch forbids using non-integer values in datetime's `:set()`
      for `year`, `month`, `day`, `hour`, `min`, `sec`, `usec`, `msec`,
      `nsec` and `tzoffset` keys. `timestamp` can be double, and integer
      values allowed in timestamp if `nsec`, `usec`, or `msecs`
      provided. An error will be raised when a value of incorrect type
      is passed.
      
      Fixes #10391
      
      @TarantoolBot document
      Title: Update types of datetime values passed to SQL's `CAST();`
      
      `CAST` can accept only integer values for `year`, `month`, `day`,
      `hour`, `min`, `sec`, `usec`, `msec`, `nsec` and `tzoffset`.
      `timestamp` can be integer or double.
      
      (cherry picked from commit f57be571b5e4cc8d57c7e97c15b52df37ad6f12c)
      e4dc4d11
    • Sergey Bronnikov's avatar
      datetime: forbid non-integers in :set() and parse() · 72c3376f
      Sergey Bronnikov authored
      The patch forbids using non-integer values in datetime's `:set()`
      for `year`, `month`, `day`, `hour`, `min`, `sec`, `usec`, `msec`
      and `nsec` keys. The type of `tzoffset` can be integer or string,
      `timestamp` can be double, and integer values allowed in timestamp
      if `nsec`, `usec`, or `msecs` provided. An error will be raised
      when a value of incorrect type is passed.
      
      Part of #10391
      
      @TarantoolBot document
      Title: Update types of values passed to `:set()` and parse()
      
      `:set()` can accept only integer values for `year`, `month`,
      `day`, `hour`, `min`, `sec`, `usec`, `msec` and `nsec`.
      The type of `tzoffset` can be integer or string, `timestamp` can
      be integer or double. `tzoffset` passed to `datetime.parse()`
      can be integer or string.
      
      (cherry picked from commit 6e77907baa3cbeebc79241cc0046a539a09e3f2c)
      72c3376f
    • Sergey Bronnikov's avatar
      datetime: forbid using non-integer values in .new() · 29be5eb3
      Sergey Bronnikov authored
      The patch forbids using non-integer values in datetime constructor
      `datetime.new()` for `year`, `month`, `day`, `hour`, `min`, `sec`,
      `usec`, `msec` and `nsec` keys. The type of `tzoffset` can be
      integer or string, `timestamp` can be double, and integer values
      allowed in timestamp if `nsec`, `usec`, or `msecs` provided.
      An error will be raised when a value of incorrect type is passed.
      
      Part of #10391
      
      @TarantoolBot document
      Title: Update types of values passed to `datetime.new()`
      
      `datetime.new()` can accept only integer values for `year`,
      `month`, `day`, `hour`, `min`, `sec`, `usec`, `msec` and `nsec`.
      The type of `tzoffset` can be integer or string, `timestamp` can
      be integer or double.
      
      (cherry picked from commit cc9010a2b11477b2f16f2b2e168a6b9dcca2fb20)
      29be5eb3
    • Andrey Saranchin's avatar
      memtx: reference tuples when rolling back statements without story · 386fd018
      Andrey Saranchin authored
      During the latest rework of DDL in MVCC, the new helper
      `memtx_tx_history_rollback_empty_stmt` was introduced - it is used for
      statements without stories (such statements can appear, for example,
      when DDL removes all stories). By mistake, we forgot to unreference the
      new tuple and reference the old one there - the commit fixes this
      embarrassing mistake.
      
      Follow-up #10146
      
      NO_CHANGELOG=bugfix for unreleased patch
      NO_DOC=bugfix
      
      (cherry picked from commit 32797f703079664abfe9b7e6112aee1039a52337)
      386fd018
    • Andrey Saranchin's avatar
      test: introduce stress test for memtx MVCC DDL · ffee0d7d
      Andrey Saranchin authored
      DDL with memtx mvcc enabled used to crash a lot until the previous
      commits have fixed it. To make sure it's stable now, the commit
      introduces a stress test that executes various DDL and DML operations
      concurrently. The test doesn't check serialization of transactions, the
      only goal is to make sure that Tarantool does not crash.
      
      Along the way, the commit introduces a new error injection that disables
      yield while building an index. The problem is any concurrent DML with
      DDL building an index will abort it if MVCC is enabled, so the error
      injection is needed to make index build successful during the stress
      test.
      
      Follow-up #10146
      
      NO_CHANGELOG=test
      NO_DOC=test
      
      (cherry picked from commit 260b10bc3616d9eeeea4f245dc523cab5494f711)
      ffee0d7d
Loading