Skip to content
Snippets Groups Projects
  1. Jul 23, 2024
    • Oleg Chaplashkin's avatar
      test: adapt tests to the new luatest version · cfd4bf46
      Oleg Chaplashkin authored
      With the new version of Luatest you have to be careful with the server
      log file. We used to get it very simply:
      
          box.cfg.log
      
      Now it is more correct to use the following approach:
      
          rawget(_G, 'box_cfg_log_file') or box.cfg.log
      
      Closes tarantool/test-run#439
      
      NO_DOC=test
      NO_TEST=test
      NO_CHANGELOG=test
      cfd4bf46
    • Nikita Zheleztsov's avatar
      test: bump test-run to new version · 59ba2131
      Nikita Zheleztsov authored
      Bump test-run to new version with the following improvements:
      
      - luatest: fix ability to run a test several times [1]
      - Enable luatest logging [2]
      - tap13: fix parsing non-utf8 chars [3]
      
      [1] tarantool/test-run@240cdea
      [2] tarantool/test-run@b8b60b4
      [3] tarantool/test-run@7290540
      
      NO_DOC=test
      NO_TEST=test
      NO_CHANGELOG=test
      59ba2131
    • Georgiy Lebedev's avatar
      box: stop relays when instance is deleted from the cluster · 358b68ff
      Georgiy Lebedev authored
      When an instance is deleted from the cluster, we need to stop its relays,
      otherwise it will continue sending asynchronous transactions applied on it
      to the remaining cluster members.
      
      Closes #10266
      
      NO_DOC=<bugfix>
      358b68ff
    • Georgiy Lebedev's avatar
      box: drop the `replicaset::is_joining` field as unused · 4e5a1c79
      Georgiy Lebedev authored
      The `replicaset::is_joining` field was previously used to track the
      situation when a node joining a replicaset could get deleted from the
      `_cluster` space. However, now in the scope of #10088 we always apply the
      deletion of the replica from the `_cluster` space on the deleted replica,
      so this field has become redundant.
      
      Follows up #10088
      
      NO_CHANGELOG=<refactoring>
      NO_DOC=<refactoring>
      NO_TEST=<refactoring>
      4e5a1c79
    • Georgiy Lebedev's avatar
      box: apply deletion of replica from `_cluster` on the deleted replica · 25aac111
      Georgiy Lebedev authored
      Currently the deletion of a replica from the `_cluster` space fails on the
      deleted replica with `ER_LOCAL_INSTANCE_ID_IS_READ_ONLY`, which forbids to
      change the replica identifier, stopping the corresponding applier. However,
      the deleted replica cannot ack its own deletion. In the scope of #9723, we
      are going to enable synchronous replication for most of the system spaces,
      including the `_cluster` space. There are several problems with this:
      
      1. Deleting a replica from a 2-member cluster without manual changing of
      quorum won't work: it is impossible to commit the deletion into the
      `_cluster` space with only 2 node, since the quorum is equal to 2, while
      the deleted node cannot ack its own deletion.
      
      2. Deleting a replica to a 3-member cluster may fail: the quorum will be
      equal to 2, the deleted replica cannot ACK its own deletion from the
      `_cluster` space — if one out of rest 2 nodes fails, reconfiguration will
      fail.
      
      Generally speaking, it will be impossible to delete a replica from the
      cluster, if a quorum, which excludes the deleted replica (which cannot
      ACK), cannot be gathered.
      
      To solve these problems, let's apply the deletion of the replica on the
      deleted replica and manually stop all of the appliers from the `on_commit`
      trigger, effectively stopping replication (if we are not recovering).
      This way we’ll be able to delete a node regardless of the current
      configuration. The deleted replica will need to be rebootstrapped, and if
      it is restarted before rebootstrap, it needs to be started as an anonymous
      replica.
      
      For consistency, let's forbid such a replica to subscribe to nodes that see
      this node as a previously (i.e., before deletion from the cluster)
      non-anonymous replica in their replicaset table.
      
      Closes #10088
      
      @TarantoolBot document
      Title: Deleted replica applies its own deletion from the `_cluster` space
      Product: Tarantool
      Since: 3.2
      
      The deleted replica will now apply its own deletion from the _cluster
      space. This will allow to delete a node regardless of
      the current configuration in case the `_cluster` space is synchronous. The
      deleted replica will need to be resubscribed, and if it is restarted
      before resubscription, it needs to be started as an anonymous replica.
      
      However, if a replica is down, then it will still be impossible to remove
      it from a 2-member cluster without manual lowering of the
      `replication_synchro_quorum` option. In such a scenario the instance should
      not be turned off before it is dropped from the `_cluster` space when
      synchronous replication is enabled for the `_cluster` space.
      25aac111
    • Georgiy Lebedev's avatar
      box: refactor the applier client fiber cancelling into `applier_kill` · b3f94fc6
      Georgiy Lebedev authored
      In the scope of #10088 we are going to add another snippet to explicitly
      cancel the applier client fiber without joining it, so let's refactor the
      existing snippet into a separate `applier_kill` function.
      
      Needed for #10088
      
      NO_CHANGELOG=<refactoring>
      NO_DOC=<refactoring>
      NO_TEST=<refactoring>
      b3f94fc6
    • Georgiy Lebedev's avatar
      box: move anonymous replica subscription checks to one if-else construct · bdd19a62
      Georgiy Lebedev authored
      In the scope of #10088 we are going to add another check for the case when
      we receive an anonymous subscription request from a replica that is present
      in our replicaset table, so let's move the checks related to anonymous
      replica subscription to one if-else construct.
      
      Needed for #10088
      
      NO_CHANGELOG=<refactoring>
      NO_DOC=<refactoring>
      NO_TEST=<refactoring>
      bdd19a62
    • Vladimir Davydov's avatar
      vinyl: do not log dump if index was dropped · ae6a02eb
      Vladimir Davydov authored
      An index can be dropped while a memory dump is in progress. If the vinyl
      garbage collector happens to delete the index from the vylog by the time
      the memory dump completes, the dump will log an entry for a deleted
      index, resulting in an error next time we try to recover the vylog,
      like:
      
      ```
      ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Run 2 committed after deletion
      ```
      
      or
      
      ```
      ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Deleted range 9 has run slices
      ```
      
      We already fixed a similar issue with compaction in commit 29e2931c
      ("vinyl: fix race between compaction and gc of dropped LSM"). Let's fix
      this one in exactly the same way: discard the new run without logging it
      to the vylog on a memory dump completion if the index was dropped while
      the dump was in progress.
      
      Closes #10277
      
      NO_DOC=bug fix
      ae6a02eb
    • Vladimir Davydov's avatar
      errinj: log error injection value · 019bacbe
      Vladimir Davydov authored
      Let's log the new value when an error injection is set in orer to ease
      debugging in tests.
      
      NO_DOC=logging
      NO_TEST=logging
      NO_CHANGELOG=logging
      019bacbe
  2. Jul 22, 2024
  3. Jul 19, 2024
    • Sergey Bronnikov's avatar
      perf/lua: add context section to test output · a3ef8fb6
      Sergey Bronnikov authored
      Google Benchmark output format contains a section "context" that
      describes useful information about test environment.
      
      Google Benchmark output format has been supported in Lua
      microbenchmarks in commit 3110ef9a
      ("perf: introduce benchmark.lua helper module"). However, produced
      output contains test results only and section "context" is missed.
      The patch add a section "context" with the following fields:
      date, load average, hostname, tarantool's version, build flags
      and a name of build target.
      
      ```
      $ tarantool uri_escape_unescape.lua --output=res.json --output_format=json
      $ jq ".context" res.json
      {
        "build_target": "Linux-x86_64-RelWithDebInfo",
        "host_name": "pony",
        "date": "2024-07-04 19:09:11",
        "tarantool_version": "3.2.0-entrypoint-114-g9e5dca29ad",
        "build_flags": " -fexceptions -funwind-tables -fasynchronous-unwind-tables -fno-common -msse2 -Wformat -Wformat-security -Werror=format-security -fstack-protector-strong -fPIC -fmacro-prefix-map=/home/sergeyb/sources/MRG/tarantool=. -std=c11 -Wall -Wextra -Wno-gnu-alignof-expression -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2 ",
        "load_avg": [
          "0.76",
          "0.74",
          "0.63"
        ]
      }
      ```
      
      NO_CHANGELOG=perf
      NO_DOC=perf
      NO_TEST=perf
      a3ef8fb6
    • Sergey Bronnikov's avatar
      perf/lua: protect require of column module · 18661ed7
      Sergey Bronnikov authored
      NO_CHANGELOG=perf
      NO_DOC=perf
      NO_TEST=perf
      18661ed7
    • Sergey Bronnikov's avatar
      ci: enable BENCH_CMD · 3951a88c
      Sergey Bronnikov authored
      The patch enable environment variable `BENCH_CMD` introduced in
      a previous commit. The `taskset` alone will pin all the process
      threads into a single (random) isolated CPU, there's a ticket [1]
      about this in the Linux kernel bugtracker. The workaround is using
      realtime scheduler for the isolated task using `chrt` [2], e. g.:
      `taskset 0xef chrt 50`.
      
      1. https://bugzilla.kernel.org/show_bug.cgi?id=116701
      2. https://www.man7.org/linux/man-pages/man1/chrt.1.html
      
      NO_CHANGELOG=performance testing
      NO_DOC=performance testing
      NO_TEST=performance testing
      3951a88c
    • Sergey Bronnikov's avatar
      perf: introduce BENCH_CMD environment variable · 03317b16
      Sergey Bronnikov authored
      The patch introduces a BENCH_CMD, environment variable that
      could be set to a string with command and its arguments on CMake
      configuration stage and this string will be used as a pre-command
      for executing performance tests. Examples of these commands are
      `taskset` [1] and `numactl` [2], or any other utilities, see [3].
      
      1. https://man7.org/linux/man-pages/man1/taskset.1.html
      2. https://man7.org/linux/man-pages/man8/numactl.8.html
      3. https://github.com/tarantool/tarantool/wiki/Benchmarking#run-benchmarks
      
      NO_CHANGELOG=performance infra
      NO_DOC=performance infra
      NO_TEST=performance infra
      03317b16
    • Sergey Bronnikov's avatar
      perf: add a script for setting environment · f3ca5c93
      Sergey Bronnikov authored
      "Benchmarking" article [0] in Tarantool's wiki contains a lot of
      recommendations that help to setup the Linux operating system and
      avoid potential reproducibility pitfalls when executing
      performance tests in a Linux-based environment. These
      recommendations written in plain text with examples of commands
      that could be executed manually. We desire to execute benchmarks
      automatically and in continuous mode, therefore we need a way to
      setup the test environment automatically before running
      benchmarks.
      
      There are many guides with benchmarking tips, but unfortunately
      there is no script that will do these steps automatically.
      I found only temci [1] and pyperf (`pyperf system` [2]) projects.
      
      The patch adds a script for setting the environment before running
      performance tests. All settings used in the proposed script are
      described in the article [3]. Note, that uncertain settings were
      not implemented.
      
      0. https://github.com/tarantool/tarantool/wiki/Benchmarking
      1. https://github.com/parttimenerd/temci
      2. https://github.com/travisdowns/uarch-bench/blob/master/uarch-bench.sh
      3. https://pyperf.readthedocs.io/en/latest/cli.html#system-cmd
      
      NO_CHANGELOG=performance
      NO_DOC=performance
      NO_TEST=performance
      f3ca5c93
  4. Jul 18, 2024
    • Vladimir Davydov's avatar
      vinyl: wake up waiters after clearing checkpoint_in_progress flag · fc3196dc
      Vladimir Davydov authored
      The function `vy_space_build_index`, which builds a new index on DDL,
      calls `vy_scheduler_dump` on completion. If there's a checkpoint in
      progress, the latter will wait on `vy_scheduler::dump_cond` until
      `vy_scheduler::checkpoint_in_progress` is cleared. The problem is
      `vy_scheduler_end_checkpoint` doesn't broadcast `dump_cond` when it
      clears the flag. Usually, everything works fine because the condition
      variable is broadcast on any dump completion, and vinyl checkpoint
      implies a dump, but under certain conditions this may lead to a fiber
      hang. Let's broadcast `dump_cond` in `vy_scheduler_end_checkpoint`
      to be on the safe side.
      
      While we are at it, let's also inject a dump delay to the original
      test to make it more robust.
      
      Closes #10267
      Follow-up #10234
      
      NO_DOC=bug fix
      fc3196dc
  5. Jul 17, 2024
    • Nikita Zheleztsov's avatar
      iproto: introduce FETCH_SNAPSHOT_CURSOR feature · 62c49367
      Nikita Zheleztsov authored
      This commit introduces FETCH_SNAPSHOT_CURSOR feature, which is available
      only in EE. The feature is not returned in response to IPROTO_ID and is
      not shown in box.iproto.protocol_features in Community Edition. Its id
      is shown only in box.iproto.feature, which is a list of all available
      features in the current version.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_CHANGELOG=minor
      
      @TarantoolBot document
      Title: Document iproto feature FETCH_SNAPSHOT_CURSOR
      
      Root document: https://www.tarantool.io/en/doc/latest/reference/reference_lua/net_box/#net-box-connect
      
      FETCH_SNAPSHOT_CURSOR feature requires cursor FETCH_SNAPSHOT on the
      server. Its ID is IPROTO_FEATURE_FETCH_SNAPSHOT_CURSOR. IPROTO version
      is 8 or more, Enterprise Edition is also required.
      62c49367
    • Nikita Zheleztsov's avatar
      engine: introduce stubs for checkpoint FETCH_SNAPSHOT · 2fca5c13
      Nikita Zheleztsov authored
      This commit introduces engine stubs that enable a new method
      of fetching snapshots for anonymous replicas. Instead of using
      the traditional read-view join approach, this update allows
      file snapshot fetching. Note that file snapshot fetching
      is only available in Tarantool EE.
      
      Checkpoint fetching is done via IPROTO_IS_CHECKPOINT_JOIN,
      IPROTO_CHECKPOINT_VCLOCK and IPROTO_CHECKPOINT_LSN fields.
      
      If IPROTO_CHECKPOINT_JOIN is set to true, join will be done from
      files: .snap for memtx, .run for vinyl, if false - from read view.
      
      Checkpoint join allows to continue from the place, where client
      stopped in case of snapshot fetching error. This allows to avoid
      rebootstrap of an anonymous client. This can be done by specifying
      CHECKPOINT_VCLOCK, which says from which file server should continue
      join, client gets vclock at the beginning of the join. Specifying
      CHECKPOINT_LSN allows to continue from some position in checkpoint.
      Server sends all data >= CHECKPOINT_LSN.
      
      If CHECKPOINT_VCLOCK is not specified, fetching is done from the latest
      available checkpoint. If CHECKPOINT_LSN is not specified - start from
      the beginning of the snap. So, specifying only IS_CHECKPOINT_JOIN
      triggers fetching the latest checkpoint from files.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=ee
      NO_TEST=ee
      NO_CHANGELOG=ee
      2fca5c13
    • Nikita Zheleztsov's avatar
      engine: send vclock with 0th component during join · 56058393
      Nikita Zheleztsov authored
      This commit makes engine to send vclock without ignoring 0th component
      during join, which is needed for checkpoint FETCH SNAPSHOT.
      
      Currently engine join functions are invoked only from
      relay_initial_join, which is done during JOIN or FETCH SNAPSHOT.
      They respond with vclock of the read view we're going to send.
      
      In the following commit checkpoint FETCH SNAPSHOT will be introduced,
      which responds with vclock of the checkpoint, we're going to send.
      Such vclock may include 0th component and it's crucial to send it to
      a client, as in case of connection failure, client will send us the
      same vclock and we'll have to use its signature to figure out, which
      checkpoint client wants.
      
      So, we have to send and receive 0th component of the vclock during
      FETCH_SNAPSHOT. This commit also introduces decoding vclocks without
      ignoring 0th component, as they'll be used in the following commit too.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=internal
      NO_TEST=ee
      NO_CHANGELOG=internal
      56058393
    • Nikita Zheleztsov's avatar
      xrow: rename xrow_encode_vclock · 313bd730
      Nikita Zheleztsov authored
      This commit renames xrow_encode_vlock to xrow_encode_vclock_ignore0
      since the next commit will introduce encoding vclock without ignoring
      0th component, which is needed during sending the response to fetch
      snapshot request.
      
      This commit also removes internal field inside the replication_request
      structure, as the following commit will use 'vclock' for
      encoding/decoding vclock without ignoring component.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      313bd730
    • Nikita Zheleztsov's avatar
      relay: refactor relay_initial_join · 72cc2b3e
      Nikita Zheleztsov authored
      From now on during initial join memtx engine prepares vclock, raft and
      limbo states, it also sends them during memtx_engine_join.
      
      It's done in order to simplify the code of initial join, as in the
      consequent commit checkpoint initial join will be introduced and we want
      relay code to handle it the same as read-view join without confusing
      conditions.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      72cc2b3e
    • Nikita Zheleztsov's avatar
      engine: move raft and limbo states after system data in checkpoint · 3da31b83
      Nikita Zheleztsov authored
      Before this commit raft and limbo states were written at the end of the
      checkpoint, which makes it very costly to access them.
      
      Checkpoint join needs to access limbo and raft state in order to send
      them during JOIN_META stage. We cannot use the latest states, like it's
      done for read-view snapshot fetching: states may be far ahead of the
      data, written to the checkpoint, which we're going to send.
      
      This commit moves raft and limbo states after data from the system
      spaces but before user data. We cannot put them right at the beginning
      of the snapshot, because then we'll have to patch recovery process,
      which currently strongly relies on the fact, that system spaces are
      at the beginning of the snapshot (this was done in order to apply force
      recovery only for user data). If we patch recovery process, then old
      versions, where it's unpatched, won't be able to recover from the
      snapshots done by the newer version, compatibility of snapshots will be
      broken.
      
      The current change is not breaking, old Tarantool versions can restore
      from the snapshot made by the newer one.
      
      Needed for tarantool/tarantool-ee#741
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      3da31b83
  6. Jul 16, 2024
    • Ilya Verbin's avatar
      perf: fix warnings in column_scan_module.c · 4cac1677
      Ilya Verbin authored
      Fix the following warnings (with ENABLE_READ_VIEW defined):
      
      ```
      ./perf/lua/column_scan_module.c:59:18: error: unused variable ‘index_id’ [-Werror=unused-variable]
         59 |         uint32_t index_id = luaL_checkinteger(L, 2);
            |                  ^~~~~~~~
      
      ./perf/lua/column_scan_module.c:149:18: error: unused variable ‘index_id’ [-Werror=unused-variable]
        149 |         uint32_t index_id = luaL_checkinteger(L, 2);
            |                  ^~~~~~~~
      ```
      
      NO_DOC=perf test
      NO_TEST=perf test
      NO_CHANGELOG=perf test
      4cac1677
    • Nikita Zheleztsov's avatar
      applier: fix assertion failure after split brain · 5ce010c5
      Nikita Zheleztsov authored
      After receiving async transaction from an old term applier_apply_tx
      exits without unlocking the latch. If the same applier tries to
      subscribe for replication, it fails with assertion, as the latch is
      already locked.
      
      Let's fix the function, which raises error so that it just sets
      diag and returns -1.
      
      Closes #10073
      
      NO_DOC=bugfix
      NO_CHANGELOG=no crash on release version
      5ce010c5
    • Ilya Verbin's avatar
      perf: add column insert test · e5c4bd63
      Ilya Verbin authored
      The test creates an empty space with 1000 nullable columns storing uint64
      values. Then it initializes a datasets that consists of 10 columns and
      1 million rows (row count and both column counts are configurable), then
      it inserts the dataset into the space.
      
      By default the test uses serial C API but one may switch to the Arrow API
      for batch insertion (the feature is exclusive to the Enterprise Edition).
      
      It's also possible to specify the engine and wal_mode to use (default are
      memtx, write).
      
      Needed for tarantool/tarantool-ee#712
      
      NO_DOC=perf test
      NO_TEST=perf test
      NO_CHANGELOG=perf test
      e5c4bd63
    • Ilya Verbin's avatar
      third_party: initial import of arrow/abi.h · 8cd677da
      Ilya Verbin authored
      Needed for tarantool/tarantool-ee#712
      
      NO_DOC=for enterprise edition
      NO_TEST=for enterprise edition
      NO_CHANGELOG=for enterprise edition
      8cd677da
    • Ilya Verbin's avatar
      lua/utils: export luaL_pushnull and luaL_isnull functions · a6140a3e
      Ilya Verbin authored
      They are useful in C modules.
      
      Needed for tarantool/tarantool-ee#712
      
      @TarantoolBot document
      Title: Update C API reference > Module lua/utils
      Product: Tarantool
      Root documents: https://www.tarantool.io/en/doc/latest/dev_guide/reference_capi/utils/
      
      The following functions are missed in the documentation:
      
       * luaL_iscallable
       * luaL_iscdata
       * luaL_isnull
       * luaL_pushnull
       * luaT_call
       * luaT_checktuple
       * luaT_isdecimal
       * luaT_newdecimal
       * luaT_pushdecimal
       * luaT_toibuf
       * luaT_tolstring
       * luaT_tuple_encode
       * luaT_tuple_new
      
      See also: https://github.com/tarantool/doc/issues/2011
      a6140a3e
    • Ilya Verbin's avatar
      mpstream: introduce mpstream_encode_int64() helper · f8be986d
      Ilya Verbin authored
      Needed for tarantool/tarantool-ee#712
      
      NO_TEST=EE
      NO_DOC=internal
      NO_CHANGELOG=internal
      f8be986d
    • Ilya Verbin's avatar
      error: introduce ERRINJ_TUPLE_ALLOC_COUNTDOWN · 52926402
      Ilya Verbin authored
      Needed for tarantool/tarantool-ee#712
      
      NO_DOC=internal
      NO_TEST=internal
      NO_CHANGELOG=internal
      52926402
    • Ilya Verbin's avatar
      test: do not test errinj.info() output · dc0fd81c
      Ilya Verbin authored
      There is no much sense in testing it, but it is sensitive to source code
      changes, especially `ERRINJ_*_COUNTDOWN` injections, e.g. see commit
      697123d0 ("box: use maximal space id instead of _schema.max_id").
      
      Needed for tarantool/tarantool-ee#712
      
      NO_DOC=test
      NO_CHANGELOG=test
      dc0fd81c
    • Lev Kats's avatar
      sio: fix error message displaying bind address · a5214bfc
      Lev Kats authored
      Now `sio_bind` function prints address into error message directly
      instead of relying on `fd` used in `bind` that failed to execute.
      
      `sio_bind` used `sio_socketname_to_buffer` for error message
      effectively attempting printing address bound to `fd` while there
      actually was an error in binding that address to that socket in the
      first place.
      
      Fixes #5925
      
      NO_DOC=bugfix
      NO_CHANGELOG=minor
      a5214bfc
    • Nikita Zheleztsov's avatar
      test: cover split-brain during promote · 06b87e27
      Nikita Zheleztsov authored
      This test checks, that when PROMOTE from the previous term is
      encountered we immediately notice split-brain situation and break
      replication without corrupting data.
      
      Closes #9943
      
      NO_DOC=test
      NO_CHANGELOG=test
      06b87e27
Loading