Skip to content
Snippets Groups Projects
  1. Jun 02, 2023
    • Serge Petrenko's avatar
      test: do not run box.cfg{} on test runner for replication tests · 8b10902d
      Serge Petrenko authored
      Some replication tests (linearizable_test.lua and
      bootstrap_strategy_test.lua) used default test-runner to test box.cfg{}
      calls which are expected to fail.
      
      Since box.cfg{} is going to be prohibited on default test runner, let's
      move such test cases into properly initialized servers.
      
      NO_DOC=test
      NO_CHANGELOG=test
      8b10902d
    • Oleg Chaplashkin's avatar
      test: ban direct calling of box.cfg() · fc3426d8
      Oleg Chaplashkin authored
      Direct call and configuration of the runner instance is prohibited. Now
      if you need to test something with specific configuration use a server
      instance please (see luatest.Server module).
      
      In-scope-of tarantool/luatest#245
      
      NO_DOC=ban calling box.cfg
      NO_TEST=ban calling box.cfg
      NO_CHANGELOG=ban calling box.cfg
      fc3426d8
  2. Jun 01, 2023
    • Vladimir Davydov's avatar
      net.box: resolve IPROTO feature names using box.iproto.feature · 606e50c4
      Vladimir Davydov authored
      Drop the IPROTO_FEATURE_NAMES table and use box.iproto.feature in
      iproto_feature_resolve so that we don't have to update it manually every
      time we add a new feature.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      606e50c4
    • Vladimir Davydov's avatar
      iproto: replace iproto_constant with string arrays · 75632133
      Vladimir Davydov authored
      There are no substantial gaps in the remaining IPROTO constant enums so
      there's no need in iproto_constant struct. Instead we can generate
      string arrays, as we usually do. This is more flexible because it allows
      us to look up a name by code. It's also consistent with iproto_type and
      iproto_key names.
      
      The only tricky part here is the iproto_flag enum because it contains
      bit masks. To generate names for the flags, we add the auxiliary enum
      iproto_flag_bit that contains bit numbers.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      75632133
    • Vladimir Davydov's avatar
      iproto: generate iproto_type_strs from IPROTO_TYPES · 42dc000e
      Vladimir Davydov authored
      Currently, we fill iproto_type_strs only for command codes exported to
      box.stat while for the rest of command codes we have a switch-case in
      the iproto_type_name function. This is ugly and error-prone because we
      can easily forget to update iproto_type_name when we add a new command
      code. Let's generate iproto_type_strs automatically just like we
      generate iproto_key_strs.
      
      There are a few things that should be noted here:
       - We don't generate strings for IPROTO_TYPE_ERROR and IPROTO_UNKNOWN
         because the former has a big code while the latter has a negative
         code. The only place where we need the strings is exporting IPROTO
         constants to Lua so now we just export these special codes explicitly
         there.
       - We don't generate strings for IPROTO codes reserved for vinyl because
         they aren't exported to Lua and use a different naming convention.
         As before, we have a switch-case in iproto_type_name for them.
       - We remove IPROTO_RESERVED_TYPE_STAT_MAX because it isn't a reserved
         code. Instead we define IPROTO_TYPE_STAT_MAX explicitly in the
         iproto_type enum as IPROTO_ROLLBACK + 1. This allows us to remove
         the condition that skips "RESERVED" constants from the code that
         exports IPROTO constants to Lua.
       - Before this change iproto_type_strs didn't have names for OK,
         CALL_16, and NOP, because they aren't shown in box.stat. After this
         change the names are present so we have to filter out the stat items
         explicitly in the rmean_foreach callback.
      
      Generating iproto_type_strs makes iproto_type_constants useless so we
      drop it in the scope of this patch and start using iproto_type_strs
      to populate box.iproto.type.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      42dc000e
    • Vladimir Davydov's avatar
      iproto: generate strings for vinyl constants · 96599a5b
      Vladimir Davydov authored
      Currently, we fill vy_page_info_key_strs, vy_run_info_key_strs, and
      vy_row_index_key_strs manually, which is inconvenient and error-prone.
      Let's generate them automatically from enum member names, like we do
      for IPROTO keys.
      
      Note, we have to rename VY_RUN_INFO_BLOOM and VY_RUN_INFO_BLOOM_LEGACY
      to VY_RUN_INFO_BLOOM_FILTER and VY_RUN_INFO_BLOOM_FILTER_LEGACY to
      preserve the xlog reader output.
      
      Still, the result isn't exactly the same:
       - An underscore is used instead of a space.
       - Strings are upper case now, not lower case, as they used to be.
       - VY_ROW_INDEX_DATA is now translated to "data", not "row index".
      
      The key names are used for two purposes:
       - For reporting ER_INVALID_INDEX_FILE error in vy_run.c. The changes
         enumerated above don't really matter there.
       - In the xlog reader. We replace spaces with underscores anyway there
         and convert the names to the lower case so the only problem is that
         "row_index" is replaced with "data" in xlog reader output. This
         should be fine though because (a) from the context it's clear that
         the data belong to a row index section, (b) reading vinyl index files
         is only useful for debugging and introspection, and (c) the field is
         a part of vinyl internals and was never documented properly.
      
      After this change we can remove the code replacing spaces with
      underscores from the xlog reader because all IPROTO constant names
      now use underscores.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      96599a5b
    • Vladimir Davydov's avatar
      iproto: generate iproto_key_strs from IPROTO_KEYS · 99e8abe2
      Vladimir Davydov authored
      Currently, we fill iproto_key_strs manually, which is inconvenient and
      error-prone. Let's generate it automatically from enum member names.
      
      The result isn't exactly the same:
       - An underscore is used instead of a space.
       - Strings are upper case now, not lower case, as they used to be.
       - IPROTO_REQUEST_TYPE is now translated to  "REQUEST_TYPE", not "type".
       - IPROTO_OPS is now translated to "OPS" not "operations".
      
      The key names are used for two purposes:
       - For reporting ER_MISSING_REQUEST_FIELD error while decoding a packet
         in xrow.c. The changes enumerated above don't really matter there.
       - In the xlog reader. Here we do need some workarounds. First, we have
         to convert the names to the lower case. Second, we have to use "type"
         and "operations" instead of generated names for IPROTO_REQUEST_TYPE
         and IPROTO_OPS. Spaces are already translated to underscores so we
         don't need to do anything about it.
      
      Generating iproto_key_strs makes iproto_key_constants useless so we
      drop it in the scope of this patch and start using iproto_key_strs to
      populate box.iproto.key.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      99e8abe2
    • Vladimir Davydov's avatar
      iproto: generate iproto_key_type from IPROTO_KEYS · 26b9cc86
      Vladimir Davydov authored
      Let's merge the key value type information into IPROTO_KEYS to keep them
      close together.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      26b9cc86
    • Vladimir Davydov's avatar
      iproto: strip prefixes from generated constant strings · 37dd09af
      Vladimir Davydov authored
      There's no need to add prefixes to generated iproto constant strings
      (like IPROTO_, IPROTO_FEATURE_, etc) because we strip them anyway when
      exporting constants to Lua. Let's drop the prefixes to cleanup the code.
      Note that enum constants themselves still have the prefixes to avoid
      name clashes.
      
      Follow-up #8443
      Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
      automatically")
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      37dd09af
  3. May 31, 2023
    • Serge Petrenko's avatar
      replication: fix crash on access to a not yet ready relay · 0ef5e3b2
      Serge Petrenko authored
      All the code outside of relay.cc judges about relay's liveliness looking
      only at relay state. When relay->state is RELAY_FOLLOW, the relay is
      considered operational.
      
      This is not always true: for example, both relay_push_raft() and
      relay_trigger_vclock_sync() are only possible after relay thread pairs
      with tx via the cbus. This happens **after** the relay enters
      RELAY_FOLLOW state.
      
      Fix the possible access to uninitialized cpipe by
      relay_trigger_vclock_sync(): make it a nop until the relay is paired
      with tx.
      
      Closes #7991
      
      NO_DOC=bugfix
      NO_TEST=covered by replication-luatest/linearizable_test.lua
      0ef5e3b2
    • Serge Petrenko's avatar
      relay: refactor is_raft_enabled flag · b787f328
      Serge Petrenko authored
      Relay had a is_raft_enabled member with mixed meaning: firstly, it was set
      to true only when relay was ready to accept messages via cbus, and
      secondly, it was set to true only for replicas which need raft updates
      (newer than Tarantool 2.6.0 and not anonymous).
      
      Let's better use the flag only as an indication that the relay is ready
      to accept cbus pushes, and check whether the relay needs raft updates
      separately.
      
      The flag will be reused in the following commit, which will make tx
      check that relay is connected prior to sending a message to it.
      
      Prerequisite #7991
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      b787f328
    • Mergen Imeev's avatar
      box: allow to set *_uuid options to NULL · 3aa029b3
      Mergen Imeev authored
      This patch allows to set replicaset_uuid and instance_uuid to box.NULL.
      This fixes the issue described in #8714, however it introduces another
      change in behavior - we can now set these parameters to NULL even if
      they weren't NULL before. However, since we still cannot set a different
      uuid after setting the parameters to NULL, and we can still set the old
      uuid for them, this behavior is considered acceptable.
      
      Closes #8714
      
      NO_DOC=bugfix
      NO_CHANGELOG=the bug was not released
      3aa029b3
    • Vladimir Davydov's avatar
      test: disable Lua JIT in app-luatest/http_client_test · 53c94bc7
      Vladimir Davydov authored
      We'll enable it when #8718 is fixed.
      
      NO_DOC=test
      NO_CHANGELOG=test
      53c94bc7
  4. May 29, 2023
    • Serge Petrenko's avatar
      raft: fix spurious split-vote · 2afde5b1
      Serge Petrenko authored
      Due to a typo raft candidate counted a vote for another node as a vote
      for self in its split-vote detector. This could lead to spurious
      split-vote detection in cases when another node wins elections with a bare
      minimum of votes for it (exactly a quorum of votes).
      
      Closes #8698
      
      NO_DOC=bugfix
      2afde5b1
    • Serge Petrenko's avatar
      raft: make promote bump term and vote at once · 17371215
      Serge Petrenko authored
      box.ctl.promote() was implemented as follows: an instance bumps the
      term and marks itself a candidate, but doesn't vote for self
      immediately. Instead it relies on the machinery which makes a candidate
      vote for self as soon as it persists a new term.
      
      This differs from a normal election start due to leader timeout: there
      term and vote are bumped at once.
      
      Besides, this increases probability of box.ctl.promote() resulting in
      other node getting elected: if a node first broadcasts a term without a
      vote, it is not considered a candidate, so other candidates might start
      elections and vote for themselves.
      
      Let's bring promote into line with automatic elections.
      
      Closes #8497
      
      NO_DOC=bugfix
      17371215
    • Serge Petrenko's avatar
      raft: persist vote for self together with term bump · 8a124e50
      Serge Petrenko authored
      Commit c9155ac8 ("raft: persist new term and vote separately") made
      the nodes persist new term and vote separately, using 2 WAL writes.
      Writing the term first is needed to flush all the ongoing transactions,
      so that the node's vclock is updated and can be checked against the
      candidate's vclock. Otherwise it could happen that the node persists a
      vote for some candidate only to find that it's vclock would actually
      become incomparable with the candidate's.
      
      Actually, this guard is not needed when checking a vote for self,
      because a node can always vote for self. Besides, splitting term bump
      and vote can lead to increased probability of split-vote. It may happen
      that a candidate bumps and broadcasts the new term without a vote,
      making other nodes vote for self. Let's go back to writing term and vote
      together for self votes.
      
      This change makes raft candidate persist term bump and vote for self in
      one WAL write instead of two, so all the tests which count WAL writes or
      expect 2 separate state updates for term and vote are rewritten.
      
      Prerequisite #8497
      
      NO_DOC=not user-visible
      NO_CHANGELOG=not user-visible
      8a124e50
  5. May 26, 2023
    • Vladimir Davydov's avatar
      changelog: mark changelog entry for gh-7149 as breaking · 3b244fc4
      Vladimir Davydov authored
      Fixes commit 97c2c9a4 ("box: disable DDL with old schema").
      Follow-up #7149
      
      NO_DOC=changelog
      NO_TEST=changelog
      3b244fc4
    • Vladimir Davydov's avatar
      box: disable DDL with old schema · 97c2c9a4
      Vladimir Davydov authored
      ** Implementation details **
      
      We disable DDL by patching the existing on_replace_dd_system_space
      trigger callback installed for each system space so that now it raises
      an error in case the current schema version is less than the most
      recent one known to this build. Since to perform a schema upgrade
      we need to execute DDL, we suppress the error for the fiber that is
      currently running a schema upgrade. To achieve that, the upgrade script
      calls box_schema_upgrade_begin and box_schema_upgrade_end before
      starting and after completing a schema upgrade. The functions keep track
      of the fiber that is currently running a schema upgrade so that we can
      allow all DDL operations for it. We also allow DDL during recovery so
      that we can replay DDL statements written to the WAL.
      
      Since there may be a bug in the `box.schema.upgrade` implementation,
      we export `box.internal.run_schema_upgrade`, which runs the given
      function as a schema upgrade script (allowing DDL). The user may use
      this function to recover after a schema upgrade failure.
      
      ** Note about the tests **
      
      A test server instance started by luatest grants permissions to the
      guest user so that luatest can execute commands on it. It means that if
      a test uses a generated snap file committed to the repository for a test
      server instance, it will fail because granting permissions is a DDL
      operation. To prevent this, we have to regenerate snap files so that
      they contain all required permissions. This works because a test server
      instance grants permissions with the `if_not_exists` flag.
      
      The problem is that it isn't easy to regenerate the snap files for the
      following tests because there's no generator script:
       - `test/box-luatest/gh_6794_recover_nonmatching_xlogs_test.lua`
       - `test/box-luatest/gh_7974_force_recovery_bugs_test.lua`
      
      So we temporarily disable these tests and file tickets to fix them.
      
      Other notes:
       - We drop `test/box-luatest/upgrade/2.9.1` and make the test using it
         use `test/box-luatest/upgrade/2.10.0` instead. We do this because
         2.9.1 was never released and the earliest Tarantool version using the
         2.9.1 schema version is 2.10.0. This shouldn't affect the test
         anyhow.
       - We drop the part of the `user_auth_history_last_modified_upgrade`
         test that checks that creating users/roles with an old schema works
         fine because this is forbidden now.
       - We wrap the code that creates a space with an old schema in the
         downgrade test in `box.internal.run_schema_upgrade`. Even though it's
         unsupported now, we still need to check that space creation works
         after a downgrade.
      
      Closes #7149
      
      @TarantoolBot document
      Title: Document that DDL is disabled with an old system schema
      
      Executing DDL operations with an old (not upgraded) system schema is
      dangerous and might result in unexpected breakages. So we decided to
      explicitly forbid all DDL operations with an old system schema until
      `box.schema.upgrade()` is called.  Note, one can still call `box.schema`
      functions with an old schema provided they do nothing, for example, if
      an object is created with the `if_not_exists` flag and the object with
      same id already exists:
      
      ```lua
      box.schema.create_space('test', {if_not_exists = true})
      ```
      
      Otherwise an attempt to create a space with an old schema will raise
      an error like shown below:
      
      ```yaml
      tarantool> box.schema.space.create('test')
      ---
      - error: Your schema version is 1.6.8 while Tarantool
          3.0.0-entrypoint-262-g3eaba1cef686 requires a more recent
          schema version. Please, consider using box.schema.upgrade().
      ...
      ```
      97c2c9a4
    • Magomed Kostoev's avatar
      box: disallow to drop system spaces · 8ae45007
      Magomed Kostoev authored
      The patch adds a new check to the _space on_replace trigger failing
      on attempt to drop a system table.
      
      Closes #5279
      
      NO_DOC=bugfix
      8ae45007
  6. May 25, 2023
    • Yaroslav Lobankov's avatar
      changelog: proofread some luajit changelogs · e64568b2
      Yaroslav Lobankov authored
      NO_DOC=changelog update
      NO_TEST=changelog update
      e64568b2
    • Yaroslav Lobankov's avatar
      metrics: bump to new version · 8bbc73ce
      Yaroslav Lobankov authored
      Bump the metrics submodule to 1.0.0 version.
      
      NO_DOC=submodule bump
      NO_TEST=submodule bump
      NO_CHANGELOG=submodule bump
      8bbc73ce
    • Yaroslav Lobankov's avatar
      test: bump test-run to new version · 252865ce
      Yaroslav Lobankov authored
      Bump test-run to new version with the following improvements:
      
      - lib: propagate test status 'skip' [1]
      - Show overall progress while running [2]
      - Follow test timeout for luatest [3]
      - Run luatest test by pattern [4]
      - Refactor command to run luatest test [5]
      - Bump luatest to 0.5.7-39-g89da427 [6]
      - consistent mode: fix worker's vardir calculation [7]
      
      [1] tarantool/test-run@6fbb7fd
      [2] tarantool/test-run@c5fa909
      [3] tarantool/test-run@f67d523
      [4] tarantool/test-run@264af05
      [5] tarantool/test-run@e19bb11
      [6] tarantool/test-run@3e74192
      [7] tarantool/test-run@aac77f5
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      252865ce
    • Nikolay Shirokovskiy's avatar
      box: cleanup on tuple encoding failure · 9f9142d6
      Nikolay Shirokovskiy authored
      Currently on tuple encoding failure we raise Lua error. In many placess
      the error is not handled in Lua C code and we get misc leaks. Let's
      instead pass error as return value.
      
      Note that generally speaking encoding code can raise an error on OOM.
      Which will lead to leak again. Hopefully application will be killed by
      OOM killer instead. Other then that we expect no more errors in the
      code. If code calls a user defined callback then pcall is used (see
      lua_field_inspect_ucdata for example). So the turn from raising errors
      to returning error code seems the right direction.
      
      Closes #7939
      
      NO_DOC=bugfix
      9f9142d6
    • Nikolay Shirokovskiy's avatar
      small: bump version · 45c9a096
      Nikolay Shirokovskiy authored
      This will bring new ibuf_truncate method.
      
      Part of #7939
      
      NO_TEST=internal
      NO_CHANGELOG=internal
      NO_DOC=internal
      45c9a096
  7. May 24, 2023
    • Igor Munkin's avatar
      luajit: bump new version · cde911d0
      Igor Munkin authored
      * Fix IR_RENAME snapshot number. Follow-up fix for a32aeadc.
      * OSX: Disable unreliable assertion for external frame unwinding.
      * Disable unreliable assertion for external frame unwinding.
      * Handle on-trace OOM errors from helper functions.
      * LJ_GC64: Make ASMREF_L references 64 bit.
      * lldb: introduce luajit-lldb
      * x64/LJ_GC64: Fix emit_rma().
      * Limit path length passed to C library loader.
      
      Closes #7745
      Part of #4808
      Part of #8069
      Part of #8516
      
      NO_DOC=LuaJIT submodule bump
      NO_TEST=LuaJIT submodule bump
      cde911d0
    • Ilya Verbin's avatar
      box: fix unique violation in functional index with nullable parts · 6bcd51f9
      Ilya Verbin authored
      Currently is_nullable property of a functional index part disables the
      unique property of the index. The bug is in func_index_compare(), which
      compares functional keys first, and if they are equal it compares the
      primary keys. This behaviour is correct only when some part of the key
      is NULL (and for non-unique indexes), but for now the primary keys are
      compared unconditionally. Fix this by checking for NULL key parts.
      
      Closes #8587
      
      NO_DOC=bugfix
      6bcd51f9
  8. May 23, 2023
    • Mergen Imeev's avatar
      sql: check printf() for failure · 13159230
      Mergen Imeev authored
      This patch adds a check that sqlXPrintf() does not fail in the built-in
      SQL function printf(). There are two possible problems: the result might
      get too large, or there might be an integer overflow because internally
      int values are converted to size_t.
      
      Closes #tarantool/security#122
      
      NO_DOC=bugfix
      13159230
    • Mergen Imeev's avatar
      sql: assert in xferOptimization() · 039f714d
      Mergen Imeev authored
      This patch fixes problems with INSERT INTO ... SELECT FROM optimization.
      These problems appeared after 6b8acd8f, where the check became redundant,
      but was not updated. Two problems arose:
      1) an assertion or segmentation fault when optimization was used and the
      source space does not have an index;
      2) optimization can be used even if the indexes are incompatible.
      
      The second problem does not result in changes that are user-visible, so
      there is no test.
      
      Closes #8661
      
      NO_DOC=bugfix
      039f714d
  9. May 22, 2023
    • Gleb Kashkin's avatar
      box: add hostname to box.info · adb14c06
      Gleb Kashkin authored
      Hostname is a useful piece of information in state reports. So it was
      decided to add it to box.info.
      
      Hostname is obtained on requested and is not cached.
      
      Closes #8605
      
      @TarantoolBot document
      Title: Add hostname to box.info
      
      This patch adds hostname to box.info, it can be useful e.g. to supplement
      various instance state reports. It is not cached and is requested on
      each call.
      adb14c06
    • Eli Kobrin's avatar
      sql: fix invalid negation · 088b32f3
      Eli Kobrin authored
      The error of invalid negation occurred because of invalid check,
      which did not cover the case when value is equal to INT64_MIN, so the
      negation of INT64_MIN equal to itself. This must be fixed, because
      negation of INT64_MIN is undefined behavior.
      
      It is fixed by the explicit check for the value variable.
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      NO_TEST=refactoring
      088b32f3
    • Georgiy Lebedev's avatar
      box: support space and index names in IPROTO requests · b9550f19
      Georgiy Lebedev authored
      Add support for accepting IPROTO requests with space or index name instead
      of identifier (name is preferred over identifier to disambiguate missing
      identifiers from zero identifiers): mark space identifier request
      key as present upon encountering space name, and delay resolution of
      identifier until request gets to transaction thread.
      
      Add support for sending DML requests from net.box connection objects with
      disabled schema fetching by manually specifying space or index name or
      identifier: when schema fetching is disabled, the space and index tables of
      connections return wrapper tables that store necessary context (space or
      index name or identifier, determined by type, connection object and space
      for indexes) for performing requests. The space and index tables cache the
      wrapper table they return.
      
      Closes #8146
      
      @TarantoolBot document
      Title: Space and index name in IPROTO requests
      
      Refer to design document for details:
      https://www.notion.so/tarantool/Schemafull-IPROTO-cc315ad6bdd641dea66ad854992d8cbf?pvs=4#f4d4b3fa2b3646f1949319866428b6c0
      b9550f19
    • Georgiy Lebedev's avatar
      box: add `space_by_name` and `space_index_by_name` for arbitrary strings · bf086dc9
      Georgiy Lebedev authored
      Change original `space_by_name` to `space_by_name0` and
      `space_index_by_name` to `space_index_by_name0`, since they accept
      NULL-terminated names, and add `space_by_name` and `space_index_by_name`
      for arbitrary strings.
      
      Needed for #8146
      
      NO_CHANGELOG=refactoring
      NO_DOC=refactoring
      NO_TEST=refactoring
      bf086dc9
  10. May 19, 2023
    • Vladislav Shpilevoy's avatar
      replication: allow to re-register with new UUID · 4507c59d
      Vladislav Shpilevoy authored
      Previously it wasn't allowed to change instance UUID in _cluster.
      When needed, it had to be done manually by deleting the instance
      from _cluster and inserting it back with a new UUID. Or not to be
      done at all.
      
      Re-UUID (like re-name) was reported to be used when people didn't
      want to register new replica IDs. They wanted to rejoin lost
      replicas from scratch but keep the numeric ID. With UUID they
      could deal by either setting it explicitly to the old value on a
      new instance, or by doing the manual re-UUID like described above.
      
      This commit is supposed to make things simpler. If a replica has a
      name, then its re-join with another UUID is not an error. Its
      record in _cluster is automatically updated to store the new UUID.
      
      That is only possible if the old-UUID-instance is not connected
      anymore and is not listed in replication cfg.
      
      Closes #5029
      
      @TarantoolBot document
      Title: Instance rebootstrap with new UUID but same ID and name
      If an instance has a non-empty instance name
      (`box.cfg.instance_name`), then at rebootstrap it can keep the
      name and its old numeric ID (space `_cluster['id']` field).
      
      This might be needed if one doesn't want to pollute `_cluster`
      with new rows, and somewhy doesn't want to or can't just drop the
      rows belonging to the dead replicas.
      
      In order for this to work 1) the rebootstrapping replica must keep
      its old non-empty instance name, 2) the other instances should not
      have any alive connections to the old dead replica. Ideally, the
      old replica should be just deleted from `box.cfg.replication`
      everywhere.
      
      When that works, the old row in `_cluster` is automatically
      updated with the new instance UUID.
      4507c59d
    • Vladislav Shpilevoy's avatar
      replication: introduce instance name · 9e2d46f9
      Vladislav Shpilevoy authored
      The instance name is carried with instance UUID everywhere in the
      replication protocols. It is visible in all other instances via
      _cluster and is displayed in monitoring.
      
      Part of #5029
      
      @TarantoolBot document
      Title: `box.cfg.instance_name` and `box.info.name`
      The new option `box.cfg.instance_name` allows to assign the
      instance name to a human-readable text value to be displayed in
      the new info key - `box.info.name`. Instances can see names of
      their peers in `box.info.replication[id].name`.
      
      The name is broadcasted in "box.id" built-in event as
      "instance_name" key. It is string when set and nil when not set.
      
      When set, it has to be unique in the instance's replicaset.
      
      If a name wasn't set on cluster bootstrap (was forgotten or the
      cluster is upgraded from a version < 3.0), then it can be set
      on an already running instance via `box.cfg.instance_name`.
      
      To change or drop an already installed name one has to use
      `box.cfg.force_recovery == true` in all instances of the cluster.
      After the name is updated and all the instances synced, the
      `force_recovery` can be set back to `false`.
      
      The name can be <= 63 symbols long, can consist only of chars
      ['0'-'9'], '-' and 'a'-'z'. It must start with a letter. When
      upper-case letters are used in `box.cfg`, they are automatically
      converted to lower-case. The names are host- and DNS-friendly.
      9e2d46f9
    • Vladislav Shpilevoy's avatar
      replication: introduce replicaset name · 5bca2295
      Vladislav Shpilevoy authored
      The replicaset name is carried with replicaset UUID wherever any
      sanity validations are needed like whether 2 instances belong to
      the same replicaset.
      
      Part of #5029
      
      @TarantoolBot document
      Title: `box.cfg.replicaset_name` and `box.info.replicaset.name`
      The new option `box.cfg.replicaset_name` allows to assign the
      replicaset name to a human-readable text value to be displayed in
      the new info key - `box.info.replicaset.name` - and to be
      validated when the instances in the replicaset connect to each
      other.
      
      The name is broadcasted in "box.id" built-in event as
      "replicaset_name" key. It is string when set and nil when not set.
      
      When set, it has to match in all instances of the entire
      replicaset.
      
      If a name wasn't set on cluster bootstrap (was forgotten or the
      cluster is upgraded from a version < 3.0), then it can be set
      on an already running instance via `box.cfg.replicaset_name`.
      
      To change or drop an already installed name one has to use
      `box.cfg.force_recovery == true` in all instances of the cluster.
      After the name is updated and all the instances synced, the
      `force_recovery` can be set back to `false`.
      
      The name can be <= 63 symbols long, can consist only of chars
      ['0'-'9'], '-' and 'a'-'z'. It must start with a letter. When
      upper-case letters are used in `box.cfg`, they are automatically
      converted to lower-case. The names are host- and DNS-friendly.
      5bca2295
    • Vladislav Shpilevoy's avatar
      replication: introduce cluster name · cb9307a7
      Vladislav Shpilevoy authored
      The patch adds 2 new entities to replication: the concept of a
      cluster which has multiple replicasets and a name for this
      cluster.
      
      The name so far doesn't participate in any replication protocols.
      It is just stored in _schema and is validated against the config.
      
      The old mentions of 'cluster' (in logs, in some protocol keys like
      in the feedback daemon) everywhere are now considered obsolete and
      probably will be eventually replaced with 'replicaset'.
      
      Part of #5029
      
      @TarantoolBot document
      Title: `box.cfg.cluster_name` and `box.info.cluster.name`
      The new option `box.cfg.cluster_name` allows to assign the cluster
      name to a human-readable text value to be displayed in the new
      info key - `box.info.cluster.name` - and to be validated when the
      instances in the cluster connect to each other.
      
      The name is broadcasted in "box.id" built-in event as
      "cluster_name" key. It is string when set and nil when not set.
      
      When set, it has to match in all instances of the entire cluster
      in all its replicasets.
      
      If a name wasn't set on cluster bootstrap (was forgotten or the
      cluster is upgraded from a version < 3.0), then it can be set
      on an already running instance via `box.cfg.cluster_name`.
      
      To change or drop an already installed name one has to use
      `box.cfg.force_recovery == true` in all instances of the cluster.
      After the name is updated and all the instances synced, the
      `force_recovery` can be set back to `false`.
      
      The name can be <= 63 symbols long, can consist only of chars
      '0'-'9', '-' and 'a'-'z'. It must start with a letter. When
      upper-case letters are used in `box.cfg`, they are automatically
      converted to lower-case. The names are host- and DNS-friendly.
      cb9307a7
    • Vladislav Shpilevoy's avatar
      box: validate global ids after boot in one func · 7fd0d2a5
      Vladislav Shpilevoy authored
      The new function check_global_ids_integrity() checks that the
      replicaset UUID specified in the config and found in the data
      match. Instance UUID is created at bootstrap and validated at the
      beginning of recovery, not in the end. Hence not checked here.
      
      For now this function is not very useful, but soon there will be
      more global IDs stored in WAL which will need validation.
      
      Needed for #5029
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      NO_TEST=already covered
      7fd0d2a5
    • Vladislav Shpilevoy's avatar
      box: introduce node_name funcs and constants · efbc7762
      Vladislav Shpilevoy authored
      Node name stores a DNS- and host- friendly string name. It will be
      used in the next patches for some new global names: cluster,
      replicaset, and instance.
      
      Part of #5029
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      efbc7762
    • Vladislav Shpilevoy's avatar
      info: rename box.info.cluster -> replicaset · ef86e000
      Vladislav Shpilevoy authored
      It was named 'cluster', but really was just about the replicaset.
      This is going to be even more confusing soon, because there will
      be introduced an actual concept of cluster as multiple
      replicasets.
      
      The patch renames it to 'replicaset'. `box.info.cluster` now means
      the whole cluster and is empty so far. Next patches will add here
      the cluster name.
      
      Part of #5029
      
      @TarantoolBot document
      Title: `box.info.cluster` is renamed to `box.info.replicaset`
      
      Done since 3.0.0. The old behaviour can be reverted back via the
      `compat` option `box_info_cluster_meaning`.
      
      `box.info.cluster` key is still here, but now means a totally
      different thing - the entire cluster with all its replicasets.
      
      <h2>Compat documentation</h2>
      
      `box.info.cluster` default meaning is the whole cluster with all
      its replicasets. To get info about only the current replicaset
      `box.info.replicaset` should be used.
      
      In old versions (< 3.0.0) `box.info.cluster` meant the current
      replicaset and `box.info.replicaset` didn't exist.
      
      <h3>Old and new behaviour</h3>
      
      New behaviour:
      ```
      tarantool> box.info.cluster
      ---
      - <some cluster keys>
      ...
      
      tarantool> box.info.replicaset
      ---
      - uuid: <replicaset uuid>
      - <... other attributes of the replicaset>
      ...
      ```
      Old behaviour:
      ```
      tarantool> box.info.cluster
      ---
      - uuid: <replicaset uuid>
      - <... other attributes of the replicaset>
      ...
      
      tarantool> box.info.replicaset (= nil on < 3.0.0)
      ---
      - uuid: <replicaset uuid>
      - <... other attributes of the replicaset>
      ...
      ```
      
      <h3>Known compatibility issues</h3>
      
      VShard versions < 0.1.24 do not support the new behaviour.
      
      <h3>Detecting issues in you codebase</h3>
      
      Look for all usages of `box.info.cluster`, `info.cluster`, and
      even just `.cluster`, `['cluster']`, `["cluster"]`. For the new
      behaviour to work all of them have to use 'replicaset' key.
      ef86e000
    • Vladislav Shpilevoy's avatar
      schema: replace 'cluster' -> 'replicaset_uuid' · aa987a82
      Vladislav Shpilevoy authored
      Replicaset UUID was stored in _schema['cluster'] tuple. This is
      going to be confusing soon, because there will be introduced an
      actual concept of cluster as multiple replicasets.
      
      The patch renames it to 'replicaset_uuid'.
      
      Part of #5029
      
      @TarantoolBot document
      Title: Update '_schema' with new 'replicaset_uuid' key
      
      Currently _schema system space is documented to have 'cluster' key
      with replicaset UUID value. Now this key is deleted (since 3.0)
      and the UUID is stored in 'replicaset_uuid' key.
      aa987a82
Loading