Skip to content
Snippets Groups Projects
  1. Feb 08, 2023
  2. Jan 18, 2023
    • Serge Petrenko's avatar
      replication: change bootstrap and replication configuration behaviour · 2f8e2d98
      Serge Petrenko authored
      See the docbot request for details.
      
      Closes #5272
      
      @TarantoolBot document
      Title: new `bootstrap_strategy` configuration option
      
      Default behaviour of replica set bootstrap, replica recovery when
      connecting to remote nodes and replication reconfiguration is changed.
      The new behaviour is controlled by the option `bootstrap_strategy`,
      which has the default value "auto".
      
      Now `replication_connect_quorum` configuration option takes no effect,
      and the effective quorum value for each stage of configuration (quorum
      of established connections, quorum of synced nodes) is determined
      automatically.
      
      On replica set bootstrap, the nodes will refuse to boot, unless a
      majority is reached (this would mean replication_connect_quorum = 3,
      when #box.cfg.repilcation is 4 or 5, for example, or
      replication_connect_quorum = 2, when #box.cfg.replication is 2 or 3).
      Moreover, the bootstrap leader will fail to boot unless it sees that
      every connected node chose it as the bootstrap leader.
      
      On new replica join to an existing cluster, the replica will fail to
      boot only if it couldn't connect to anyone. As long as at least one
      connection is established, the replica will try to join like before.
      
      Moreover, the replica will check that its box.cfg.replication table
      contains every registered node in the cluster, thus ensuring that it has
      tried to connect to everyone and chose the best bootstrap leader
      possible.
      
      On replication reconfiguration on a working instance and recovery from
      local WAL files, the node will try to connect to everyone specified in
      box.cfg.replication. Any number of connections (even no connections)
      will be deemed a success, but the replica will stay in orphan mode until
      it is synced with everyone connected.
      
      If you wish to return to the old behavior, a deprecated setting
      `bootstrap_strategy` = "legacy" is left for now. With
      `bootstrap_strategy` = "legacy", the node behaves exactly like before:
      quorum for both connection and synchronisation is determined by
      `replication_connect_quorum`, and neither bootstrap leader nor joining
      replicas perform any additional checks on bootstrap.
      2f8e2d98
    • Serge Petrenko's avatar
      replication: set default replication_sync_timeout to 0 · 67cb4e4e
      Serge Petrenko authored
      The only observable behaviour of non-zero replication_sync_timeout is
      that it delays box.cfg{replication=...} return until either the node is
      synced with others or the timeout passes.
      
      If the timeout passes without reaching sync, box.cfg{} is exited and the
      node enters "orphan" state, in which it can't write anything until
      either a reconfiguration happens or replicaset is finally synced.
      
      While the previous box.cfg{} call is running (probably waiting for
      replication_sync_timeout), the user can't issue another box.cfg{} call.
      
      So basically, while giving no guarantees that the node exits box.cfg{}
      in fully synced state, the timeout makes reconfiguration harder: even if
      the user knows that the sync won't be achieved, he will have to wait
      until the full timeout passes in order to reconfigure replication.
      
      Let's make the default value of replication_sync_timeout 0 instead of
      300 seconds. The user still may set the timeout to whatever he likes.
      Besides, we have recently introduced box.ctl.on_recovery_state triggers,
      which have a "synced" event, and this is the new recommended way to wait
      until the node is synced with others.
      
      Part-of #5272
      
      @TarantoolBot document
      Title: Changed default value for `box.cfg.replication_sync_timeout
      
      The default value for `replication_sync_timeout` configuration option
      was changed from 300 seconds to 0.
      67cb4e4e
    • Serge Petrenko's avatar
      box: broadcast registered replica uuids in ballot · fd61dc64
      Serge Petrenko authored
      Now the instance appends a list of registered replica set members it
      knows of to its ballot.
      
      Prerequisite #5272
      
      NO_CHANGELOG=not user-visible
      
      @TarantoolBot document
      Title: New fields in instance's ballot.
      
      Instance's ballot (a response to IPROTO_VOTE sent on replica connect)
      receives two new fields:
      1) The uuid of the node this instance considers the bootstrap leader.
         Key: IPROTO_BALLOT_BOOTSTRAP_LEADER_UUID = 0x08
         Value: uuid, encoded as 36-byte string (like
         "bfd2b31c-b740-43e5-bf3c-28538a74c9a6").
      2) An array of registered replica set members uuids.
         Key: IPROTO_BALLOT_REGISTERED_REPLICA_UUIDS = 0x09
         Value: a MP_ARRAY of uuids, each uuid encoded as a 36-byte string
         (like in an example above).
      fd61dc64
    • Serge Petrenko's avatar
      box: broadcast bootstrap leader uuid in ballot · f06825e6
      Serge Petrenko authored
      Note that bootstrap leader uuid is not set when an anonymous replica
      registers, because technically it's not performing a bootstrap.
      
      Prerequisite #5272
      
      NO_DOC=appended to next commit's doc request
      NO_CHANGELOG=not user-visible
      f06825e6
    • Serge Petrenko's avatar
      box: introduce "internal.ballot" builtin event · e49b9085
      Serge Petrenko authored
      Add a new builtin event carrying instance's ballot information (that is,
      what this instance would normally send in reply to IPROTO_VOTE request).
      
      The event will be watched by connecting replicas to find the bootstrap
      leader.
      
      Prerequisite #5272
      
      NO_DOC=technically user-visible, but not intended for users
      NO_CHANGELOG=see NO_DOC
      e49b9085
    • Serge Petrenko's avatar
      test: rename cluster to replica_set in gh_6260 test · dc635190
      Serge Petrenko authored
      luatest_helpers/cluster module was recently added to luatest trunk and
      renamed to replica_set.
      Let's update its name everywhere in gh_6260_add_builtin_events_test,
      since this test will be amended in the following commits and the new
      module name will be used.
      
      In-scope-of #5272
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      dc635190
  3. Dec 16, 2022
    • Yaroslav Lobankov's avatar
      test: use luatest modules instead of internal ones · 62ffc72c
      Yaroslav Lobankov authored
      Some internal modules have been recently copied to luatest repo [1] and
      now they can be safely removed, and the corresponding functionality from
      luatest can be used instead.
      
      Affected modules:
      
      - test/luatest_helpers/cluster.lua
      
      [1] tarantool/luatest#271
      
      Closes tarantool/luatest#237
      Closes tarantool/luatest#269
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      62ffc72c
  4. Dec 05, 2022
    • Yaroslav Lobankov's avatar
      test: use luatest modules instead of internal ones · 21fc0770
      Yaroslav Lobankov authored
      Some internal modules have been recently copied to luatest repo [1,2]
      and now they can be safely removed, and the corresponding functionality
      from luatest can be used instead.
      
      Affected modules:
      
      - test/luatest_helpers/server.lua
      
      [1] tarantool/luatest#258
      [2] tarantool/luatest#266
      
      Closes tarantool/luatest#239
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      21fc0770
  5. Sep 26, 2022
    • Georgiy Lebedev's avatar
      test: replace Lua `assert`s with luatest `assert`s in luatest tests · 1d9645f7
      Georgiy Lebedev authored
      Some luatest framework tests use Lua `assert`s, which are incomprehensible
      when failed (the only information provided is 'assertion failed!'),
      making debugging difficult: replace them with luatest `assert`s and their
      context-specific varieties.
      
      NO_CHANGELOG=<code health>
      NO_DOC=<code health>
      1d9645f7
  6. Mar 29, 2022
    • Yan Shtunder's avatar
      net.box: add predefined system events for pub/sub · e1d2f7f0
      Yan Shtunder authored
      Added predefined system events: box.status, box.id, box.election and
      box.schema.
      
      Closes #6260
      
      @TarantoolBot document
      Title: Built-in events for pub/sub
      
      Built-in events are needed, first of all, in order to learn who is the
      master, unless it is defined in an application specific way. Knowing who
      is the master is necessary to send changes to a correct instance, and
      probably make reads of the most actual data if it is important. Also
      defined more built-in events for other mutable properties like leader
      state change, his election role and election term, schema version change
      and instance state.
      
      Built-in events have a special naming schema - their name always starts
      with box.. The prefix is reserved for built-in events. Creating new events
      with this prefix is banned. Below is a list of all the events + their names
      and values:
      
      1. box.id
      Description - identification of the instance. Changes are extra rare. Some
      values never change or change only once. For example, instance UUID never
      changes after the first box.cfg. But is not known before box.cfg is called.
      Replicaset UUID is unknown until the instance joins to a replicaset or
      bootsa new one, but the events are supposed to start working before that -
      right at listen launch. Instance numeric ID is known only after
      registration. On anonymous replicas is 0 until they are registered
      officially.
      Value - {
          MP_STR “id”: MP_UINT; box.info.id,
          MP_STR “instance_uuid”: MP_UUID; box.info.uuid,
          MP_STR “replicaset_uuid”: MP_UUID box.info.cluster.uuid,
      }
      
      2. box.status
      Description - generic blob about instance status. Its most commonly used
      and not frequently changed config options and box.info fields.
      Value - {
          MP_STR “is_ro”: MP_BOOL box.info.ro,
          MP_STR “is_ro_cfg”: MP_BOOL box.cfg.read_only,
          MP_STR “status”: MP_STR box.info.status,
      }
      
      3. box.election
      Description - all the needed parts of box.info.election needed to find who
      is the most recent writable leader.
      Value - {
          MP_STR “term”: MP_UINT box.info.election.term,
          MP_STR “role”: MP_STR box.info.election.state,
          MP_STR “is_ro”: MP_BOOL box.info.ro,
          MP_STR “leader”: MP_UINT box.info.election.leader,
      }
      
      4. box.schema
      Description - schema-related data. Currently it is only version.
      Value - {
          MP_STR “version”: MP_UINT schema_version,
      }
      
      Built-in events can't be override. Meaning, users can't be able to call
      box.broadcast(‘box.id’, any_data) etc.
      
      The events are available from the very beginning as not MP_NIL. It's
      necessary for supported local subscriptions. Otherwise no way to detect
      whether an event is even supported at all by this Tarantool version. If
      events are broadcast before box.cfg{}, then the following values will
      available:
          box.id = {}
          box.schema = {}
          box.status = {}
          box.election = {}
      
      This way the users will be able to distinguish an event being not supported
      at all from box.cfg{} being not called yet. Otherwise they would need to
      parse _TARANTOOL version string locally and peer_version in netbox.
      
      Example usage:
      
       * Client:
         ```lua
         conn = net.box.connect(URI)
         -- Subscribe to updates of key 'box.id'
         w = conn:watch('box.id', function(key, value)
             assert(key == 'box.id')
             -- do something with value
         end)
         -- or to updates of key 'box.status'
         w = conn:watch('box.status', function(key, value)
             assert(key == 'box.status')
             -- do something with value
         end)
         -- or to updates of key 'box.election'
         w = conn:watch('box.election', function(key, value)
             assert(key == 'box.election')
             -- do something with value
         end)
         -- or to updates of key 'box.schema'
         w = conn:watch('box.schema', function(key, value)
             assert(key == 'box.schema')
             -- do something with value
         end)
         -- Unregister the watcher when it's no longer needed.
         w:unregister()
         ```
      e1d2f7f0
Loading