Skip to content
Snippets Groups Projects
  1. Aug 06, 2021
    • Serge Petrenko's avatar
      box: allow calling promote on a candidate · b0192cc1
      Serge Petrenko authored
      Part of #6034
      b0192cc1
    • Serge Petrenko's avatar
      box: split manual and automatic promotion · 8d9326f6
      Serge Petrenko authored
      There are two use cases for box_promote(): it's either invoked manually
      or after a won election round. In the latter case promote() is much
      simpler.
      
      Part-of #6034
      8d9326f6
    • Serge Petrenko's avatar
      box: test the assertion failure after a spurious wakeup in promote · 338e8675
      Serge Petrenko authored
      The failure itself was fixed in 68de8753
      (raft: replace raft_start_candidate with _promote), let's add a
      regression test now.
      
      Follow-up #3055
      338e8675
    • Serge Petrenko's avatar
      box: make promote on the current leader a no-op · ed7fd96d
      Serge Petrenko authored
      It was allowed to call promote on any instance, even when it's already
      the limbo owner (iow Raft leader, when elections are enabled).
      
      This doesn't break anything, when elections are disabled, but is rather
      strange.
      
      When elections are enabled, in contrary, calling promote() should be a
      no-op on the leader. Otherwise it would make the leader read-only until
      it wins the next election round, which's quite inconvenient.
      
      Part-of #6034
      ed7fd96d
    • Serge Petrenko's avatar
      box: make promote always bump the term · d2996fa5
      Serge Petrenko authored
      When called without elections, promote resulted in multiple
      PROMOTE entries for the same term. This is not correct, because all
      the promotions for the same term except the first one would be ignored
      as already seen.
      
      Part-of #6034
      d2996fa5
    • Serge Petrenko's avatar
      box: split promote() into reasonable parts · 4fe5d5db
      Serge Petrenko authored
      box_promote() is a monster. It does a lot of different things based on
      flags: try_wait and run_elections. The flags themselves depend on the
      node's Raft state and the lunar calendar.
      
      Moreover, there are multiple cancellation points and places where
      external state may have changed and needs a re-check.
      
      Things are going to get even worse with the introduction of box.ctl.demote().
      
      So it's time to split up box_promote() into reasonable parts, each doing
      exactly one thing.
      
      This commit mostly addresses the multiple cancellation points issue,
      so that promote() doesn't look like a huge pile of if(something_changed)
      blocks. Some other functions will look like that instead.
      
      Part of #6034
      4fe5d5db
    • Serge Petrenko's avatar
      raft: refactor raft_new_term() · 1dd9bd4f
      Serge Petrenko authored
      Make raft_new_term() always bump the current term, even when Raft is
      disabled.
      
      Part-of #6034
      1dd9bd4f
    • Serge Petrenko's avatar
      replication: send current Raft term in join response · e7f3b3b9
      Serge Petrenko authored
      Make Raft nodes send out their latest persisted term to joining
      replicas.
      
      This is needed to avoid the situation when txn_limbo-managed 'promote
      greatest term' is greater than current Raft term. Otherwise the
      following may happen: replica joins off some instance and receives its
      latest limbo state. The state includes "greatest term seen" and makes
      limbo filter out any data coming from instances with smaller terms.
      Imagine that master this replica has joined from dies before replica has
      a chance to subscribe to it. Then it doesn't receive its current Raft
      term and start elections at smallest term possible, 2 (when there are no
      suitable Raft nodes besides the replica).
      
      Once the elections in a small term number are won, a ton of problems
      arises: starting with filtering out PROMOTE requests for "old" term and
      nop-ifying any data coming from terms smaller than "greatest term seen".
      
      Prerequisite #6034
      e7f3b3b9
    • Serge Petrenko's avatar
      replication: send latest effective promote in initial join · 05ef687f
      Serge Petrenko authored
      A joining instance may never receive the latest PROMOTE request, which
      is the only source of information about the limbo owner. Send out the
      latest limbo state (e.g. the latest applied PROMOTE request) together
      with the initial join snapshot.
      
      Prerequisite #6034
      05ef687f
    • Serge Petrenko's avatar
      replication: add META stage to JOIN · f1c2127d
      Serge Petrenko authored
      The new META stage is part of server's response to a join request.
      It's marked by IPROTO_JOIN_META and IPROTO_JOIN_SNAPSHOT requests and goes
      before the actual snapshot data.
      
      Prerequisite #6034
      
      @TarantoolBot document
      Title: new protocol stage during JOIN
      
      A new stage is added to the stream of JOIN rows coming from master.
      The stage is marked with a bodyless row with type
      IPROTO_JOIN_META = 71
      Once all the rows from the stage are sent out, the JOIN continues as
      before (as a stream of snapshot rows). The end of META stage is marked
      with a row of type IPROTO_JOIN_SNAPSHOT = 72
      
      The stage contains the rows that are necessary for instance
      initialization (current Raft term, current state of synchronous
      transaction queue), but do not belong to any system space.
      f1c2127d
    • Serge Petrenko's avatar
      iproto: make iproto_write_error() blocking in debug · 2b3ca42c
      Serge Petrenko authored
      iproto_write_error() used to be blocking until the commit
      4dac37a6 (iproto: remove
      iproto_write_error_blocking())
      Actually, it should block until the error is written to socket, because
      some tests (vinyl/errinj.test.lua, for example) rely on that.
      
      Do not make iproto_write_error() blocking in release builds for safety
      reasons, as stated in commit above. But make it blocking in debug for
      testing sake.
      
      Part-of #6034
      2b3ca42c
    • Serge Petrenko's avatar
      replication: encode version in JOIN request · 528f5438
      Serge Petrenko authored
      The replica's version will be needed once sending limbo and election
      state snapshot is implemented.
      
      Prerequisite #6034
      
      @TarantoolBot document
      Title: New field in JOIN request
      
      JOIN request now comes with a new field: replica's version.
      The field uses IPROTO_SERVER_VERSION key.
      528f5438
    • Serge Petrenko's avatar
      txn_limbo: persist the latest effective promote in snapshot · fec06d1d
      Serge Petrenko authored
      Previously PROMOTE entries, just like CONFIRM and ROLLBACK were only
      stored in WALs. This is because snapshots consist solely of confirmed
      transactions, so there's nothing to CONFIRM or ROLLBACK.
      
      PROMOTE has gained additional meaning recently: it pins limbo ownership
      to a specific instance, rendering everyone else read-only. So now
      PROMOTE information must be stored in snapshots as well.
      
      Save the latest limbo state (owner id and latest confirmed lsn) to the
      snapshot as a PROMOTE request.
      
      Prerequisite #6034
      fec06d1d
    • Serge Petrenko's avatar
      txn_limbo: fix promote term filtering · 7c16da00
      Serge Petrenko authored
      txn_limbo_process() used to filter out promote requests whose term was
      equal to the greatest term seen. This wasn't correct for PROMOTE entries
      with term 1.
      
      Such entries appear after box.ctl.promote() is issued on an instance
      with disabled elections. Every PROMOTE entry from such an instance has
      term 1, but should still be applied. Fix this in the patch.
      
      Also, when an outdated PROMOTE entry with term smaller than already
      applied from some replica arrived, it wasn't filtered at all. Such a
      situation shouldn't be possible, but fix it as well.
      
      Part-of #6034
      7c16da00
    • Serge Petrenko's avatar
      replication: always send raft state to subscribers · 3bee3fe2
      Serge Petrenko authored
      Tarantool used to send out raft state on subscribe only when raft was
      enabled. This was a safeguard against partially-upgraded clusters, where
      some nodes had no clue about Raft messages and couldn't handle them
      properly.
      
      Actually, Raft state should be sent out always. For example, promote
      will be changed to bump Raft term even when Raft is disabled, and it's
      important that everyone in cluster has the same term for the sake of promote
      at least.
      
      So, send out Raft state to every subscriber with version >= 2.6.0
      (that's when Raft was introduced).
      Do the same for Raft broadcasts. They should be sent only to replicas
      with version >= 2.6.0
      
      Closes #5438
      3bee3fe2
  2. Aug 05, 2021
    • Mergen Imeev's avatar
      tests: fix test in sql-tap/cse.test.lua · d7e8d888
      Mergen Imeev authored
      d7e8d888
    • Serge Petrenko's avatar
      box: introduce on_election triggers · 7f5876fa
      Serge Petrenko authored
      On_election triggers are fired asynchronously after any Raft event with
      a broadcast, they are run in a worker fiber, so it's allowed to yield
      inside them, unlike Raft's on_update triggers we already had.
      
      Closes #5819
      
      @TarantoolBot document
      Title: document triggers on election state change
      
      A new function to register triggers is added, `box.ctl.on_election()`.
      Triggers registered via this function are run asynchronously every time
      a visible change in `box.info.election` table appears.
      No parameters are passed to the trigger, it may check what's changed by
      looking at `box.info.election` and `box.info.synchro`.
      7f5876fa
    • Mergen Imeev's avatar
      sql: implicit cast rules for arithmetic operations · ecd8231b
      Mergen Imeev authored
      After this patch, arithmetic operations will only accept numeric values.
      For the "%" operation, the rules have become even stricter, now it
      accepts only INTEGER and UNSIGNED values.
      
      Part of #4470
      Closes #5756
      ecd8231b
    • Mergen Imeev's avatar
      sql: bit-wise operations now accepts only UNSIGNED · 3f14bcf1
      Mergen Imeev authored
      After this patch, bitwise operations will only accept UNSIGNED and
      positive INTEGER values as operands. The result of the bitwise operand
      will be UNSIGNED.
      
      Part of #4470
      Closes #5364
      3f14bcf1
    • Mergen Imeev's avatar
      sql: fix STRING to BOOLEAN explicit cast · 06c94033
      Mergen Imeev authored
      Prior to this patch, if a non-NULL-terminated string was cast to
      BOOLEAN, the conversion always failed. Casting to BOOLEAN is now
      independent of NULL termination.
      
      Part of #4470
      06c94033
    • Mergen Imeev's avatar
      sql: disallow explicit cast of VARBINARY to number · 8034a076
      Mergen Imeev authored
      This patch removes explicit cast of VARBINARY values to numeric types.
      
      Part of #4470
      Closes #4772
      Closes #5852
      8034a076
    • Mergen Imeev's avatar
      sql: disallow explicit cast of BOOLEAN to number · 12dadb93
      Mergen Imeev authored
      This patch removes explicit cast of BOOLEAN values to numeric types and
      explicit cast of numeric values to BOOLEAN.
      
      Part of #4470
      12dadb93
    • Mergen Imeev's avatar
      sql: remove OP_Realify · 482eceb9
      Mergen Imeev authored
      This opcode was used to convert INTEGER values to REAL. It is not
      necessary in Tarantool and causes errors.
      
      Due to OP_Realify two type of errors appeared:
      1) In some cases in trigger INTEGER may be converted to DOUBLE.
      For example:
      box.execute("CREATE TABLE t (i NUMBER PRIMARY KEY, n NUMBER);")
      box.execute("CREATE TRIGGER t AFTER INSERT ON t FOR EACH ROW BEGIN UPDATE t SET n = new.n; END;")
      box.execute("INSERT INTO t VALUES (1, 1);")
      box.execute("SELECT i / 2, n / 2 FROM t;")
      
      Result:
      tarantool> box.execute("SELECT i / 2, n / 2 FROM t;")
      ---
      - metadata:
        - name: COLUMN_1
          type: number
        - name: COLUMN_2
          type: number
        rows:
        - [0, 0.5]
      ...
      
      2) If SELECT uses GROUP BY then it may return DOUBLE instead of INTEGER.
      For example:
      box.execute("CREATE TABLE t (i NUMBER PRIMARY KEY, n NUMBER);")
      box.execute("INSERT INTO t VALUES (1,1);")
      box.execute("SELECT i / 2, n / 2 FROM t GROUP BY n;")
      
      Result:
      tarantool> box.execute("SELECT i / 2, n / 2 FROM t GROUP BY n;")
      ---
      - metadata:
        - name: COLUMN_1
          type: number
        - name: COLUMN_2
          type: number
        rows:
        - [0.5, 0.5]
      ...
      
      This patch removes OP_Realify, after which these errors disappear.
      
      Closes #5335
      482eceb9
    • Mergen Imeev's avatar
      sql: fix cast of small negative DOUBLE to INTEGER · 27262bf1
      Mergen Imeev authored
      Prior to this patch when DOUBLE value that less than 0.0 and greater
      than -1.0 was cast to INTEGER, it was considered to be negative number
      though the result was 0. This patch fixes this, so now such DOUBLE
      value will be properly cast to INTEGER and UNSIGNED.
      
      Closes #6225
      27262bf1
  3. Aug 04, 2021
  4. Aug 02, 2021
    • Vladislav Shpilevoy's avatar
      decimal: introduce and use lua_pushdecimalstr() · c302360f
      Vladislav Shpilevoy authored
      decimal conversion to string in Lua used decimal_str() function.
      The function is not safe to use in preemptive context like Lua,
      where any attempt to push something onto the Lua stack might
      trigger GC, which in turn might invoke any other code.
      
      It is not safe because uses the static buffer, which is global and
      cyclic. Newer allocations can override the old data without any
      warning.
      
      The same problem was fixed for tt_uuid_str() and uuids in
      box.info in one of the previous commits.
      
      The patch adds a new function lua_pushdecimalstr() which does not
      use the static buffer. It is now used to push decimals safely on a
      Lua stack.
      
      Follow up #5632
      Follow up #6050
      Closes #6259
      c302360f
    • Vladislav Shpilevoy's avatar
      decimal: introduce decimal_to_string · 2738669b
      Vladislav Shpilevoy authored
      It saves decimal as a string into an externally passed buffer.
      This will be used by places which can not use the static buffer
      returned by decimal_str().
      
      Part of #6259
      2738669b
    • Vladislav Shpilevoy's avatar
      decimal: rename decimal_to_string to decimal_str · 9534cf55
      Vladislav Shpilevoy authored
      To be consistent with tt_uuid_str() and tt_uuid_to_string().
      _str() returns a string. _to_string() copies it into an externally
      passed buffer.
      
      Part of #6259
      9534cf55
    • Vladislav Shpilevoy's avatar
      info: use luaL_pushuuidstr() for box.info uuids · 87a871a3
      Vladislav Shpilevoy authored
      box.info.uuid, box.info.cluster.uuid, and box.info.* replica UUIDs
      used tt_uuid_str() function. The function is not safe to use in
      preemptive context like Lua, where any attempt to push something
      onto the Lua stack might trigger GC, which in turn might invoke
      any other code.
      
      It is not safe because uses the static buffer, which is global and
      cyclic. Newer allocations can override the old data without any
      warning.
      
      Follow up #5632
      Follow up #6050
      Part of #6259
      87a871a3
    • Vladislav Shpilevoy's avatar
      uuid: introduce and use luaL_pushuuidstr() · 51464c7d
      Vladislav Shpilevoy authored
      The function safely pushes tt_uuid as a string on a Lua stack.
      Safety means that it does not use tt_uuid_str() which stores the
      result into the global static buffer and can not be used in Lua
      context.
      
      The static buffer is not safe to use in Lua and Lua C because
      during a static string push onto a Lua stack the GC might be
      started and it can spoil the buffer.
      
      Part of #6259
      51464c7d
    • Vladimir Davydov's avatar
      net.box: rewrite response decoder in C · 77da64c3
      Vladimir Davydov authored
      This patch moves method_decoder table from Lua to C. This is a step
      towards rewriting performance-critical parts of net.box in C.
      
      Part of #6241
      77da64c3
  5. Jul 30, 2021
    • Leonid Vasiliev's avatar
      test: fix stdout check in tarantoolctl test · 231c90af
      Leonid Vasiliev authored
      Before the patch, instead of checking stdout,
      the return result is checked.
      
      Affected test: "check answers in case of call".
      Affected test case: "check 'eval' stdout for 'good_script ok_script.lua'".
      231c90af
    • Igor Munkin's avatar
      luajit: bump new version · 72256657
      Igor Munkin authored
      * test: disable interactive mode assertions on BSD
      * test: update lua-Harness to c4451fe
      * test: support tarantool cli in lua-Harness
      * test: backport lua-Harness directory detection
      * test: support Tarantool in lua-Harness
      * test: refactor with _dofile
      * test: refactor with _retrieve_progname
      * test: use CI friendly variables in lua-Harness
      * test: rename lua-Harness tap to test_assertion
      * test: port lua-Harness to Test.Assertion
      
      Closes #5970
      Part of #4473
      72256657
    • Vladimir Davydov's avatar
      net.box: create mpstream in netbox_encode_method · b0cd298a
      Vladimir Davydov authored
      Currently, an mpstream is initialized with the Lua error handler in
      netbox_prepare_request, which is used by all encoding methods, including
      netbox_encode_auth. The latter will be moved to C, along with iproto
      request handlers, where we will have to use a different error handler.
      Let's create an mpstream in netbox_encode_method and netbox_encode_auth
      instead. For now, they do the same, but once we move the code to C they
      will use different error handlers.
      
      Part of #6241
      b0cd298a
    • Vladimir Davydov's avatar
      net.box: rewrite request encoder in C · 6d4ce0de
      Vladimir Davydov authored
      This patch moves method_encoder table from Lua to C. This is a step
      towards rewriting performance-critical parts of net.box in C.
      
      Part of #6241
      6d4ce0de
    • Vladimir Davydov's avatar
      net.box: rename request.ctx to request.format · d50b2818
      Vladimir Davydov authored
      Request context only stores tuple format or nil, which is used for
      decoding a response. Rename it appropriately.
      
      Part of #6241
      d50b2818
  6. Jul 29, 2021
    • Vladislav Shpilevoy's avatar
      replication: set replica ID before _cluster commit · 969a9ee1
      Vladislav Shpilevoy authored
      Replica registration works via looking for the smallest not
      occupied ID in _cluster and inserting it into the space.
      
      It works not so good when mvcc is enabled. In particular, if more
      than 1 replica try to register at the same time, they might get
      the same replica_id because don't see changes of each other until
      the registration in _cluster is complete.
      
      This in the end leads to all replicas failing the registration
      except one with the 'duplicate key' error (primary index in
      _cluster is replica ID).
      
      The patch makes the replicas occupy their ID before they commit it
      into _cluster. And new replica ID search now uses the replica ID
      map instead of _cluster iterator.
      
      This way the registration works like before - like MVCC does not
      exist which is fine.
      
      Closes #5601
      969a9ee1
Loading