Skip to content
Snippets Groups Projects
  1. Aug 17, 2022
    • Serge Petrenko's avatar
      replication: fix downstream lag growing when there's no new transactions · a167a070
      Serge Petrenko authored
      downstream lag is the difference in time between the moment a
      transaction was written to master's WAL and the moment an ack for it
      arrived.
      
      Its calculation is supported by replicas sending the last applied row
      timestamp. When there is no replication, the last applied row timestamp
      stays the same, so in this case downstream lag grows as time passes.
      
      Once an old master is replaced by a new one, it notices changes in peer
      vclocks and tries to update downstream lag unconditionally. This makes
      the lag appear to be growing indefinitely, showing the time since the
      last transaction on the old master:
      
      ```
       downstream:
         status: follow
         idle: 0.018218606001028
         vclock: {1: 3, 2: 2}
         lag: 34.623061401367
      ```
      
      The commit 56571d83 ("raft: make followers notice leader hang")
      made relay exchange information with tx even when there are no new
      transactions, so the issue became even easier to reproduce.
      
      The issue itself was present since downstream lag introduction in commit
      29025bce ("relay: provide information about downstream lag").
      
      Closes #7581
      
      NO_DOC=bugfix
      a167a070
    • Cyrill Gorcunov's avatar
      log: free resources while event loop is running · 0c3f9b37
      Cyrill Gorcunov authored
      
      The 'log' module uses fibers internally for logs rotation sake and
      before we can free log's resources (on program exit) we need to wait
      until rotation is complete, which implies that events loop is still
      running. But we break the event loop in `on_shutdown_f` trigger and
      calling any events based functionality later cause unexpected results
      because fibers are no loner valid to use. Thus move `say_logger_free`
      call into `on_shutdown_f` body where fibers are still alive.
      
      N.B. Testing the issue is sensitive to timings, during local tests
      found that minimal delay 1ms is enough to trigger, thus ERRINJ_LOG_ROTATE
      get increased.
      
      Fixes #4450
      
      NO_DOC=bugfix
      
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      0c3f9b37
  2. Aug 16, 2022
    • Ilya Verbin's avatar
      fiber: do not crash on concurrent fiber:join() · 8f4538cb
      Ilya Verbin authored
      If two or more fibers are yielding in fiber_join_timeout(), one of them
      will eventually join and recycle the fiber, while the rest will crash
      on accessing the recycled fiber's struct. Fix this by doing fiber_find()
      again after each waiting attempt in lbox_fiber_join().
      
      Closes #7489
      Closes #7531
      
      NO_DOC=bugfix
      8f4538cb
    • Ilya Verbin's avatar
      fiber: introduce fiber_wait_on_deadline() · 73e1059d
      Ilya Verbin authored
      It is separated from fiber_join_timeout(), and will be used
      in lbox_fiber_join() too.
      
      Part of #7489
      Part of #7531
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      73e1059d
  3. Aug 15, 2022
    • Ilya Verbin's avatar
      test: extend the fiber.info() backtraces test · fc11ce06
      Ilya Verbin authored
      Test that an expected Lua function can be found in one of frames.
      C function is already covered by this test.
      
      Closes #7535
      
      NO_DOC=test
      NO_CHANGELOG=test
      fc11ce06
    • Ilya Verbin's avatar
      cmake: normalize ENABLE_BACKTRACE option · 3a6021ea
      Ilya Verbin authored
      CMake accepts the following case-insensitive values as true: 1, ON, YES,
      TRUE, Y, or a non-zero number (including floating point numbers). This
      complicates the parsing of ENABLE_BACKTRACE in `tarantool.build.options`.
      Fix this by defining it to TRUE for any true value.
      
      Part of #7535
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      3a6021ea
    • Gleb Kashkin's avatar
      console: remove ERRINJ_STDIN_ISATTY injection · 16d6e9d2
      Gleb Kashkin authored
      As the underlying problem behind this injection is fixed in #7357 it can
      be removed and `-i` flag could be used as initially intended.
      
      Closes #7554
      Requires #7357
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      16d6e9d2
  4. Aug 11, 2022
    • Boris Stepanenko's avatar
      raft: add strict fencing · 64ae9a08
      Boris Stepanenko authored
      With current leader fencing implementation old leader doesn't resign
      it's leadership before new leader may be elected. Because of this
      several "leaders" might coexist in replicaset for some time.
      
      This commit changes replication_disconnect_timeout that it is twice
      as short for current raft leader (2*replication_timeout) if strict
      fencing is enabled. Assuming that replication_timeout is the same for
      every replica in replicaset this makes it less probable that new
      leader can be elected before old one resigns it's leadership.
      
      Old fencing behaviour can be enabled by setting fencing to soft mode.
      This is useful when connection death timeouts shouldn't be affected
      (e.g. different replication_timeouts are set to prioritize some
      replicas as leader over the others).
      
      Closes #7110
      
      @TarantoolBot document
      Title: Strict fencing
      
      In `box.cfg` option `election_fencing_enabled` is deprecated in favor
      of `election_fencing_mode`. `election_fencing_mode` can be set to one
      of the following values:
      'off' - fencing turned off (same as `election_fencing_enabled` set to
      false before).
      Connection death timeout is 4*replication_timeout for all nodes.
      
      'soft' (default) - fencing turned on, but connection death timeout is
      the same for leader and followers in replicaset. This is enough to
      solve cluster being readonly and not being to elect a new leader in
      some situations because of pre-vote.
      Connection death timeout is 4*replication_timeout for all nodes.
      
      'strict' - fencing turned on. In this mode leader tries its best to
      resign leadership before new leader can be elected. This is achived
      by halving death timeout on leader.
      Connection death timeout is 4*replication_timeout for followers and
      2*replication_timout for current leader.
      64ae9a08
    • Boris Stepanenko's avatar
      luatest_helpers: add replication proxy · b907e726
      Boris Stepanenko authored
      Before, we used to modify box.cfg.replication to reproduce network
      problems in our test. This worked fine in most situations, but doesn't
      work in others: when instance gets disconnected by modifying
      box.cfg.replication, it closes its connection immediately (in terms of
      realtime), and this is noticed almost immediately by its neighbours in
      replica set (because they receive EOF). This made it impossible to test
      some things, that rely on specific timeouts in our code (e.g. strict
      fencing).
      
      This commits adds helper, which acts as UNIX socket proxy, and can block
      connection transparently for tarantool instances. It makes it possible
      to write some tests, that were not possible before. It is also possible
      to inject arbitrary packets between instance, which are interconnected
      via proxy.
      
      Usage:
      
                        +-------------------+
                        |tarantool server 1 |
                        +-------------------+
                                  |
                                  |
                                  |
                         .-----------------.
                        (   /tmp/test-out   )
                         `-----------------'
                                  |
                                  |
                                  |
                        +-------------------+
                        |       proxy       |
                        +-------------------+
                                  |
                                  |
                                  |
                         .-----------------.
                +-------(   /tmp/test-in    )--------+
                |        `-----------------'         |
                |                                    |
                |                                    |
                |                                    |
      +-------------------+                +-------------------+
      |tarantool server 2 |                |tarantool server 3 |
      +-------------------+                +-------------------+
      
      tarantool server 1 init.lua:
      box.cfg{listen = '/tmp/test-out'}
      box.once("schema", function()
          box.schema.user.grant('guest', 'super')
      end)
      
      tarantool server 2 and tarantool server 3 init.lua:
      box.cfg{replication = '/tmp/test-in'}
      
      proxy init.lua:
      -- Import proxy helper
      Proxy = require('test.luatest_helpers.proxy.proxy')
      
      -- Create proxy, which will (when started) listen on client_socket_path
      -- and accept connection when client tries to connect. The accepted
      -- socket connection is then passed to new Connection instance.
      proxy = Proxy:new({
          -- Path to UNIX socket, where proxy will await new connections.
          client_socket_path = '/tmp/test-in',
      
          -- Path to UNIX socket where tarantool server is listening.
          server_socket_path = '/tmp/test-out',
      
          -- Table, describing how to process client socket. Optional.
          -- Defaults used and described:
          process_client = {
              -- function(connection) which, if not nil, will be called once
              -- before client socket processing loop.
              pre = nil,
      
              -- function(connection, data) which, if not nil, will be called
              -- in loop, when new data is received from client socket.
              -- Connection.forward_to_server(connection, data) will:
              -- 1) Connect server socket to server_socket_path, if server
              --    socket is not connected.
              -- 2) Write data to server socket, if connected and writable.
              func = Connection.forward_to_server,
      
              -- function(connection) which, if not nil, will be called once
              -- after client socket processing loop.
              -- Connection.close_client_socket(connection) will shutdown and
              -- close client socket, if it is connected.
              post = Connection.close_client_socket,
          },
      
          -- Table, describing how to process server socket. Optional.
          -- Defaults used and described:
          process_server = {
              -- function(connection) which, if not nil, will be called once
              -- before server socket processing loop.
              pre = nil,
      
              -- function(connection, data) which, if not nil, will be called
              -- in loop, when new data is received from server socket.
              -- Connection.forward_to_client(connection, data) will write data
              -- to client socket, if it is connected and writable
              func = Connection.forward_to_client,
      
              -- function(connection) which, if not nil, will be called once
              -- after server socket processing loop.
              -- Connection.close_server_socket(connection) will shutdown and
              -- close server socket, if it is connected.
              post = Connection.close_server_socket,
          }
      
      })
      
      -- Bind client socket (defined by proxy.client_socket_path) and start
      -- accepting connections on it in a new fiber. If opts.force is set to
      -- true, it will remove proxy.client_socket_path file before binding to
      -- it. After proxy is started it will accept client connections and
      -- create Connection instance for each connection.
      proxy:start({force = false})
      
      -- Stop accepting new connetions on client socket and join the fiber,
      -- created by proxy:start(), and close client socket. Also stop all
      -- active connections (see Connection:stop()).
      proxy:stop()
      
      -- Pause accepting new connections and pause all active connections (see
      -- Connection:pause()).
      proxy:pause()
      
      -- Resume accepting new connections and resume all paused connections
      -- (see Connection:resume())
      proxy:resume()
      
      -- Connection class:
      Connection:new({
          {
              -- Socket which is already created (by Proxy class for example).
              -- Optional, may be nil.
              client_socket = '?table',
      
              -- Path to connect server socket to. Will try to connect on
              -- initialization, and in Connection.forward_to_server.
              -- Can connect manually by calling
              -- Connection:connect_server_socket().
              server_socket_path = 'string',
      
              -- See Proxy:new()
              process_client = '?table',
      
                -- See Proxy:new()
              process_server = '?table',
          },
      })
      
      -- Start processing client socket, using functions from
      -- Connection:process_client.
      Connection:start()
      
      -- Connect server socket to Connection.server_socket_path (if not
      -- connected already). Start processing server socket, if successfully
      -- connected (using functions from Connection.process_server).
      Connection:connect_server_socket()
      
      -- Pause processing packets (both incoming from client socket and server
      -- socket).
      Connection:pause()
      
      -- Resume processing packets (both incoming from client socket and
      -- server socket).
      Connection:resume()
      
      -- Close server socket, if open.
      Connection:close_server_socket()
      
      -- Close client socket, if open.
      Connection:close_client_socket()
      
      -- Close client and server sockets, if open, and wait for processing
      -- fibers to die.
      Connection:stop()
      
      NO_DOC=test helpers
      NO_CHANGELOG=test helpers
      b907e726
  5. Aug 09, 2022
    • Gleb Kashkin's avatar
      console: fix -i being overruled by !isatty() · 9965e3fe
      Gleb Kashkin authored
      The interactive mode has been ignored when stdin was not a tty and is no
      more. Now results of another command can be handled by tarantool.
      Before the patch:
      ```
      $ echo 42 | tarantool -i
      LuajitError: stdin:1: unexpected symbol near '42'
      fatal error, exiting the event loop
      ```
      
      After the patch:
      ```
      $ echo 42 | tarantool -i
      Tarantool 2.5.0-130-ge3cf64a6c
      type 'help' for interactive help
      tarantool> 42
      ---
      - 42
      ...
      
      ```
      
      Closes #5064
      
      NO_DOC=bugfix
      9965e3fe
    • Gleb Kashkin's avatar
      test: fix reading STDIN command on openSUSE · ea07854e
      Gleb Kashkin authored
      Inspired by gh-5064, that breaks the previous version of the test on
      openSUSE. When using `io.popen:write()` on tarantool with `-i` flag, it
      failed to run the command on openSUSE. This happened because before
      gh-5064 patch it used to employ `luaL_loadfile()` that interprets EOF
      as the end of the command, while when it is loaded as a string openSUSE
      expects it to end with '\n'.
      
      Needed for #5064
      NO_DOC=test fix
      NO_TEST=test fix
      NO_CHANGELOG=test fix
      ea07854e
  6. Aug 08, 2022
    • Ilya Verbin's avatar
      util: introduce strlcat utility function · dcd9be4a
      Ilya Verbin authored
      strlcat is a function from BSD, which is designed to be safer, more
      consistent, and less error prone replacement for strcat and strncat.
      
      NO_DOC=internal
      NO_CHANGELOG=internal
      
      Part of #7534
      dcd9be4a
  7. Aug 05, 2022
    • Vladimir Davydov's avatar
      memtx: fix dirty data written to snapshot for hash index · 64d87e88
      Vladimir Davydov authored
      The hash index doesn't create a snapshot clarifier, which is used for
      filtering out uncommitted tuples from a snapshot. Fix this. Also fix
      a bug in hash_snapshot_iterator_next, where we passed a wrong argument
      to tuple_data_range. It hasn't fired, because the clarifier didn't work.
      
      Fixes commit ee8ed065 ("txm: clarify all fetched tuples").
      Fixes commit f167c1af ("memtx: decompress tuples in snapshot
      iterator").
      
      Closes #7539
      
      NO_DOC=bug fix
      64d87e88
    • Georgiy Lebedev's avatar
      memtx: fix handling of corner cases gap tracking in transaction manager · 7360281e
      Georgiy Lebedev authored
      
      Gap tracking does not handle gap writes when the key has the same value as
      the gap item: review the whole gap write handling logic, refactor it and
      fix handling of corner cases along the way.
      
      Co-authored-by: default avatarAlexander Lyapunov <alyapunov@tarantool.org>
      
      Closes #7375
      
      NO_DOC=bugfix
      7360281e
    • Georgiy Lebedev's avatar
      memtx: fix tree iterator `next` result clarification · 542f9525
      Georgiy Lebedev authored
      The problem is described in #7073. It was fixed only for
      `tree_iterator_start_raw` next method, but other methods used for reverse
      iterators are also subject to this bug: move tuple clarification from the
      wrapper of iterator `next` methods to individual iterator methods.
      
      Closes #7432
      
      NO_DOC=bugfix
      542f9525
    • Alexander Turenko's avatar
      lua/decimal: add Lua value accessors to module API · c75fbce1
      Alexander Turenko authored
      The Rust module (see the issue) needs a getter and a setter for decimal
      values on the Lua stack. Let's make them part of the module API.
      
      Part of #7228
      
      @TarantoolBot document
      Title: Lua/C functions for decimals in the module API
      
      The following functions are added into the module API:
      
      ```c
      /**
       * Allocate a new decimal on the Lua stack and return
       * a pointer to it.
       */
      API_EXPORT box_decimal_t *
      luaT_newdecimal(struct lua_State *L);
      
      /**
       * Allocate a new decimal on the Lua stack with copy of given
       * decimal and return a pointer to it.
       */
      API_EXPORT box_decimal_t *
      luaT_pushdecimal(struct lua_State *L, const box_decimal_t *dec);
      
      /**
       * Check whether a value on the Lua stack is a decimal.
       *
       * Returns a pointer to the decimal on a successful check,
       * NULL otherwise.
       */
      API_EXPORT box_decimal_t *
      luaT_isdecimal(struct lua_State *L, int index);
      ```
      c75fbce1
    • Serge Petrenko's avatar
      raft: make followers notice leader hang · 56571d83
      Serge Petrenko authored
      It's possible to hang an instance by some non-yielding request. The
      simplest example is `while true do end`. A more true to life one would
      be a `select{}` from a large space, or `pairs` iteration over a space
      without yields.
      
      Any such request makes the instance unresponsive - it can serve neither
      reads nor writes. At the same time, the instance appears alive to other
      cluster members: relay thread used to communicate with others is not
      hung and continues to send heartbeats every replication_timeout.
      
      The problem is the most severe with Raft leader elections: followers
      believe the leader is fine and do not start elections despite leader
      being unable to serve reads or writes.
      
      Closes #7512
      
      NO_DOC=bugfix
      56571d83
  8. Aug 04, 2022
    • Vladimir Davydov's avatar
      salad: rework bps tree read view API · 91caa388
      Vladimir Davydov authored
      Currently, there's no notion of a BPS tree read view per se - one can
      create an iterator over a regular tree and then "freeze" it. This works
      just fine for snapshotting and joining replicas, but this spartan API
      doesn't let us implement user read views, because to do that we need to
      do lookups and create iterators over a frozen tree as many times as we
      want, not just once.
      
      So this patch introduces a concept of bps_tree_view, which contains a
      frozen image of a bps_tree and implements a subset of non-modifying
      bps_tree methods:
      
       - bps_tree_view_size
       - bps_tree_view_find
       - bps_tree_view_first
       - bps_tree_view_last
       - bps_tree_view_lower_bound
       - bps_tree_view_lower_bound_elem
       - bps_tree_view_upper_bound
       - bps_tree_view_upper_bound_elem
       - bps_tree_view_iterator_get_elem
       - bps_tree_view_iterator_prev
       - bps_tree_view_iterator_next
       - bps_tree_view_iterator_is_equal
      
      Note, bps_tree and bps_tree_view share bps_tree_iterator, because
      iterator methods (get_elem, next, prev, is_equal) take bps_tree or
      bps_tree_view. The bps_tree_iterator now contains only block index and
      offset.
      
      We could also implement the rest of non-modifying methods, but didn't do
      that, because they are not needed to implement user read views:
      
       - bps_tree_random
       - bps_tree_approximate_count
       - bps_tree_debug_check
       - bps_tree_print
      
      To create a bps_tree_view from a bps_tree, one is supposed to call
      bps_tree_view_create. If a bps_tree_view is no longer needed, it should
      be destroyed with bps_tree_view_destroy.
      
      Old methods used for creating frozen iterators were dropped:
      
       - bps_tree_iterator_freeze
       - bps_tree_iterator_destroy
      
      To avoid code duplication, we factored out the common part of bps_tree
      and bps_tree_view into a new structure, named bps_tree_common.
      Basically, the new structure contains all bps_tree members except
      matras, which is stored in bps_tree. The difference between
      bps_tree_view and bps_tree is that the latter stores matras_view
      instead of matras. The common part contains pointers to matras and
      matras_view, which are used by internal implementation to look up
      bps_tree blocks.
      
      All internal methods now take bps_tree_common instead of bps_tree.
      For all public methods that are implemented both for bps_tree and
      bps_tree_view, we have the common implementation defined in _impl
      suffixed private function, which is called by the corresponding public
      functions.
      
      To ensure that a modifying method isn't called on bps_tree_common object
      corresponding to a bps_tree_view because of a bug in the bps_tree
      implementation, we added !matras_is_read_view_created assertion to
      bps_tree_touch_block.
      
      Closes #7191
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      91caa388
    • Vladimir Davydov's avatar
      Use bps_tree_size instead of accessing size directly · 5855fd30
      Vladimir Davydov authored
      We have a method for getting the number of elements stored in a BPS
      tree. Let's use it instead of accessing BPS tree internals directly
      so that we can freely refactor BPS tree internals.
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      5855fd30
    • Vladimir Davydov's avatar
      salad: rename a few bps_tree methods · c84990ad
      Vladimir Davydov authored
       - Rename bps_tree_iterator_are_equal to bps_tree_iterator_is_equal for
         consistency with other methods that check two objects for equality
         (for example, tt_uuid_is_equal).
      
       - Rename bps_tree_iterator_first and bps_tree_iterator_last to
         bps_tree_first and bps_tree_last, because these are methods of
         bps_tree, not bps_tree_iterator. Omitting _iterator is also
         consistent with bps_tree_lower_bound and bps_tree_upper_bound
         methods, which also create bps_tree_iterator objects.
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      c84990ad
    • Georgiy Lebedev's avatar
      memtx: fix reverse iterators gap tracking · fda38e66
      Georgiy Lebedev authored
      In case of reverse iterators, due to index limitations, we need to clarify
      the successor tuple early: this implies that the successor's story is not
      always at the top of the history chain, whilst we need to add the gap item
      to the story currently present in index — fix this by reusing the
      iterators' check logic to set the current iterator's tuple (which is
      considered the successor) to a tuple in index.
      
      CLoses #7409
      
      NO_DOC=bugfix
      fda38e66
    • Georgiy Lebedev's avatar
      memtx: fix HASH index 'GT' iterator `next` method set incorrectly · 5c0d7117
      Georgiy Lebedev authored
      The `next` method of memtx HASH index 'GT' iterator is initially set to
      'GT' and is supposed to be set to 'GE' after first iteration: it is
      mistakenly set to the 'base' method instead of the full method which also
      does tuple clarification — this allows dirty reads. Move the `next` method
      change on first iteration to `WRAP_ITERATOR_METHOD` for clarity and
      correctness.
      
      Closes #7477
      
      NO_DOC=bugfix
      5c0d7117
    • Vladimir Davydov's avatar
      salad: rework light hash read view API · b595f212
      Vladimir Davydov authored
      Currently, there's no notion of a LIGHT hash table read view per se -
      one can create an iterator over a regular hash table and then "freeze"
      it. This works just fine for snapshotting and joining replicas, but this
      spartan API doesn't let us implement user read views, because to do that
      we need to do lookups and create iterators over a frozen hash table as
      many times as we want, not just once.
      
      So this patch introduces a concept of LIGHT(view), which contains a
      frozen image of a LIGHT(core) and implements a subset of non-modifying
      LIGHT(core) methods:
      
       - LIGHT(view_count)
       - LIGHT(view_find)
       - LIGHT(view_find_key)
       - LIGHT(view_get)
       - LIGHT(view_iterator_begin)
       - LIGHT(view_iterator_key)
       - LIGHT(view_iterator_get_and_next)
      
      Note, LIGHT(core) and LIGHT(view) share LIGHT(iterator), because
      iterator methods (begin, key, get_and_next) take LIGHT(core) or
      LIGHT(view). The LIGHT(iterator) now contains only a hash table slot.
      
      We could also implement the rest of non-modifying methods, but didn't do
      that, because they are not needed to implement user read views:
      
       - LIGHT(random)
       - LIGHT(selfcheck)
      
      To create a LIGHT(view) from a LIGHT(core), one is supposed to call
      LIGHT(view_create). If a LIGHT(view) is no longer needed, it should be
      destroyed with LIGHT(view_destroy).
      
      Old methods used for creating frozen iterators were dropped:
      
       - LIGHT(iterator_freeze)
       - LIGHT(iterator_destroy)
      
      To avoid code duplication, we factored out the common part of
      LIGHT(core) and LIGHT(view) into a new structure, named LIGHT(common).
      Basically, the new structure contains all LIGHT(core) members except
      matras, which is stored in LIGHT(core). The difference between
      LIGHT(view) and LIGHT(core) is that the latter stores matras_view
      instead of matras. The common part contains pointers to matras and
      matras_view, which are used by internal implementation to look up
      LIGHT(record).
      
      All internal methods now take LIGHT(common) instead of LIGHT(core).
      For all public methods that are implemented both for LIGHT(core) and
      LIGHT(view), we have the common implementation defined in _impl suffixed
      private function, which is called by the corresponding public functions.
      
      To ensure that a modifying method isn't called on LIGHT(common) object
      corresponding to a LIGHT(view) because of a bug in the LIGHT code, we
      added !matras_is_read_view_created assertion to LIGHT(touch_record),
      LIGHT(prepare_first_insert), and LIGHT(grow).
      
      Closes #7192
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      b595f212
    • Vladimir Davydov's avatar
      salad: add LIGHT(count) method · 77b552fe
      Vladimir Davydov authored
      This commit adds a function that retrieves the number of records stored
      in a light hash table and makes light users use it instead of accessing
      the light count directly. This gives us more freedom of refactoring the
      light internals without modifying the code using it.
      
      Needed for #7192
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      77b552fe
    • Vladimir Davydov's avatar
      prbuf: fix prbuf_open for empty buffer · 943ce3ca
      Vladimir Davydov authored
      prbuf_check, which is called by prbuf_open, proceeds to scanning the
      buffer even if it's empty. On debug build, this results in prbuf_open
      reporting that the buffer is corrupted, because we trash the buffer in
      prbuf_create. On a release build, this may lead to a hang, in case the
      buffer is zeroed out. Let's fix this by returning success from
      prbuf_check if the buffer is empty. Note, prbuf_iterator_next doesn't
      call prbuf_first_record if the buffer is empty, either.
      
      Needed for https://github.com/tarantool/tarantool-ee/issues/187
      
      NO_DOC=bug fix
      NO_CHANGELOG=will be added to EE
      943ce3ca
    • Alexander Turenko's avatar
      decimal: add the library into the module API · 5c1bc3da
      Alexander Turenko authored
      The main decision made in this patch is how large the public
      `box_decimal_t` type should be. Let's look on some calculations.
      
      We're interested in the following values.
      
      * How much decimal digits is stored?
      * Size of an internal decimal type (`sizeof(decimal_t)`).
      * Size of a buffer to store a string representation of any valid
        `decimat_t` value.
      * Largest signed integer type fully represented in decimal_t (number of
        bits).
      * Largest unsigned integer type fully represented in decimal_t (number
        of bits).
      
      Now `decimal_t` is defined to store 38 decimal digits. It means the
      following values:
      
      | digits | sizeof | string | int???_t | uint???_t |
      | ------ | ------ | ------ | -------- | --------- |
      | 38     | 36     | 52     | 126      | 127       |
      
      In fact, decNumber (the library we currently use under the hood) allows
      to vary the 'decimal digits per unit' parameter, which is 3 by default,
      so we can choose density of the representation. For example, for given
      38 digits the sizeof is 36 by default, but it may vary from 28 to 47
      bytes:
      
      | digits | sizeof     | string | int???_t | uint???_t |
      | ------ | ---------- | ------ | -------- | --------- |
      | 38     | 36 (28-47) | 52     | 126      | 127       |
      
      If we'll want to store `int128_t` and `uint128_t` ranges, we'll need 39
      digits:
      
      | digits | sizeof     | string | int???_t | uint???_t |
      | ------ | ---------- | ------ | -------- | --------- |
      | 39     | 36 (29-48) | 53     | 130      | 129       |
      
      If we'll want to store `int256_t` and `uint256_t` ranges:
      
      | digits | sizeof     | string | int???_t | uint???_t |
      | ------ | ---------- | ------ | -------- | --------- |
      | 78     | 62 (48-87) | 92     | 260      | 259       |
      
      If we'll want to store `int512_t` and `uint512_t` ranges:
      
      | digits | sizeof       | string | int???_t | uint???_t |
      | ------ | ------------ | ------ | -------- | --------- |
      | 155    | 114 (84-164) | 169    | 515      | 514       |
      
      The decision here is what we consdider as possible and what as unlikely.
      The patch freeze the maximum amount of bytes in `decimal_t` as 64. So
      we'll able to store 256 bit integers and will NOT able to store 512 bit
      integers in a future (without the ABI breakage at least).
      
      The script, which helps to calculate those tables, is at end of the
      commit message.
      
      Next, how else `box_decimal_*()` library is different from the internal
      `decimal_*()`?
      
      * Added a structure that may hold any decimal value from any current or
        future tarantool version.
      * Added `box_decimal_copy()`.
      * Left `strtodec()` out of scope -- we can add it later.
      * Left `decimal_str()` out of scope -- it looks dangerous without at
        least a good explanation when data in the static buffer are
        invalidated. There is `box_decimal_to_string()` that writes to an
        explicitly provided buffer.
      * Added `box_decimal_mp_*()` for encoding to/decoding from msgpack.
        Unlike `mp_decimal.h` functions, here we always have `box_decimal_t`
        as the first parameter.
      * Left `decimal_pack()` out of scope, because a user unlikely wants to
        serialize a decimal value piece-by-piece.
      * Exposed `decimal_unpack()` as `box_decimal_mp_decode_data()` to keep a
        consistent terminogoly around msgpack encoding/decoding.
      * More detailed API description, grouping by functionality.
      
      The script, which helps to calculate sizes around `decimal_t`:
      
      ```lua
      -- See notes in decNumber.h.
      
      -- DECOPUN: DECimal Digits Per UNit
      local function unit_size(DECOPUN)
          assert(DECOPUN > 0 and DECOPUN < 10)
          if DECOPUN <= 2 then
              return 1
          elseif DECOPUN <= 4 then
              return 2
          end
          return 4
      end
      
      function sizeof_decimal_t(digits, DECOPUN)
          -- int32_t digits;
          -- int32_t exponent;
          -- uint8_t bits;
          -- <..padding..>
          -- <..units..>
          local us = unit_size(DECOPUN)
          local padding = us - 1
          local unit_count = math.ceil(digits / DECOPUN)
          return 4 + 4 + 1 + padding + us * unit_count
      end
      
      function string_buffer(digits)
          -- -9.{9...}E+999999999# (# is '\0')
          -- ^ ^      ^^^^^^^^^^^^
          return digits + 14
      end
      
      function binary_signed(digits)
          local x = 1
          while math.log10(2 ^ (x - 1)) < digits do
              x = x + 1
          end
          return x - 1
      end
      
      function binary_unsigned(digits)
          local x = 1
          while math.log10(2 ^ x) < digits do
              x = x + 1
          end
          return x - 1
      end
      
      function digits_for_binary_signed(x)
          return math.ceil(math.log10(2 ^ (x - 1)))
      end
      
      function digits_for_binary_unsigned(x)
          return math.ceil(math.log10(2 ^ x))
      end
      
      function summary(digits)
          print('digits', digits)
          local sizeof_min = math.huge
          local sizeof_max = 0
          local DECOPUN_sizeof_min
          local DECOPUN_sizeof_max
          for DECOPUN = 1, 9 do
              local sizeof = sizeof_decimal_t(digits, DECOPUN)
              print('sizeof', sizeof, 'DECOPUN', DECOPUN)
              if sizeof < sizeof_min then
                  sizeof_min = sizeof
                  DECOPUN_sizeof_min = DECOPUN
              end
              if sizeof > sizeof_max then
                  sizeof_max = sizeof
                  DECOPUN_sizeof_max = DECOPUN
              end
          end
          print('sizeof min', sizeof_min, 'DECOPUN', DECOPUN_sizeof_min)
          print('sizeof max', sizeof_max, 'DECOPUN', DECOPUN_sizeof_max)
          print('string', string_buffer(digits))
          print('int???_t', binary_signed(digits))
          print('uint???_t', binary_unsigned(digits))
      end
      ```
      
      Part of #7228
      
      @TarantoolBot document
      Title: Module API for decimals
      
      See the declarations in `src/box/decimal.h` in tarantool sources.
      5c1bc3da
  9. Aug 02, 2022
    • Mergen Imeev's avatar
      sql: always treat NaN as NULL · 7c5651af
      Mergen Imeev authored
      In most cases, NaN was treated as NULL. But in case NaN was returned as
      a result of a Lua or C user defined function, it was considered a
      double. After this patch, NaN will also be considered NULL in the
      specified cases.
      
      Closes #6374
      Closes #6572
      
      NO_DOC=bugfix
      7c5651af
    • Mergen Imeev's avatar
      sql: fix wrong flag is_res_neg in sql_rem_int() · 1f4bb194
      Mergen Imeev authored
      This patch makes the is_res_neg flag false in the sql_rem_int() function
      if the left value is negative and the result is 0. Prior to this patch,
      the value of the flag was true, which resulted in an assertion during
      encoding 0 as MP_INT.
      
      Closes #6575
      
      NO_DOC=bugfix
      1f4bb194
    • Mergen Imeev's avatar
      sql: do nothing in ROUND() if precision is too big · 4c216c4c
      Mergen Imeev authored
      The smallest positive double value is 2.225E-307, and the value before
      the exponent has a maximum of 15 digits after the decimal point. This
      means that double values cannot have more than 307 + 15 digits after
      the decimal point.
      
      After this patch, ROUND() will return its first argument unchanged if
      the first argument is DOUBLE and the second argument is INTEGER greater
      than 322.
      
      Closes #6650
      
      NO_DOC=bugfix
      4c216c4c
  10. Aug 01, 2022
    • Andrey Saranchin's avatar
      fiber: allow to reset fiber slice with SIGURG · 1a3b710d
      Andrey Saranchin authored
      The patch introduces opportunity for user to reset
      slice of current fiber execution. It allows to limit
      iteration in space with SIGURG.
      
      NO_CHANGELOG=see later commits
      NO_DOC=see later commits
      1a3b710d
    • Andrey Saranchin's avatar
      box: allow to limit space iteration with timeout · bc053c55
      Andrey Saranchin authored
      Currently, there is no way to interrupt a long execution of a
      request (such as s:select(nil)). This patch introduces this
      opportunity.
      
      Box will use fiber deadline timeout as a timeout for DML usage.
      Thus, when deadline of current fiber is up, all DML requests will
      end with a particular error.
      
      Closes #6085
      
      NO_CHANGELOG=see later commits
      NO_DOC=see later commits
      bc053c55
    • Andrey Saranchin's avatar
      test: adapt tests to iteration limit · b33ea6ea
      Andrey Saranchin authored
      Part of #6085
      
      NO_TEST=no behavior changes
      NO_CHANGELOG=no behavior changes
      NO_DOC=no behavior changes
      b33ea6ea
    • Andrey Saranchin's avatar
      fiber: introduce fiber slice · e9bd2250
      Andrey Saranchin authored
      This patch introduces execution time slice for fiber. Later, we will use
      this mechanism to limit iteration in space.
      
      Part of #6085
      
      NO_CHANGELOG=see later commits
      NO_DOC=see later commits
      e9bd2250
    • Alexander Turenko's avatar
      fiber_channel: add accessor to internal functions · 395c30e8
      Alexander Turenko authored
      The Rust module [1] leans on several internal symbols. They were open in
      Tarantool 2.8 (see #2971 and #5932), but never were in the public API.
      Tarantool 2.10.0 hides the symbols and we need a way to get them back to
      use in the module.
      
      We have the following options:
      
      1. Design and expose a module API for fiber channels.
      2. Export the symbols with a prefix like `tnt_internal_` (to don't spoil
         the global namespace).
      3. Provide a `dlsym()` alike function to get an address of an internal
         symbol for users who knows what they're doing.
      
      I think that the third way offers the best compromise between amount of
      effort, quality of the result and opportunities to extend. In this
      commit I hardcoded the list of functions to make the change as safe as
      possible. Later I'll return here to autogenerate the list.
      
      Exported the following function from the tarantool executable:
      
      ```c
      void *
      tnt_internal_symbol(const char *name);
      ```
      
      I don't add it into the module API headers, because the function is to
      perform a dark magic and we don't suggest it for users.
      
      While I'm here, added `static` to a couple of fiber channel functions,
      which are only used within the compilation unit.
      
      [1]: https://github.com/picodata/tarantool-module
      
      Part of #7228
      Related to #6372
      
      NO_DOC=don't advertize the dangerous API
      NO_CHANGELOG=don't advertize the dangerous API
      395c30e8
  11. Jul 27, 2022
    • Ilya Verbin's avatar
      box: fix thread_id check in box.stat.net.thread[] · 969b76ac
      Ilya Verbin authored
      The valid range for thread_id is [0, iproto_threads_count - 1].
      
      Closes #7196
      
      NO_DOC=bugfix
      969b76ac
    • Ilya Verbin's avatar
      box: check for foreign keys on space:truncate() · 7ec71b4f
      Ilya Verbin authored
      Add a missed check to on_replace_dd_truncate, similar to
      on_replace_dd_space and on_replace_dd_index.
      
      Closes #7309
      
      NO_DOC=bugfix
      7ec71b4f
    • Andrey Saranchin's avatar
      core: introduce helper tt_sigaction · 9839812c
      Andrey Saranchin authored
      The problem is that even if we block all signals on all
      threads except the main thread, the signals still can be
      delivered to other threads (#7206). And another problem
      is that user can spawn his own thread and not block
      signals.
      
      That is why the patch introduces tt_sigaction function that
      guarantees that all signals will be handled only by the main
      thread. We use this helper in clock_lowres module.
      This is supposed to solve the problem, described in #7408.
      
      NO_CHANGELOG=internal
      NO_DOC=internal
      9839812c
  12. Jul 26, 2022
    • Alexander Turenko's avatar
      tuple: add JSON path field accessor to module API · bcca0b2b
      Alexander Turenko authored
      Added a function (see the API in the documentation request below), which
      reflects the `tuple[json_path]` Lua API (see #1285).
      
      Part of #7228
      
      @TarantoolBot document
      Title: tuple: access a field using JSON path via module API
      
      The following function is added into the module API:
      
      ```c
      /**
       * Return a raw tuple field in the MsgPack format pointed by
       * a JSON path.
       *
       * The JSON path includes the outmost field. For example, "c" in
       * ["a", ["b", "c"], "d"] can be accessed using "[2][2]" path (if
       * index_base is 1, as in Lua). If index_base is set to 0, the
       * same field will be pointed by the "[1][1]" path.
       *
       * The first JSON path token may be a field name if the tuple
       * has associated format with named fields. A field of a nested
       * map can be accessed in the same way: "foo.bar" or ".foo.bar".
       *
       * The return value is valid until the tuple is destroyed, see
       * box_tuple_ref().
       *
       * Return NULL if the field does not exist or if the JSON path is
       * malformed or invalid. Multikey JSON path token [*] is treated
       * as invalid in this context.
       *
       * \param tuple a tuple
       * \param path a JSON path
       * \param path_len a length of @a path
       * \param index_base 0 if array element indexes in @a path are
       *        zero-based (like in C) or 1 if they're one-based (like
       *        in Lua)
       * \retval a pointer to a field data if the field exists or NULL
       */
      API_EXPORT const char *
      box_tuple_field_by_path(box_tuple_t *tuple, const char *path,
      			uint32_t path_len, int index_base);
      ```
      bcca0b2b
  13. Jul 25, 2022
    • Ilya Verbin's avatar
      box: return 1-based fkey field numbers to Lua · 014f5aa1
      Ilya Verbin authored
      In Lua field's numbers are counted from base 1, however currently
      space:format() and space.foreign_key return zero-based foreign key
      fields, which leads to an error on space:format(space:format()).
      
      Closes #7350
      
      NO_DOC=bugfix
      014f5aa1
    • Ilya Verbin's avatar
      box: do not modify format arg by normalize_format · a8b6fd0c
      Ilya Verbin authored
      Currently a foreign_key field in the `format` argument, passed to
      normalize_format, can be changed inside normalize_foreign_key_one.
      Fix this by using a local copy of def.field.
      
      NO_DOC=bugfix
      NO_CHANGELOG=minor bug
      a8b6fd0c
Loading