Skip to content
Snippets Groups Projects
  1. Apr 14, 2021
    • Roman Khabibov's avatar
      build: install libcurl headers · 38d0b0c1
      Roman Khabibov authored
      Ship libcurl headers to system path "${PREFIX}/include/tarantool"
      in the case of libcurl included as bundled library or static
      build. It is needed to use SMTP client with tarantool's libcurl
      instead of system libcurl.
      
      See related issue: https://github.com/tarantool/smtp/issues/24
      
      Closes #4559
      Unverified
      38d0b0c1
    • Roman Khabibov's avatar
      build: enable smtp · 4bde1dbc
      Roman Khabibov authored
      Enable smtp and smtps protocols in bundled libcurl. It is needed
      to use SMTP client with tarantool's libcurl instead of system
      libcurl.
      
      See related issue: https://github.com/tarantool/smtp/issues/24
      
      Part of #4559
      Unverified
      4bde1dbc
    • Sergey Kaplun's avatar
      luajit: bump new version · ef55e488
      Sergey Kaplun authored
      
      LuaJIT submodule is bumped to introduce the following changes:
      * tools: introduce --leak-only memprof parser option
      
      Within this changeset the new Lua module providing post-processing
      routines for parsed memory events is introduced:
      * memprof/process.lua: post-process the collected events
      
      The changes provide an option showing only heap difference. One can
      launch memory profile parser with the introduced option via the
      following command:
      $ tarantool -e 'require("memprof")(arg)' - --leak-only filename.bin
      
      Closes #5812
      
      Reviewed-by: default avatarIgor Munkin <imun@tarantool.org>
      Signed-off-by: default avatarIgor Munkin <imun@tarantool.org>
      Unverified
      ef55e488
    • Cyrill Gorcunov's avatar
      box: implement box.lib module · f463b5fa
      Cyrill Gorcunov authored
      
      Currently to run "C" function from some external module one
      have to register it first in "_func" system space. This is
      a problem if node is in read-only mode (replica).
      
      Still people would like to have a way to run such functions
      even in ro mode. For this sake we implement "box.lib" lua module.
      
      Unlike `box.schema.func` interface the `box.lib` does not defer module
      loading procedure until first call of a function. Instead a module
      is loaded immediately and if some error happens (say shared
      library is corrupted or not found) it pops up early.
      
      The need of use stored C procedures implies that application is
      running under serious loads most likely there is modular structure
      present on Lua level (ie same shared library is loaded in different
      sources) thus we cache the loaded library and reuse it on next
      load attempts. To verify that cached library is up to day the
      module_cache engine test for file attributes (device, inode, size,
      modification time) on every load attempt.
      
      Since both `box.schema.func` and `box.lib` are using caching to minimize
      module loading procedure the pass-through caching scheme is
      implemented:
      
       - box.lib relies on module_cache engine for caching;
       - box.schema.func does snoop into box.lib hash table when attempt
         to load a new module, if module is present in box.lib hash then
         it simply referenced from there and added into own hash table;
         in case if module is not present then it loaded from the scratch
         and put into both hashes;
       - the module_reload action in box.schema.func invalidates module_cache
         or fill it if entry is not present.
      
      Closes #4642
      
      Co-developed-by: default avatarVladislav Shpilevoy <v.shpilevoy@tarantool.org>
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      
      @TarantoolBot document
      Title: box.lib module
      
      Overview
      ========
      
      `box.lib` module provides a way to create, delete and execute
      `C` procedures from shared libraries. Unlike `box.schema.func`
      methods the functions created with `box.lib` help are not persistent
      and live purely in memory. Once a node get turned off they are
      vanished. An initial purpose for them is to execute them on
      nodes which are running in read-only mode.
      
      Module functions
      ================
      
      `box.lib.load(path) -> obj | error`
      -----------------------------------
      
      Loads a module from `path` and return an object instance
      associate with the module, otherwise an error is thrown.
      
      The `path` should not end up with shared library extension
      (such as `.so`), only a file name shall be there.
      
      Possible errors:
      
      - IllegalParams: module path is either not supplied
        or not a string.
      - SystemError: unable to open a module due to a system error.
      - ClientError: a module does not exist.
      - OutOfMemory: unable to allocate a module.
      
      Example:
      
      ``` Lua
      -- Without error handling
      m = box.lib.load('path/to/library)
      
      -- With error handling
      m, err = pcall(box.lib.load, 'path/to/library')
      if err ~= nil then
          print(err)
      end
      ```
      
      `module:unload() -> true | error`
      ---------------------------------
      
      Unloads a module. Returns `true` on success, otherwise an error
      is thrown. Once the module is unloaded one can't load new
      functions from this module instance.
      
      Possible errors:
      
      - IllegalParams: a module is not supplied.
      - IllegalParams: a module is already unloaded.
      
      Example:
      
      ``` Lua
      m = box.lib.load('path/to/library')
      --
      -- do something with module
      --
      m:unload()
      ```
      
      If there are functions from this module referenced somewhere
      in other places of Lua code they still can be executed because
      the module continue sitting in memory until the last reference
      to it is closed.
      
      If the module become a target to the Lua's garbage collector
      then unload is called implicitly.
      
      `module:load(name) -> obj | error`
      ----------------------------------
      
      Loads a new function with name `name` from the previously
      loaded `module` and return a callable object instance
      associated with the function. On failure an error is thrown.
      
      Possible errors:
       - IllegalParams: function name is either not supplied
         or not a string.
       - IllegalParams: attempt to load a function but module
         has been unloaded already.
       - ClientError: no such function in the module.
       - OutOfMemory: unable to allocate a function.
      
      Example:
      
      ``` Lua
      -- Load a module if not been loaded yet.
      m = box.lib.load('path/to/library')
      -- Load a function with the `foo` name from the module `m`.
      func = m:load('foo')
      ```
      
      In case if there is no need for further loading of other
      functions from the same module then the module might be
      unloaded immediately.
      
      ``` Lua
      m = box.lib.load('path/to/library')
      func = m:load('foo')
      m:unload()
      ```
      
      `function:unload() -> true | error`
      -----------------------------------
      
      Unloads a function. Returns `true` on success, otherwise
      an error is thrown.
      
      Possible errors:
       - IllegalParams: function name is either not supplied
         or not a string.
       - IllegalParams: the function already unloaded.
      
      Example:
      
      ``` Lua
      m = box.lib.load('path/to/library')
      func = m:load('foo')
      --
      -- do something with function and cleanup then
      --
      func:unload()
      m:unload()
      ```
      
      If the function become a target to the Lua's garbage collector
      then unload is called implicitly.
      
      Executing a loaded function
      ===========================
      
      Once function is loaded it can be executed as an ordinary Lua call.
      Lets consider the following example. We have a `C` function which
      takes two numbers and returns their sum.
      
      ``` C
      int
      cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end)
      {
      	uint32_t arg_count = mp_decode_array(&args);
      	if (arg_count != 2) {
      		return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
      				     "invalid argument count");
      	}
      	uint64_t a = mp_decode_uint(&args);
      	uint64_t b = mp_decode_uint(&args);
      
      	char res[16];
      	char *end = mp_encode_uint(res, a + b);
      	box_return_mp(ctx, res, end);
      	return 0;
      }
      ```
      
      The name of the function is `cfunc_sum` and the function is built into
      `cfunc.so` shared library.
      
      First we should load it as
      
      ``` Lua
      m = box.lib.load('cfunc')
      cfunc_sum = m:load('cfunc_sum')
      ```
      
      Once successfully loaded we can execute it. Lets call the
      `cfunc_sum` with wrong number of arguments
      
      ``` Lua
      cfunc_sum()
       | ---
       | - error: invalid argument count
      ```
      
      We will see the `"invalid argument count"` message in output.
      The error message has been set by the `box_error_set` in `C`
      code above.
      
      On success the sum of arguments will be printed out.
      
      ``` Lua
      cfunc_sum(1, 2)
       | ---
       | - 3
      ```
      
      The functions may return multiple results. For example a trivial
      echo function which prints arguments passed in.
      
      ``` Lua
      cfunc_echo(1,2,3)
       | ---
       | - 1
       | - 2
       | - 3
      ```
      
      Module and function caches
      ==========================
      
      Loading a module is relatively slow procedure because operating
      system needs to read the library, resolve its symbols and etc.
      Thus to speedup this procedure if the module is loaded for a first
      time we put it into an internal cache. If module is sitting in
      the cache already and new request to load comes in -- we simply
      reuse a previous copy. In case if module is updated on a storage
      device then on new load attempt we detect that file attributes
      (such as device number, inode, size, modification time) get changed
      and reload module from the scratch. Note that newly loaded module
      does not intersect with previously loaded modules, the continue
      operating with code previously read from cache.
      
      Thus if there is a need to update a module then all module instances
      should be unloaded (together with functions) and loaded again.
      
      Similar caching technique applied to functions -- only first function
      allocation cause symbol resolving, next ones are simply obtained from
      a function cache.
      f463b5fa
    • Cyrill Gorcunov's avatar
      box/func: fix modules functions restore · b9f2bf4e
      Cyrill Gorcunov authored
      
      In commit 96938faf (Add hot function reload for C procedures)
      an ability to hot reload of modules has been introduced.
      When module is been reloaded his functions are resolved to
      new symbols but if something went wrong it is supposed
      to restore old symbols from the old module.
      
      Actually current code restores only one function and may
      crash if there a bunch of functions to restore. Lets fix it.
      
      Fixes #5968
      
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      b9f2bf4e
    • Vladislav Shpilevoy's avatar
      applier: process synchro rows after WAL write · b259e930
      Vladislav Shpilevoy authored
      Applier used to process synchronous rows CONFIRM and ROLLBACK
      right after receipt before they are written to WAL.
      
      That led to a bug that the confirmed data became visible, might be
      accessed by user requests, then the node restarted before CONFIRM
      finished its WAL write, and the data was not visible again. That
      is just like if it would be rolled back, which is not acceptable.
      
      Another case - CONFIRM WAL write could simply fail due to any
      reason (no disk space, OOM), but the transactions would remain
      confirmed anyway.
      
      Also that produced some hacks in the limbo's code to support the
      confirmation and rollback of transactions not yet written to WAL.
      
      The patch makes the synchro rows processed only after they are
      written to WAL. Although the 'rollback' case above might still
      happen if the xlogs were in the kernel caches, and the machine was
      powered off before they were flushed to disk. But that is not
      related to qsync specifically.
      
      To handle the synchro rows after WAL write the patch makes them go
      to WAL in a blocking way (journal_write() instead of
      journal_write_try_async()). Otherwise it could happen that a
      CONFIRM/ROLLBACK is being written to WAL and would clear the limbo
      afterwards, but a new transaction arrives with a different owner,
      and it conflicts with the current limbo owner.
      
      Closes #5213
      b259e930
    • Mary Feofanova's avatar
      update: allow update absent nullable fields · 2bb373b9
      Mary Feofanova authored
      Update operations could not insert with gaps. This patch changes
      the behavior so that the update operation fills the missing fields
      with nulls.
      Part of #3378
      
      @TarantoolBot document
      Title: Allow update absent nullable fields
      Update operations could not insert with gaps. Changed the behavior
      so that the update operation fills the missing fields with nulls.
      For example we create space `s = box.schema.create_space('s')`,
      then create index for this space `pk = s:create_index('pk')`, and
      then insert tuple in space `s:insert{1, 2}`. After all of this we
      try to update this tuple `s:update({1}, {{'!', 5, 6}})`. In previous
      version this operation fails with ER_NO_SUCH_FIELD_NO error, and now
      it's finished with success and there is [1, 2, null, null, 6] tuple in
      space.
      2bb373b9
  2. Apr 13, 2021
    • mechanik20051988's avatar
      iproto: implement ability to run multiple iproto threads · 2ede3be3
      mechanik20051988 authored
      There are users that have specific workloads where iproto thread
      is the bottleneck of throughput: iproto thread's code is 100% loaded
      while TX thread's core is not. For such cases it would be nice to have
      a capability to create several iproto threads.
      
      Closes #5645
      
      @TarantoolBot document
      Title: implement ability to run multiple iproto threads
      Implement ability to run multiple iproto threads, which is useful
      in some specific workloads where iproto thread is the bottleneck
      of throughput. To specify count of iproto threads, user should used
      iproto_threads option in box.cfg. For example if user want to start
      8 iproto threads, he must enter `box.cfg{iproto_threads=8}`. Default
      iproto threads count == 1. This option is not dynamic, so user can't
      change it after first setting, until server restart. Distribution of
      connections per threads is managed by OS kernel.
      2ede3be3
    • Alexander Turenko's avatar
      build: fix configuring using cmake3 command · 820d2be6
      Alexander Turenko authored
      `cmake` command was hardcoded for configuring libcurl, however only
      `cmake3` may be installed in a system. Now we use the same cmake command
      for configuring libcurl as one that is used for configuring tarantool
      itself.
      
      The problem exists since 2.6.0-196-g2b0760192 ('build: enable cmake in
      curl build').
      
      Fixes #5955
      820d2be6
  3. Apr 12, 2021
    • Cyrill Gorcunov's avatar
      qsync: provide box.info.synchro interface for monitoring · bce3b581
      Cyrill Gorcunov authored
      
      In commit 14fa5fd8 (cfg: support symbolic evaluation of
      replication_synchro_quorum) we implemented support of
      symbolic evaluation of `replication_synchro_quorum` parameter
      and there is no easy way to obtain it current run-time value,
      ie evaluated number value.
      
      Moreover we would like to fetch queue length on transaction
      limbo for tests and extend this statistics in future. Thus
      lets add them.
      
      Closes #5191
      
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      
      @TarantoolBot document
      Title: Provide `box.info.synchro` interface
      
      The `box.info.synchro` leaf provides information about details of
      synchronous replication.
      
      In particular `quorum` represent the current value of synchronous
      replication quorum defined by `replication_synchro_quorum`
      configuration parameter because it can be set as dynamic formula
      such as `N/2+1` and the value depends on current number of replicas.
      
      Since synchronous replication does not commit data immediately
      but waits for its propagation to replicas the data sits in a queue
      gathering `commit` responses from remote nodes. Current number of
      entries waiting in the queue is shown via `queue.len` member.
      
      A typical output is the following
      
      ``` Lua
      tarantool> box.info.synchro
      ---
      - queue:
          len: 0
        quorum: 1
      ...
      ```
      
      The `len` member shows current number of entries in the queue.
      And the `quorum` member shows an evaluated value of
      `replication_synchro_quorum` parameter.
      bce3b581
  4. Apr 05, 2021
    • Vladislav Shpilevoy's avatar
      recovery: make it transactional · 9311113d
      Vladislav Shpilevoy authored
      Recovery used to be performed row by row. It was fine because
      anyway all the persisted rows are supposed to be committed, and
      should not meet any problems during recovery so a transaction
      could be applied partially.
      
      But it became not true after the synchronous replication
      introduction. Synchronous transactions might be in the log, but
      can be followed by a ROLLBACK record which is supposed to delete
      them.
      
      During row-by-row recovery, firstly, the synchro rows each turned
      into a sync transaction. Which is probably fine. But the rows on
      non-sync spaces which were a part of a sync transaction, could be
      applied right away bypassing the limbo leading to all kind of the
      sweet errors like duplicate keys, or inconsistency of a partially
      applied transaction.
      
      The patch makes the recovery transactional. Either an entire
      transaction is recovered, or it is rolled back which normally
      happens only for synchro transactions followed by ROLLBACK.
      
      In force recovery of a broken log the consistency is not
      guaranteed though.
      
      Closes #5874
      9311113d
    • Serge Petrenko's avatar
      replication: do not ignore replica vclock on register · f42fee5a
      Serge Petrenko authored
      There was a bug in box_process_register. It decoded replica's vclock but
      never used it when sending the registration stream. So the replica might
      lose the data in range (replica_vclock, start_vclock).
      
      Follow-up #5566
      f42fee5a
    • Serge Petrenko's avatar
      replication: tolerate synchro rollback during final join · 3ec0e87f
      Serge Petrenko authored
      Both box_process_register and box_process_join had guards ensuring that
      not a single rollback occured for transactions residing in WAL around
      replica's _cluster registration.
      Both functions would error on a rollback and make the replica retry
      final join.
      
      The reason for that was that replica couldn't process synchronous
      transactions correctly during final join, because it applied the final
      join stream row-by-row.
      
      This path with retrying final join was a dead end, because even if
      master manages to receive no ROLLBACK messages around N-th retry of
      box.space._cluster:insert{}, replica would still have to receive and
      process all the data dating back to its first _cluster registration
      attempt.
      In other words, the guard against sending synchronous rows to the
      replica didn't work.
      
      Let's remove the guard altogether, since now replica is capable of
      processing synchronous txs in final join stream and even retrying final
      join in case the _cluster registration was rolled back.
      
      Closes #5566
      3ec0e87f
    • Serge Petrenko's avatar
      applier: fix not releasing the latch on apply_synchro_row() fail · 9ad1bd15
      Serge Petrenko authored
      Once apply_synchro_row() failed, applier_apply_tx() would simply raise
      an error without unlocking replica latch. This lead to all the appliers
      hanging indefinitely on trying to lock the latch for this replica.
      
      In scope of #5566
      9ad1bd15
    • Vladislav Shpilevoy's avatar
      swim: check types in __serialize methods · 1d121c12
      Vladislav Shpilevoy authored
      In swim Lua code none of the __serialize methods checked the
      argument type assuming that nobody would call them directly and
      mess with the types. But it happened, and is not hard to fix, so
      the patch does it.
      
      The serialization functions are sanitized for the swim object,
      swim member, and member event.
      
      Closes #5952
      1d121c12
    • Vladislav Shpilevoy's avatar
      swim: fix crash on bad member_by_uuid() call · fe33a108
      Vladislav Shpilevoy authored
      In Lua swim object's method member_by_uuid() could crash if called
      with no arguments. UUID was then passed as NULL, and dereferenced.
      
      The patch makes member_by_uuid() treat NULL like nil UUID and
      return NULL (member not found). The reason is that
      swim_member_by_uuid() can't fail. It can only return a member or
      not. It never sets a diag error.
      
      Closes #5951
      fe33a108
    • Alexander Turenko's avatar
      lua: fix tuple leak in <key_def>.compare_with_key · db766c52
      Alexander Turenko authored
      The key difference between lbox_encode_tuple_on_gc() and
      luaT_tuple_encode() is that the latter never raises a Lua error, but
      passes an error using the diagnostics area.
      
      Aside of the tuple leak, the patch fixes fiber region's memory 'leak'
      (till fiber_gc()). Before the patch, the memory that is used for
      serialization of the key is not freed (region_truncate()) when the
      serialization fails. It is verified in the gh-5388-<...> test.
      
      While I'm here, added a test case that just verifies correct behaviour
      in case of a key serialization failure (added into key_def.test.lua).
      The case does not verify whether a tuple leaks and it is successful as
      before this patch as well after the patch. I don't find a simple way to
      check the tuple leak within a test. Verified manually using the
      reproducer from the linked issue.
      
      Fixes #5388
      db766c52
  5. Apr 02, 2021
    • Nikita Pettik's avatar
      vinyl: skip vylog if it's newer than snap · 149ccce9
      Nikita Pettik authored
      Having data in different engines checkpoint process is handled this way:
       - wait_checkpoint memtx
       - wait_checkpoint vinyl
       - commit_checkpoint memtx
       - commit_checkpoint vinyl
      
      In contrast to commit_checkpoint which does not tolerate fails (if
      something goes wrong e.g. renaming of snapshot file - instance simply
      crashes), wait_checkpoint may fail. As a part of wait_checkpoint for
      vinyl engine vy_log rotation takes place: old vy_log is closed and new
      one is created. At this moment, wait_checkpoint of memtx engine has
      already created new *inprogress* snapshot featuring bumped vclock.
      While recovering from this configuration, vclock of the latest snapshot
      is used as a reference.
      
      At the initial recovery stage (vinyl_engine_begin_initial_recovery),
      we check that snapshot's vclock matches with vylog's one (they should be
      the same since normally vylog is rotated along with snapshot). On the
      other hand, in the directory we have old snapshot and new vylog (and new
      .inprogress snapshot). In such a situation recovery (even in force mode)
      was aborted. The only way to fix this dead end, user has to manually
      delete last vy_log file.
      
      Let's proceed with the same resolution while user runs force_recovery
      mode: delete last vy_log file and update vclock value. If user uses
      casual recovery, let's print verbose message how to fix this situation
      manually.
      
      Closes #5823
      149ccce9
    • Mergen Imeev's avatar
      sql: ignore \0 in string passed to Lua-function · 22e2e4ea
      Mergen Imeev authored
      Prior to this patch string passed to user-defined Lua-function from SQL
      was cropped in case it contains '\0'. At the same time, it wasn't
      cropped if it is passed to the function from BOX. After this patch the
      string won't be cropped when passed from SQL if it contain '\0'.
      
      Closes #5938
      22e2e4ea
  6. Mar 31, 2021
    • Cyrill Gorcunov's avatar
      gc/xlog: delay xlog cleanup until relays are subscribed · 2fd51aea
      Cyrill Gorcunov authored
      
      In case if replica managed to be far behind the master node
      (so there are a number of xlog files present after the last
      master's snapshot) then once master node get restarted it
      may clean up the xlogs needed by the replica to subscribe
      in a fast way and instead the replica will have to rejoin
      reading a number of data back.
      
      Lets try to address this by delaying xlog files cleanup
      until replicas are got subscribed and relays are up
      and running. For this sake we start with cleanup fiber
      spinning in nop cycle ("paused" mode) and use a delay
      counter to wait until relays decrement them.
      
      This implies that if `_cluster` system space is not empty
      upon restart and the registered replica somehow vanished
      completely and won't ever come back, then the node
      administrator has to drop this replica from `_cluster`
      manually.
      
      Note that this delayed cleanup start doesn't prevent
      WAL engine from removing old files if there is no
      space left on a storage device. The WAL will simply
      drop old data without a question.
      
      We need to take into account that some administrators
      might not need this functionality at all, for this
      sake we introduce "wal_cleanup_delay" configuration
      option which allows to enable or disable the delay.
      
      Closes #5806
      
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      
      @TarantoolBot document
      Title: Add wal_cleanup_delay configuration parameter
      
      The `wal_cleanup_delay` option defines a delay in seconds
      before write ahead log files (`*.xlog`) are getting started
      to prune upon a node restart.
      
      This option is ignored in case if a node is running as
      an anonymous replica (`replication_anon = true`). Similarly
      if replication is unused or there is no plans to use
      replication at all then this option should not be considered.
      
      An initial problem to solve is the case where a node is operating
      so fast that its replicas do not manage to reach the node state
      and in case if the node is restarted at this moment (for various
      reasons, for example due to power outage) then `*.xlog` files might
      be pruned during restart. In result replicas will not find these
      files on the main node and have to reread all data back which
      is a very expensive procedure.
      
      Since replicas are tracked via `_cluster` system space this we use
      its content to count subscribed replicas and when all of them are
      up and running the cleanup procedure is automatically enabled even
      if `wal_cleanup_delay` is not expired.
      
      The `wal_cleanup_delay` should be set to:
      
       - `0` to disable the cleanup delay;
       - `>= 0` to wait for specified number of seconds.
      
      By default it is set to `14400` seconds (ie `4` hours).
      
      In case if registered replica is lost forever and timeout is set to
      infinity then a preferred way to enable cleanup procedure is not setting
      up a small timeout value but rather to delete this replica from `_cluster`
      space manually.
      
      Note that the option does *not* prevent WAL engine from removing
      old `*.xlog` files if there is no space left on a storage device,
      WAL engine can remove them in a force way.
      
      Current state of `*.xlog` garbage collector can be found in
      `box.info.gc()` output. For example
      
      ``` Lua
       tarantool> box.info.gc()
       ---
         ...
         is_paused: false
      ```
      
      The `is_paused` shows if cleanup fiber is paused or not.
      2fd51aea
  7. Mar 29, 2021
  8. Mar 24, 2021
    • Vladislav Shpilevoy's avatar
      buffer: remove Lua registers · 911ca60e
      Vladislav Shpilevoy authored
      Lua buffer module used to have a couple of preallocated objects of
      type 'union c_register'. It was a bunch of C scalar and array
      types intended for use instead of ffi.new() where it was needed to
      allocate a temporary object like 'int[1]' just to be able to pass
      'int *' into a C function via FFI.
      
      It was a bit faster than ffi.new() even for small sizes. For
      instance (when JIT works), getting a register to use it as
      'int[1]' cost around 0.2-0.3 ns while ffi.new('int[1]') costs
      around 0.4 ns. Also the code looked cleaner.
      
      But Lua registers were global and therefore had the same issue as
      IBUF_SHARED and static_alloc() in Lua - no ownership, and sudden
      reuse when GC starts right the register is still in use in some
      Lua code. __gc handlers could wipe the register values making the
      original code behave unpredictably.
      
      IBUF_SHARED was fixed by proper ownership implementation, but it
      is not necessary with Lua registers. It could be done with the
      buffer.ffi_stash_new() feature, but its performance is about 0.8
      ns which is worse than plain ffi.new() for simple scalar types.
      
      This patch eliminates Lua registers, and uses ffi.new() instead
      everywhere.
      
      Closes #5632
      911ca60e
  9. Mar 22, 2021
  10. Mar 19, 2021
    • Vladislav Shpilevoy's avatar
      lua: separate sched and script diag · f4e248c0
      Vladislav Shpilevoy authored
      When Lua main script was launched, the sched fiber passed its own
      diag to the script's fiber. When the script was finished, it put
      its error into the diag. The sched fiber then checked if the diag
      is empty to detect an error.
      
      But it wasn't really correct. The error could also happen right in
      the scheduler fiber in a libev callback. For example, in one of
      ev_io callbacks in SWIM. Then the process would end with an error
      even if the script was finished successfully.
      
      These errors were not related to the main fiber executing the
      script.
      
      The patch makes so the scheduler fiber's diag no longer is used as
      an indication of an error in the script. Instead, a new diag is
      created on the stack of the scheduler's fiber, where the Lua
      script saves the error.
      
      Closes #5864
      f4e248c0
    • Serge Petrenko's avatar
      wal: introduce limits on simultaneous writes · de93b448
      Serge Petrenko authored
      Since the introduction of asynchronous commit, which doesn't wait for a
      WAL write to succeed, it's quite easy to clog WAL with huge amounts
      write requests. For now, it's only possible from an applier, since it's
      the only user of async commit at the moment.
      
      This happens when replica is syncing with master and reads new
      transactions at a pace higher than it can write them to WAL (see docbot
      request for detailed explanation).
      
      To ameliorate such behavior, we need to introduce some limit on
      not-yet-finished WAL write requests. This is what this commit is trying
      to do.
      A new counter is added to wal writer: queue_size (in bytes) together with a
      corresponding configuration setting: `wal_queue_max_size`.
      The counter is increased on every new submitted request, and decreased once
      the tx thread receives a confirmation that a specific request was written.
      
      Actually, the limit is added to an abstract journal queue, but
      currently works only for wal writer, since it's the only possible journal
      when applier is working.
      
      Once size reaches its maximum value, applier is blocked until
      some of the write requests are finished.
      
      The size limit isn't strict, i.e. if there's at least one free byte, the
      whole write request fits and no blocking is involved.
      
      The feature is ready for `box.commit{is_async=true}`. Once it's
      implemented, it should check whether the queue is full and let the user
      decide what to do next. Either wait or roll the tx back.
      
      Closes #5536
      
      @TarantoolBot document
      Title: new configuration option: 'wal_queue_max_size'
      
      `wal_queue_max_size` puts a limit on the amount of concurrent write requests
      submitted to WAL.
      `wal_queue_max_size` is measured in number of bytes to be written (0
      means unlimited, which was the default behaviour before).
      The option only affects replica behaviour at the moment, and defaults
      to 16 megabytes. The option limits the pace at which replica reads new
      transactions from master.
      
      Here's when the option comes in handy:
      
      Before this option was introduced such a situation could be possible:
      there are 2 servers, a master and a replica, and the replica is down for
      some period of time. While the replica is down, master serves requests
      at a reasonable pace, possibly close to its WAL throughput limit. Once the
      replica reconnects, it has to receive all the data master has piled up and
      there's no limit in speed at which master sends the data to replica, and,
      without the option, there was no limit in speed at which replica submitted
      corresponding write requests to WAL.
      
      This lead to a situation when replica's WAL was never in time to serve the
      requests and the amount of pending requests was constantly growing.
      There was no limit for memory WAL write requests take, and this clogging
      of WAL write queue could even lead to replica using up all the available
      memory.
      
      Now, when `wal_queue_max_size` is set, appliers will stop reading new
      transactions once the limit is reached. This will let WAL process all the
      requests that have piled up and free all the excess memory.
      de93b448
    • mechanik20051988's avatar
      Implement on_shutdown API · 3010f024
      mechanik20051988 authored
      Implemented on_shutdown API, which allows to register functions
      that will be called when the tarantool stopped. Functions will
      be called in the reverse order they are registered. So the module
      developer registers one fuction that starts module termination and
      waits for its competition. This function should be fast or used an
      asynchronous waiting mechanism (coio_wait or cord_cojoin for example).
      
      Closes #5723
      
      @TarantoolBot document
      Title: Implement on_shutdown API
      Implemented on_shutdown API, which allows to register functions
      that will be called when the tarantool stopped. Functions will
      be called in the reverse order they are registered. So the module
      developer registers one fuction that starts module termination and
      waits for its competition. This function should be fast or used an
      asynchronous waiting mechanism (coio_wait or cord_cojoin for example).
      3010f024
    • mechanik20051988's avatar
      lua: change on_shutdown triggers behaviour · 357f1551
      mechanik20051988 authored
      Previously lua on_shutdown triggers were started sequentially, now
      each of triggers starts in a separate fiber. Tarantool waits for 3.0
      seconds to their completion by default. User has the option to change
      this value using new implemented box.ctl.set_on_shutdown_timeout function.
      If timeout has expired, tarantool immediately stops, without waiting for
      other triggers completion.
      Also moved ev_break from trigger to the on_shutdown_f function, after
      calling all on_shutdown lua triggers, because now all triggers are
      started asynchronously in fibers, and we should call ev_break only
      after all triggers are finished.
      
      Part of #5723
      
      @TarantoolBot document
      Title: Changed Lua on_shutdown triggers behaviour.
      Previously lua on_shutdown triggers were started sequentially, now
      each of triggers starts in a separate fiber. Tarantool waits for 3.0
      seconds to their completion by default. User has the option to change
      this value using new implemented box.ctl.set_on_shutdown_timeout function.
      If timeout has expired, tarantool immediately stops, without waiting for
      other triggers completion.
      357f1551
    • mechanik20051988's avatar
      box: rename granularity option in box.cfg{} to slab_alloc_granularity · 501da2bf
      mechanik20051988 authored
      Renamed granularity option to slab_alloc_granularity, according
      to the name of the other options for small allocator.
      
      Follow-up #5518
      501da2bf
  11. Mar 15, 2021
  12. Mar 12, 2021
    • Artem Starshov's avatar
      lua: fix tarantool -e always enters interactive mode · 0787483c
      Artem Starshov authored
      The reason why tarantool -e always enters interactive mode is that
      statement after option -e isn't considered as a script.
      
      In man PUC-Rio lua there are different names for statement -e (stat)
      and script, but they have the same behavior regarding interactive
      mode. (Also cases, when interpreter loads stdin, have the same behaviour).
      
      NOTE: test for this code fix uses errinjs, and the last one  should work only
      in debug mode, so added `release_disabled` in suite.ini. But there is a bug in
      test-run: `release_disable` disables tests at each build type. Partially this
      problem is descripted in tarantool/test-run#199.
      
      Fixes #5040
      0787483c
  13. Mar 11, 2021
    • mechanik20051988's avatar
      memxt: add granularity option to box.cfg{} · 53c0e910
      mechanik20051988 authored
      Granularity is an option that allows user to set
      multiplicity of memory allocation in small allocator.
      Granulatiry must be exponent of two and >= 4. By default
      granularity value == sizeof(intptr_t), as it was before,
      when this option was not provided.
      
      @TarantoolBot document
      Title: Add 'granularity' option to box.cfg{}
      Add granularity option that allows user to set multiplicity
      of memory allocation in small allocator. Granularity determines
      not only alignment of objects, but also size of the objects in
      the pool. Thus, the greater the granularity, the greater the
      memory loss per one memory allocation, but tuples with different
      sizes are allocated from the same mempool, and we do not lose
      memory on the slabs, when we have highly distributed tuple sizes.
      This is somewhat similar to a large alloc factor. The smaller the
      granularity, the less memory loss per allocation, if the user has
      many small tuples of approximately the same size, it will be nice
      to set granularity == 4 to save memory.
      
      This option must be set once during start, default value
      == sizeof(intptr_t) (8 on 64 bit platforms), as it was before, when
      this option was not provided. Granularity must be exponent of two
      and >= 4. Together with the slab_alloc_factor, this option gives you
      full control over the behavior of small allocator.
      
      Closes #5518
      53c0e910
  14. Mar 04, 2021
  15. Mar 02, 2021
  16. Feb 28, 2021
    • Igor Munkin's avatar
      build: adjust LuaJIT build system · 07c83aab
      Igor Munkin authored
      LuaJIT submodule is bumped to introduce the following changes:
      * test: run luacheck static analysis via CMake
      * test: fix warnings found with luacheck in misclib*
      * test: run LuaJIT tests via CMake
      * build: replace GNU Make with CMake
      * build: preserve the original build system
      
      Since LuaJIT build system is ported to CMake in scope of the changeset
      mentioned above, the module building the LuaJIT bundled in Tarantool is
      completely reworked. There is no option to build Tarantool against
      another prebuilt LuaJIT due to a91962c0
      ('Until Bug#962848 is fixed, don't try to compile with external
      LuaJIT'), so all redundant options defining the libluajit to be used in
      Tarantool are dropped with the related auxiliary files.
      
      To run LuaJIT related tests or static analysis for Lua files within
      LuaJIT repository, <LuaJIT-test> and <LuaJIT-luacheck> targets are used
      respectively as a dependency of the corresponding Tarantool targets.
      
      As an additional dependency to run LuaJIT tests, prove[1] utility is
      required, so the necessary binary packages are added to the lists with
      build requirements.
      
      [1]: https://metacpan.org/pod/TAP::Harness#prove
      
      
      
      Closes #4862
      Closes #5470
      Closes #5631
      
      Reviewed-by: default avatarSergey Kaplun <skaplun@tarantool.org>
      Reviewed-by: default avatarTimur Safin <tsafin@tarantool.org>
      Signed-off-by: default avatarIgor Munkin <imun@tarantool.org>
      Unverified
      07c83aab
  17. Feb 24, 2021
  18. Feb 15, 2021
  19. Feb 12, 2021
    • mechanik20051988's avatar
      memtx: fix test for gh5304 issue and memtx_space_is_recovering function · 8ac47898
      mechanik20051988 authored
      In previous version of patch we compared memtx state with
      MEMTX_FINAL_RECOVERY to check that memtx recovery completed.
      This is not quite true, memtx_state == MEMTX_FINAL_RECOVERY
      means that the recovery from snapshot is finished, but recovery
      from wals not. We need to compare memtx_state with MEMTX_OK
      to check that recovery totally finished.
      In previous test version on_replace trigger (created on
      _user space) is never called. It's because is_recovery_finished()
      always returns false: on_schema_init is invoked BEFORE
      user's data recovery process (so trigger is not created at all
      at this moment).
      In new test version you can see correct user case:
      we create on_replace trigger on _index system space,
      which replaces/inserts/updates tuples in temp and loc spaces.
      So each time user creates new space and index for it,
      trigger replaces/inserts/updates tuples in temp and loc spaces.
      Because trigger replaces/inserts/updates tuple with same
      primary key, we get error when insert trigger called.
      
      Follow-up #5304
      8ac47898
    • Nikita Pettik's avatar
      Add changelog for #5764 (de6c76b6) · df99d492
      Nikita Pettik authored
      df99d492
Loading