Skip to content
Snippets Groups Projects
  1. Oct 31, 2022
    • Nikolay Shirokovskiy's avatar
      app: fix crash on logrotate and immediate exit · 99de5b74
      Nikolay Shirokovskiy authored
      `say_logrotate` does not rotate logs synchronously. It posts tasks to
      coio which executes them in it's pool thread. On application exit we
      destroy logs calling `log_destroy`. This function waits for rotate task
      to finish because if it will free resources immediately then unfinished
      rotate task can have use-after-free issues. Waiting crashes because at
      this moment event loop is not running which is required for
      `fiber_cond_wait` to work.
      
      Note that if there is no crash then we will hang forever waiting in
      `log_destroy` because event loop is not running and
      `log_rotate_async_cb` will never be executed.
      
      Let's use mutexes and conditions instead. It solves both problems with
      crash and hang. Thanks Vladimir Davydov for idea.
      
      Fixes #4450
      
      NO_DOC=bugfix
      
      (cherry picked from commit eed09192)
      99de5b74
  2. Oct 28, 2022
    • Vladimir Davydov's avatar
      txn: account started, committed, and rolled back transactions · a5fe5c32
      Vladimir Davydov authored
      This commit fixes BEGIN, COMMIT, and ROLLBACK counters in the box.stat()
      output. Before this commit, they always showed 0. Now, they report
      the number of started, committed, and rolled back transactions,
      respectively.
      
      Closes #7583
      
      NO_DOC=bug fix
      
      (cherry picked from commit 70969d1d)
      a5fe5c32
    • Ilya Verbin's avatar
      box: panic if snapshot has no system spaces during recovery · 502bb096
      Ilya Verbin authored
      Currently, if a snapshot contains some correct entries, but doesn't
      include system spaces, Tarantool crashes with segmentation fault, or
      for Debug build: void diag_raise(): Assertion `e != NULL' failed.
      This happens because memtx_engine_recover_snapshot returns -1, while
      diag is not set. Let's panic instead of a crash.
      
      Closes #7800
      
      NO_DOC=bugfix
      
      (cherry picked from commit e0a9aed4)
      502bb096
  3. Oct 26, 2022
    • Vladimir Davydov's avatar
      msgpack: fix crash on decode of 0xc1 · e48a8f4b
      Vladimir Davydov authored
      0xc1 isn't a valid MsgPack header, but it was allowed by mp_check.
      As a result, msgpack.decode crashed while trying to decode it.
      This commit updates the msgpuck library to fix this issue.
      
      Closes #7818
      
      NO_DOC=bug fix
      
      (cherry picked from commit ced405af)
      e48a8f4b
  4. Oct 25, 2022
    • Serge Petrenko's avatar
      security: check size boundaries for getenv() returns · 829b65f8
      Serge Petrenko authored
      getenv() return values cannot be trusted, because an attacker might set
      them. For instance, we shouldn't expect, that getenv() returns a value
      of some sane size.
      
      Another problem is that getenv() returns a pointer to one of
      `char **environ` members, which might change upon next setenv().
      
      Introduce a wrapper, getenv_safe(), which returns the value only when
      it fits in a buffer of a specified size, and copies the value onto the
      buffer. Use this wrapper everywhere in our code.
      
      Below's a slightly decorated output of `grep -rwn getenv ./src --include
      *.c --include *.h --include *.cc --include *.cpp --include *.hpp
      --exclude *.lua.c` as of 2022-10-14.
      `-` marks invalid occurences (comments, for example),
      `*` marks the places that are already guarded before this patch,
      `X` mars the places guarded in this patch, and
      `^` marks places fixed in the next commit:
      
      NO_WRAP
      ```
      * ./src/lib/core/coio_file.c:509:	const char *tmpdir = getenv("TMPDIR");
      X ./src/lib/core/errinj.c:75: const char *env_value = getenv(inj->name);
      - ./src/proc_title.c:202: * that might try to hang onto a getenv() result.)
      - ./src/proc_title.c:241:	* is mandatory to flush internal libc caches on getenv/setenv
      X ./src/systemd.c:54: sd_unix_path = getenv("NOTIFY_SOCKET");
      * ./src/box/module_cache.c:300: const char *tmpdir = getenv("TMPDIR");
      X ./src/box/sql/os_unix.c:1441: azDirs[0] = getenv("SQL_TMPDIR");
      X ./src/box/sql/os_unix.c:1446: azDirs[1] = getenv("TMPDIR");
      * ./src/box/lua/console.c:394: const char *envvar = getenv("TT_CONSOLE_HIDE_SHOW_PROMPT");
      ^ ./src/box/lua/console.lua:771: local home_dir = os.getenv('HOME')
      ^ ./src/box/lua/load_cfg.lua:1007: local raw_value = os.getenv(env_var_name)
      X ./src/lua/init.c:575: const char *path = getenv(envname);
      X ./src/lua/init.c:592: const char *home = getenv("HOME");
      * ./src/find_path.c:77: snprintf(buf, sizeof(buf) - 1, "%s", getenv("_"));
      ```
      NO_WRAP
      
      Part-of #7797
      
      NO_DOC=security
      
      (cherry picked from commit b86395ff)
      829b65f8
    • Mergen Imeev's avatar
      sql: fix another cursor invalidation · 1b357d93
      Mergen Imeev authored
      This patch fixes the issue described in issue #5310 when the tuple
      format has more fields than the space format. This solution is more
      general than the solution in 89057a21.
      
      Follow-up #5310
      Closes #4666
      
      NO_DOC=bugfix
      
      (cherry picked from commit 5a38c5c9)
      1b357d93
  5. Oct 20, 2022
    • Andrey Saranchin's avatar
      box: unify errors about mismatch of password and login during auth · e8705a65
      Andrey Saranchin authored
      If we raise different errors in case of entering an invalid password and
      entering the login of a non-existent user during authorization, it will
      open the door for an unauthorized person to enumerate users.
      So let's unify raised errors in the cases described above.
      
      Closes #tarantool/security#16
      
      NO_DOC=security fix
      
      (cherry picked from commit 5c62f01b)
      e8705a65
  6. Oct 19, 2022
    • Mergen Imeev's avatar
      box: fix format of _vfunc · ee06f892
      Mergen Imeev authored
      The _vfunc system space is the sysview for the _func system space.
      However, the _vfunc format is different from the _func format. This
      patch makes the _vfunc format the same as the _func format.
      
      Closes #7822
      
      NO_DOC=bugfix
      
      (cherry picked from commit 707da125)
      ee06f892
  7. Oct 18, 2022
    • Timur Safin's avatar
      datetime: datetimes subtractions ignored timezone · 8aa13474
      Timur Safin authored
      We used to ignore timezone difference (in `tzoffset`) for
      datetime subtraction operation:
      
      ```
      tarantool> datetime.new{tz='MSK'} - datetime.new{tz='UTC'}
      ---
      - +0 seconds
      ...
      
      tarantool> datetime.new{tz='MSK'}.timestamp -
                 datetime.new{tz='UTC'}.timestamp
      ---
      - -10800
      ...
      ```
      
      Now we accumulate tzoffset difference in the minute component
      of a resultant interval:
      
      ```
      tarantool> datetime.new{tz='MSK'} - datetime.new{tz='UTC'}
      ---
      - -180 minutes
      ...
      ```
      
      Closes #7698
      
      NO_DOC=bugfix
      
      (cherry picked from commit 0daed8d5)
      Unverified
      8aa13474
    • Timur Safin's avatar
      datetime: fix interval arithmetic for DST · 9f8b05ab
      Timur Safin authored
      We did not take into consideration the fact that
      as result of date/time arithmetic we could get
      in a different timezone, if DST boundary has been
      crossed during operation.
      
      ```
      tarantool> datetime.new{year=2008, month=1, day=1,
      			tz='Europe/Moscow'} +
      	   datetime.interval.new{month=6}
      ---
      - 2008-07-01T01:00:00 Europe/Moscow
      ...
      ```
      
      Now we resolve tzoffset at the end of operation if
      tzindex is not 0.
      
      Fixes #7700
      
      NO_DOC=bugfix
      
      (cherry picked from commit 6ca07285)
      Unverified
      9f8b05ab
  8. Oct 14, 2022
  9. Oct 13, 2022
  10. Oct 12, 2022
  11. Oct 11, 2022
    • Mergen Imeev's avatar
      sql: change rules used to determine NULLIF() type · 5c6afe47
      Mergen Imeev authored
      This patch introduces new rules to determine type of NULLIF() built-in
      function.
      
      Closes #6990
      
      @TarantoolBot document
      Title: New rules to determine type of result of NULLIF
      
      The type of the result of NULLIF() function now matches the type of the
      first argument.
      
      (cherry picked from commit 805cbaa7)
      5c6afe47
    • Mergen Imeev's avatar
      sql: change rules used to determine CASE type · 5585825a
      Mergen Imeev authored
      This patch introduces new rules to determine type of CASE operation.
      
      Part of #6990
      
      @TarantoolBot document
      Title: New rules to determine type of result of CASE
      
      New rules are applied to determine the type of the CASE operation. If
      all values are NULL with no type, or if a bind variable exists among
      the possible results, then the type of CASE is ANY. Otherwise, all NULL
      values with no type are ignored, and the type of CASE is determined
      using the following rules:
      1) if all values of the same type, then type of CASE is this type;
      2) otherwise, if any of the possible results is of one of the
      incomparable types, then the type of CASE is ANY;
      3) otherwise, if any of the possible results is of one of the
      non-numeric types, then the type of CASE is SCALAR;
      4) otherwise, if any of the possible results is of type NUMBER, then the
      type of CASE is NUMBER;
      5) otherwise, if any of the possible results is of type DECIMAL, then
      the type of CASE is DECIMAL;
      6) otherwise, if any of the possible results is of type DOUBLE, then the
      type of CASE is DOUBLE;
      7) otherwise the type of CASE is INTEGER.
      
      (cherry picked from commit 90f64460)
      5585825a
  12. Oct 07, 2022
    • Mergen Imeev's avatar
      sql: fix cursor invalidation · 2579dfed
      Mergen Imeev authored
      If the length of the tuple is greater than the number of fields in the
      format, it is possible that the cursor in the VDBE will be overridden
      with zeros.
      
      Closes #5310
      
      NO_DOC=bugfix
      
      (cherry picked from commit 89057a21)
      2579dfed
  13. Oct 06, 2022
  14. Oct 05, 2022
  15. Sep 30, 2022
    • Nikolay Shirokovskiy's avatar
      test: don't call box.cfg() from test instance in gh-7288 · 13a839e8
      Nikolay Shirokovskiy authored
      Such style is not recommented and going to be banned.
      
      Closes #7715
      
      NO_TEST=test refactoring
      NO_DOC=test refactoring
      NO_CHANGELOG= test refactoring
      
      (cherry picked from commit ce784430)
      13a839e8
    • Georgiy Lebedev's avatar
      test: replace Lua `assert`s with luatest `assert`s in luatest tests · c7a65c94
      Georgiy Lebedev authored
      Some luatest framework tests use Lua `assert`s, which are incomprehensible
      when failed (the only information provided is 'assertion failed!'),
      making debugging difficult: replace them with luatest `assert`s and their
      context-specific varieties.
      
      NO_CHANGELOG=<code health>
      NO_DOC=<code health>
      
      (cherry picked from commit 1d9645f7)
      c7a65c94
    • Vladimir Davydov's avatar
      box: fix usage error messages if FFI is disabled for index read ops · 5ebfc581
      Vladimir Davydov authored
      Commit 7f2c7609 ("box: add option to disable Lua FFI for read
      operations") added an internal configuration option that switches all
      index read operations from FFI to Lua C, which was necessary to
      implement space upgrade in the EE repository. Turns out there was a
      minor bug in that commit - when FFI is disabled, method usage errors
      (raised when object.method is called instead of object:method) stop
      working. Fix this bug and add an extensive test that checks that Lua C
      and FFI implementations of index read operations are equivalent.
      
      Fixes https://github.com/tarantool/tarantool-ee/issues/254
      
      NO_DOC=bug fix
      NO_CHANGELOG=bug manifests itself in only in EE
      
      (cherry picked from commit e8d9a7cb)
      5ebfc581
    • Georgiy Lebedev's avatar
      memtx: fix loss of committed tuple in secondary index · 2ec3dfbb
      Georgiy Lebedev authored
      Concurrent transactions can try to insert tuples that intersect only by
      parts of secondary index: in this case when one of them gets prepared, the
      others get conflicted, but the committed story does not get retained
      (because the conflicting statements are not added to the committed story's
      delete statement list as opposed to primary index) and is lost after
      garbage collection: retain stories if there is a newer uncommitted story
      in the secondary indexes' history chain.
      
      Closes #7712
      
      NO_DOC=bugfix
      
      (cherry picked from commit 7b0baa57)
      2ec3dfbb
  16. Sep 29, 2022
    • Serge Petrenko's avatar
      gc: replace vclockset_psearch with _match in wal_collect_garbage_f · d6fc95f6
      Serge Petrenko authored
      When using vclockset_psearch, the resulting vclock may be incomparable
      to the search key. For example, with a vclock set { } (empty vclock),
      {0: 1, 1: 10}, {0: 2, 1:11} vclockset_psearch(set, {0:2, 1: 9}) might
      return {0: 1, 1: 10}, and not { }.
      This is known and avoided in other places, for example
      recover_remaining_wals(), where vclockset_match() is used instead.
      vclockset_match() starts with the same result as vclockset_psearch() and
      then unwinds the result until the first vclock which is less or equal to
      the search key is found.
      
      Having vclockset_psearch in wal_collect_garbage_f could lead to issues
      even before local space changes became written to 0-th vclock component.
      Once replica subscribes, its' gc consumer is set to the vclock, which
      the replica sent in subscribe request. This vclock might be incomparable
      with xlog vclocks of the master, leading to the same issue of
      potentially deleting a needed xlog during gc.
      
      Closes #7584
      
      NO_DOC=bugfix
      
      (cherry picked from commit c63bfb9a)
      d6fc95f6
  17. Sep 28, 2022
    • Georgiy Lebedev's avatar
      memtx: fix transaction manager MVCC invariant violation · 1fac9eef
      Georgiy Lebedev authored
      We hold the following invariant in MVCC: the story at the top of the
      history chain is present in index.
      
      If a story is subject to be deleted from index and there is an older story
      in the history chain, the older story starts to be at the top of the
      history chain and is not present in index, which violates our invariant:
      explicitly check for this case when evaluating whether a story can be
      garbage collected and add an assertion to check the invariant above is not
      violated.
      
      Rollbacked stories need to be handled in a special way: they are
      present at the end of some history chains and completely unlinked from
      others (which also implies they are not present in the corresponding
      indexes).
      
      `memtx_tx_story_full_unlink` is called in two contexts: space deletion, in
      which we delete all stories, and garbage collection step — the former case
      can break the invariant described above, while the latter must preserve it,
      hence add two different functions for the corresponding contexts.
      
      Closes #7490
      
      NO_CHANGELOG=<internal bugfix not user observable>
      NO_DOC=<bugfix>
      
      (cherry picked from commit c8eccfbb)
      1fac9eef
    • Georgiy Lebedev's avatar
      memtx: rework transaction rollback · 61be2c8f
      Georgiy Lebedev authored
      When we rollback a transaction statement, we relink its read trackers
      to a newer story in the history chain, if present (6c990a7b), but we do not
      handle the case when there is no newer story.
      
      If there is an older story in the history chain, we can relink the
      rollbacked story's reader to it, but if the rollbacked story is the
      only one left, we need to retain it, because it stores the reader list
      needed for conflict resolution — such stories are distinguished by the
      rollbacked flag, and there can be no more than one such story located
      strictly at the end of a given history chain (which means a story can be
      fully unlinked from some indexes and present at the end of others).
      
      There are several nuances we need to account for:
      
      Firstly, such rollbacked stories must be impossible to read from an index:
      this is ensured by `memtx_tx_story_is_visible`.
      
      Secondly, rollbacked transactions need to be treated as prepared with
      stories that have `add_psn == del_psn`, so that they are correctly deleted
      during garbage collection.
      
      After this logical change we have the following partially ordered set over
      tuple stories:
      ———————————————————————————————————————————————————————> serialization time
      |- - - - - - - -|— — — — — -|— — — — — |— — — — — — -|— — — — — — — -
      | No more than  | Committed | Prepared | In-progress | One dirty
      | one rollbacked|           |          |             | story in index
      | story         |           |          |             |
      |- - - - - - - -|— — — — — -| — — — — —|— — — — — — -|— — — — — — — —
      
      Closes #7343
      
      NO_DOC=bugfix
      
      (cherry picked from commit 56cf737c)
      61be2c8f
    • Georgiy Lebedev's avatar
      memtx: remove redundant `space` field from `struct memtx_story` · 8ee6a2f2
      Georgiy Lebedev authored
      `struct memtx_story` has a `space` field, which is basically used
      to identify that a tuple is unlinked from the history chain in
      `memtx_tx_index_invisible_count_slow` (though this can be determined by its
      presence in the index) and is used to get the space's index in
      `memtx_tx_story_link_top` (though it can be  retrieved from the older
      story's link field): remove this redundant field.
      
      Needed for #7343
      
      NO_CHANGELOG=<refactoring>
      NO_DOC=<refactoring>
      NO_TEST=<refactoring>
      
      (cherry picked from commit 55e64a8d)
      8ee6a2f2
  18. Sep 26, 2022
    • Vladislav Shpilevoy's avatar
      xrow: fix crash on nested map/array update ops · 3def2916
      Vladislav Shpilevoy authored
      If an update operation tried to insert a new key into a map or an
      array which was created by a previous update operation, then the
      process would fail an assertion.
      
      That was because the first operation was stored as a bar update.
      The second operation tried to branch it assuming that the entire
      bar update's JSON path must exist, but it wasn't so for the newly
      created part of the path.
      
      The solution is to fallback to branching earlier than the entire
      bar path ends, if can see that the next part of the path can't be
      found.
      
      Closes #7705
      
      NO_DOC=bugfix
      
      (cherry picked from commit 8425ebfc)
      3def2916
  19. Sep 23, 2022
    • Georgiy Lebedev's avatar
      memtx: track `index:random` reads and clarify result · 2af84a85
      Georgiy Lebedev authored
      TREE (HASH) index implements `random` method: if the space is empty from
      the transaction's perspective, which means we have to return nothing, add
      gap tracking of whole range (full scan
      tracking), since this result is equivalent to `index:select{}`, otherwise
      repeatedly call `random` and clarify result, until we get a non-empty one.
      We do not care about performance here, since all operations in context of
      transaction management currently have O(number of dirty tuples)
      complexity.
      
      Closes #7670
      
      NO_DOC=bugfix
      
      (cherry picked from commit 1b82beb2)
      2af84a85
    • Georgiy Lebedev's avatar
      memtx: fix TREE index `get` check for part count · b9d62fca
      Georgiy Lebedev authored
      If TREE index `get` result is empty, the key part count is incorrectly
      compared to the tree's `cmp_def->part_count`, though it should be compared
      with `cmp_def->unique_part_count`. But we can actually assume that by the
      time we get to the index's `get` method the part count is equal to the
      unique part count (partial keys are rejected and `get` is not
      supported for non-unique indexes): change check to correct assertion.
      
      Closes #7685
      
      NO_DOC=<bugfix>
      
      (cherry picked from commit bfcd8ca7)
      b9d62fca
  20. Sep 21, 2022
    • Boris Stepanenko's avatar
      limbo: fix assertions in box_issue_de/promote · 65b3bad6
      Boris Stepanenko authored
      Replaced assertions, that no one started new elections/promoted while
      acquiring limbo, with checks that raft term and limbo term didn't
      change. In case they did - don't write DEMOTE/PROMOTE and just release
      limbo, because it's already owned/will soon be by someone else.
      
      Closes #7086
      
      NO_DOC=Bugfix
      
      (cherry picked from commit 8ee0e434)
      65b3bad6
  21. Sep 16, 2022
  22. Sep 15, 2022
    • Georgiy Lebedev's avatar
      memtx: fix 'use after free' of garbage collected MVCC stories · 0daf8382
      Georgiy Lebedev authored
      `directly_replaced` stories can potentially get garbage collected in
      `memtx_tx_handle_gap_write`, which is unexpected and leads to 'use after
      free': in order to fix this, limit garbage collection points only to
      external API calls.
      
      Wrap all possible garbage collection points with explicit warnings (see
      c9981a56).
      
      Closes #7449
      
      NO_DOC=bugfix
      
      (cherry picked from commit 18e042f5)
      0daf8382
  23. Sep 14, 2022
    • Alexander Turenko's avatar
      lua/merger: fix use-after-free during iteration · f9aecfb8
      Alexander Turenko authored
      All merge sources (including the merger itself) share the same
      `<merge source>:pairs()` implementation, which returns `gen, param,
      state` triplet. `gen` is `lbox_merge_source_gen()`, `param` is `nil`,
      `state` in the merge source.
      
      The `lbox_merge_source_gen()` returns `source, tuple`. The returned
      source is supposed to be the same object as a one passed to the function
      (`gen(param, state)`), so the function assumes the object as alive and
      don't increment source's refcounter at entering, don't decrease it at
      exitting.
      
      This logic is perfect, but there was a mistake in the implementation:
      the function returns a new cdata object (which holds the same pointer to
      the merge source structure) instead of the same cdata object.
      
      The new cdata object neither increases the source's refcounter at
      pushing to Lua, nor decreases it at collecting. At result, if we'll loss
      the original merge source object (and the first `state` that is returned
      from `:pairs()`), the source structure may be freed. The pointer in the
      new cdata object will be invalid so.
      
      A sketchy code that illustrates the problem:
      
      ```lua
      gen, param, state0 = source:pairs()
      assert(state0 == source)
      source = nil
      state1, tuple = gen(param, state0)
      state0 = nil
      -- assert(state1 == source) -- would fails
      collectgarbage()
      -- The cdata object that is referenced as `source` and as `state`
      -- is collected. The GC handler is called and dropped the merge
      -- source structure refcounter to zero. The structure is freed.
      -- The call below will crash.
      gen(param, state1)
      ```
      
      In the fixed code `state1 == source`, so the GC handler is not called
      prematurely: we have the merge source object alive till the end of the
      iterator or till the stop of the traversal.
      
      Fixes #7657
      
      NO_DOC=a crash is definitely not what we want to document
      
      (cherry picked from commit 3bc64229)
      Unverified
      f9aecfb8
  24. Sep 13, 2022
    • Yaroslav Lobankov's avatar
      test: slight refactoring of replication-py tests · 852770c2
      Yaroslav Lobankov authored
      - Remove unused imports
      - Remove unnecessary creation of 'replica' instance objects
      - Use `<instance>.iproto.uri` object attribute instead of calling
        `box.cfg.listen` via admin connection
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      
      (cherry picked from commit d13b06bd)
      852770c2
    • Yaroslav Lobankov's avatar
      test: bump test-run to new version · 96dfd98b
      Yaroslav Lobankov authored
      Bump test-run to new version with the following improvements:
      
      - Report job summary on GitHub Actions [1]
      - Free port auto resolving for TarantoolServer and AppServer [2]
      
      Also, this patch includes the following changes:
      
      - removing `use_unix_sockets` option from all suite.ini config files
        due to permanent using Unix sockets for admin connection recently
        introduced in test-run
      - switching replication-py tests to Unix sockets for iproto connection
      - fixing replication-py/swap.test.py and swim/swim.test.lua tests
      
      [1] tarantool/test-run#341
      [2] tarantool/test-run#348
      
      NO_DOC=testing stuff
      NO_TEST=testing stuff
      NO_CHANGELOG=testing stuff
      
      (cherry picked from commit 4335b442)
      96dfd98b
    • Georgiy Lebedev's avatar
      memtx: track read story when conflicting full scans due to gap write · 23b7d3cb
      Georgiy Lebedev authored
      When conflicting transactions that made full scans in
      `memtx_tx_handle_gap_write`, we need to also track that the conflicted
      transaction has read the inserted tuple, just like we do in gap tracking
      for ordered indexes — otherwise another transaction can overwrite the
      inserted tuple in which case no gap tracking will be handled.
      
      Closes #7493
      
      NO_DOC=bugfix
      
      (cherry picked from commit 7f52f445)
      23b7d3cb
  25. Sep 12, 2022
    • Vladimir Davydov's avatar
      Use MT-Safe strerror_r instead of strerror · 03ceaafc
      Vladimir Davydov authored
      strerror() is MT-Unsafe, because it uses a static buffer under the hood.
      We should use strerror_r() instead, which takes a user-provided buffer.
      The problem is there are two implementations of strerror_r(): XSI and
      GNU. The first one returns an error code and always writes the message
      to the beginning of the buffer while the second one returns a pointer to
      a location within the buffer where the message starts. Let's introduce a
      macro HAVE_STRERROR_R_GNU set if the GNU version is available and define
      tt_strerror() which writes the message to the static buffer, like
      tt_cstr() or tt_sprintf().
      
      Note, we have to export tt_strerror(), because it is used by Lua via
      FFI. We also need to make it available in the module API header, because
      the say_syserror() macro uses strerror() directly. In order to avoid
      adding tt_strerror() to the module API, we introduce an internal helper
      function _say_strerror(), which calls tt_strerror().
      
      NO_DOC=bug fix
      NO_TEST=code is covered by existing tests
      
      (cherry picked from commit 44f46dc8)
      03ceaafc
  26. Sep 09, 2022
    • Alexander Turenko's avatar
      popen: fix a race between setpgrp() and killpg() · 99040255
      Alexander Turenko authored
      In brief: `vfork()` on Mac OS 12 and newer doesn't suspend the parent
      process, so we should wait for `setpgrp()` to use `killpg()`. See more
      detailed description of the problem in a comment of the
      `popen_wait_group_leadership()` function.
      
      The solution is to spin in a loop and check child's process group. It
      looks as the most simple and direct solution. Other possible solutions
      requires to estimate cons and pros of using extra file descriptor or
      assigning a signal number for the child -> parent communication.
      
      There are the following alternatives and variations:
      
      * Create a pipe and notify the parent from the child about the
        `setpgrp()` call.
      
        It costs extra file descriptor, so I decided to don't do that.
        However if we'll need some channel to deliver information from the
        child to the parent for another task, it'll worth to reimplement this
        function too.
      
        One possible place, where we may need such channel is delivery of
        child's errors to the parent. Now the child writes them directly to
        logger's fd and it requires some tricky code to keep and close the
        descriptor at right points. Also it doesn't allow to catch those
        errors in the parent, but we may need it for #4925.
      * Notify the parent about `setpgrp()` using a signal.
      
        It seems too greedly to assign a specific signal for such local
        problem. It is also unclear how to guarantee that it'll not break any
        user's code: a user can load a dynamic library, which uses some
        signals on its own.
      
        However we can consider using this approach here if we'll design some
        common interprocess notification system.
      * We can use the fiber cond or the `popen_wait_timeout()` function from
        PR #7648 to react to the child termination instantly.
      
        It would complicate the code and anyway wouldn't allow to react
        instantly on `setpgrp()` in the child.
      
        Also it assumes yielding during the wait (see below).
      * Wait until `setpgrp()` in `popen_send_signal()` instead of
        `popen_new()`.
      
        It would add yielding/waiting inside `popen_send_signal()` and likely
        will extend a set of its possible exit situations. It is undesirable:
        this function should have simple and predictable behavior.
      * Finally, we considered yielding in `popen_wait_group_leadership()`
        instead of sleeping the whole tx thread.
      
        `<popen handle>:new()` doesn't yield at the moment and a user's code
        may lean on this fact.
      
        Yielding would allow to achieve better throughtput (amount of parallel
        requests per second), but we don't take much care to performance on
        Mac OS. The primary goal for this platform is to offer the same
        behavior as on Linux to allow development of applications.
      
      I didn't replace `vfork()` with `fork()` on Mac OS, because `vfork()`
      works and I don't know consequences of calling `pthread_atfork()`
      handlers in a child created by popen. See the comment in `popen_new()`
      near to `vfork()` call: it warns about possible mutex double locks. This
      topic will be investigated further in #6674.
      
      Fixes #7658
      
      NO_DOC=fixes incorrect behavior, no need to document the bug
      NO_TEST=already tested by app-tap/popen.test.lua
      
      (cherry picked from commit e2207fdc)
      99040255
Loading