Skip to content
Snippets Groups Projects
  1. Jan 10, 2024
    • Sergey Bronnikov's avatar
      httpc: fix a race in GC finalizers · 7f3ded43
      Sergey Bronnikov authored
      `httpc` module has two GC-finalizers: the first one for a Lua http
      client (C function `luaT_httpc_cleanup`) and the second one for a Lua
      http chunked requests (C function `luaT_httpc_io_cleanup`) introduced in
      commit 417c6cb7 ("httpc: introduce stream input/output interface").
      In a C implementation HTTP requests depends on structures of HTTP client
      and there is a problem with destroying Lua objects in `httpc` module -
      these GC-finalizers are not synchronized. This could lead to at least
      two problems:
      
      There is a race with GC-finalization that leads to use-after-free errors
      when HTTP client is collected before collecting HTTP request. In a
      stacktrace the problem looks as below:
      
      ```
      0x55ca7d47652e in crash_collect+256
      0x55ca7d476f6a in crash_signal_cb+100
      0x7fb876c42520 in __sigaction+80
      0x55ca7d641e51 in curl_slist_free_all+35
      0x55ca7d441498 in httpc_request_delete+45
      0x55ca7d4653f1 in httpc_io_destroy+27
      0x55ca7d4674bc in luaT_httpc_io_cleanup+36
      0x55ca7d4e00c7 in lj_BC_FUNCC+70
      0x55ca7d4f8364 in gc_call_finalizer+668
      0x55ca7d4f8946 in gc_finalize+1387
      0x55ca7d4f91e2 in gc_onestep+864
      0x55ca7d4f9716 in lj_gc_fullgc+276
      ...
      ```
      
      Lua object `http.client` could be GC-collected when chunked HTTP request
      is alive. This will lead to an error "IllegalParams: io: request must be
      io" because we call a method when Lua object is already a `nil`.
      
      ```lua
      local url = 'https://bronevichok.ru/'
      local c = require('http.client').new()
      local r = c:get(url, {chunked = true})
      c = nil
      collectgarbage()
      collectgarbage()
      r:read(1) -- IllegalParams: io: request must be io
      ```
      
      The patch introduces two functions: `httpc_env_finish` and
      `curl_env_finish`, that prepares curl and httpc environments for
      destruction. HTTP client's GC finalizer now calls `httpc_env_finish`
      instead of `httpc_env_destroy`, this prevents from destroying memory
      that could be in use by HTTP requests. Additionally `httpc_env_finish`
      sets a flag `cleanup`. HTTP environment destroying is called when flag
      `cleanup` is set and a there are no active HTTP requests. The main idea
      of the patch is a synchronization of destructors for HTTP client and
      HTTP chunked requests. Unfortunately, GC will eventually collect HTTP
      client object after calling its `__gc`. To prevent this we put a
      reference to a Curl's userdata in Lua objects with HTTP chunked requests
      and HTTP default client.
      
      Fixes #9346
      Fixes #9453
      
      NO_DOC=bugfix
      
      (cherry picked from commit 17e9c6ff)
      7f3ded43
    • Sergey Bronnikov's avatar
      httpc: fix a crash triggered by gc · 3c617ca9
      Sergey Bronnikov authored
      Bump curl version to 8.4.0 triggers a crash in Tarantool due to commit
      "h2: testcase and fix for pausing h2 streams" [1]. The original
      reproducer involves etcd and an etcd-client Lua module, running
      etcd-client tests as a part of Tarantool integration testing is planned
      to do in scope of [1].
      
      However, the problem could be reproduced with a Lua code below:
      
      ```
      local url = 'https://google.com/'
      
      local c = require('http.client').new()
      
      r1 = c:get(url, {chunked = true})
      r1:read(1)
      r2 = c:get(url, {chunked = true})
      r2:read(1)
      r3 = c:get(url, {chunked = true})
      r3:read(1)
      r4 = c:get(url, {chunked = true})
      r4:read(1)
      
      c = nil
      collectgarbage()
      collectgarbage()
      
      r1:read(1)
      r2:read(1)
      r3:read(1)
      r4:read(1)
      
      collectgarbage()
      collectgarbage()
      ```
      
      According to Curl documentation, `curl_multi_cleanup` [1] must be called
      before any easy handles are cleaned up. The patch adds a cleanup of easy
      handles on running `curl_env_destroy`, right before calling
      `curl_multi_cleanup`. The patch uses a function 'curl_multi_get_handles'
      that returns all added easy handles introduced in Curl 8.4.0. Therefore
      bump to 8.4.0 is required.
      
      1. https://github.com/curl/curl/commit/6b9a591bf7d82031f463373706d7de1cba0adee6
      2. https://curl.se/libcurl/c/curl_multi_cleanup.html
      
      Fixes #9283
      
      1. https://github.com/tarantool/tarantool/issues/9093
      
      NO_DOC=bugfix
      NO_TEST=no simple reproducer, covered by tests in etcd-client
      
      (cherry picked from commit c6e6dd93)
      3c617ca9
    • Sergey Bronnikov's avatar
      httpc: prefer curl headers in submodule by default · dfc46bc0
      Sergey Bronnikov authored
      FreeBSD instances in Tarantool CI have installed libcurl package (as a
      dependency of Zabbix monitoring agent). Curl 8.4.0 introduces a new
      function `curl_multi_get_handles` that is used in the following commit
      in `src/curl.c`, but libcurl system package has no such symbol in
      headers. On building on FreeBSD in Tarantool CI C compiler produces a
      warning about implicit declaration of function, because it looks at
      system headers by default and due to enabled CMake option
      `-DENABLE_WERROR=ON` building has failed:
      
      ```
      [ 63%] Building C object src/CMakeFiles/server.dir/title.c.o
      /.cache/act/55d136250dd94303/hostexecutor/src/curl.c:266:17: error: implicit declaration of function 'curl_multi_get_handles' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                      CURL **list = curl_multi_get_handles(env->multi);
                                    ^
      /.cache/act/55d136250dd94303/hostexecutor/src/curl.c:266:17: note: did you mean 'curl_multi_add_handle'?
      /usr/local/include/curl/multi.h:140:23: note: 'curl_multi_add_handle' declared here
      CURL_EXTERN CURLMcode curl_multi_add_handle(CURLM *multi_handle,
                            ^
      /.cache/act/55d136250dd94303/hostexecutor/src/curl.c:266:10: error: incompatible integer to pointer conversion initializing 'CURL **' (aka 'void **') with an expression of type 'int' [-Werror,-Wint-conversion]
                      CURL **list = curl_multi_get_handles(env->multi);
                             ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      2 errors generated.
      ```
      
      The patch fixes that by reordering headers passed to compiler, see [1].
      
      1. https://cmake.org/cmake/help/latest/command/include_directories.html
      
      Needed for #9283
      
      NO_CHANGELOG=build
      NO_DOC=build
      NO_TEST=build
      
      (cherry picked from commit 0a3500d3)
      dfc46bc0
    • Sergey Bronnikov's avatar
      tests: suppress message 'Broken pipe exception handling' · 91e53899
      Sergey Bronnikov authored
      Message below is printed every time on shutdown `httpd.py` when
      `test/app-luatest/http_client_test.lua` is running by luatest without
      capturing stdout:
      
      ```
      BrokenPipeError: [Errno 32] Broken pipe exception handling
      ```
      
      The patch suppress this exception by adding a handler for a signal
      `SIGPIPE`.
      
      NO_CHANGELOG=testing
      NO_DOC=testing
      NO_TEST=testing
      
      (cherry picked from commit 8912df25)
      91e53899
    • Sergey Bronnikov's avatar
      httpc: fix typos · 489d89a2
      Sergey Bronnikov authored
      NO_CHANGELOG=fixed typos
      NO_DOC=fixed typos
      NO_TEST=fixed typos
      
      (cherry picked from commit 2aaf0115)
      489d89a2
    • Sergey Bronnikov's avatar
      cmake: propagate debug mode to third party components · 32cc24f6
      Sergey Bronnikov authored
      The patch propagates debug mode to building of third party components:
      c-ares, libcurl, libeio, nghttp2, zstd. Other components enables debug
      mode automatically once it is enabled in Tarantool build.
      
      Curl has two similar options that enables debug mode, however they are
      different: `ENABLE_CURLDEBUG` enable memory debugging and `ENABLE_DEBUG`
      restricts code which is only compiled for debug enabled builds [1].
      
      1. https://everything.curl.dev/internals/memory-debugging
      
      NO_CHANGELOG=build
      NO_DOC=build
      NO_TEST=build
      
      (cherry picked from commit 3dbf19b6)
      32cc24f6
    • Sergey Bronnikov's avatar
      third_party: update libcurl from 8.3.0 to 8.4.0 · e2fcf100
      Sergey Bronnikov authored
      The patch updates curl module to the version 8.4.0 [1] that brings a
      number of functional fixes and security fix of SOCKS5 heap buffer
      overflow (CVE-2023-38545), see description in [2] and commit
      fb4415d8aee6 ("socks: return error if hostname too long for remote
      resolve") in [3].
      
      1. https://curl.se/changes.html#8_4_0
      2. https://curl.se/docs/CVE-2023-38545.html
      3. https://github.com/curl/curl/commit/fb4415d8aee6c1045be932a34fe6107c2f5ed147
      
      NO_DOC=libcurl submodule bump
      NO_TEST=libcurl submodule bump
      
      (cherry picked from commit ee575fef)
      e2fcf100
  2. Jan 09, 2024
    • Alexander Turenko's avatar
      test: allow to quote CLI arguments in justrun · bbbfb7d2
      Alexander Turenko authored
      Sometimes shell quoting is needed in tests to trigger a validation
      error. For example, if the argument is empty or contains whitespaces.
      
      Left the default unchanged to don't affect existing tests.
      
      Part of #8862
      
      NO_DOC=testing helper change
      NO_CHANGELOG=see NO_DOC
      NO_TEST=see NO_DOC
      
      (cherry picked from commit 761273f2)
      bbbfb7d2
    • Gleb Kashkin's avatar
      test: make treegen.clean more durable · 635b5a1d
      Gleb Kashkin authored
      Usually treegen.clean is called after a test by g.after_all function
      or an equivalent. In some rare cases internal helpers use their own
      treegen and clean up after themself. In such a case, treegen.clean
      would look for an internal list of all directories and find nil. This
      causes an error in ipairs iteration in internal logic and fails the
      whole test.
      
      This patch adds minor durability improvement for such a case. Now if
      internal list of all directories is nil (e.g. when treegen.clean was
      called beforehand), the function does nothing.
      
      Part of #8967
      
      NO_DOC=test helper update
      NO_CHANGELOG=see NO_DOC
      NO_TEST=see NO_DOC
      
      (cherry picked from commit 9b0896d9)
      635b5a1d
    • Nikolay Shirokovskiy's avatar
      test: reset readline configuration for justrun too · 7e769d99
      Nikolay Shirokovskiy authored
      This fixes gh_8613_new_cli_behaviour_test run with my custom .inputrc
      (I use vi-cmd-mode-string/vi-ins-mode-string).
      
      We already reset readline configuration in interactive_tarantool.lua.
      
      Follows up ground works done for #7774
      
      NO_DOC=test harness
      NO_TEST=test harness
      NO_CHANGELOG=test harness
      
      (cherry picked from commit 028c65e0)
      7e769d99
    • Alexander Turenko's avatar
      test: adjust treegen for TMPDIR ending with slash · 659a646b
      Alexander Turenko authored
      Our macOS runners have such a TMPDIR value. It breaks
      `config-luatest/basic_test.lua`, because net.box seems unable to connect
      a Unix domain socket using an URI with a double slash in the middle.
      
      See the comment in the code for details.
      
      NO_DOC=testing helper change
      NO_CHANGELOG=see NO_DOC
      NO_TEST=see NO_DOC
      
      (cherry picked from commit 8c3b4c08)
      659a646b
    • Alexander Turenko's avatar
      test: return a dir from treegen.prepare_directory() · db86ddaf
      Alexander Turenko authored
      It makes the helper a bit more convenient to use it tests.
      
      NO_DOC=testing helper
      NO_TEST=see NO_DOC
      NO_CHANGELOG=see NO_DOC
      
      (cherry picked from commit f81b1aac)
      db86ddaf
  3. Dec 29, 2023
  4. Dec 28, 2023
    • Alexander Turenko's avatar
      box: support TT_* uri env vars with query params · aadec75d
      Alexander Turenko authored
      `TT_LISTEN` and `TT_REPLICATION` environment variables were interpreted
      by `box.cfg()` in a confusing way if query parameters with values are
      present. For example, `localhost:3301?transport=plain` was interpreted
      as the following map: `{['localhost:3301?transport'] = 'plain'}`. Later,
      `box.cfg()` looks into this map for known URI fields like `login`,
      `password`, `uri`, `host`, `service` and so on. It found nothing and
      doesn't start a listening socket.
      
      The reason of such a behaviour is that the environment value is
      interpreted as a mapping in the `key=value,key=value` format, because
      there is `=` in it.
      
      The patch changes this behavior for an `key=value,key=value` environment
      variable that contains `?` in a key: now such a value is not interpreted
      as a mapping.
      
      Note: Everything said above is also applicable to the so called
      multilisten case: when several URIs are defined in the environment
      variable. The following URI list is interpreted correctly now.
      
      NOWRAP
      ```sh
      export TT_LISTEN=localhost:3301?transport=plain,localhost:3302?transport=plain
      ```
      NOWRAP
      
      Note 2: Examples are given with the `plain` transport, which is default,
      but the query parameters are the way to define TLS options. They're
      supported in Tarantool Enterprise Edition, see [1].
      
      Fixes #9539
      
      NO_DOC=bugfix
      
      [1]: https://www.tarantool.io/en/doc/latest/enterprise/security/#traffic-encryption
      
      (cherry picked from commit dde7342c)
      aadec75d
  5. Dec 22, 2023
    • Alexander Turenko's avatar
      doc: drop 'breaking change' mark from 2.11.2 notes · d0fbb0ca
      Alexander Turenko authored
      The release notes states one change as breaking, but the new behavior
      was enabled on 3.0 branch, see commit 6cb39116 ("box: set default
      c_func_iproto_multireturn to new"), and not enabled on 2.11 branch.
      
      Thanks @mons for the notice!
      
      NO_DOC=no code changes
      NO_CHANGELOG=see NO_DOC
      NO_TEST=see NO_DOC
      d0fbb0ca
    • Mergen Imeev's avatar
      sql: properly check result of decimal parsing · 450da026
      Mergen Imeev authored
      This patch fixes a crash that can occur when SQL parses a decimal
      literal that represents a number greater than or equal to 10^38.
      
      Closes #9469
      
      NO_DOC=bugfix
      
      (cherry picked from commit 05551a55)
      450da026
    • Astronomax's avatar
      box: fix failing assertion in box_promote_qsync · d9608c99
      Astronomax authored
      Fixed a bug when the assertion in box_promote_qsync would fail.
      The assertion is that at the moment when box_promote_qsync is
      executed, no other promote is executed. It turned out that this
      assertion is basically incorrect. Now after this patch the newly
      elected leader is trying to repeat box_promote_qsync in
      box_raft_update_synchro_queue until it fails due to the fact
      that some other promotion is currently being executed.
      
      Closes #9263
      
      NO_DOC=bugfix
      
      (cherry picked from commit ebe4cd9b)
      d9608c99
    • Maksim Kokryashkin's avatar
      ci: fix action for submodule bump · f9ccb414
      Maksim Kokryashkin authored
      It turns out, GitHub actions don't allow `env` usage in their
      definition. This patch fixes this issue in submodule bump action
      by moving the environment definition into the executed shell
      script.
      
      NO_DOC=CI
      NO_TEST=CI
      NO_CHANGELOG=CI
      f9ccb414
    • Maksim Kokryashkin's avatar
      ci: add optional submodule bump step · e2f2db07
      Maksim Kokryashkin authored
      Currently, if there is a need to test submodule integration with
      Tarantool and its integration, it is required to create a PR.
      That is inconvenient, so this patch introduces the option to run
      the same jobs that are triggered by the `full-ci` label as
      reusable workflows with the desired submodule revision. This
      allows for integration testing of submodules within their
      designated repositories.
      
      NO_DOC=CI
      NO_TEST=CI
      NO_CHANGELOG=CI
      e2f2db07
  6. Dec 21, 2023
    • Igor Munkin's avatar
      luajit: bump new version · f8fbfa4f
      Igor Munkin authored
      * FFI: Fix dangling reference to CType in carith_checkarg().
      * FFI: Fix dangling reference to CType. Improve checks.
      * FFI: Fix dangling reference to CType.
      * FFI: Ensure returned string is alive in ffi.typeinfo().
      * FFI: Fix missing cts->L initialization in argv2ctype().
      * Abstract out on-demand loading of FFI library.
      * test: fix flaky finalizer error handler tests
      * test: adjust lua-Harness test error assertion
      * Fix snapshot PC when linking to BC_JLOOP that was a BC_RET*.
      * snap: check J->pc is within its proto bytecode
      * Fix HREFK forwarding vs. table.clear().
      * Fix FOLD rule for BUFHDR append.
      * Prevent CSE of a REF_BASE operand across IR_RETF.
      * test: rewrite sysprof test using managed execution
      * test: disable buffering for the C test engine
      
      Part of #9145
      
      NO_DOC=LuaJIT submodule bump
      NO_TEST=LuaJIT submodule bump
      f8fbfa4f
  7. Dec 19, 2023
    • Mergen Imeev's avatar
      sql: fix memleak in sqlSelect · 34335ad0
      Mergen Imeev authored
      This patch fixes a memory leak in sqlSelect that was caused by pWInfo
      not being removed if an error occurred while encoding a GROUP BY
      expression.
      
      Closes #8535
      Closes tarantool/security#125
      
      NO_DOC=memleak
      NO_TEST=memleak
      
      (cherry picked from commit 832fea92)
      34335ad0
  8. Dec 14, 2023
    • Nikolay Shirokovskiy's avatar
      replication: don't rollback qsync limbo wait on fiber cancel · 9ce539d8
      Nikolay Shirokovskiy authored
      During iproto graceful shutdown which is WIP we cancel all iproto
      request in progress. This causes election_qsync_stress test failure.
      
      We shutdown master on waiting transaction confirmation from quorum
      (which is never exist in this test). Currently on shutdown we rollback
      transaction in this state. So that when previous master is restarted
      after electing new master we don't expect the rollback on previous
      master.
      
      Let's keep the transaction in limbo if fiber is cancelled as our
      direction is to do only quorum rollbacks.
      
      Part of #8423
      Closes #9480
      
      NO_DOC=bugfix
      
      (cherry picked from commit 7a2bc0bb)
      9ce539d8
  9. Dec 13, 2023
  10. Dec 12, 2023
    • Alexander Turenko's avatar
      test: bump test-run with luatest update to 1.0.0-3 · 9775c9e5
      Alexander Turenko authored
      This commit updates test-run and the only change in test-run is a bunch
      of luatest updates. The list of luatest updates can be found in
      tarantool/test-run#415 or below.
      
      - assertions: Improved error message for one assert function [1]
      - TAP output: add missing tabulation to artifacts [2]
      - utils: add `version_current_ge_than()` [3]
      - server: fix unix socket path length check [4]
      - server: accept `new_box_uri` as a table [5]
      
      The list excludes changes that are not related to test-run's usage:
      documentation, testing of luatest itself, packaging of luatest and so
      on.
      
      [1]: tarantool/luatest@2a26c32
      [2]: tarantool/luatest@5e8c3e3
      [3]: tarantool/luatest@7b6f167
      [4]: tarantool/luatest@a8b0389
      [5]: tarantool/luatest@f37b353
      
      NO_DOC=testing framework update
      NO_CHANGELOG=see NO_DOC
      NO_TEST=see NO_DOC
      
      (cherry picked from commit 161ca17b)
      9775c9e5
    • Serge Petrenko's avatar
      Dummy commit · 07391b19
      Serge Petrenko authored
      07391b19
  11. Dec 07, 2023
  12. Dec 05, 2023
    • Sergey Kaplun's avatar
      lua: prevent serialization of error for ucdata · 074fe0bf
      Sergey Kaplun authored
      Without checking the return value of lua_pcall()` in
      `lua_field_inspect_ucdata()`, the error message itself is returned as a
      serialized result. The result status of `lua_pcall()` is not ignored
      now.
      
      NO_DOC=bugfix
      
      Closes #9396
      
      (cherry picked from commit 98474f70)
      074fe0bf
    • Maxim Kokryashkin's avatar
      build: purge sysprof.collapse module · 2e9d205b
      Maxim Kokryashkin authored
      This module became unused as a result of LuaJIT bump made in the
      commit 88333d13 ("luajit: bump new version"), so it can be
      purged safely from the Tarantool sources.
      
      Part of #8700
      
      NO_DOC=internal
      NO_TEST=internal
      NO_CHANGELOG=added within the aforementioned commit
      
      (cherry picked from commit e2851883)
      2e9d205b
  13. Dec 02, 2023
    • Serge Petrenko's avatar
      replication: fix extraneous split-brain alerting · 718aeb14
      Serge Petrenko authored
      Current split-brain detector implementation raises an error each time a
      CONFIRM or ROLLBACK entry is received from the previous synchronous
      transaction queue owner. It is assumed that the new queue owner must
      have witnessed all the previous CONFIRMS. Besides, according to Raft,
      ROLLBACK should never happen.
      
      Actually there is a case when a CONFIRM from an old term is legal: it's
      possible that during leader transition old leader writes a CONFIRM for
      the same transaction that is confirmed by the new leader's PROMOTE. If
      PROMOTE and CONFIRM lsns match there is nothing bad about such
      situation.
      
      Symmetrically, when an old leader issues a ROLLBACK with the lsn right
      after the new leader's PROMOTE lsn, it is not a split-brain.
      
      Allow such cases by tracking the last confirmed lsn for each synchronous
      transaction queue owner and silently nopifying CONFIRMs with an lsn less
      than the one recorded and ROLLBACKs with lsn greater than that.
      
      Closes #9138
      
      NO_DOC=bugfix
      
      (cherry picked from commit ffa6ac15)
      718aeb14
    • Serge Petrenko's avatar
      replication: persist confirmed vclock on replicas · bcbe9232
      Serge Petrenko authored
      Previously the replicas only persisted the confirmed lsn of the current
      synchronous transaction queue owner. As soon as the onwer changed, the
      info about which lsn was confirmed by the previous owner was lost.
      
      Actually, this info is needed to correctly filter synchro requests
      coming from the old term, so start tracking confirmed vclock instead of
      the confirmed lsn on replicas.
      
      In-scope of #9138
      
      NO_TEST=covered by the next commit
      NO_CHANGELOG=internal change
      
      @TarantoolBot document
      Title: Document new IPROTO_RAFT_PROMOTE request field
      
      IPROTO_RAFT_PROMOTE and IPROTO_RAFT_DEMOTE requests receive a new key
      value pair:
      
      IPROTO_VCLOCK : MP_MAP
      
      The vclock holds a confirmed vclock of the node sending the request.
      
      (cherry picked from commit c4415d44)
      bcbe9232
    • Serge Petrenko's avatar
      xrow: remove SYNCHRO_BODY_LEN_MAX constant · 8d457af4
      Serge Petrenko authored
      Synchronous requests will receive a new field encoding a full vclock
      soon. Theoretically a vclock may take up to ~ 300-400 bytes (3 bytes for
      a map header + 32 components each taking up 1 byte for replica id and up
      to 9 bytes for lsn). So it makes no sense to increase
      SYNCHRO_BODY_LEN_MAX from 32 to 400-500. It would become almost the same
      as plain BODY_LEN_MAX. Simply reuse the latter everywhere.
      
      In-scope-of #9138
      
      NO_DOC=refactoring
      NO_TEST=refactoring
      NO_CHANGELOG=refactoring
      
      (cherry picked from commit 53605779)
      8d457af4
    • Serge Petrenko's avatar
      xrow: fix xrow_decode_synchro rejecting non-int types · 77853bef
      Serge Petrenko authored
      There was an error in xrow_decode_synchro: it compared the expected type
      of the value to the type of the key (MP_UINT) instead of the type of the
      actual value. This went unnoticed because all values in synchro requests
      were integers.
      
      This is going to change soon, when PROMOTE requests will start holding a
      vclock, so fix the wrong type check.
      
      In-scope-of #9138
      
      NO_DOC=bugfix
      NO_CHANGELOG=not user-visible
      
      (cherry picked from commit c18410f5)
      77853bef
  14. Nov 28, 2023
Loading