- Jan 29, 2025
-
-
Dmitry Ivanov authored
Repro: ```lua box.cfg {} box.ctl.wait_rw() box.execute([[ WITH q(a) AS (VALUES (1)) SELECT AVG(a), ? AS c1 FROM q HAVING 1 = AVG(a) ]], {100}) ``` Output: ``` Please file a bug at https://github.com/tarantool/tarantool/issues Attempting backtrace... Note: since the server has already crashed, this may fail as well 0x609a4705960a in luaT_pushdecimal+33530 0x70f7f904c1d0 in __sigaction+80 0x609a47230e15 in ibuf_reserve_slow+383333 0x609a46fd8e37 in box_decimal_mp_decode_data+363175 0x609a46fdcca1 in box_decimal_mp_decode_data+379153 0x609a46f80ce7 in box_decimal_mp_decode_data+2391 0x609a4701f167 in luaT_pushtuple+147927 0x609a4724a283 in ibuf_reserve_slow+486867 0x609a47092437 in lua_pcall+119 0x609a4702b4a3 in tarantool_lua_slab_cache+37907 0x609a4702d2a2 in luaL_setcdatagc+3810 0x609a46e6ee40 in ??+0 0x609a47061117 in fiber_self+1591 0x609a471d9129 in ibuf_reserve_slow+23673 ``` NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Jan 28, 2025
-
-
Dmitry Ivanov authored
Split-brain detector might be triggered by complicated online cluster upgrades (e.g. "the quorum promote patch"), so we don't want to write inconsistent state to disk. If there's no inconsistent snapshot, the problem may go away once we restart the node with a newer version of Tarantool. NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
This patch adds a new control knob to let the user temporarily disable Tarantool's ability to create snapshots. This might come in useful when performing an online cluster upgrade which is known to bring nodes into an inconsistent state, e.g. "the quorum promote patch". Before upgrading, the user is supposed to run ``` box.cfg{checkpoint_enabled = false} ``` to temporarily disable both scheduled and manual snapshots. NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
ssl-luatest/replication_test.lua used to fail because there was no error set in diag. Actually, the error was set, but in another fiber. The erroneous scenario was as follows: 1. A fatal error occurs in one fiber, this fiber poisons iostream with SSL_IOSTREAM_POISON flag and reports the error using diag_set. This operation fails with an error reported. 2. Another fiber starts a new operation and discovers that SSL_IOSTREAM_POISON is set, so it returns IOSTREAM_ERROR early. However, the error was set only in another fiber, not in this one (errors are fiber local), so diag_raise fails on the assertion. This commit resolves this problem by removing poisoning logic. If a fatal error occurred, further errors will be reported by opensll, not due to the flag. It also adds a new flag SSL_SHUTDOWN_MUST_NOT_BE_CALLED, which purpose is described in its comment. Note that the same strategy is used in rust-openssl: https://github.com/sfackler/rust-openssl/issues/2334 Closes picodata#890. NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Jan 21, 2025
-
-
Dmitry Ivanov authored
Previously, we'd preserve WAL directory lock during exec(), meaning that stray child processes would not let us restart the instance: ``` E> ER_ALREADY_RUNNING: Failed to lock WAL directory /data and hot_standby mode is off F> can't initialize storage: Failed to lock WAL directory /data and hot_standby mode is off ``` This patch fixes that by setting O_CLOEXEC for the fd used in flock(). NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Jan 15, 2025
-
-
Виталий Шунков authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Jan 10, 2025
-
-
Виталий Шунков authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Dec 26, 2024
-
-
Виталий Шунков authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Dec 25, 2024
-
-
Виталий Шунков authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Виталий Шунков authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- Dec 11, 2024
-
-
Dmitry Ivanov authored
The bug was discovered during the development of the quorum promote feature -- basically, it causes qpromote_several_outstanding_promotes_test.lua to fail spontaneously. Furthermore, we suspect that this is the underlying cause of lost heartbeats and subsequent severe replication lags. Long story short, whenever we want to send a message from the TX thread back to a relay thread, we should first check if they are still connected. Otherwise, we'll see * An assertion failure in debug, or * (Presumably) a relay hangup in release due to `if (status_msg->msg.route != NULL) return;` in relay_status_update() -> relay_check_status_needs_update(). The upstream is aware of this issue: https://github.com/tarantool/tarantool/issues/9920 Backtrace: ``` __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 0x00007f891f6a5463 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 0x00007f891f64c120 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 0x00007f891f6334c3 in __GI_abort () at abort.c:79 0x00007f891f6333df in __assert_fail_base (fmt=0x7f891f7c3c20 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x58560077b94b "loop() == pipe->producer", file=file@entry=0x58560077b935 "./src/lib/core/cbus.h", line=line@entry=224, function=function@entry=0x58560077b910 "void cpipe_push_input(cpipe*, cmsg*)") at assert.c:94 0x00007f891f644177 in __assert_fail (assertion=0x58560077b94b "loop() == pipe->producer", file=0x58560077b935 "./src/lib/core/cbus.h", line=224, function=0x58560077b910 "void cpipe_push_input(cpipe*, cmsg*)") at assert.c:103 0x0000585600328782 in cpipe_push_input (pipe=0x58563115dfc8, msg=0x58563115e028) at tarantool/src/lib/core/cbus.h:224 0x0000585600328802 in cpipe_push (pipe=0x58563115dfc8, msg=0x58563115e028) at tarantool/src/lib/core/cbus.h:241 0x000058560032a157 in tx_status_update (msg=0x58563115e028) at tarantool/src/box/relay.cc:629 0x0000585600468d4b in cmsg_deliver (msg=0x58563115e028) at tarantool/src/lib/core/cbus.c:553 0x0000585600469e50 in fiber_pool_f (ap=0x7f891e8129a8) at tarantool/src/lib/core/fiber_pool.c:64 0x00005856001be16a in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=0x585600469b82 <fiber_pool_f>, ap=0x7f891e8129a8) at tarantool/src/lib/core/fiber.h:1283 0x000058560045f495 in fiber_loop (data=0x0) at tarantool/src/lib/core/fiber.c:1085 0x0000585600745b8f in coro_init () at tarantool/third_party/coro/coro.c:108 ``` Relevant frames: ``` 0x0000585600328782 in cpipe_push_input (pipe=0x58563115dfc8, msg=0x58563115e028) at tarantool/src/lib/core/cbus.h:224 224 assert(loop() == pipe->producer); (gdb) 0x0000585600328802 in cpipe_push (pipe=0x58563115dfc8, msg=0x58563115e028) at tarantool/src/lib/core/cbus.h:241 241 cpipe_push_input(pipe, msg); (gdb) 0x000058560032a157 in tx_status_update (msg=0x58563115e028) at tarantool/src/box/relay.cc:629 629 cpipe_push(&status->relay->relay_pipe, msg); (gdb) 0x0000585600468d4b in cmsg_deliver (msg=0x58563115e028) at tarantool/src/lib/core/cbus.c:553 553 msg->hop->f(msg); (gdb) 0x0000585600469e50 in fiber_pool_f (ap=0x7f891e8129a8) at tarantool/src/lib/core/fiber_pool.c:64 64 cmsg_deliver(msg); (gdb) ``` NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
Previously, log_destroy would close log->fd even if it's one of the standard streams. This behavior almost never makes sense, unless one's trying to write a unix daemon which is obviously not the case. Closing stderr has the following side effects: - it breaks a reinit of the default logger; - it inhibits ASan's final leak report; - it causes a EPOLLHUP during a restart via `--entrypoint-fd` (Picodata). NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
This stores all of the include directories of box into a file called box-include-args. We use this to generate certain FFI bindings using rust-bindgen. NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
Apparently, clang 18 is not particularly happy about luajit's static assert implementation: ``` In file included from tarantool/src/box/box.cc:38: In file included from tarantool/src/lua/utils.h:47: In file included from tarantool/third_party/luajit/src/lj_state.h:9: tarantool/third_party/luajit/src/lj_obj.h:488:1: error: variable length arrays in C++ are a Clang extension; did you mean to use 'static_assert'? [-Werror,-Wvla-extension-static-assert] 488 | LJ_STATIC_ASSERT(offsetof(Node, val) == 0); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tarantool/third_party/luajit/src/lj_def.h:370:71: note: expanded from macro 'LJ_STATIC_ASSERT' 370 | extern void LJ_ASSERT_NAME(__COUNTER__)(int STATIC_ASSERTION_FAILED[(cond)?1:-1]) | ^~~~~~~~~~~ tarantool/third_party/luajit/src/lj_obj.h:488:18: note: cast that performs the conversions of a reinterpret_cast is not allowed in a constant expression 488 | LJ_STATIC_ASSERT(offsetof(Node, val) == 0); | ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ tarantool/src/trivia/util.h:273:33: note: expanded from macro 'offsetof' 273 | #define offsetof(type, member) ((size_t) &((type *)0)->member) | ^ tarantool/third_party/luajit/src/lj_def.h:370:72: note: expanded from macro 'LJ_STATIC_ASSERT' 370 | extern void LJ_ASSERT_NAME(__COUNTER__)(int STATIC_ASSERTION_FAILED[(cond)?1:-1]) | ``` Luckily, we can just mute this. NO_DOC=<nothing interesting here> NO_TEST=<tested during build time> NO_CHANGELOG=<nothing interesting here>
-
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Picodata SQL now manages the Tarantool statement cache using a dedicated SQL fiber that handles preparation and unpreparation of statements based on an LRU eviction policy. Prepared statements can be executed across different sessions by SQL clients. Previously, when a client executed a prepared statement, it increased the reference count in the statement cache and linked the statement to the client's session. While this approach seemed fine, it caused issues during eviction, as references to these statements remained in client sessions, preventing proper eviction. This commit addresses the issue by ensuring that if a statement is added to the current session during execution, it is removed and the session state is restored once execution is complete. NO_DOC=internal NO_CHANGELOG=internal
-
Temporary spaces, used for cluster-wide SQL data materialization, were causing unnecessary netbox schema version bumps, leading to schema downloading via netbox, excessive Lua garbage and GC blocks. Since these tables are for internal SQL use, we don't need to inform netbox clients about schema changes. We now maintain separate schema versions: one for netbox clients and one for the internal prepared statement cache. NO_DOC=picodata internal patch NO_CHANGELOG=picodata internal patch
-
Dmitry Ivanov authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
* smoke test that verifies that cluster can be successfully bootstrapped * test diverging limbo owner (litmus test) - doesn't work on fork master, works with promote chains * test aba leader - two cases where elected leader fails to deliver promote to others * attempt to model 2+2 quorum lowering case * two cases with two outstanding promotes * use new injections for particular xrow types NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal Co-authored-by:
Dmitry Ivanov <ivadmi5@gmail.com>
-
Dmitry Ivanov authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
Further reading: * https://git.picodata.io/picodata/tarantool/-/merge_requests/175 * https://github.com/tarantool/tarantool/pull/10334 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal Co-authored-by:
Dmitry Ivanov <ivadmi5@gmail.com>
-
- If you prepare and execute statement with params in the projection and then unprepare the statement, byte counter may show the wrong value or even overflow. - The problem is that when we compile sql statement, we set parameter type to 'any'. But when we execute the statement we set parameter type to actual type. Then we use this type in calculation of estimated of sql cache entry size. This leads to different estimated sizes of cache entry during prepare and during unprepare after statement was executed - Fix this by resetting type to 'any' after executing the statement NO_DOC=picodata internal patch NO_CHANGELOG=picodata internal patch
-
BREAKING CHANGE!: 1. add session id argument to sql_prepare_ext 2. introduce sql_unprepare_ext function. This function removes prepared stmt using given session id. In picodata SQL, we may prepare stmt in one session and unprepare it in some other session, which does not know in what session the statement was prepared. Now sql_prepare_ext returns not only statement id, but also a session id. This way statement can be unprepared from other session using sql_unprepare_ext. NO_DOC=picodata internal patch NO_CHANGELOG=picodata internal patch
-
Dmitry Ivanov authored
NO_DOC=internal NO_CHANGELOG=internal NO_TEST=internal
-
Replace fiber.sleep with luatest.helpers.retrying to make tests less flaky NO_DOC=internal NO_CHANGELOG=internal
-
WAL extensions allows to add auxiliary information to each write-ahead log record. WAL extensions configured by `box.cfg.wal_ext` option. Currently, there is only one builtin extension: `new_old`. `new_old` extension add information about new and old tuples for ddl operations. NO_DOC=internal NO_CHANGELOG=internal
-
BREAKING CHANGE!: 1. remove sql_prepare from the export list; 2. introduce sql_prepare_ext. The sql_prepare symbol previously included the tarantool port as an output parameter. However, this structure was inconvenient for libraries using the C API, as they primarily required just the statement ID. To address this issue, the sql_prepare symbol was replaced with the sql_prepare_ext symbol. NO_DOC=picodata internal patch NO_CHANGELOG=picodata internal patch
-
Previously, users with multiple connections to tarantool instance couldn't share prepared statements across sessions. They had to manually call prepare in each session before execution. This commit automates this process for the exported version of SQL prepared statement execution (sql_execute_prepared_ext symbol). Original Lua execution keeps the old behavior for backward compatibility. NO_DOC=picodata internal patch NO_CHANGELOG=picodata internal patch NO_TEST=picodata internal patch
-
Previously, sql_prepare_and_execute and sql_execute_prepared functions didn't follow a convention to keep output parameters (the port to be exact) at the end of the argument list. NO_DOC=picodata internal patch NO_CHANGELOG=picodata internal patch NO_TEST=picodata internal patch
-
BREAKING CHANGE!: 1. sql_bind_list_decode - removed 2. sql_execute_prepared_ext - new arguments 3. sql_prepare_and_execute_ext - exported There were several reasons to refactor the API. 1. sql_bind_list_decode (decodes message pack parameters into internal C bind structure) is very difficult to use without memory leaks (as it allocates results on fiber()->gc). 2. sql_execute_prepared_ext missed vdbe step limit in parameters and used the default value. 3. Sometimes SQL queries don't fit into prepared statement cache and the user still wants to execute them via a slow pass with full compilations from the query text. That was the reason to export sql_prepare_and_execute_ext symbol. NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Dmitry Ivanov authored
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal The following commit introduced a tautological if expression: ```gitcommit sql: introduce structs assembling DDL arguments during parsing (ba56b145fafaa3) ``` Due to the changes in commit ```gitcommit sql: refactor memory allocation system (cb777136dd7a0c) ``` the allocations in sql expressions became infallible, which means that we may safely fix static analysis warnings by dropping the tautological comparison altogether. Original patch by Feodor Alexandrov.
-
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
Also remove CI pipeline for fuzz_until, as running it in CI is not planned NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-
- add cli flag whether to wait for 2x coverage - add timestamps to log lines - remove dictionary passing as it is not needed when corpus already exist NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
-