- Mar 25, 2021
-
-
HustonMmmavr authored
* Remove unnecessary `#include "tt_static.h"` from src/ssl_cert_paths_discover.c * Fix typo at test/app-tap/ssl-cert-paths-discover.test.lua call `os.exit` instead of `os:exit` A follow up on #5615
-
Sergey Ostanevich authored
Resolves #5857 Reviewed-by:
Igor Munkin <imun@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
- Mar 24, 2021
-
-
Iskander Sagitov authored
Found that in case of exiting the rope_insert function with an error some nodes are created but not deleted. This commit fixes it and adds the test. Test checks that in case of this error the number of allocated nodes and the number of freed nodes are the same. Closes #5788
-
Vladislav Shpilevoy authored
Lua buffer module used to have a couple of preallocated objects of type 'union c_register'. It was a bunch of C scalar and array types intended for use instead of ffi.new() where it was needed to allocate a temporary object like 'int[1]' just to be able to pass 'int *' into a C function via FFI. It was a bit faster than ffi.new() even for small sizes. For instance (when JIT works), getting a register to use it as 'int[1]' cost around 0.2-0.3 ns while ffi.new('int[1]') costs around 0.4 ns. Also the code looked cleaner. But Lua registers were global and therefore had the same issue as IBUF_SHARED and static_alloc() in Lua - no ownership, and sudden reuse when GC starts right the register is still in use in some Lua code. __gc handlers could wipe the register values making the original code behave unpredictably. IBUF_SHARED was fixed by proper ownership implementation, but it is not necessary with Lua registers. It could be done with the buffer.ffi_stash_new() feature, but its performance is about 0.8 ns which is worse than plain ffi.new() for simple scalar types. This patch eliminates Lua registers, and uses ffi.new() instead everywhere. Closes #5632
-
Vladislav Shpilevoy authored
sio_strfaddr() can't be used in the places where static buffer is not acceptable - in any code which wants to push the value to Lua, or the address string must be long living. The patch introduces sio_snprintf(), which does the same, but saves the result into a provided buffer with a limited size. In the Lua C code the patch saves the address string on the stack which makes it safe against Lua GC interruptions. Part of #5632
-
Vladislav Shpilevoy authored
It was 32, and couldn't fit long IPv6 and Unix socket addresses. The patch makes it 200 so now it fits any supported addresses family used in the code. Having SERVICE_NAME_MAXLEN valid is necessary to be able to save a complete address string on the stack in the places where the static buffer returned by sio_strfaddr() can't be used safely. For instance, in the code working with Lua due to Lua GC which might be invoked any moment and in a __gc handler could overwrite the static buffer. Needed for #5632
-
Vladislav Shpilevoy authored
The function was overcomplicated, and made it harder to update it in the next patches with functional changes. The main source of the complication was usage of both inet_ntoa() and getnameinfo(). The latter is more universal, it can cover the case of the former. The patch makes it use only getnameinfo() for IP addresses regardless of v4 or v6. Needed for #5632
-
Vladislav Shpilevoy authored
In a few places to push a formatted string was used 2 calls: tt_sprintf() + lua_pushstring(). It wasn't necessary because Lua API has lua_pushfstring() with a big enough subset of printf format features. But more importantly - it was a bug. lua_pushstring() is a GC point. Before copying the passed string it tries to invoke Lua GC, which might invoke a __gc handler for some cdata, where static alloc might be used, and it can rewrite the string passed to lua_pushstring() in the beginning of the stack. Part of #5632
-
Vladislav Shpilevoy authored
Static_alloc() uses a fixed size circular BSS memory buffer. It is often used in C when need to allocate something of a size smaller than the static buffer temporarily. And it was thought that it might be also useful in Lua when backed up by ffi.new() for large allocations. It was useful, and faster than ffi.new() on sizes > 128 and less than the static buffer size, but it wasn't correct to use it. By the same reason why IBUF_SHARED global variable should not have been used as is. Because without a proper ownership the buffer might be reused in some unexpected way. Just like with IBUF_SHARED, the static buffer could be reused during Lua GC in one of __gc handlers. Essentially, at any moment on almost any line of a Lua script. IBUF_SHARED was fixed by proper ownership implementation, but it is not possible with the static buffer. Because there is no such a thing like a static buffer object which can be owned, and even if there would be, cost of its support wouldn't be much better than for the new cord_ibuf API. That would make the static buffer close to pointless. This patch eliminates static_alloc() from Lua, and uses cord_ibuf instead almost everywhere except a couple of places where ffi.new() is good enough. Part of #5632
-
Vladislav Shpilevoy authored
static_alloc() appears not to be safe to use in Lua, because it does not provide any ownership protection for the returned values. The problem appears when something is allocated, then Lua GC starts, and some __gc handlers might also use static_alloc(). In Lua and in C - both lead to the buffer being corrupted in its original usage place. The patch is a part of activity of getting rid of static_alloc() in Lua. It removes it from uri Lua module and makes it use the new FFI stash feature, which helps to cache frequently used and heavy to allocate FFI values. In one place static_alloc() was used for an actual buffer - it was replaced with cord_ibuf which is equally fast when preallocated. ffi.new() for temporary struct uri is not used, because - It produces a new GC object; - ffi.new('struct uri') costs around 20ns while FFI stash costs around 0.8ns. The hack with 'struct uri[1]' does not help because size of uri is > 128 bytes; - Without JIT ffi.new() costs about the same as the stash, not better as well; The patch makes uri perf a bit better in the places where static_alloc() was used, because its cost was around 7ns for one allocation.
-
Vladislav Shpilevoy authored
The function converts struct tt_uuid * to a string. The string is allocated on the static buffer, which can't be used in Lua due to unpredictable GC behaviour. It can start working any moment even if tt_uuid_str() has returned, but its result wasn't passed to ffi.string() yet. Then the buffer might be overwritten. Lua uuid now uses tt_uuid_to_string() which does the same but takes the buffer pointer. The buffer is stored in an ffi stash, because it is x4 times faster than ffi.new('char[37]') (where 37 is length of a UUID string + terminating 0) (2.4 ns vs 0.8 ns). After this patch UUID is supposed to be fully compatible with Lua GC handlers. Part of #5632
-
Vladislav Shpilevoy authored
static_alloc() appears not to be safe to use in Lua, because it does not provide any ownership protection for the returned values. The problem appears when something is allocated, then Lua GC starts, and some __gc handlers might also use static_alloc(). In Lua and in C - both lead to the buffer being corrupted in its original usage place. The patch is a part of activity of getting rid of static_alloc() in Lua. It removes it from uuid Lua module and makes it use the new FFI stash feature, which helps to cache frequently used and heavy to allocate FFI values. ffi.new() is not used, because - It produces a new GC object; - ffi.new('struct tt_uuid') costs around 300ns while FFI stash costs around 0.8ns (although it is magically fixed when ffi.new('struct tt_uuid[1]') is used); - Without JIT ffi.new() costs about the same as the stash, ~280ns for small objects like tt_uuid. The patch makes uuid perf a bit better in the places where static_alloc() was used, because its cost was around 7ns for one allocation.
-
Vladislav Shpilevoy authored
Buffer module now exposes ffi_stash_new() function which returns 2 functions take() and put(). FFI stash implements proper ownership of global heavy-to-create objects which can only be created via FFI. Such as structs, pointers, arrays. It should help to fix buffer's registers (buffer.reg1, buffer.reg2, buffer.reg_array), and other global FFI objects such as 'struct port_c' in schema.lua. The issue is that when these objects are global, they might be re-used right during usage in case Lua starts GC and invokes __gc handlers. Just like it happened with IBUF_SHARED and static_alloc(). Part of #5632
-
Vladislav Shpilevoy authored
The global ibuf used for hot Lua and Lua C code didn't have ownership management. As a result, it could be reused in some unexpected ways during Lua GC via __gc handlers, even if it was currently in use in some code below the stack. The patch makes cord_ibuf_take() steal the global buffer from its global stash, and assign to the current fiber. cord_ibuf_put() puts it back to the stash, and detaches from the fiber. If yield happens before cord_ibuf_put(), the buffer is detached automatically. Fiber attach/detach is done via on_yield/on_stop triggers. The buffer is not supposed to survive a yield, so this allows to free/put the buffer back to the stash even if the owner didn't do that. For instance, if a Lua exception was raised before cord_ibuf_put() was called. This makes cord buffer being safe to use in any yield-free code, even if Lua GC might be started. And in non-Lua code as well. Part of #5632
-
Vladislav Shpilevoy authored
There was a global ibuf object called tarantool_lua_ibuf. It was used in all the places working with Lua which didn't have yields, and where fiber's region could be potentially slower due to not being able to guarantee the allocated memory is contiguous. Yields during the ibuf usage were prohibited because another fiber would take the same ibuf and override its previous content which was still used by another fiber. But it wasn't taken into account that there is Lua GC. It can be invoked from any Lua function in Lua C code, and almost on any line in the Lua scripts. During GC some deleted objects might have GC handlers installed as __gc metamethods. From the handler they could call Tarantool functions, including the ones using the global ibuf. Therefore ibuf could be overridden not only at yields, but almost in any moment. Because with the Lua GC at hand, the multitasking is not strictly "cooperative" anymore. It is necessary to implement ownership for the global buffer. The patch prepares the API for this: the buffer is moved to its own file, and has methods take(), put(), and drop(). Take() is supposed to make the current fiber own the buffer. Put() makes it available again. Drop() does the same but also clears the buffer (frees its memory). The ownership itself is a subject for the next patches. Here only the API is prepared. The patch "hits" performance a little. Previously the get of buffer.IBUF_SHARED cost around 1 ns. Now cord_ibuf_take() + cord_ibuf_put() cost around 5 ns together. The next patches will make it worse, up to 15 ns until #5871 is done. Part of #5632
-
Vladislav Shpilevoy authored
In Lua iconv_convert() in case ffi.C.tnt_iconv() with normal arguments failed, tried to clear iconv context by calling the function again with all arguments NULL. Then it looked at errno. But the second call could do anything with errno. For instance, it could also fail, and change errno. The patch saves errno into a variable before calling tnt_iconv() second time. It still does not give a perfect protection as it was discovered in scope of #5632, but still better. The patch is mostly motivated by the next patches about #5632 which will add another call to the error path, and it should better be after errno save. Needed for #5632
-
Vladislav Shpilevoy authored
Code in lua/tuple.c used global tarantool_lua_ibuf in many places relying on it never being changed and not reused by other code until a yield. But it is not so. In fact, as it was discovered in #5632, in any Lua function may be started GC. Any GC handler might touch some API also using tarantool_lua_ibuf inside. This makes the first usage in lua/tuple.c invalid - the buffer could be reset or reallocated or its wpos/rpos could change during GC. In order to fix this, first of all there should be clear points where the buffer is taken, and where it becomes not needed anymore. The patch makes code in lua/tuple.c take tarantool_lua_ibuf when it is needed first time. Not during usage. The same is done for the fiber region for the API symmetry. Part of #5632
-
Vladislav Shpilevoy authored
In msgpack test it is used only to check that 'struct ibuf *' can be passed to encode() functions. But soon IBUF_SHARED will be deleted, and its alternative won't be yield-tolerant. This means it can't be used in this test. There are yields between the buffer usages. In varbinary test it is used in a too complicated way to be able to put it back normally. And otherwise its usage does not make much sense - without put() it is going to be created from the scratch on non-first usage until a yield. In the module_api test it is used to check if some function works with 'struct ibuf *'. Can be done without IBUF_SHARED. Part of #5632
-
Vladislav Shpilevoy authored
fio:pread() used buffer.IBUF_SHARED, which might be reused after a yield. As a result, if pread() was called from 2 different fibers or in parallel with something else using IBUF_SHARED, it would turn the buffer into garbage for all parallel usages. The same problem existed for read(), and was fixed in c7c24f84 ("fio: Fix race condition in fio.read"). But apparently pread() was missed. What is worse, the original commit's test passed even without the fix from that commit. Because it didn't check the results of read()s called from 2 fibers. The patch fixes pread() and adds a test covering both read() and pread(). The old test from the original commit is dropped. Follow up #3187
-
- Mar 22, 2021
-
-
Alexander Turenko authored
This update fixes a sporadic problem with hanging test-run workers. The reason is an incorrect garbage collector handler. See [1] for details. This is not the last test-run problem, which leads to a hang worker: at least there is known problem [2]. [1]: https://github.com/tarantool/test-run/pull/275 [2]: https://github.com/tarantool/test-run/issues/276 Part of tarantool/tarantool-qa#96
-
mechanik20051988 authored
There was error in test: in case when rand() % OSCILLATION_MAX return 0, no memory allocation is made, so fail_unless(obuf_capacity(&buf) > 0) check failed. A small refactoring was also done: add slab_arena_destroy for graceful resources release, removed global seed value, removed unused value from enum. Closes #5345
-
Oleg Babin authored
This patch adds previously missing changelog entry. Follow-up #5451
-
- Mar 19, 2021
-
-
Vladislav Shpilevoy authored
When Lua main script was launched, the sched fiber passed its own diag to the script's fiber. When the script was finished, it put its error into the diag. The sched fiber then checked if the diag is empty to detect an error. But it wasn't really correct. The error could also happen right in the scheduler fiber in a libev callback. For example, in one of ev_io callbacks in SWIM. Then the process would end with an error even if the script was finished successfully. These errors were not related to the main fiber executing the script. The patch makes so the scheduler fiber's diag no longer is used as an indication of an error in the script. Instead, a new diag is created on the stack of the scheduler's fiber, where the Lua script saves the error. Closes #5864
-
Vladislav Shpilevoy authored
Swim node couldn't talk to broadcast network interfaces because the option SO_BROADCAST wasn't set. It worked fine for localhost broadcast, but failed for all the other IPs. There is no a test, because the tests work for the localhost only anyway. It still fails on Mac though in case the swim node was bound to 127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt to broadcast beyond the local machine. It happens on Linux too, but with EINVAL error. These errors are ignored because are not critical. Part of #5864
-
Sergey Nikiforov authored
Was caught by base64 test with enabled ASAN. It also caused data corruption - garbage instead of "extra bits" was saved into state->result if there was no space in output buffer. Decode state removed along with helper functions. Added test for "zero-sized output buffer" case. Fixes: #3069 (cherry picked from commit 7214add2c7f2a86265a5e08f2184029a19fc184d)
-
Serge Petrenko authored
Since the introduction of asynchronous commit, which doesn't wait for a WAL write to succeed, it's quite easy to clog WAL with huge amounts write requests. For now, it's only possible from an applier, since it's the only user of async commit at the moment. This happens when replica is syncing with master and reads new transactions at a pace higher than it can write them to WAL (see docbot request for detailed explanation). To ameliorate such behavior, we need to introduce some limit on not-yet-finished WAL write requests. This is what this commit is trying to do. A new counter is added to wal writer: queue_size (in bytes) together with a corresponding configuration setting: `wal_queue_max_size`. The counter is increased on every new submitted request, and decreased once the tx thread receives a confirmation that a specific request was written. Actually, the limit is added to an abstract journal queue, but currently works only for wal writer, since it's the only possible journal when applier is working. Once size reaches its maximum value, applier is blocked until some of the write requests are finished. The size limit isn't strict, i.e. if there's at least one free byte, the whole write request fits and no blocking is involved. The feature is ready for `box.commit{is_async=true}`. Once it's implemented, it should check whether the queue is full and let the user decide what to do next. Either wait or roll the tx back. Closes #5536 @TarantoolBot document Title: new configuration option: 'wal_queue_max_size' `wal_queue_max_size` puts a limit on the amount of concurrent write requests submitted to WAL. `wal_queue_max_size` is measured in number of bytes to be written (0 means unlimited, which was the default behaviour before). The option only affects replica behaviour at the moment, and defaults to 16 megabytes. The option limits the pace at which replica reads new transactions from master. Here's when the option comes in handy: Before this option was introduced such a situation could be possible: there are 2 servers, a master and a replica, and the replica is down for some period of time. While the replica is down, master serves requests at a reasonable pace, possibly close to its WAL throughput limit. Once the replica reconnects, it has to receive all the data master has piled up and there's no limit in speed at which master sends the data to replica, and, without the option, there was no limit in speed at which replica submitted corresponding write requests to WAL. This lead to a situation when replica's WAL was never in time to serve the requests and the amount of pending requests was constantly growing. There was no limit for memory WAL write requests take, and this clogging of WAL write queue could even lead to replica using up all the available memory. Now, when `wal_queue_max_size` is set, appliers will stop reading new transactions once the limit is reached. This will let WAL process all the requests that have piled up and free all the excess memory.
-
mechanik20051988 authored
Implemented on_shutdown API, which allows to register functions that will be called when the tarantool stopped. Functions will be called in the reverse order they are registered. So the module developer registers one fuction that starts module termination and waits for its competition. This function should be fast or used an asynchronous waiting mechanism (coio_wait or cord_cojoin for example). Closes #5723 @TarantoolBot document Title: Implement on_shutdown API Implemented on_shutdown API, which allows to register functions that will be called when the tarantool stopped. Functions will be called in the reverse order they are registered. So the module developer registers one fuction that starts module termination and waits for its competition. This function should be fast or used an asynchronous waiting mechanism (coio_wait or cord_cojoin for example).
-
mechanik20051988 authored
Previously lua on_shutdown triggers were started sequentially, now each of triggers starts in a separate fiber. Tarantool waits for 3.0 seconds to their completion by default. User has the option to change this value using new implemented box.ctl.set_on_shutdown_timeout function. If timeout has expired, tarantool immediately stops, without waiting for other triggers completion. Also moved ev_break from trigger to the on_shutdown_f function, after calling all on_shutdown lua triggers, because now all triggers are started asynchronously in fibers, and we should call ev_break only after all triggers are finished. Part of #5723 @TarantoolBot document Title: Changed Lua on_shutdown triggers behaviour. Previously lua on_shutdown triggers were started sequentially, now each of triggers starts in a separate fiber. Tarantool waits for 3.0 seconds to their completion by default. User has the option to change this value using new implemented box.ctl.set_on_shutdown_timeout function. If timeout has expired, tarantool immediately stops, without waiting for other triggers completion.
-
mechanik20051988 authored
Since the function for registering on_shutdown triggers for tarantool modules was decided to be named box_on_shutdown, the head of the trigger list with a similar name was renamed. Part of #5723
-
mechanik20051988 authored
Implemented function for starting a chain of triggers in separate fibers, which is required for on_shutdown API implementation. Part of #5723
-
mechanik20051988 authored
Implemented fiber_join_timeout function, which allows to wait for the completion of the fiber for a specified period of time. Function returns fiber execution status to the caller or -1 if the timeout exceeded and set diag. Needed for further on_shutdown API implementation. Part of #5723
-
mechanik20051988 authored
Renamed granularity option to slab_alloc_granularity, according to the name of the other options for small allocator. Follow-up #5518
-
- Mar 18, 2021
-
-
Alexander Turenko authored
This test-run update offers fixes of two problems: * Unhandled OSError exception that occurs rarely, under a heavy load (see [1]). * The 'attempt to compare nil with number' error on test_run:wait_lsn(), when an instance is just bootstrapped (see [2]). [1]: https://github.com/tarantool/test-run/issues/270 [2]: https://github.com/tarantool/test-run/issues/226
-
- Mar 17, 2021
-
-
Sergey Kaplun authored
LuaJIT submodule is bumped to introduce the following changes: * test: disable LuaJIT CLI tests in lua-Harness suite * test: set USERNAME env var for lua-Harness suite * test: adjust lua-Harness tests that use dofile * test: adjust lua-Harness suite to CMake machinery * test: add lua-Harness test suite Within this changeset lua-Harness suite[1] is added to Tarantool testing. Considering Tarantool specific changes in runtime the suite itself is adjusted in LuaJIT submodule. However, Tarantool provides and unconditionally loads TAP module conflicting with the one used in the new suite. Hence, the Tarantool built-in module is "unloaded" in test/luajit-test-init.lua. Furthermore, Tarantool provides UTF-8 support via another built-in module. Its interfaces differ from the ones implemented in Lua5.3 and moonjit. At the same time our LuaJIT fork provides no UTF-8 support, so lua-Harness UTF-8 detector is simply confused with non-nil utf8 global variable. As a result, utf8 is set to nil in test/luajit-test-init.lua. There are also some tests launching Lua interpreter, so strict need to be disabled for their child tests too. Hence `strict.off()` is added to `progname` (i.e. arg[-1] considering the way Tarantool parses its CLI arguments) command used in these tests. [1]: https://framagit.org/fperrad/lua-Harness/tree/a74be27/test_lua Closes #5844 Part of #4473 Reviewed-by:
Sergey Ostanevich <sergos@tarantool.org> Reviewed-by:
Igor Munkin <imun@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
Nikita Pettik authored
xdir_say_gc() takes errno and return code of unlink() sys call. If RC is negative (meaning that unlink failed) we reset errno to given value and log corresponding error message (it is done this way since eio saves errno to internal structure so we have to restore it manually). Before this patch, unlink() call was "in-place" of argument. However, the order of argument evaluation is unspecified. So it may turn out that we assign errno to the previous value, which is obviously wrong. To fix it let's firstly invoke unlink() and then pass the result of call to xdir_say_gc().
-
Alexander V. Tikhonov authored
Added manual and backend triggers to run test workflows. It will give the ability to run missed/needed workflows in Github Actions and to use standalone backend scripts to run test workflows.
-
- Mar 16, 2021
-
-
Cyrill Gorcunov authored
In case if there only one snapshot or xlog file there is no need to call sorting procedure at all. In-scope-of #5806 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Currently we use unsined int for "cleanup" schedule counting, this is safe while this routine is not called too often. Still there is a chance to hit a number wrap on code modification because there is no strict rule on how to use this garbage collector. Lets use wide integers instead, we have only one gc instance and such approach eliminates potential problems in future (actually this should had been done from the beginning since the current gc code flow developed without wrapping in mind). In-scope-of #5806 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Igor Munkin authored
After porting LuaJIT build system to CMake in commit 07c83aab ('build: adjust LuaJIT build system'), its build options are not fully maintained in Tarantool. E.g. several compile flags, such as -fomit-frame-pointer, are set within LuaJIT CMake machinery and there is no way to tweak them outside. As a result ASAN + LSAN build in Tarantool CI[1] reports new leaks related to LuaJIT runtime, but there is none of them actually (no source code changes are made in scope of the applied patchset). Hence it was decided to consider all LuaJIT related warnings as false positives for now and suppress them until #5878 is resolved. [1]: https://github.com/tarantool/tarantool/runs/1999839396 Follows up #4862 Relates to #5878 Reviewed-by:
Alexander V. Tikhonov <avtikhon@tarantool.org> Reviewed-by:
Kirill Yukhin <kyukhin@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
- Mar 15, 2021
-
-
Sergey Bronnikov authored
Closes #5652
-