- Feb 28, 2020
-
-
Alexander Turenko authored
Found another problem with the test: | /builds/DtQXhC5e/0/tarantool/tarantool/test/unit/popen.c:63:6: | error: variable 'rc' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] | if (handle == NULL) | ^~~~~~~~~~~~~~ Decided to revert it and fix in background. This reverts commit 40a51647.
-
Alexander Turenko authored
Found another problem with the test: | /builds/DtQXhC5e/0/tarantool/tarantool/test/unit/popen.c:63:6: | error: variable 'rc' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] | if (handle == NULL) | ^~~~~~~~~~~~~~ Decided to revent the test and so revent its disabling. This reverts commit bceaf05c.
-
Cyrill Gorcunov authored
This test is buggy, need to rewrite. Thus to not block other developers which refer the master branch just disable it. The test was added in 40a51647 ('test: unit/popen'). Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by:
Alexander Turenko <alexander.turenko@tarantool.org>
-
Cyrill Gorcunov authored
Fix for commit f58cb606 ('popen: introduce a backend engine'). Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by:
Alexander Turenko <alexander.turenko@tarantool.org>
-
- Feb 27, 2020
-
-
Cyrill Gorcunov authored
Basic tests for popen engine Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
In the patch we introduce popen backend engine which provides a way to execute external programs and communicate with their stdin/stdout/stderr streams. It is possible to run a child process with: a) completely closed stdX descriptors b) provide /dev/null descriptors to appropritate stdX c) pass new transport into a child (currently we use pipes for this sake, but may extend to tty/sockets) d) inherit stdX from a parent, iow do nothing On tarantool start we create @popen_pids_map hash which maps created processes PIDs to popen_handle structure, this structure keeps everything needed to control and communicate with the children. The hash will allow us to find a hild process quickly from inside of a signal handler. Each handle links into @popen_head list, which is need to be able to destory children processes on exit procedure (ie when we exit tarantool and need to cleanup the resources used). Every new process is born by vfork() call - we can't use fork() because of at_fork() handlers in libeio which cause deadlocking in internal mutex usage. Thus the caller waits until vfork() finishes its work and runs exec (or exit with error). Because children processes are running without any limitations they can exit by self or can be killed by some other third side (say user of a hw node), we need to watch their state which is done by setting a hook with ev_child_start() helper. This helper allows us to catch SIGCHLD when a child get exited/signaled and unregister it from a pool or currently running children. Note the libev wait() reaps child zomby by self. Another interesting detail is that libev catches signal in async way but our SIGCHLD hook is called in sync way before child reap. This engine provides the following API: - popen_init to initialize engine - popen_free to finalize engine and free all reasources allocated so far - popen_new to create a new child process and start it - popen_delete to release resources occupied and terminate a child process - popen_stat to fetch statistics about a child process - popen_command to fetch command line string formerly used on the popen object creation - popen_write_timeout to write data into child's stdin with timeout - popen_read_timeout to read data from child's stdout/stderr with timeout - popen_state to fetch state (alive, exited or killed) and exit code of a child process - popen_state_str to get state of a child process in string form, for Lua usage mostly - popen_send_signal to send signal to a child process (for example to kill it) Known issues to fix in next series: - environment variables for non-linux systems do not support inheritance for now due to lack of testing on my side; - for linux base systems we use popen2 system call passing O_CLOEXEC flag so that two concurrent popen_create calls would not affect each other with pipes inheritance (while currently we don't have a case where concurrent calls could be done as far as I know, still better to be on a safe side from the beginning); - there are some files (such as xlog) which tarantool opens for own needs without setting O_CLOEXEC flag and it get propagated to a children process; for linux based systems we use close_inherited_fds helper which walks over opened files of a process and close them. But for other targets like MachO or FreeBSD this helper just zapped simply because I don't have such machines to experimant with; we should investigate this moment in more details later once base code is merged in; - need to consider a case where we will be using piping for descriptors (for example we might be writting into stdin of a child from another pipe, for this sake we could use splice() syscall which gonna be a way faster than copying data inside kernel between process). Still the question is -- do we really need it? Since we use interanal flags in popen handle this should not be a big problem to extend this interfaces; this particular feature is considered to have a very low priority but I left it here just to not forget. Part-of #4031 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
There is no reason to hide functions. In particular we will use these helpers in popen code. Part-of #4031 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Maria authored
The commit e8009f41 ('box: user.grant error should be versatile') did not do proper clean-up: it grants non-default privileges for user 'guest' and does not revoke them at the end. That caused occasional failures of other tests, all with the same error saying user 'guest' already had access on universe. This case should be handled by test-run in a future, see [1]. [1]: https://github.com/tarantool/test-run/issues/156 Follows up #714
-
Alexander Turenko authored
After #4736 regression fix (in fact it just reverts the new logic in small) it is possible again that a fiber's region may hold a memory for a while, but release it eventually. When the used memory exceeds 128 KiB threshold, fiber_gc() puts 'garbage' slabs back to slab_cache and subtracts them from region_used() metric. But until this point those slabs are accounted in region_used() and so in fiber.info() metrics. This commit fixes flakiness of test cases of the following kind: | fiber.info()[fiber.self().id()].memory.used -- should be zero | <...workload...> | fiber.info()[fiber.self().id()].memory.used -- should be zero The problem is that the first `<...>.memory.used` value may be non-zero. It depends of previous tests that were executed on this tarantool instance. The obvious way to solve it would be print differences between `<...>.memory.used` values before and after a workload instead of absolute values. This however does not work, because a first slab in a region can be almost used at the point where a test case starts and a next slab will be acquired from a slab_cache. This means that the previous slab will become a 'garbage' and will not be collected until 128 KiB threshold will exceed: the latter `<...>.memory.used` check will return a bigger value than the former one. However, if the threshold will be reached during the workload, the latter check may show lesser value than the former one. In short, the test case would be unstable after this change. It is resolved by restarting of a tarantool instance before such test cases to ensure that there are no 'garbage' slabs in a current fiber's region. Note: This works only if a test case reserves only one slab at the moment: otherwise some memory may be hold after the case (and so a memory check after a workload will fail). However it seems that our cases are small enough to don't trigger this situation. Call of region_free() would be enough, but we have no Lua API for it. Fixes #4750.
-
- Feb 25, 2020
-
-
Maria authored
Calling prepare and execute did not update corresponding request statistics in the box.stat table. This happened because methods that collect stats were never called where they should have been. Closes #4756
-
Maria authored
Error message on granted privileges was not flexible and did not distinguish between universal or any other privileges leaving either 'nil' or simply '' at the end. Closes #714
-
- Feb 24, 2020
-
-
Vladislav Shpilevoy authored
The bug was in an attempt to update a record in _space_sequence in-place, to add field path and number. This was not properly supported by the system space's trigger, and was banned in the previous patch of this series. But delete + tuple update + insert work fine. The patch uses them. To test it the old disabled and heavily outdated xlog/upgrade.test.lua was replaced with a smaller analogue, which is supposed to be created separately for each upgrade bug. According to the new policy of creating test files. The patch tries to make it easy to add new upgrade tests and snapshots. A new test should consist of fill.lua script to populate spaces, snapshot, needed xlogs, and a .test.lua file. Fill script and binaries should be in the same folder as test file name, which is located in version folder. Like this: xlog/ | + <test_name>.test.lua | +- upgrade/ | +- <version>/ | | | +-<test_name>/ | | | +- fill.lua | +- *.snap | +- *.xlog Version is supposed to say explicitly what a version files in there have. Closes #4771
-
Vladislav Shpilevoy authored
Anyway this does not work for generated sequences. A proper support of update would complicate the code and won't give anything useful. Part of #4771
-
Vladislav Shpilevoy authored
box.internal.bootstrap() before doing anything turns off system space triggers, because it is likely to do some hard changes violating existing rules. And eliminates data from all system spaces to fill it from the scratch. Each time when a new space is added, its erasure and turning off its triggers should have been called explicitly here. As a result it was not done sometimes, by accident. For example, triggers were not turned off for _sequence_data, _sequence, _space_sequence. Content removal wasn't done for _space_sequence. The patch makes a generic solution which does not require manual patching of trigger manipulation and truncation anymore. The bug was discovered while working on #4771, although it is not related.
-
Cyrill Gorcunov authored
In case if we unable to revert guard page back to read|write we should never use such slab again. Initially I thought of just put panic here and exit but it is too destructive. I think better print an error and continue. If node admin ignore this message then one moment at future there won't be slab left for use and creating new fibers get prohibited. In future (hopefully near one) we plan to drop guard pages to prevent VMA fracturing and use stack marks instead. Reviewed-by:
Alexander Turenko <alexander.turenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Both madvise and mprotect calls can fail due to various reasons, mostly because of lack of free memory in the system. We log such cases via say_x helpers but this is not enough. In particular tarantool/memcached relies on diag error to be set to detect an error condition: | expire_fiber = fiber_new(name, memcached_expire_loop); | const box_error_t *err = box_error_last(); | if (err) { | say_error("Can't start the expire fiber"); | say_error("%s", box_error_message(err)); | return -1; | } Thus lets use diag_set() helper here and instead of macros use inline functions for better readability. Fixes #4722 Reported-by:
Alexander Turenko <alexander.turenko@tarantool.org> Reviewed-by:
Alexander Turenko <alexander.turenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Chris Sosnin authored
Providing socket pathname longer than UNIX_PATH_MAX to socket.tcp_server() will not cause any error, lbox_socket_local_resolve will just truncate the name according to the limit, causing bad behavior (tarantool will try to access a socket, which doesn't exist). Thus, let's verify, that pathname can fit into buffer. Closes #4634
-
- Feb 21, 2020
-
-
Alexander V. Tikhonov authored
Our S3 based repositories now reflect packagecloud.io repositories structure. It will allow us to migrate from packagecloud.io w/o much complicating redirection rules on a web server serving download.tarantool.org. Deploy source packages (*.src.rpm) into separate 'SRPM' repository like packagecloud.io does. Changed repository signing key from its subkey to public and moved it to gitlab-ci environment. Follows up #3380
-
- Feb 20, 2020
-
-
Cyrill Gorcunov authored
To look similar to txn_complete. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Acked-by:
Nikita Pettik <korablev@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Instead of on_done use on_complete prefix since done it too general while we're trying to complete write procedue. Also it is more consistent with txn_complete name. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Acked-by:
Nikita Pettik <korablev@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
We're returning int64_t with values 0 or -1 by now, there is no need for such big return type, plain integer is enough. Acked-by:
Nikita Pettik <korablev@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Using void explicitly in functions which take no arguments allows to optimize code a bit and don't assume if there might be variable args. Moreover in commit e070cc4d we dropped arguments from txn_begin but didn't update vy_scheduler.c. The compiler didn't complain because it assumed there are vargs. Acked-by:
Konstantin Osipov <kostja.osipov@gmail.com> Acked-by:
Nikita Pettik <korablev@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Alexander V. Tikhonov authored
Found that on 19.02.2020 APT repositories with packages for Ubuntu 18.10 Cosmic were removed from Ubuntu archive: E: The repository 'http://security.ubuntu.com/ubuntu cosmic-security Release' does not have a Release file. E: The repository 'http://archive.ubuntu.com/ubuntu cosmic Release' does not have a Release file. E: The repository 'http://archive.ubuntu.com/ubuntu cosmic-updates Release' does not have a Release file. E: The repository 'http://archive.ubuntu.com/ubuntu cosmic-backports Release' does not have a Release file. Also found the half a year old message about Ubuntu 18.10 Cosmic EOL: https://fridge.ubuntu.com/2019/07/19/ubuntu-18-10-cosmic-cuttlefish-end-of-life-reached-on-july-18-2019/ Removed the Ubuntu 18.10 Cosmic from gitlab-ci and travis-ci testings.
-
Vladislav Shpilevoy authored
The server used to crash when any option argument was passed with a value concatenated to it, like this: '-lvalue', '-evalue' instead of '-l value' and '-e value'. However this is a valid way of writing values, and it should not have crashed regardless of its validity. The bug was in usage of 'optind' global variable from getopt() function family. It is not supposed to be used for getting an option's value. It points to a next argv to parse. Next argv != value of current argv, like it was with '-lvalue' and '-evalue'. For getting a current value there is a variable 'optarg'. Closes #4775
-
Vladislav Shpilevoy authored
There was a bug that float +/- float could result into infinity even if result fits a double. It was fixed by storing double or float depending on a result value. But it didn't take result field type into account. That led to a bug when a double field +/- a value fit the float range, and could be stored as float resulting into an error at attempt to create a tuple. Now if a field type is double in the tuple format, it will store double always, even if it fits the float range. Follow-up #4701
-
Vladislav Shpilevoy authored
Currently xrow_update sizeof + store are used to calculate the result tuple's size, preallocate it as a monolithic memory block, and save the update tree into it. Sizeof was expected to return the exact memory size needed for the tuple. But this is not useful when result size of a field depends on its format type, and therefore on its position in the tuple. Because in that case sizeof would need to care about tuple format, and walk format tree just like store does now. Or it would be needed to save the found json tree nodes into struct xrow_update_op during sizeof calculation. All of this would make sizeof code more complex. The patch makes it possible for sizeof to return the maximal needed size. So, for example, a floating point field size now returns size needed for encoding of a double. And then store can either encode double or float. Follow-up #4701
-
Vladislav Shpilevoy authored
Tuple format now is passed to xrow_update routines. It is going to be used for two purposes: - Find field types of the result tuple fields. It will be used to decide whether a floating point value should be saved with single or double precision; - In future use the format and tuple offset map to find target fields for O(1), without decoding anything. May be especially useful for JSON updates of indexed fields. For the types the format is passed to *_store() functions. Types can't be calculated earlier, because '!' and '#' operations change field order. Even if a type would be calculated during operations appliance for field N, an operation '!' or '#' on field < N would make this calculation useless. Follow-up #4701
-
Vladislav Shpilevoy authored
Before the patch there were the rules: * float +/- double = double * double +/- double = double * float +/- float = float The rules were applied regardless of values. That led to a problem when float + float exceeding maximal float value could fit into double, but was stored as an infinity. The patch makes so that if a floating point arithmetic operation result fits into float, it is stored as float. Otherwise as double. Regardless of initial types. This alongside saves some memory for cases when doubles can be stored as floats, and therefore takes 4 less bytes. Although these cases are rare, because any not integer value stored in a double may have a long garbage tail in its fraction. Closes #4701
-
- Feb 19, 2020
-
-
Vladislav Shpilevoy authored
os.setenv() and os.environ() are Lua API for extern char **environ; int setenv(); The Open Group standardized access points for environment variables. But there is no a word about that environ never changes. Programs can't relay on that. For example, addition of a new variable may cause realloc of the whole environ array, and therefore change of its pointer value. That was exactly the case in os.environ() - it was using value of environ array remembered when Tarantool started. And os.setenv() could realloc the array and turn the saved pointer into garbage. Closes #4733
-
Kirill Yukhin authored
Revert "build: introduce LUAJIT_ENABLE_PAIRSMM flag" Related to #4770
-
- Feb 18, 2020
-
-
Alexander V. Tikhonov authored
Enabled Tarantool performance testing on Gitlab-CI for release/master branches and "*-perf" named branches. For this purpose 'perf' and 'cleanup' stages were added into Gitlab-CI pipeline. Performance testing support next benchmarks: - cbench - linkbench - nosqlbench (hash and tree Tarantool run modes) - sysbench - tpcc - ycsb (hash and tree Tarantool run modes) Benchmarks use scripts from repository: http://github.com/tarantool/bench-run Performance testing uses docker images, built with docker files from bench-run repository: - perf/ubuntu-bionic:perf_master -- parent image with benchmarks only - perf_tmp/ubuntu-bionic:perf_<commit_SHA> -- child images used for testing Tarantool sources @Totktonada: Harness and workloads are to be reviewed.
-
Oleg Babin authored
After 7fd6c809 (buffer: port static allocator to Lua) uri started to use static_allocator - cyclic buffer that also is used in several modules. However situation when uri.format output is zero-length string was not handled properly and ffi.string could return data that was previously written in static buffer because use as string terminator the first zero byte. To prevent such situation let's pass result length explicitly. Closes #4779
-
- Feb 17, 2020
-
-
Oleg Babin authored
Some of our users want to have a native method to check is specified value 'decimal' or not This patch introduces 'is_decimal' check in 'decimal' module Closes #4623 @TarantoolBot document Title: decimal.is_decimal is_decimal check function returns "true" if specified value is decimal and "false" otherwise
-
- Feb 15, 2020
-
-
Olga Arkhangelskaia authored
When json.decode is used with 2 arguments, 2nd argument seeps out to the json configuration of the instance. Moreover, due to current serializer.cfg implementation it remains invisible while checking settings using json.cfg table. This fixes commit 6508ddb7 ('json: fix stack-use-after-scope in json_decode()'). Closes #4761
-
Vladislav Shpilevoy authored
box_process_call/eval() in the end check if there is an active transaction. If there is, it is rolled back, and an error is set. But rollback is not needed anymore, because anyway in the end of the request the fiber is stopped, and its not finished transaction is rolled back. Just setting of the error is enough. Follow-up #4662
-
Vladislav Shpilevoy authored
Fiber.storage was not deleted when created in a fiber started from the thread pool used by IProto requests. The problem was that fiber.storage was created and deleted in Lua land only, assuming that only Lua-born fibers could have it. But in fact any fiber can create a Lua storage. Including the ones used to serve IProto requests. Not deletion of the storage led to a possibility of meeting a non-empty fiber.storage in the beginning of an iproto request, and to not deletion of the memory caught by the storage until its explicit nullification. Now the storage destructor works for any fiber, which managed to create the storage. The destructor unrefs and nullifies the storage. For destructor purposes the fiber.on_stop triggers were reworked. Now they can be called multiple times during fiber's lifetime. After every request done by that fiber. Closes #4662 Closes #3462 @TarantoolBot document Title: Clarify fiber.storage lifetime Fiber.storage is a Lua table created when it is first accessed. On the site it is said that it is deleted when fiber is canceled via fiber:cancel(). But it is not the full truth. Fiber.storage is destroyed when the fiber is finished. Regardless of how is it finished - via :cancel(), or the fiber's function did 'return', it does not matter. Moreover, from that moment the storage is cleaned up even for pooled fibers used to serve IProto requests. Pooled fibers never really die, but nonetheless their storage is cleaned up after each request. That makes possible to use fiber.storage as a full featured request-local storage. Fiber.storage may be created for a fiber no matter how the fiber itself was created - from C, from Lua. For example, a fiber could be created in C using fiber_new(), then it could insert into a space, which had Lua on_replace triggers, and one of the triggers could create fiber.storage. That storage will be deleted when the fiber is stopped. Another place where fiber.storage may be created - for replication applier fiber. Applier has a fiber from which it applies transactions from a remote instance. In case the applier fiber somehow creates a fiber.storage (for example, from a space trigger again), the storage won't be deleted until the applier fiber is stopped.
-
- Feb 14, 2020
-
-
Vladislav Shpilevoy authored
Fiber.storage is a table, available from anywhere in the fiber. It is destroyed after fiber function is finished. That provides a reliable fiber-local storage, similar to thread-local in C/C++. But there is a problem that the storage may be created via one struct lua_State, and destroyed via another. Here is an example: function test_storage() fiber.self().storage.key = 100 end box.schema.func.create('test_storage') _ = fiber.create(function() box.func.test_storage:call() end) There are 3 struct lua_State: tarantool_L - global always alive state; L1 - Lua coroutine of the fiber, created by fiber.create(); L2 - Lua coroutine created by that fiber to execute test_storage(). Fiber.storage is created on stack of L2 and referenced by global LUA_REGISTRYINDEX. Then it is unreferenced from L1 when the fiber is being destroyed. That is generally ok as soon as the storage object is always in LUA_REGISTRYINDEX, which is shared by all Lua states. But soon during destruction of the fiber.storage there will be only tarantool_L and the original L2. Original L2 may be already deleted by the time the storage is being destroyed. So this patch makes unref of the storage via reliable tarantool_L. Needed for #4662
-
Cyrill Gorcunov authored
Every new error introduced into error engine cause massive update in test even if only one key is introduced. To minimize diff output better print them in sorted order. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by:
Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Reviewed-by:
Alexander Turenko <alexander.turenko@tarantool.org>
-
Sergey Kaplun authored
We already have 12Kb thread-safe static buffer in `lib/small/small/static.h`, that can be used instead of 16Kb bss buffer in `src/lib/core/backtrace.cc` for backtrace payload. Closes #4650
-
- Feb 12, 2020
-
-
Nikita Pettik authored
During value decoding fetched from space's field FP representation was forced in case type of field was NUMBER. It was so since NUMBER used to substitute DOUBLE field type (in fact NUMBER mimicked DOUBLE type). Since now DOUBLE is a separate field type, there's no such necessity. Hence from now integers from NUMBER field are treated as integers. Implemented by Mergen Imeev <imeevma@gmail.com> Closes #4233 @TarantoolBot document Title: NUMBER column type changes From now NUMBER behaves in the same way as in NoSQL Tarantool. Previously, NUMBER was rather synonym to what now DOUBLE means: it used to force floating point representation of values, even if they were integers. A few examples: 1) CAST operation: Obsolete behaviour: SELECT CAST(922337206854774800 AS NUMBER), CAST(5 AS NUMBER) / 10; --- rows: - [922337206854774784, 0.5] New behaviour: SELECT CAST(922337206854774800 AS NUMBER), CAST(5 AS NUMBER) / 10; --- rows: - [922337206854774800, 0] Obsolete behaviour: SELECT CAST(true AS NUMBER); --- - null - 'Type mismatch: can not convert TRUE to number' ... New behaviour: SELECT CAST(true AS NUMBER); --- rows: - [1] ... CAST boolean to NUMBER is allowed since it is allowed to convert booleans to integers; in turn NUMBER comprises integer type. 2) Preserving integer representation: Obsolete behaviour: CREATE TABLE t (n NUMBER PRIMARY KEY); INSERT INTO t VALUES (3), (-4), (5.0); SELECT n, n/10 FROM t; --- rows: - [-4, -0.4] - [3, 0.3] - [5, 0.5] New behaviour: SELECT n, n/10 FROM t; --- rows: - [-4, 0] - [3, 0] - [5, 0.5]
-