- Apr 14, 2021
-
-
Cyrill Gorcunov authored
Part-of #4642 Co-developed-by:
Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
Currently to run "C" function from some external module one have to register it first in "_func" system space. This is a problem if node is in read-only mode (replica). Still people would like to have a way to run such functions even in ro mode. For this sake we implement "box.lib" lua module. Unlike `box.schema.func` interface the `box.lib` does not defer module loading procedure until first call of a function. Instead a module is loaded immediately and if some error happens (say shared library is corrupted or not found) it pops up early. The need of use stored C procedures implies that application is running under serious loads most likely there is modular structure present on Lua level (ie same shared library is loaded in different sources) thus we cache the loaded library and reuse it on next load attempts. To verify that cached library is up to day the module_cache engine test for file attributes (device, inode, size, modification time) on every load attempt. Since both `box.schema.func` and `box.lib` are using caching to minimize module loading procedure the pass-through caching scheme is implemented: - box.lib relies on module_cache engine for caching; - box.schema.func does snoop into box.lib hash table when attempt to load a new module, if module is present in box.lib hash then it simply referenced from there and added into own hash table; in case if module is not present then it loaded from the scratch and put into both hashes; - the module_reload action in box.schema.func invalidates module_cache or fill it if entry is not present. Closes #4642 Co-developed-by:
Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> @TarantoolBot document Title: box.lib module Overview ======== `box.lib` module provides a way to create, delete and execute `C` procedures from shared libraries. Unlike `box.schema.func` methods the functions created with `box.lib` help are not persistent and live purely in memory. Once a node get turned off they are vanished. An initial purpose for them is to execute them on nodes which are running in read-only mode. Module functions ================ `box.lib.load(path) -> obj | error` ----------------------------------- Loads a module from `path` and return an object instance associate with the module, otherwise an error is thrown. The `path` should not end up with shared library extension (such as `.so`), only a file name shall be there. Possible errors: - IllegalParams: module path is either not supplied or not a string. - SystemError: unable to open a module due to a system error. - ClientError: a module does not exist. - OutOfMemory: unable to allocate a module. Example: ``` Lua -- Without error handling m = box.lib.load('path/to/library) -- With error handling m, err = pcall(box.lib.load, 'path/to/library') if err ~= nil then print(err) end ``` `module:unload() -> true | error` --------------------------------- Unloads a module. Returns `true` on success, otherwise an error is thrown. Once the module is unloaded one can't load new functions from this module instance. Possible errors: - IllegalParams: a module is not supplied. - IllegalParams: a module is already unloaded. Example: ``` Lua m = box.lib.load('path/to/library') -- -- do something with module -- m:unload() ``` If there are functions from this module referenced somewhere in other places of Lua code they still can be executed because the module continue sitting in memory until the last reference to it is closed. If the module become a target to the Lua's garbage collector then unload is called implicitly. `module:load(name) -> obj | error` ---------------------------------- Loads a new function with name `name` from the previously loaded `module` and return a callable object instance associated with the function. On failure an error is thrown. Possible errors: - IllegalParams: function name is either not supplied or not a string. - IllegalParams: attempt to load a function but module has been unloaded already. - ClientError: no such function in the module. - OutOfMemory: unable to allocate a function. Example: ``` Lua -- Load a module if not been loaded yet. m = box.lib.load('path/to/library') -- Load a function with the `foo` name from the module `m`. func = m:load('foo') ``` In case if there is no need for further loading of other functions from the same module then the module might be unloaded immediately. ``` Lua m = box.lib.load('path/to/library') func = m:load('foo') m:unload() ``` `function:unload() -> true | error` ----------------------------------- Unloads a function. Returns `true` on success, otherwise an error is thrown. Possible errors: - IllegalParams: function name is either not supplied or not a string. - IllegalParams: the function already unloaded. Example: ``` Lua m = box.lib.load('path/to/library') func = m:load('foo') -- -- do something with function and cleanup then -- func:unload() m:unload() ``` If the function become a target to the Lua's garbage collector then unload is called implicitly. Executing a loaded function =========================== Once function is loaded it can be executed as an ordinary Lua call. Lets consider the following example. We have a `C` function which takes two numbers and returns their sum. ``` C int cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end) { uint32_t arg_count = mp_decode_array(&args); if (arg_count != 2) { return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", "invalid argument count"); } uint64_t a = mp_decode_uint(&args); uint64_t b = mp_decode_uint(&args); char res[16]; char *end = mp_encode_uint(res, a + b); box_return_mp(ctx, res, end); return 0; } ``` The name of the function is `cfunc_sum` and the function is built into `cfunc.so` shared library. First we should load it as ``` Lua m = box.lib.load('cfunc') cfunc_sum = m:load('cfunc_sum') ``` Once successfully loaded we can execute it. Lets call the `cfunc_sum` with wrong number of arguments ``` Lua cfunc_sum() | --- | - error: invalid argument count ``` We will see the `"invalid argument count"` message in output. The error message has been set by the `box_error_set` in `C` code above. On success the sum of arguments will be printed out. ``` Lua cfunc_sum(1, 2) | --- | - 3 ``` The functions may return multiple results. For example a trivial echo function which prints arguments passed in. ``` Lua cfunc_echo(1,2,3) | --- | - 1 | - 2 | - 3 ``` Module and function caches ========================== Loading a module is relatively slow procedure because operating system needs to read the library, resolve its symbols and etc. Thus to speedup this procedure if the module is loaded for a first time we put it into an internal cache. If module is sitting in the cache already and new request to load comes in -- we simply reuse a previous copy. In case if module is updated on a storage device then on new load attempt we detect that file attributes (such as device number, inode, size, modification time) get changed and reload module from the scratch. Note that newly loaded module does not intersect with previously loaded modules, the continue operating with code previously read from cache. Thus if there is a need to update a module then all module instances should be unloaded (together with functions) and loaded again. Similar caching technique applied to functions -- only first function allocation cause symbol resolving, next ones are simply obtained from a function cache.
-
Cyrill Gorcunov authored
Since we have low level module api now we can switch the box.schema.func code to use it. In particular we define schema_module structure as a wrapper over the modules api -- it carries a pointer to general module structure. Because of modules reload functionality the schema modules carry own cache of modules instances. Thus to make overall code somehow close to modules api we do: 1) cache operations are renamed to cache_find/put/update/del; 2) C functions are switched to use module_func low level api; 3) because low level modules api carry own references we can take no explicit reference when calling a function. Part-of #4642 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
The modules subsystem hides some low-level operations under API. In particular the modules subsystem is responsible for - modules lookup in Lua's "package.search" storage - modules caching to eliminate expensive load procedure - function symbol resolving Because naming is intersecting with current module functions sitting in box/func, lets rename the later to schema_module prefix. We will use this prefix in next patches to point the modules in box.schema.func are just a particular user of the general modules engine. Part-of #4642 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
The only purpose of the module argument is to notify the caller that the module doesn't exist. Lets simplify the code and drop this argument. Part-of #4642 Acked-by:
Serge Petrenko <sergepetrenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Cyrill Gorcunov authored
In commit 96938faf (Add hot function reload for C procedures) an ability to hot reload of modules has been introduced. When module is been reloaded his functions are resolved to new symbols but if something went wrong it is supposed to restore old symbols from the old module. Actually current code restores only one function and may crash if there a bunch of functions to restore. Lets fix it. Fixes #5968 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Vladislav Shpilevoy authored
Applier used to process synchronous rows CONFIRM and ROLLBACK right after receipt before they are written to WAL. That led to a bug that the confirmed data became visible, might be accessed by user requests, then the node restarted before CONFIRM finished its WAL write, and the data was not visible again. That is just like if it would be rolled back, which is not acceptable. Another case - CONFIRM WAL write could simply fail due to any reason (no disk space, OOM), but the transactions would remain confirmed anyway. Also that produced some hacks in the limbo's code to support the confirmation and rollback of transactions not yet written to WAL. The patch makes the synchro rows processed only after they are written to WAL. Although the 'rollback' case above might still happen if the xlogs were in the kernel caches, and the machine was powered off before they were flushed to disk. But that is not related to qsync specifically. To handle the synchro rows after WAL write the patch makes them go to WAL in a blocking way (journal_write() instead of journal_write_try_async()). Otherwise it could happen that a CONFIRM/ROLLBACK is being written to WAL and would clear the limbo afterwards, but a new transaction arrives with a different owner, and it conflicts with the current limbo owner. Closes #5213
-
mechanik20051988 authored
In previous patch update(insert) operation for absent nullable fields was allowed. This patch allows to update(delete) operation for absent nullable fileds. Closes #3378
-
Mary Feofanova authored
Update operations could not insert with gaps. This patch changes the behavior so that the update operation fills the missing fields with nulls. Part of #3378 @TarantoolBot document Title: Allow update absent nullable fields Update operations could not insert with gaps. Changed the behavior so that the update operation fills the missing fields with nulls. For example we create space `s = box.schema.create_space('s')`, then create index for this space `pk = s:create_index('pk')`, and then insert tuple in space `s:insert{1, 2}`. After all of this we try to update this tuple `s:update({1}, {{'!', 5, 6}})`. In previous version this operation fails with ER_NO_SUCH_FIELD_NO error, and now it's finished with success and there is [1, 2, null, null, 6] tuple in space.
-
Mary Feofanova authored
Prepare this test for upcoming #3378 fix: bad upserts will become good, so we need another way to do them.
-
- Apr 13, 2021
-
-
mechanik20051988 authored
There are users that have specific workloads where iproto thread is the bottleneck of throughput: iproto thread's code is 100% loaded while TX thread's core is not. For such cases it would be nice to have a capability to create several iproto threads. Closes #5645 @TarantoolBot document Title: implement ability to run multiple iproto threads Implement ability to run multiple iproto threads, which is useful in some specific workloads where iproto thread is the bottleneck of throughput. To specify count of iproto threads, user should used iproto_threads option in box.cfg. For example if user want to start 8 iproto threads, he must enter `box.cfg{iproto_threads=8}`. Default iproto threads count == 1. This option is not dynamic, so user can't change it after first setting, until server restart. Distribution of connections per threads is managed by OS kernel.
-
mechanik20051988 authored
There was two problems with struct rmean: - For correct access for rmean struct fields, this struct should be created in tx thread. - In case when rmean_new return NULL in net_cord_f tarantool hangs and does not terminate in any way except on SIGKILL. Also net_slabc cache was not destroyed. Moved allocation and deallocation of rmean structure to iproto_init/iproto_free respectively. Added slab_cache_destroy for net_slabc for graceful resource releases.
-
mechanik20051988 authored
The fields of the rmean structure can be accessed from multiple threads, so we must use atomic operations to get/set fields in this structure. Also in the comments to the functions i wrote in which threads they should be called to correctly access the fields of the rmean structure.
-
Alexander Turenko authored
`cmake` command was hardcoded for configuring libcurl, however only `cmake3` may be installed in a system. Now we use the same cmake command for configuring libcurl as one that is used for configuring tarantool itself. The problem exists since 2.6.0-196-g2b0760192 ('build: enable cmake in curl build'). Fixes #5955
-
Iskander Sagitov authored
ER_TUPLE_FOUND message shows only space and index, let's also show old tuple and new tuple. This commit changes error message in code and in tests. Test sql/checks and sql-tap/aler remain the same due to problems in showing their old and new tuples in error message. Closes #5567
-
Iskander Sagitov authored
Add field name to field mismatch error message. Part of #4707
-
Iskander Sagitov authored
Add got type to field mismatch error message. Part of #4707
-
Iskander Sagitov authored
Previously tuple_field_u32 and tuple_next_u32 stored uint64_t value in uint32_t field. This commit fixes it. Part of #4707
-
- Apr 12, 2021
-
-
Serge Petrenko authored
Bump `feedback_version` to 7 and introduce a new field: `feedback.events`. It holds a counter for every event we may choose to register later on. Currently the possible events are "create_space", "drop_space", "create_index", "drop_index". All the registered events and corresponding counters are sent in a report in `feedback.events` field. Also, the first registered event triggers the report sending right away. So, we may follow such events like "first space/index created/dropped" Closes #5750
-
Serge Petrenko authored
Feedback daemon used to generate report before waiting (for an hour by default) until it's time to send it. Better actualize the reports and generate them right when it's time to send them. Part of #5750
-
Serge Petrenko authored
Send the first report as soon as instance's initial configuration finishes. Part of #5750
-
Serge Petrenko authored
feedback_daemon.send() will come in handy once we implement triggers to dispatch feedback after some events, for example, right on initial instance configuration. So, it's not a testing method anymore, hence the new name. Part of #5750
-
Serge Petrenko authored
We are going to send feedback right after initial `box.cfg{}` call, so include server uptime in the report to filter out short-living CI instances. Also, while we're at it, fix a typo in feedback_daemon test. Prerequisite #5750
-
Cyrill Gorcunov authored
In commit 14fa5fd8 (cfg: support symbolic evaluation of replication_synchro_quorum) we implemented support of symbolic evaluation of `replication_synchro_quorum` parameter and there is no easy way to obtain it current run-time value, ie evaluated number value. Moreover we would like to fetch queue length on transaction limbo for tests and extend this statistics in future. Thus lets add them. Closes #5191 Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> @TarantoolBot document Title: Provide `box.info.synchro` interface The `box.info.synchro` leaf provides information about details of synchronous replication. In particular `quorum` represent the current value of synchronous replication quorum defined by `replication_synchro_quorum` configuration parameter because it can be set as dynamic formula such as `N/2+1` and the value depends on current number of replicas. Since synchronous replication does not commit data immediately but waits for its propagation to replicas the data sits in a queue gathering `commit` responses from remote nodes. Current number of entries waiting in the queue is shown via `queue.len` member. A typical output is the following ``` Lua tarantool> box.info.synchro --- - queue: len: 0 quorum: 1 ... ``` The `len` member shows current number of entries in the queue. And the `quorum` member shows an evaluated value of `replication_synchro_quorum` parameter.
-
- Apr 08, 2021
-
-
Alexander V. Tikhonov authored
For self-hosted runners run w/o restart may need to kill hanged processes that could be leaved from the previous workflows runs. This patch implements in cleanup with kill for such processes in 'environment' actions which is called for all workflows and runs before main steps. Cleanup searches for two names patterns of running processes: - ' tarantool ' - 'test-run.py ' Closes tarantool/tarantool-qa#114
-
Alexander V. Tikhonov authored
Enabling ubuntu-20.04 hosts for packaging workflows found that DEB package Github Actions workflows do not need to install createrepo tool. Also found that createrepo is not ready for ubuntu-20.04 as described in (till ubuntu-21.04 where it is available as the new version of this tool named as 'createrepo_c' as DEB package): 3a7c2102 ('github-ci: ubuntu-20.04 misses createrepo package') To fix it added 'createrepo_c' build and installation from sources and changed in update_repo tool 'createrepo' tool to 'createrepo_c'. This patch is needed to use these workflows on self-hosted runners which run under ubuntu-20.04 by default for now. Also checking the patch on ubuntu-20.04 hosts got the following issue: Regenerated DEB file: pool/xenial/main/t/tarantool/tarantool-common_2.8.0.185.g4c3e0eb-1_all.deb <botocore.awsrequest.AWSRequest object at 0x7f7998a4ca90> <botocore.awsrequest.AWSRequest object at 0x7f627d965070> make: *** [.gitlab.mk:131: deploy] Error 255 Error: Process completed with exit code 2. Found that there is already issue exists in Github Actions on it [1]. Provided solution to setup AWS region in environment helped to workaround the issue [2]. Closes tarantool/tarantool-qa#110 Closes tarantool/tarantool-qa#111 [1]: https://github.com/aws/aws-cli/issues/5234 [2]: https://github.com/aws/aws-cli/issues/5234#issuecomment-635459464
-
Alexander V. Tikhonov authored
Created local composite action 'pack_and_deploy' for Tarantool packages creation and deployment. It was created using local scripts in packages workflows. It let to change common parts of packages creations and deployment in each packaging workflow to call for this action. It helped to consolidate all the instructions on packages creation and deployment in a single place.
-
- Apr 07, 2021
-
-
Igor Munkin authored
LuaJIT submodule is bumped to introduce the following changes: * test: fix dynamic modules loading on MacOS * test: make utils.selfrun usage easier * test: remove excess dependency for tests target Within this changeset SIP issues are worked around and dynamic modules loading on MacOS is fixed. As a result LuaJIT tests can be enabled for static build target on MacOS. Closes #5959 Follows up #4862 Reviewed-by:
Alexander V. Tikhonov <avtikhon@tarantool.org> Reviewed-by:
Sergey Kaplun <skaplun@tarantool.org> Signed-off-by:
Igor Munkin <imun@tarantool.org>
-
Alexander V. Tikhonov authored
Found that running package removement command from S3: ./tools/update_repo.sh -o=fedora -d=30 -b=s3://tarantool_repo/live/1.10 -r=tarantool-1.10.9.108 it searched to remove: Searching to remove: s3://tarantool_repo/live/1.10/fedora/30/SRPMS/Packages/tarantool-1.10.9.108-1.fedora30.src.rpm while it had to be: Searching to remove: s3://tarantool_repo/live/1.10/fedora/30/SRPMS/Packages/tarantool-1.10.9.108-1.fc30.src.rpm where 'fc30' had to be instead of 'fedora30'. This patch fixes it.
-
Alexander V. Tikhonov authored
For self-host runners, where work path with sources is saving between different workflows runs, it is needed to be sure that all permissions correct for running user to be sure that later call to sources checkout by 'actions/checkout' action run by the same user won't fail on sources cleanup. Closes tarantool/tarantool-qa#102
-
Alexander V. Tikhonov authored
Workflow 'coverity' produces and uploads results to 'coverity.com' web site. Message about it can be shown in PR within each run was done. This patch adds the ability to send message in available PR otherwise it is skipped. Part of #5644
-
mechanik20051988 authored
Test checks possibility of recovery with force_recovery option. For these purpose, snapshot is damaged in test and possibility of recovery is checked. In previous version snapshot size was too small, so sometimes in test, system data, needed for recovery was corrupted. Also moved same code to functions, deleted one of two identical test case (in previous version we write different count of garbage in the middle of snapshot, it's meaningless, because result is almost the same, only the amount of data that can be read from snapshot differs). Follow-up #5422
-
Alexander Turenko authored
List of changes in test-run: * readme: add memcached to users ([PR #285][1]) * test: add integration tests ([PR #273][2]) * test: enable in continuous integration ([PR #273][2]) * Fix slowdown on Python 3 ([PR #290][3]) Updated .luacheckrc to exclude newly added 'core = tarantool' tests (ones that check test-run itself). [1]: https://github.com/tarantool/test-run/pull/285 [2]: https://github.com/tarantool/test-run/pull/273 [3]: https://github.com/tarantool/test-run/pull/290
-
- Apr 05, 2021
-
-
Vladislav Shpilevoy authored
It was used so as to recover synchronous auto-commit transactions in an async way (not blocking the fiber). But it became not necessary since #5874 was fixed. Because recovery does not use auto-commit transactions anymore. Closes #5194
-
Vladislav Shpilevoy authored
Recovery used to be performed row by row. It was fine because anyway all the persisted rows are supposed to be committed, and should not meet any problems during recovery so a transaction could be applied partially. But it became not true after the synchronous replication introduction. Synchronous transactions might be in the log, but can be followed by a ROLLBACK record which is supposed to delete them. During row-by-row recovery, firstly, the synchro rows each turned into a sync transaction. Which is probably fine. But the rows on non-sync spaces which were a part of a sync transaction, could be applied right away bypassing the limbo leading to all kind of the sweet errors like duplicate keys, or inconsistency of a partially applied transaction. The patch makes the recovery transactional. Either an entire transaction is recovered, or it is rolled back which normally happens only for synchro transactions followed by ROLLBACK. In force recovery of a broken log the consistency is not guaranteed though. Closes #5874
-
Vladislav Shpilevoy authored
During recovery and xlog replay vinyl skips the statements already stored in runs. Indeed, their re-insertion into the mems would lead to their second dump otherwise. But that results into an issue that the recovery transactions in vinyl don't have a write set - their tx->log is empty. On the other hand they still are added to the write set (xm->writers). Probably so as not to have too many checks "skip if in recovery" all over the code. It works fine with single-statement transactions, but would break on multi-statement transactions. Because the decision whether need to add to the write set was done based on the tx's log emptiness. It is always empty, and so the transaction could be added to the write set twice and corrupt its list-link member. The patch makes the decision about being added to the write set based on emptiness of the list-link member instead of the log so it works fine both during recovery and normal operation. Needed for #5874
-
Serge Petrenko authored
There was a bug in box_process_register. It decoded replica's vclock but never used it when sending the registration stream. So the replica might lose the data in range (replica_vclock, start_vclock). Follow-up #5566
-
Serge Petrenko authored
Both box_process_register and box_process_join had guards ensuring that not a single rollback occured for transactions residing in WAL around replica's _cluster registration. Both functions would error on a rollback and make the replica retry final join. The reason for that was that replica couldn't process synchronous transactions correctly during final join, because it applied the final join stream row-by-row. This path with retrying final join was a dead end, because even if master manages to receive no ROLLBACK messages around N-th retry of box.space._cluster:insert{}, replica would still have to receive and process all the data dating back to its first _cluster registration attempt. In other words, the guard against sending synchronous rows to the replica didn't work. Let's remove the guard altogether, since now replica is capable of processing synchronous txs in final join stream and even retrying final join in case the _cluster registration was rolled back. Closes #5566
-
Serge Petrenko authored
Now applier assembles rows into transactions not only on subscribe stage, but also during final join / register. This was necessary for correct handling of rolled back synchronous transactions in final join stream. Part of #5566
-
Serge Petrenko authored
applier->last_row_time is updated in applier_read_tx_row, which's called at least once per each subscribe loop iteration. So there's no need to have a separate last_row_time update inside the loop body itself. Part of #5566
-