- May 12, 2020
-
-
Nikita Pettik authored
If recovery process fails during range restoration, range itself is deleted and recovery is assumed to be finished as failed (in case of casual i.e. not forced recovery). During recovery of particular range, runs to be restored are refed twice: once when they are created at vy_run_new() and once when they are attached to slice. This fact is taken into consideration and after all ranges are recovered: all runs of lsm tree are unrefed so that slices own run resources (as a result, when slice is to be deleted its runs unrefed and deleted as well). However, if range recovery fails, range is dropped alongside with already recovered slices. This leads to unrefing runs - this is not accounted. To sum up recovery process below is a brief schema: foreach range in lsm.ranges { vy_lsm_recover_range(range) { foreach slice in range.slices { // inside recover_slice() each run is refed twice if vy_lsm_recover_slice() != 0 { // here all already restored slices are deleted and // corresponding runs are unrefed, so now they have 1 ref. range_delete() } } } } foreach run in lsm.runs { assert(run->refs > 1) vy_run_unref(run) } In this case, unrefing such runs one more time would lead to their destruction. On the other hand, iteration over runs may turn out to be unsafe, so we should use rlist_foreach_entry_safe(). Moreover, we should explicitly clean-up these runs calling vy_lsm_remove_run(). Reviewed-by:
Vladislav Shpilevoy <vshpilevoi@mail.ru> Closes #4805
-
Nikita Pettik authored
With new macro ERROR_INJECT_COUNTDOWN it is possible to delay error injection by iparam value: injection will be set only after iparam times the path is executed. For instance: void foo(int i) { /* 2 is delay counter. */ ERROR_INJECT_COUNTDOWN(ERRINJ_FOO, { printf("Error injection on %d cycle!\n", i); }); } void boo(void) { for (int i = 0; i < 10; ++i) foo(i); } box.error.injection.set('ERRINJ_FOO', 2) The result is "Error injection on 2 cycle!". This type of error injection can turn out to be useful to set injection in the middle of query processing. Imagine following scenario: void foo(void) { int *fds[10]; for (int i = 0; i < 10; ++i) { fds[i] = malloc(sizeof(int)); if (fds[i] == NULL) goto cleanup; } cleanup: free(fds[0]); } "cleanup" section obviously contains error and leads to memory leak. But using means of casual error injection without delay such situation can't be detected: OOM can be set only for first cycle iteration and in this particular case no leaks take place. Reviewed-by:
Vladislav Shpilevoy <vshpilevoi@mail.ru>
-
Nikita Pettik authored
vy_task_write_run() is executed in auxiliary thread (dump or compaction). Write iterator is created and used inside this function. Meanwhile, creating/destroying tuples in these threads does not change reference counter of corresponding tuple formats (see vy_tuple_delete() and vy_stmt_alloc()). Without cleaning up write iterator right in write_iterator_start() after fail, this procedure takes place in vy_task_compaction_abort() or vy_task_dump_abort(). These *_abort() functions in turn are executed in the main thread. Taking this into consideration, tuple might be allocated in aux. thread and deleted in the main thread. As a result, format reference counter might decrease, whereas it shouldn't change (otherwise tuple format will be destroyed before all tuples of this format are gone). Fortunately, clean-up of write iterator in another thread was found only on 1.10 branch, master branch already contains fix but lacks test (2f17c929). So let's introduce test with following scenario: 1. run compaction process; 2. add one or more slice sources in vy_write_iterator_start(): corresponding slice_stream structures obtain newly created tuples in vy_slice_stream_next(); 3. the next call of vy_write_iterator_add_src() fails due to OOM, invalid run file or whatever; 4. if write_iterator_start() didn't provide clean-up of sources, it would take place in vy_task_dump_abort() which would be executed in the main thread; 5. now format reference counter would be less than it was before compaction. Closes #4864
-
Nikita Pettik authored
vy_write_iterator->read_views[i].history objects are allocated on region (see vy_write_iterator_push_rv()) during building history of the given key. However, in case of fail of vy_write_iterator_build_history() region is truncated but pointers to vy_write_history objects are not nullified. As a result, they may be accessed (for instance while finalizing write_iterator object in vy_write_iterator_stop) which in turn may lead to crash, segfaul or disk formatting. The same may happen if vy_read_view_merge() fails during processing of read view array. Let's clean-up those objects in case of error takes place. Part of #4864
-
- May 08, 2020
-
-
HustonMmmavr authored
According to dockerfile reference, there are two forms of specifying entrypoint: exec and shell. Exec form is preferrable and allows use this image in scripts. Close #4960
-
Alexander V. Tikhonov authored
Added Catalina OSX 10.15 to gitlab-ci testing and removed OSX 10.13, due to decided to have only 2 last major releases, for now it is 10.14 and 10.15 OSX versions. Also changed the commit job for branches from 10.14 to 10.15 OSX version. Additional cleanup for 'box_return_mp' and 'box_session_push', added API_EXPORT which defines nothrow, compiler warns or errors depending on the build options. Part of #4885 Close #4873
-
Alexander V. Tikhonov authored
Fragiled flaky tests from parallel runs to avoid of flaky fails in regular testing: box-py/snapshot.test.py ; gh-4514 replication/misc.test.lua ; gh-4940 replication/skip_conflict_row.test.lua ; gh-4958 replication-py/init_storage.test.py ; gh-4949 vinyl/stat.test.lua ; gh-4951 xlog/checkpoint_daemon.test.lua ; gh-4952 Part of #4953
-
Oleg Piskunov authored
Gitlab-ci pipeline modified in order to keep performance results into gitlab-ci artifacts. Closes #4920
-
Georgy Kirichenko authored
Here is a summary on how and when rollback works in WAL. Disk write failure can cause rollback. In that case the failed and all next transactions, sent to WAL, should be rolled back. Together. Following transactions should be rolled back too, because they could make their statements based on what they saw in the failed transaction. Also rollback of the failed transaction without rollback of the next ones can actually rewrite what they committed. So when rollback is started, *all* pending transactions should be rolled back. However if they would keep coming, the rollback would be infinite. This means to complete a rollback it is necessary to stop sending new transactions to WAL, then rollback all already sent. In the end allow new transactions again. Step-by-step: 1) stop accepting all new transactions in WAL thread, where rollback is started. All new transactions don't even try to go to disk. They added to rollback queue immediately after arriving to WAL thread. 2) tell TX thread to stop sending new transactions to WAL. So as the rollback queue would stop growing. 3) rollback all transactions in reverse order. 4) allow transactions again in WAL thread and TX thread. The algorithm is long, but simple and understandable. However implementation wasn't so easy. It was done using a 4-hop cbus route. 2 hops of which were supposed to clear cbus channel from all other cbus messages. Next two hops implemented steps 3 and 4. Rollback state of the WAL was signaled by checking internals of a preallocated cbus message. The patch makes it simpler and more straightforward. Rollback state is now signaled by a simple flag, and there is no a hack about clearing cbus channel, no touching attributes of a cbus message. The moment when all transactions are stopped and the last one has returned from WAL is visible explicitly, because the last sent to WAL journal entry is saved. Also there is a single route for commit and rollback cbus messages now, called tx_complete_batch(). This change will come in hand in scope of synchronous replication, when WAL write won't be enough for commit. And therefore 'commit' as a concept should be washed away from WAL's code gradually. Migrate to solely txn module.
-
Roman Khabibov authored
Add check that on_shutdown() triggers were called before exit, because in case of EOF or Ctrl+D (no signals) they were ignored. Closes #4703
-
- May 07, 2020
-
-
Nikita Pettik authored
If vy_key_from_msgpack() fails in vy_lsm_split_range(), clean-up procedure is called. However, at this moment struct vy_range *parts[2] is not initialized ergo contains garbage and access to this structure may result in crash, segfault or disk formatting. Let's move initialization of mentioned variables before call of vy_lsm_split_range(). Part of #4864
-
- May 01, 2020
-
-
Alexander V. Tikhonov authored
Found that in commit 'travis-ci/gitlab-ci: add Ubuntu Focal 20.04' forgot to add Ubuntu Focal to the list of the available Ubuntu distributions in the script for saving built packages at S3. Follow up #4863
-
- Apr 30, 2020
-
-
Alexander V. Tikhonov authored
Closes #4863
-
Sergey Ostanevich authored
API_EXPORT defines nothrow, so compiler warns or errors depending on the build options. Closes #4885
-
Aleander V. Tikhonov authored
Fixed flaky upstream checks at replication/skip_conflict_row test, also check on lsn set in test-run wait condition routine. Errors fixed: [024] @@ -66,11 +66,11 @@ [024] ... [024] box.info.replication[1].upstream.message [024] --- [024] -- null [024] +- timed out [024] ... [024] box.info.replication[1].upstream.status [024] --- [024] -- follow [024] +- disconnected [024] ... [024] box.space.test:select() [024] --- [024] [004] @@ -125,11 +125,11 @@ [004] ... [004] box.info.replication[1].upstream.message [004] --- [004] -- Duplicate key exists in unique index 'primary' in space 'test' [004] -... [004] -box.info.replication[1].upstream.status [004] ---- [004] -- stopped [004] +- null [004] +... [004] +box.info.replication[1].upstream.status [004] +--- [004] +- follow [004] ... [004] test_run:cmd("switch default") [004] --- [004] [038] @@ -174,7 +174,7 @@ [038] ... [038] box.info.replication[1].upstream.status [038] --- [038] -- follow [038] +- disconnected [038] ... [038] -- write some conflicting records on slave [038] for i = 1, 10 do box.space.test:insert({i, 'r'}) end Line 201 (often): [039] @@ -201,7 +201,7 @@ [039] -- lsn should be incremented [039] v1 == box.info.vclock[1] - 10 [039] --- [039] -- true [039] +- false [039] ... [039] -- and state is follow [039] box.info.replication[1].upstream.status [039] [030] @@ -201,12 +201,12 @@ [030] -- lsn should be incremented [030] v1 == box.info.vclock[1] - 10 [030] --- [030] -- true [030] +- false [030] ... [030] -- and state is follow [030] box.info.replication[1].upstream.status [030] --- [030] -- follow [030] +- disconnected [030] ... [030] -- restart server and check replication continues from nop-ed vclock [030] test_run:cmd("switch default") Line 230 (OSX): [022] --- replication/skip_conflict_row.result Thu Apr 16 21:54:28 2020 [022] +++ replication/skip_conflict_row.reject Mon Apr 27 00:52:56 2020 [022] @@ -230,7 +230,7 @@ [022] ... [022] box.info.replication[1].upstream.status [022] --- [022] -- follow [022] +- disconnected [022] ... [022] box.space.test:select({11}, {iterator = "GE"}) [022] --- [022] Close #4457
-
- Apr 29, 2020
-
-
Alexander Turenko authored
Now we have S3 based infrastructure for RPM / Deb packages and GitLab CI pipelines, which deploys packages to it. We don't plan to add 2.5+ repositories on packagecloud.io, so instead of usual change of target bucket from 2_N to 2_(N+1), the deploy stage is removed. Since all distro specific jobs are duplicated in GitLab CI pipelines and those Travis-CI jobs are needed just for deployment, it worth to remove them too. Follows up #3380. Part of #4947.
-
- Apr 28, 2020
-
-
Vladislav Shpilevoy authored
A couple of functions were mistakenly declared as 'function' instead of 'local function' in schema.lua. That led to their presence in the global namespace. Closes #4812
-
Vladislav Shpilevoy authored
When index:alter() was called on a non-functional index with specified 'func', it led to accessing a not declared variable in schema.lua.
-
Vladislav Shpilevoy authored
Port_tuple is exactly the same as port_c, but is not able to store raw MessagePack. In theory it sounds like port_tuple should be a bit simpler and therefore faster, but in fact it is not. Microbenchmarks didn't reveal any difference. So port_tuple is no longer needed, all its functionality is covered by port_c. Follow up #4641
-
Vladislav Shpilevoy authored
Closes #4641 @TarantoolBot document Title: box_return_mp() public C function Stored C functions could return a result only via `box_return_tuple()` function. That made users create a tuple every time they wanted to return something from a C function. Now public C API offers another way to return - `box_return_mp()`. It allows to return arbitrary MessagePack, not wrapped into a tuple object. This is simpler to use for small results like a number, boolean, or a short string. Besides, `box_return_mp()` is much faster than `box_return_tuple()`, especially for small MessagePack. Note, that it is faster only if an alternative is to create a tuple by yourself. If an already existing tuple was obtained from an iterator, and you want to return it, then of course it is faster to return via `box_return_tuple()`, than via extraction of tuple data, and calling `box_return_mp()`. Here is the function declaration from module.h: ```C /** * Return MessagePack from a stored C procedure. The MessagePack * is copied, so it is safe to free/reuse the passed arguments * after the call. * MessagePack is not validated, for the sake of speed. It is * expected to be a single encoded object. An attempt to encode * and return multiple objects without wrapping them into an * MP_ARRAY or MP_MAP is undefined behaviour. * * \param ctx An opaque structure passed to the stored C procedure * by Tarantool. * \param mp Begin of MessagePack. * \param mp_end End of MessagePack. * \retval -1 Error. * \retval 0 Success. */ API_EXPORT int box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end); ```
-
Vladislav Shpilevoy authored
Port_c is a new descendant of struct port. It is used now for public C functions to store their result. Currently they can return only a tuple, but it will change soon, they will be able to return arbitrary MessagePack. Port_tuple is not removed, because still is used for box_select(), for functional indexes, and in SQL as a base for port_sql. Although that may be changed later. Functional indexes really need only a single MessagePack object from their function. While box_select() working via port_tuple or port_c didn't show any significant difference during micro benchmarks. Part of #4641
-
- Apr 27, 2020
-
-
Serge Petrenko authored
Since the introduction of transaction boundaries in replication protocol, appliers follow replicaset.applier.vclock to the lsn of the first row in an arrived batch. This is enough and doesn't lead to errors when replicating from other instances, respecting transaction boundaries (instances with version 2.1.2 and up). However, if there's a 1.10 instance in 2.1.2+ cluster, it sends every single tx row as a separate transaction, breaking the comparison with replicaset.applier.vclock and making the applier apply part of the changes, it has already applied when processing a full transaction coming from another 2.x instance. Such behaviour leads to ER_TUPLE_FOUND errors in the scenario described above. In order to guard from such cases, follow replicaset.applier.vclock to the lsn of the last row in tx. Closes #4924 Reviewed-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Roman Khabibov authored
Function implementing comparison during VDBE sorting routine (sqlVdbeCompareMsgpack) did not account values of boolean type in some cases. Let's fix it so that booleans always precede numbers if they are sorted in ascending order. Closes #4697
-
- Apr 24, 2020
-
-
Cyrill Gorcunov authored
The notification of wait variable shall be done under a bound mutex locked. Otherwise the results are not guaranteed (see pthread manuals). Thus when we create a new endpoint via cbus_endpoint_create and there is an other thread which sleeps inside cpipe_create we should notify the sleeper under cbus.mutex. Fixes #4806 Reported-by:
Alexander Turenko <alexander.turenko@tarantool.org> Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com>
-
Leonid Vasiliev authored
The cbus hang test uses glibc pthread mutex implementation details. The reason why mutex implementation details is used: "For the bug reproducing the canceled thread must be canceled during processing cpipe_flush_cb. We need to synchronize the main thread and the canceled worker thread for that. So, thread synchronization has been realized by means of endpoint's mutex internal field(__data.__lock)." Therefore, it should not compile in case of using another library.
-
- Apr 21, 2020
-
-
Olga Arkhangelskaia authored
While refactoring of say module in commit 5db765a7 (say: fix non-informative error messages for log cfg) format of syslog was broken. Closes #4785
-
- Apr 20, 2020
-
-
Kirill Yukhin authored
-
Nikita Pettik authored
In case accessing previous error doesn't come alongside with incrementing its reference counter, it may lead to use-after-free bug. Consider following scenario: _, err = foo() -- foo() returns diagnostic error stack preve = err.prev -- err.prev ref. counter == 1 err:set_prev(nil) -- err.prev ref. counter == 0 so err.prev is destroyed preve -- accessing already freed memory To avoid that let's increment reference counter of .prev member while calling error.prev and set corresponding gc finalizer (error_unref()). Closes #4887
-
Nikita Pettik authored
There's no overflow check while incrementing error's reference counter in error_ref(). Meanwhile, stubborn users still may achieve overflow: each call of box.error.last() increments reference counter of error residing in diagnostic area. As a result, 2^32 calls of box.error.last() in a row will lead to counter overflow ergo - to unpredictable results. Let's fix it and introduce dummy check in error_ref().
-
Kirill Yukhin authored
Fixa clash in struct names detected by LTO.
-
Kirill Yukhin authored
Make older compiler happy w.r.t. initialization of locals.
-
Igor Munkin authored
Fixes the regression from 335f80a0 ('test: adjust luajit-tap testing machinery').
-
Leonid Vasiliev authored
Fix possible overlap of IPROTO_ERROR by IPROTO_ERROR_24. This was possible because messages are transmitted in a map and an order is not defined. IPROTO_ERROR_24 could be parsed after the IPROTO_ERROR, and could throw it away.
-
Vladislav Shpilevoy authored
IPROTO_ERROR in fact is not an error. It is an error message. Secondly, this key is deprecated in favor of IPROTO_ERROR_STACK, which contains all attributes of the whole error stack. It uses MP_ERROR MessagePack extenstion for that. So IPROTO_ERROR is renamed to IPROTO_ERROR_24 (similar to how old call was renamed to IPROTO_CALL_16). IPROTO_ERROR_STACK becomes new IPROTO_ERROR. Follow up #4398
-
Vladislav Shpilevoy authored
After error objects marshaling was implemented in #4398, there were essentially 2 versions of the marshaling - when an error is sent inside response body, and when it is thrown and is encoded in iproto fields IPROTO_ERROR and IPROTO_ERROR_STACK. That is not really useful to have 2 implementation of the same feature. This commit drops the old iproto error encoding (its IPROTO_ERROR_STACK part), and makes it reuse the common error encoder. Note, the encoder skips MP_EXT header. This is because * The header is not needed - error is encoded as a value of IPROTO_ERROR_STACK key, so it is known this is an error. MP_EXT is needed only when type is unknown on decoding side in advance; * Old clients may not expect MP_EXT in iproto fields. That is the case of netbox connector, at least. Follow up #4398 @TarantoolBot document Title: Stacked diagnostics binary protocol Stacked diagnostics is described in details in https://github.com/tarantool/doc/issues/1224. This commit changes nothing except binary protocol. The old protocol should not be documented anywhere. `IPROTO_ERROR_STACK` is still 0x52, but format of its value is different now. It looks exactly like `MP_ERROR` object, without `MP_EXT` header. ``` IPROTO_ERROR_STACK: <MP_MAP> { MP_ERROR_STACK: <MP_ARRAY> [ <MP_MAP> { ... <all the other fields of MP_ERROR> ... }, ... ] } ``` It is easy to see, that key `IPROTO_ERROR_STACK` is called 'stack', and `MP_ERROR_STACK` is also 'stack'. So it may be good to rename the former key in the documentation. For example, the old `IPROTO_ERROR` can be renamed to `IPROTO_ERROR_24` and `IPROTO_ERROR_STACK` can be renamed to just `IPROTO_ERROR`.
-
Vladislav Shpilevoy authored
C struct error objects can be created directly only in C. C-side increments their reference counter when pushes to the Lua stack. It is not going to be so convenient soon. error_unpack() function will be used in netbox to decode error object via Lua FFI. Such error object will have 0 refs and no Lua GC callback established. Because it won't be pushed on Lua stack natually, from Lua C. To make such errors alive their reference counter will be incremented and error_unref() will be set as GC callback. Follow up for #4398
-
Leonid Vasiliev authored
Co-authored-by:
Vladislav <Shpilevoy<v.shpilevoy@tarantool.org> Closes #4398 @TarantoolBot document Title: Error objects encoding in MessagePack Until now an error sent over IProto, or serialized into MessagePack was turned into a string consisting of the error message. As a result, all other error object attributes were lost, including type of the object. On client side seeing a string it was not possible to tell whether the string is a real string, or it is a serialized error. To deal with that the error objects encoding is reworked from the scratch. Now, when session setting `error_marshaling_enabled` is true, all fibers of that session will encode error objects as a new MP_EXT type - MP_ERROR (0x03). ``` +--------+----------+========+ | MP_EXT | MP_ERROR | MP_MAP | +--------+----------+========+ MP_ERROR: <MP_MAP> { MP_ERROR_STACK: <MP_ARRAY> [ <MP_MAP> { MP_ERROR_TYPE: <MP_STR>, MP_ERROR_FILE: <MP_STR>, MP_ERROR_LINE: <MP_UINT>, MP_ERROR_MESSAGE: <MP_STR>, MP_ERROR_ERRNO: <MP_UINT>, MP_ERROR_CODE: <MP_UINT>, MP_ERROR_FIELDS: <MP_MAP> { <MP_STR>: ..., <MP_STR>: ..., ... }, ... }, ... ] } ``` On the top level there is a single key: `MP_ERROR_STACK = 0x00`. More keys can be added in future, and a client should ignore all unknown keys to keep compatibility with new versions. Every error in the stack is a map with the following keys: * `MP_ERROR_TYPE = 0x00` - error type. This is what is visible in `<error_object>.base_type` field; * `MP_ERROR_FILE = 0x01` - file name from `<error_object>.trace`; * `MP_ERROR_LINE = 0x02` - line from `<error_object>.trace`; * `MP_ERROR_MESSAGE = 0x03` - error message from `<error_object>.message`; * `MP_ERROR_ERRNO = 0x04` - errno saved at the moment of the error creation. Visible in `<error_object>.errno`; * `MP_ERROR_CODE = 0x05` - error code. Visible in `<error_object>.code` and in C function `box_error_code()`. * `MP_ERROR_FIELDS = 0x06` - additional fields depending on error type. For example, AccessDenied error type stores here fields `access_type`, `object_type`, `object_name`. Connector's code should ignore unknown keys met here, and be ready, that for some existing errors new fields can be added, old can be dropped.
-
Vladislav Shpilevoy authored
Lua C module 'msgpack' supports registration of custom extension decoders for MP_EXT values. That is needed to make 'msgpack' not depending on any modules which use it. So far the only box-related extension were tuples - struct tuple cdata needed to be encoded as an array. That is going to change in next commits, where struct error cdata appears, also depending on box. So the decoder can't be located in src/box/lua/tuple.c. It is moved to a more common place - src/box/lua/init.c. Needed for #4398
-
Leonid Vasiliev authored
We want to have a transparent marshalling through net.box for errors. To do this, we need to recreate the error on the client side with the same parameters as on the server. For convenience, we update AccessDeniedError constructor which has pointers to static strings and add the XlogGapError constructor that does not require vclock. Needed for #4398
-
Leonid Vasiliev authored
Errors are encoded as a string when serialized to MessagePack to be sent over IProto or when just saved into a buffer via Lua modules msgpackffi and msgpack. That is not very useful on client-side, because most of the error metadata is lost: code, type, trace - everything except the message. Next commits are going to dedicate a new MP_EXT type to error objects so as everything could be encoded, and on client side it would be possible to restore types. But this is a breaking change in case some users use old connectors when work with newer Tarantool instances. So to smooth the upgrade there is a new session setting - 'error_marshaling_enabled'. By default it is false. When it is true, all fibers of the given session will serialize error objects as MP_EXT. Co-authored-by:
Vladislav <Shpilevoy<v.shpilevoy@tarantool.org> Needed for #4398
-