Commits · e12930d06a5b85f9b22482db9534f317b2fc5082 · core / tarantool

Dec 05, 2019

luajit: bump new a new version · e12930d0

Kirill Yukhin authored 5 years ago

List of changes
     - fold: keep type of emitted CONV in sync with its mode
     - test: adjust the test name related to PAIRSMM flag

e12930d0

build: enables LUAJIT_ENABLE_PAIRSMM by default · 2110213c

Olga Arkhangelskaia authored 5 years ago

Turns on LUAJIT_ENABLE_PAIRSMM flag for tarantool build.
Now __pairs/__ipairs metamethods are available.

Closes #4650

2110213c

build: add Fedora 31 into CI / CD · 9e09b07c

Alexander V. Tikhonov authored 5 years ago

Added build + test jobs in GitLab-CI and build + test + deploy jobs on
Travis-CI for Fedora 31.

Updated testing dependencies in the RPM spec to follow the new Python 2
package naming scheme that was introduced in Fedora 31: it uses
python2-' prefix rather then 'python-'.

Fedora 31 does not provide python2-gevent and python2-greenlet packages,
so they were pushed to https://packagecloud.io/packpack/backports
repository. This repository is enabled in our build image
(packpack/packpack:fedora-31) by default. Those dependencies are build-time,
so nothing was changed for a user. The source RPM packages were gathered
from https://rpms.remirepo.net/rpmphp/

.

Closes #4612

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>

9e09b07c

test: update test-run · 5fccf003

Alexander Turenko authored 5 years ago

Strengthen test_run:cmd() against temporary connection failures (#193).

We recently added 'replication/box_set_replication_stress' test that may
exceed file descriptor limit. When test_run:cmd() function executes a
command ('switch master' in the case), it tries to create a new socket
and connect it to test-run's inspector, but it may fail to do so in the
case, because of the file descriptor limit.

The sockets that the test produces are closed in background, so if we'll
keep trying to create and connect a socket we'll succeed once. This is
exactly that the test-run's patch doing: it fails test_run:cmd()
function only if a socket cannot be connected during 100 seconds.

I guess that the reason why sockets are not closed immediately is that
relays wait until replicas will close its side of a socket and only then
closes its side. Didn't investigate it deeper, to be honest.

5fccf003

Dec 03, 2019

json: fix stack-use-after-scope in json_decode() · 6508ddb7

Maria authored 5 years ago

Inside json_decode() struct luaL_serializer is allocated on stack, but
json context stores pointer to it:

   998	static int json_decode(lua_State *l)
   999	{
  ...
  1007	    if (lua_gettop(l) == 2) {
  1008	        struct luaL_serializer user_cfg = *luaL_checkserializer(l);
  1009	        luaL_serializer_parse_options(l, &user_cfg);
  1010	        lua_pop(l, 1);
  1011	        json.cfg = &user_cfg;
  1012      }

Later (for instance in json_decode_descend()), it can be dereferenced
which in turn results in stack-use-after-scope (object turns into
garbage right after scope is ended). To fix it let's simply avoid
allocating and copying luaL_serializer on stack and instead use pointer
to it.

Bug is found by ASAN: test app-tap/json.test.lua fails with enabled
ASAN. Current fix allows to pass all tests.

Thanks to @Korablev77 for the initial investigation.

Closes #4637

6508ddb7

Dec 02, 2019

Revert "test: update test-run" · 4acdeeda

Alexander Turenko authored 5 years ago

This reverts commit a0b196dd.

This commit was pushed occasionally and points to a draft commit in
test-run repository. See also
https://github.com/tarantool/test-run/issues/195

4acdeeda

test: stabilize quorum test conditions · f6775e86

Ilya Kosarev authored 5 years ago

There were some pass conditions in quorum test which could take some
time to be satisfied. Now they are wrapped using test_run:wait_cond to
make the test stable.

Closes #4586

f6775e86

replication: make anon replicas iteration safe · 6f038f4b

Ilya Kosarev authored 5 years ago

In replicaset_follow we iterate anon replicas list: list of replicas
that haven't received an UUID. In case of successful connect replica
link is being removed from anon list. If it happens immediately,
without yield in applier, iteration breaks. Now it is fixed by
rlist_foreach_entry_safe instead of common rlist_foreach_entry.
Relevant test case is added.

Part of #4586
Closes #4576
Closes #4440

6f038f4b

replication: fix appliers pruning · 36ff3c89

Ilya Kosarev authored 5 years ago

During pruning of appliers some anon replicas might connect
from replicaset_follow called in another fiber. Therefore we need to
prune appliers of anon replicas first and, moreover, prune them one by
one instead of iterating them, as far as any of them might connect
while we are stopping the other one and it will break iteration.

Part of #4586
Closes #4643

36ff3c89

test: update test-run · a0b196dd
Ilya Kosarev authored 5 years ago
```
Stabilize tcp_connect in test_run:cmd() (tarantool/test-run#193)
```
a0b196dd

Nov 27, 2019

refactoring: remove try..catch wrapper around trigger->run · 43f2c359
Ilya Kosarev authored 5 years ago
```
Triggers don't throw exceptions any more. Now they have
return codes to report errors.

Closes #4247
```
43f2c359

refactoring: remove redundant line in txn_alter_trigger_new · 0a4079fc

Ilya Kosarev authored 5 years ago

Since refactoring: clear privilege managing triggers from exceptions
(977fca29) we are doing zero memset for
trigger struct in txn_alter_trigger_new. This means we don't any more
need to set any field of this struct to NULL explicitly. 

Part of #4247

0a4079fc

refactoring: update comment for index_def_check_tuple · 7b23b827

Ilya Kosarev authored 5 years ago

Originally index_def_check_tuple comment said that it throws a nice
error. Since refactoring: remove exceptions from
index_def_new_from_tuple (90ac0037)
it returns an error. Now it is clearly specified in the comment.

Part of #4247

7b23b827

refactoring: clear triggers from fresh exceptions · b4ddb4a0

Ilya Kosarev authored 5 years ago

Clear triggers from freshly occured exceptions. Trivial replacements:
`diag_raise` by `return -1`, _xc function by it's non _xc version.

Part of #4247

b4ddb4a0

refactoring: set diagnostics if sequence_by_id fails · cc6f68d2

Ilya Kosarev authored 5 years ago

In refactoring: use non _xc version of functions in triggers
(b75d5f85) sequence_cache_find was
replaced by sequence_by_id. It led to the loss of diagnostics in case
of sequence_by_id failure. Now it is fixed.

Part of #4247

cc6f68d2

refactoring: recombine error conditions in triggers · 8d66e638

Ilya Kosarev authored 5 years ago

Some error conditions in triggers and underlying functions were
combined to look better. On the other hand, in
on_replace_dd_fk_constraint we now return an error immediately if
child space were not found instead of searching for both child and
parent spaces before search results inspection.

Part of #4247

8d66e638

refactoring: specify struct name in allocation diagnostics · bfe2a287

Ilya Kosarev authored 5 years ago

In case of allocation problems in region alloc we were setting
diagnostics using "new slab" stub. Now we specify concrete struct name
which was going to be allocated.

Part of #4247

bfe2a287

refactoring: wrap new operator calls in triggers · aa2e0987

Ilya Kosarev authored 5 years ago

std operator new might throw so we need to wrap it in triggers to
provide non-throwing triggers. It also means alter_space_move_indexes
returns an error code now. It's usages are updated.

Part of #4247

aa2e0987

sql: fix decode of boolean binding value · abc08ca7

Nikita Pettik authored 5 years ago

Some time ago, when there was no support of boolean type in SQL, boolean
values passed as parameters to be bound were converted to integer values
0 and 1. It takes place in lua_sql_bind_decode(). However, now we can
avoid this conversion and store booleans as booleans. Note that patch
does not include test case since type of value is preserved correctly,
so when binding is extracted from struct sql_bind it will assigned to
the right value.

abc08ca7

Nov 26, 2019

iproto: don't destroy a session during disconnect · 6da9d395

Vladislav Shpilevoy authored 5 years ago

Binary session disconnect trigger yield could lead to use after
free of the session object. That happened because iproto thread
sent two requests to TX thread at disconnect:

    - Close the session and run its on disconnect triggers;

    - If all requests are handled, destroy the session.

When a connection is idle, all requests are handled, so both these
requests are sent. If the first one yielded in TX thread, the
second one arrived and destroyed the session right under the feet
of the first one.

This can be solved in two ways - in TX thread, and in iproto
thread.

Iproto thread solution (which is chosen in the patch): just don't
send destroy request until disconnect returns back to iproto
thread.

TX thread solution (alternative): add a flag which says whether
disconnect is processed by TX. When destroy request arrives, it
checks the flag. If disconnect is not done, the destroy request
waits on a condition variable until it is.

The iproto is a bit tricker to implement, but it looks more
correct.

Closes #4627

6da9d395

Nov 22, 2019

luajit: bump a new version · 5d2105bf

Kirill Yukhin authored 5 years ago

Add LUAJIT_ENABLE_PAIRSMM flag as a build option for luajit.
If the flag is set, pairs/ipairs metamethods are available in
Lua 5.1.
For Tarantool this option is enabled by default.

5d2105bf

Nov 21, 2019

build: fix warning re comparison of enum and uint · 2afbe263

Vladislav Shpilevoy authored 5 years ago


The warning is observed when tarantool is compiled by GCC 9.1.0.

Warnings are treated as errors during a debug build or when
-DENABLE_WERROR=ON option is passed to cmake, that is usual for our
testing jobs in CI.

The commit that introduces the problem is
3a8adccf ('access: fix invalid error
type for not found user').

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>

2afbe263

replication: use empty password by default · 6c01ca48

Vladislav Shpilevoy authored 5 years ago

Replication's applier encoded an auth request with exactly the
same parameters as extracted by the URI parser. I.e. when no
password was specified, the parser returned it as NULL, and it was
not encoded. The relay, received such an auth request, complained
that IPROTO_TUPLE field is not specified (this is password).

Such an error confuses - a user didn't do anything illegal, he
just used URI like 'login@host:port', without a password after the
login.

The patch makes the applier use an empty string as a default
password.

An alternative was to force a user always set a password even if
it is an empty string, like that: 'login:@host:port'. And if a
password was not found in an auth request, then reject it with a
password mismatch error. But in that case a URI of kind
'login@host:port' becomes useless - it can never pass. In
addition, netbox already uses an empty string as a default
password. So the only way to make it consistent, and don't break
anything - repeat netbox logic for replication URIs.

Closes #4605

Conflicts:
	test/replication/suite.cfg

6c01ca48

replication: show errno in replication info · 691715b5

Vladislav Shpilevoy authored 5 years ago

Box.info.replication shows applier/relay's latest error message.
But it didn't include errno description for system errors, even
though it was included in the logs. Now box.info shows the errno
description as well, when possible.

Closes #4402

Conflicts:
	test/replication/suite.cfg

691715b5

error: move errno into an error object · 22bbb34f

Vladislav Shpilevoy authored 5 years ago

The only error type having an errno as a part of it was
SystemError (and its descendants SocketError, TimedOut, OOM, ...).
That was used in logs (SystemError::log() method), and exposed to
Lua (if type was SystemError, an error object had 'errno' field).

But actually errno might be useful not only there. For example,
box.info.replication exposes the latest error message of
applier/relay as 'message' field of 'upstream/downstream' fields,
lacking errno description.

Before the patch it was impossible to obtain an errno code from C,
because it was necessary to check whether an error has SystemError
type, cast to SystemError class, and call SystemError::get_errno()
method.

Now errno is available as a part of struct error object (available
from C), and is not 0 for system errors.

Part of #4402

22bbb34f

access: fix invalid error type for not found user · 3a8adccf

Vladislav Shpilevoy authored 5 years ago

Box.session.su() raised 'SystemError' when a user was not found
due to a too long user name. That was obviously wrong, because
SystemError is always something related to libraries (standard,
curl, etc), and it has an errno code.

Now a ClientError is raised.

3a8adccf

func: fix use after free on function unload · fa2893ea

Vladislav Shpilevoy authored 5 years ago

Functions are stored in lists inside module objects. Module
objects are stored in a hash table, where key is a package name.
But the key was a pointer at one of module's function definition
object. Therefore, when that function was deleted, its freed
package name memory was still in the hash key, and could be
accessed, when another function was deleted.

Now module does not use memory of its functions, and keep a copy
of the package name.

fa2893ea

app/fiber: wait till a full event loop iteration ends · 7990d1fa

Serge Petrenko authored 5 years ago

fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694

7990d1fa

fiber.top(): alter exponential moving average calculation · e5a3c090

Serge Petrenko authored 5 years ago

When fiber EMA is 0 and first non-zero observation is added to it, we assumed
that EMA should be equal to this observation (i.e. average value should
be the same as the observed one). This breaks the following invariant:
sum of clock EMAs of all fibers equals clock EMA of the thread.
If one of the fibers is just spawned and has a big clock delta, it
will assign this delta to its EMA, while the thread will calculate the
new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation
when fiber EMA is greater than cord EMA.

This caused occasional test failures:
```
[001] Test failed! Result content mismatch:
[001] --- app/fiber.result	Mon Nov 18 17:00:48 2019
[001] +++ app/fiber.reject	Mon Nov 18 17:33:10 2019
[001] @@ -1511,7 +1511,7 @@
[001]  -- not exact due to accumulated integer division errors
[001]  sum_avg > 99 and sum_avg < 101 or sum_avg
[001]  ---
[001] -- true
[001] +- 187.59585601717
[001]  ...
[001]  tbl = nil
[001]  ---

```

Follow-up #2694

e5a3c090

fiber.top() refactor clock and cpu time calculation · 1743d0a4

Serge Petrenko authored 5 years ago

Unify all the members related to fiber's clock statistics into struct
clock_stat and all the members related to cord's knowledge of cpu state
and clocks to struct cpu_stat.
Reset stats of all alive fibers on fiber.top_enable().

Follow-up #2694

1743d0a4

Nov 15, 2019

test: ensure instances are stopped in tctl test · 8d363c43

Alexander Turenko authored 5 years ago


The problem appears after 6c627af3
('test: tarantoolctl: verify delayed box.cfg()'), where the test case
was changed and it doesn't more assume an error at the instance start.
So we need to stop it to prevent a situation when instances are stay
after `make test`.

Fixes #4600.
Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>

8d363c43

Nov 14, 2019

app/argparse: expect no value for a boolean option · e47f2c91

Alexander Turenko authored 5 years ago


Before commit 03f85d4c ('app: fix
boolean handling in argparse module') the module does not expect a value
after a 'boolean' argument. However there was the problem: a 'boolean'
argument can be passed only at end of an argument list, otherwise it
wrongly consumes a next argument and gives a confusing error message.

The mentioned commit fixes this behaviour in the following way: it still
allows to pass a 'boolean' argument at end of the list w/o a value, but
requires a value ('true', 'false', '1', '0') if a 'boolean' argument is
not at the end to be provided using {'--foo=true'} or {'--foo', 'true'}
syntax.

Here this behaviour is changed: a 'boolean' argument does not assume an
explicitly passed value despite its position in an argument list. If a
'boolean' argument appears in the list, then argparse.parse() returns
`true` for its value (a list of `true` values in case of 'boolean+'
argument), otherwise it will not be added to the result.

This change also makes the behaviour of long (--foo) and short (-f)
'boolean' options consistent.

The motivation of the change is simple: it is easier and more natural to
type, say, `tarantoolctl cat --show-system 00000000000000000000.snap`
then `tarantoolctl cat --show-system true 00000000000000000000.snap`.

This commit adds several new test cases, but it does not mean that we
guarantee that the module behaviour will not be changed around some
corner cases, say, handling of 'boolean+' arguments. This is internal
module.

Follows up #4076.
Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>

e47f2c91

refactoring: remove exceptions from ck_constraint_def_new_from_tuple · e7c64d41

Ilya Kosarev authored 5 years ago

ck_constraint_def_new_from_tuple is used in
on_replace_dd_ck_constraint therefore it has to be cleared from
exceptions. Now it doesn't throw any more. It's usages are updated.
Some _xc functions, not needed any more, are removed.

Part of #4247
Conflicts:
	src/box/alter.cc

e7c64d41

refactoring: remove exceptions from fk_constraint_check_dup_links · 6642a7f7

Ilya Kosarev authored 5 years ago

fk_constraint_check_dup_links is used in
on_replace_dd_fk_constraint therefore it has to be cleared from
exceptions. Now it doesn't throw any more. It's usages are updated.

Part of #4247

6642a7f7

fix: don't request absent tuple field · 25aedb01

Ilya Kosarev authored 5 years ago

During replacement of tuple_field_bool_xc with it's non-xc version
turned out that it might be called even if there is not enough fields
in processed tuple. Now it is fixed.

Part of #4247

25aedb01

Nov 13, 2019

refactoring: remove exceptions from fk_constraint_def_new_from_tuple · 470ccf6c

Ilya Kosarev authored 5 years ago

fk_constraint_def_new_from_tuple is used in
on_replace_dd_fk_constraint therefore it has to be cleared from
exceptions. Now it doesn't throw any more. It means we also need
to clear from exceptions it's subsidiary function: decode_fk_links.
Their usages are updated. Some _xc functions, not needed any more,
are removed.

Part of #4247

470ccf6c

refactoring: remove exceptions from coll_id_def_new_from_tuple · 716a9af7

Ilya Kosarev authored 5 years ago

coll_id_def_new_from_tuple is used in on_replace_dd_collation
therefore it has to be cleared from exceptions. Now it doesn't
throw any more. It's usages are updated.

Part of #4247

716a9af7

refactoring: remove exceptions from user_def_new_from_tuple · 180163c0

Ilya Kosarev authored 5 years ago

user_def_new_from_tuple is used in on_replace_dd_user &
user_cache_alter_user therefore it has to be cleared from
exceptions. Now it doesn't throw any more. It means we also need
to clear from exceptions it's subsidiary function:
user_def_fill_auth_data. Their usages are updated.

Part of #4247

180163c0

refactoring: remove exceptions from alter_space_new · 92b38e73
Ilya Kosarev authored 5 years ago
```
alter_space_new doesn't throw anymore. It's usages are updated.

Part of #4247
```
92b38e73

refactoring: remove exceptions from user_has_data · f4c8c08a

Ilya Kosarev authored 5 years ago

user_has_data is used in on_replace_dd_user therefore it has to be
cleared from exceptions. Now it doesn't throw any more. It means
we also need to clear from exceptions it's subsidiary function:
space_has_data. Their usages are updated.

Part of #4247

f4c8c08a