Commits · 01657bfbb9b34997f20d27405226a9affdeeb520 · core / tarantool

Apr 20, 2020

popen: always free resources in popen_delete() · 01657bfb

The function still set a diagnostics when a signal sending fails and
returns -1, but it is purely informational result: for logging or so. It
reflects notes about dealing with failures in Linux's `man 2 close`:

 | Note, however, that a failure return should be used only for
 | diagnostic purposes <...> or remedial purposes <...>.
 |
 | <...> Linux  kernel always releases the file descriptor early in the
 | close operation, freeing it for reuse; the steps that may return an
 | error <...> occur only later in the close operation.
 |
 | Many other implementations similarly always close the file descriptor
 | <...> even if they subsequently report an error on return from
 | close(). POSIX.1 is currently silent on this point, but there are
 | plans to mandate this behavior in the next major release of the
 | standard.

When kill or killpg returns EPERM a caller usually unable to overcome it
somehow: retrying is not sufficient here. So there are no needs to keep
the handle: a caller refuses the handle and don't want to perform any
other operation on it.  The open engine do its best to kill a child
process or a process group, but when it is not possible, just set the a
diagnostic and free handle resources anyway.

Left comments about observed Mac OS behaviour regarding killing a
process group, where all processes are zombies (or just when a process
group leader is zombie, don't sure): it gives EPERM instead of ESRCH
from killpg(). This result should not surprise a user, so it should be
documented. See [1] for another description of the problem (I don't find
any official information about this).

[1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1329528



Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
(cherry picked from commit 56a8c346ecb0581300a63c6e677d8a4672ff1f95)

01657bfb

net.box: fix fetching of schema of an old version · 06edcbe1

Alexander Turenko authored 4 years ago

After 2.2.0-633-gaa0964ae1 ('net.box: fix schema fetching from 1.10/2.1
servers') net.box expects that _vcollation system view exists on a
tarantool server of 2.2.1+ version. This is however not always so: a
server may be run on a new version of tarantool, but work on a schema of
an old version.

The situation with non last schema is usual for replication cluster in
process of upgrading: all instances run on the new version of tarantool
first (no auto-upgrade is performed by tarantools in a cluster). Then
box.schema.upgrade() should be called, but the instances should be
operable even before the call.

Before the commit net.box was unable to connect a server if it is run on
a schema without _vcollation system view (say, 2.1.3), but the server
executable is of 2.2.1 version or newer.

Note: I trim tests from the commit to polish them a bit more, but
include the fix itself to 2.4.1 release.

Follows up #4307
Fixes #4691

06edcbe1

test: adjust luajit-tap testing machinery · 335f80a0

Igor Munkin authored 4 years ago


This changeset makes possible to run luajit-tap tests requiring
libraries implemented in C:
* symlink to luajit test is created on configuration phase instead of
  build one.
* introduced a CMake function for building shared libraries required for
  luajit tests.

Furthermore this commit enables CMake build for the following luajit-tap
tests:
* gh-4427-ffi-sandwich
* lj-flush-on-trace

Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>

335f80a0

luajit: bump new version · a1594091

Kirill Yukhin authored 4 years ago


- jit: abort trace execution on JIT mode change
- jit: abort trace recording and execution for C API

Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>

a1594091

Apr 18, 2020

build: fix build with CMake 2 · 930c7463

Alexander V. Tikhonov authored 4 years ago

Found that some package builds failed on the mistake in CMakeLists.txt
file, the failed packages and test builds were:
- CentOS 6
- CentOS 7
- Ubuntu 14.04
and static build based on Dockerfile.

The core of the issue is that CMake 2 does not support line continuation
with backslash.

The commit fixes the regression from
7b443650 ('feedback: add cmake option to
disable the daemon').

Follow up #3308

930c7463

Apr 17, 2020

feedback: add cmake option to disable the daemon · 7b443650

Vladislav Shpilevoy authored 4 years ago

There is a complaint that the feedback daemon is a 'spying' tool
and because of that can't be used on Gentoo. Its default disabled
option also is not acceptable, the daemon should be eliminated
completely.

The patch introduces cmake option ENABLE_FEEDBACK_DAEMON. It is
ON by default. When set to OFF, all feedback daemon's code is not
included into the binary, its configuration options disappear.

Closes #3308

7b443650

box: yield after initial box_cfg() is finished · 70695ecb

Vladislav Shpilevoy authored 4 years ago

box.cfg() works in two stages, when called first time - boot the
instance using box_cfg() C++ function, and then configure it.
During booting all non-dynamic parameters are read. Dynamic are
configured mostly afterwards.

Normally there should be a yield between box_cfg() C++ call and
dynamic parameters configuration. It is used by box.ctl.wait_ro()
and box.ctl.wait_rw() Lua calls to catch the instance in read-only
state always before read-write state.

In theory a user should be able to call box.ctl.wait_ro() and
box.ctl.wait_rw() in one fiber, box.cfg() in another, and these
waits would be unblocked one after another.

It works fine now, but only because of, surprisingly, the feedback
daemon. The daemon creates a yield after C++ box_cfg() is
finished, but dynamic parameters are still being applied in
load_cfg.lua. That gives time to catch box.ctl.wait_ro() event.

The thing is that dynamic parameters configuration includes the
daemon's options too. When 'feedback_enable' option is installed
to true, the daemon is started using fiber.create(). That creates
a yield, and gives time to box.ctl.wait_ro() fibers to handle the
event.

When the daemon is disabled or removed, like it is going to happen
in #3308, this trick does not work, and box.ctl.wait_ro() started
before box.cfg() is never triggered.

It could be tested on app-tap/cfg.test.lua with

    box.cfg{}

changed to

    box.cfg{feedback_enabled = false}

Then the test would hang. A test is not patched here, because the
feedback is going to be optionally removed in a next commit, and
the test would become flaky depending on build options.

Needed for #3308

70695ecb

feedback: move feedback code to the single file · e9e9b540

Vladislav Shpilevoy authored 4 years ago

Feedback daemon's code was located in two files:
box/lua/feedback_daemon.lua and box/lua/schema.lua. That makes
it harder to eliminate the daemon at cmake configuration time.

Now all its code is in one place, in feedback_daemon.lua. Disable
of the daemon's code now is a matter of excluding the Lua file
from source code.

Part of #3308

e9e9b540

box: improve built-in module load panic message · a05ff5d3

Vladislav Shpilevoy authored 4 years ago

Box built-in modules, such as session, tuple, schema, etc, were
loaded using luaL_loadbuffer() + lua_call(). Error of the former
call was handled properly with a panic message describing the
problem.

But if lua_call() failed, it resulted into 'unknown exception' in
main.cc. Not very helpful. Now it is lua_pcall(), and the error
message is included into the panic() message. That helps in debug,
when something is being changed in the box modules.

a05ff5d3

popen: fix popen_write_timeout retval type · c66a7d84

Alexander Turenko authored 4 years ago


On Linux x86_64 `ssize_t` is 64 bit, while `int` is 32 bit wide (at
least typically). Let's return `ssize_t` from popen_write_timeout() to
prevent data loss.

Part of #4031

Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

c66a7d84

popen: add caution comment for popen_may_io() · e164d95e

Alexander Turenko authored 4 years ago


It was easy to misinterpret popen_may_io() contract. In fact, I made
this mistake recently and want to clarify how the function should be
called.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

e164d95e

popen: allow to close parent's end of std* fds · 5efd028d

Alexander Turenko authored 4 years ago


The function popen_shutdown() checks whether std{in,out,err} was piped
and closes the parent's end. A user should have ability to send EOF for
child's stdin for stream programs like `grep`. It is better when there
is a function that encapsulates proper checks, error messages and the
actual actions.

This commit in particular reverts
1ef95b99 ('popen: remove redundant fd
check before perform IO'), because now the check is meaningful: an fd
may become closed before the whole popen handle will be deleted.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

5efd028d

popen: refine popen_{read,write}_timeout errors · 55cb9cbe

Alexander Turenko authored 4 years ago


Popen backend errors should be meaningful for a user of the popen Lua
API, because otherwise we'll need to map backend errors into Lua API
errors. Those particular failures can't appear when the functions are
called from the Lua API, but it is good to keep all error messages in
one style.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

55cb9cbe

popen: clarify group signaling details · 062e55eb

Alexander Turenko authored 4 years ago


Even when ..._SETSID and ..._GROUP_SIGNAL are set, we unable to safely
kill a process group after the child process we spawned becomes died. So
we don't do that.

The behaviour seems to be indefeasible part of Unix process group
design. The best that we can do here is describe those details in the
documentation comment.

NB: It seems that pid namespaces allow to overcome this problem, however
it is the Linux specific feature, so we unlikely will use them.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

062e55eb

popen: clarify popen_{signal,delete} contract · c4cf1454

Alexander Turenko authored 4 years ago


It is convenient to have a formal description of an API during
development and when writing a documentation. I plan to use those
contracts when I will write an API description for the future popen Lua
module.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

c4cf1454

popen: fix close-on-exec flag setting · e9abaef0

Alexander Turenko authored 4 years ago


fcntl(2) lists flags that can be set using F_SETFL: O_CLOEXEC is not
included there. F_SETFD should be used to set close-on-exec.

Parent's end of pipes are closed explicitly in a child process anyway.
However this change fixes closing of the copy of a logger fd. See commit
07a07b3c ('popen: decouple logger fd
from stderr') for more information why this file descriptor was
introduced.

Part of #4031.

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

e9abaef0

popen: add logging of duplicated logger fd · a8bc553b
Alexander Turenko authored 4 years ago
```
For debugging purposes.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
```
a8bc553b

popen: quote multiword command arguments · 0ed48764

Alexander Turenko authored 4 years ago


Of course it is still not fair shell-style quoting: at least we should
also escape quotes inside arguments. But it gives correct output for
most of typical commands and has straightforward implementation.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

0ed48764

popen: remove retval from popen_stat() · c66617e3

Alexander Turenko authored 4 years ago


The change 'popen: require popen handle to be non-NULL' makes
popen_stat() function always successful. There is no reason to return a
success / failure result.

See the previous similar patch: 'popen: remove retval from
popen_state()'.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

c66617e3

popen: add missed diag_set() in popen_new() · 2b4ca339

Alexander Turenko authored 4 years ago


See the previous similar commits:

* popen: add missed diag_set() in popen IO functions
* popen: add missed diag_set in popen_signal/delete

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

2b4ca339

popen: log a reason of close inherited fds failure · 667930de

Alexander Turenko authored 4 years ago


This information may be useful for debuggging.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

667930de

popen: add ability to keep child on deletion · 78867931

Cyrill Gorcunov authored 4 years ago


Currently popen_delete kills all children process.
Moreover we use popen_delete on tarantool exit.

Alexander pointed out that keep children running
even if tarantool is exited is still needed.

Part of #4031

Reported-by: Alexander Turenko <alexander.turenko@tarantool.org>
Acked-by: Alexander Turenko <alexander.turenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

78867931

popen: allow to kill process group · 393f0bbf

Cyrill Gorcunov authored 4 years ago


As Alexander pointed out this might be useful
for running a pipe of programs inside shell
(i.e. popen.shell('foo | bar | baz', 'r')).

Part of #4031

Reported-by: Alexander Turenko <alexander.turenko@tarantool.org>
Acked-by: Alexander Turenko <alexander.turenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

393f0bbf

Apr 16, 2020

sql: do not change order of inserted values · 2cc7e608

Mergen Imeev authored 4 years ago

Before this patch, if an ephemeral space was used during INSERT or
REPLACE, the inserted values were sorted by the first column,
since this was the first part of the index. This can lead to an
error when using the AUTOINCREMENT feature, since changing the
order of the inserted value can change the value inserted instead
of NULL. To avoid this, the patch makes the rowid of the inserted
row in the ephemeral space the only part of the ephemeral space
index.

Closes #4256

2cc7e608

sql: specify field types in ephemeral space format · 2103f587

Mergen Imeev authored 5 years ago

This patch specifies field types in ephemeral space format in SQL.
Prior to this patch, all fields had a SCALAR field type.

This patch allows us to not use the primary index to obtain field
types, since now the ephemeral space has field types in the
format. This allows us to change the structure of the primary
index, which helps to solve the issue #4256. In addition, since we
can now set the field types of the ephemeral space, we can use
this feature to set the field types according to the left value of
the IN operator. This will fix issue #4692.

Needed for #4256
Needed for #4692
Closes #3841

2103f587

box: extend ephemeral space format · 032de39f

Mergen Imeev authored 4 years ago

This patch allows to set field types and names in ephemeral space
formats.

Needed for #4256
Needed for #4692
Part of #3841

032de39f

relay: move relay_schedule_pending_gc before status update · e7ffddce

Serge Petrenko authored 4 years ago

relay_schedule_pending_gc() is executed after relay status update,
which made perfect sense before we've introduced local spaces rework, making
local space operations use a special instance id: 0.
Relay status update is performed only when the remote instance has
reported a bigger vclock, than its previous one. However, we may have an
entire WAL file filled with local space changes, in which case the
changes won't be transmitted to replica, and it will report the same
vclock as before, postponing the scheduled gc until a non-local row is
created on master.

Fix this by reordering relay_schedule_pending_gc() and relay status
update. In case nothing new is added to pending_gc queue and replica
clock is not updated, relay_schedule_pending_gc() will exit on the first
loop iteration, so it doesn't add an overhead.

Also make relay_schedule_pending_gc() use vclock_compare_ignore0() instead
of plain vclock_compare().

Follow-up #4114

e7ffddce

Apr 15, 2020

sql: add '\0' to the BLOB when it is cast to INTEGER · a39e6a01

Mergen Imeev authored 4 years ago

Prior to this patch, due to the absence of the '\0' character at
the end of the BLOB, it was possible to get an error or incorrect
result when using CAST() from BLOB to INTEGER or UNSIGNED. This
has now been fixed, but the maximum length of a BLOB that could be
cast to INTEGER or UNSIGNED was limited to 12287 bytes.

Examples of wrong CAST() from BLOB to INTEGER:

CREATE TABLE t (i INT PRIMARY KEY, a VARBINARY, b INT, c INT);
INSERT INTO t VALUES (1, X'33', 0x33, 0x00), (2, X'34', 0x41, 0);

Example of wrong result:

SELECT CAST(a AS INTEGER) FROM t WHERE i = 1;

Result: 33

Example of error during CAST():

SELECT CAST(a AS INTEGER) FROM t WHERE i = 2;

Result: 'Type mismatch: can not convert varbinary to integer'

Closes #4766

a39e6a01

sql: fix implicit cast from STRING to INTEGER · 6e6de43c

Mergen Imeev authored 4 years ago

Prior to this patch, STRING, which contains the DOUBLE value,
could be implicitly cast to INTEGER. This was done by converting
STRING to DOUBLE and then converting this DOUBLE value to INTEGER.
This may affect the accuracy of CAST(), so it was forbidden. It
is worth noting that these changes will not affect the comparison,
since the implicit cast in this case has different mechanics.

Example:
box.execute("CREATE TABLE t(i INT PRIMARY KEY);")

Before patch:
box.execute("INSERT INTO t VALUES ('111.1');")
box.execute("SELECT * FROM t;")
Result: 111

After patch:
box.execute("INSERT INTO t VALUES ('1.1');")
Result: 'Type mismatch: can not convert 1.1 to integer'

box.execute("INSERT INTO t VALUES ('1.0');")
Result: 'Type mismatch: can not convert 1.0 to integer'

box.execute("INSERT INTO t VALUES ('1.');")
Result: 'Type mismatch: can not convert 1. to integer'

@TarantoolBot document
Title: disallow cast from STRING contains DOUBLE to INTEGER

After the last two patches, explicit and implicit casting from the
string containing DOUBLE to INTEGER directly will be prohibited.
The user must use the explicit cast to DOUBLE before the explicit
or implicit cast to INTEGER. The reason for this is that before
these patches, such STRINGs were implicitly cast to DOUBLE, and
then this DOUBLE was implicitly or explicitly cast to INTEGER.
Because of this, the result of such a cast may differ from what
the user expects, and the user may not know why.

It is worth noting that these changes will not affect the
comparison, since the implicit cast in this case has different
mechanics.

Example for implicit cast:

box.execute("CREATE TABLE t(i INT PRIMARY KEY);")
-- Does not work anymore:
box.execute("INSERT INTO t VALUES ('1.1');")
-- Right way:
box.execute("INSERT INTO t VALUES (CAST('1.1' AS DOUBLE));")

Example for explicit cast:

-- Does not work anymore:
box.execute("SELECT CAST('1.1' AS INTEGER);")
-- Right way:
box.execute("SELECT CAST(CAST('1.1' AS DOUBLE) AS INTEGER);")

6e6de43c

sql: fix CAST() from STRING to INTEGER · 11352a32

Mergen Imeev authored 5 years ago

Prior to this patch, STRING, which contains the DOUBLE value,
could be cast to INTEGER. This was done by converting STRING to
DOUBLE and then converting this DOUBLE value to INTEGER. This may
affect the accuracy of CAST(), so it was forbidden.

Before patch:
box.execute("SELECT CAST('111.1' as INTEGER);")
Result: 111

After patch:
box.execute("SELECT CAST('1.1' as INTEGER);")
Result: 'Type mismatch: can not convert 1.1 to integer'

box.execute("SELECT CAST('1.0' as INTEGER);")
Result: 'Type mismatch: can not convert 1.0 to integer'

box.execute("SELECT CAST('1.' as INTEGER);")
Result: 'Type mismatch: can not convert 1. to integer'

11352a32

gitlab-ci: move sources tarball creation to gitlab · 34f87bc6

Alexander V. Tikhonov authored 4 years ago

Moved sources tarball creation from travis-ci to gitlab-ci,
moved its jobs for sources packing and sources deploying.

Close #4895

34f87bc6

Divide box/ddl.test.lua test · 4a8d1ebd

Alexander V. Tikhonov authored 5 years ago

Divided into tests:
- box/ddl_alter.test.lua
- box/ddl_collation.test.lua
- box/ddl_collation_types.test.lua
- box/ddl_collation_wrong_id.test.lua
- box/ddl_no_collation.test.lua
- box/ddl_parallel.test.lua
- box/ddl_tuple.test.lua
- box/gh-2336-ddl_call_twice.test.lua
- box/gh-2783-ddl_lock.test.lua
- box/gh-2839-ddl_custom_fields.test.lua
- box/gh-2937-ddl_collation_field_def.test.lua
- box/gh-3290-ddl_collation_deleted.test.lua
- box/gh-928-ddl_truncate.test.lua

4a8d1ebd

gitlab-ci: remove Ubuntu 19.04 Disco · 05df6b31
Alexander V. Tikhonov authored 4 years ago
```
Removed Ubuntu 19.04 Disco from testing which is EOL.

Close #4896
```
05df6b31

Added ability to remove packages from S3 · d6c50af1

Alexander V. Tikhonov authored 4 years ago

Added ability to remove given in options package from S3. TO remove the
needed package need to set '-r=<package name with version>' option,
like:
  ./tools/update_repo.sh -o=<OS> -d=<DIST> -b=<S3 repo> \
    -r=tarantool-2.2.2.0
it will remove all found appropriate source and binaries packages from
the given S3 repository, also the meta files will be corrected there.

Close #4839

d6c50af1

Add help instruction on 'product' option · cccc989c
Alexander.V Tikhonov authored 4 years ago
```
Added instructions on 'product' option with examples.

Part of #4839
```
cccc989c

Enable script for saving packages in S3 for modules · 4527a4da

Alexander.V Tikhonov authored 4 years ago

Found that modules may have only binaries packages w/o sources
packages. Script changed to be able to work with only binaries
either sources packages.

Part of #4839

4527a4da

Add metafiles cleanup routines at S3 pack script · ed491409

Alexander V. Tikhonov authored 5 years ago

Added cleanup functionality for the meta files.
Script may have the following situations:

 - package files removed at S3, but it still registered:
   Script stores and registers the new packages at S3 and
   removes all the other registered blocks for the sames
   files in meta files.

 - package files already exists at S3 with the same hashes:
   Script passes it with warning message.

 - package files already exists at S3 with the old hashes:
   Script fails w/o force flag, otherwise it stores and
   registers the new packages at S3 and removes all the other
   registered blocks for the sames files in meta files.

Added '-s|skip_errors' option flag to skip errors on changed
packages to avoid of exits on script run.

Part of #4839

ed491409

gitlab-ci: set static docker build release testing · 6f618d62

Alexander V. Tikhonov authored 4 years ago

Returned the static build based on Dockerfile to gitlab-ci release branches
testing after the issues with missed openssl version fixed at PR #4831.

Follow up #4831

(cherry picked from commit b09f44b856e91f1006bd5b3e226a7be0b65b7859)

6f618d62

build: disable cache for static build Dockerfile · c5d27312

Alexander V. Tikhonov authored 4 years ago

Found that static build based on Dockerfile used external link
and missed that it was removed, like it was in #4830. To avoid
of the same issues the cache for building the Dockerfile was
disabled with '--no-cache' option at docker build command.

Follow up #4830

(cherry picked from commit 1207821e4fc18312a9916d81a55a8eacd75a67b3)

c5d27312

Enable branch coverage in lcov · 5842157c

Sergey Bronnikov authored 4 years ago

By default lcov collects line coverage only. It would be useful to
collect function and branch coverage too.

Closes #4888

5842157c