Commits · de2906a8dc037db6f6b1d95a0488e6599f4989eb · core / tarantool

May 02, 2019

swim: explicitly stop old ev_io input/output on rebind · de2906a8

Vladislav Shpilevoy authored 5 years ago

When an input/output file descriptor was changed, SWIM did this:

    swim_ev_io_set(io, new_fd)
    swim_ev_io_start(io)

It worked in an assumption that libev allows to change fd on fly,
and the tests were passing because fake event loop was used, not
real libev.

But it didn't work. Libev requires explicit ev_io_stop() called
on the old descriptor. This patch makes this:

    swim_ev_io_stop(io)
    //do bind ...

    swim_ev_io_set(io, new_fd)
    swim_ev_io_start(io)

Part of #3234

de2906a8

swim: allow to omit host in URI · 5320d8f0

Vladislav Shpilevoy authored 5 years ago

Before the patch swim.cfg() was returning an error on an attempt
to use URI like 'port', without a host. But it would be useful,
easy, and short to allow that, and use '127.0.0.1' host by
default. Compare:

    swim:cfg({uri = 1234})
            vs
    swim:cfg({uri = '127.0.0.1:1234'})

It is remarkable that box.cfg{listen} also allows to omit host.

Note, that use '0.0.0.0' (INADDR_ANY) is forbidden. Otherwise

    1) Different instances interacting with this one via not the
       same interfaces would see different source IP addresses.
       It would mess member tables;

    2) This instance would not be able to encode its IP address
       in the meta section, because it has no a fixed IP. At the
       same time omission of it and usage of UDP header source
       address is not possible as well, because UDP header is not
       encrypted and therefore is not safe to look at.

Part of #3234

5320d8f0

swim: introduce member reference API · 42bdc367

Vladislav Shpilevoy authored 5 years ago

Struct swim_member pointer is used to learn member's status,
payload, incarnation etc. To obtain a pointer, a one should
either lookup the member by UUID, or obtain from iterators API.
The former is long, the latter is useless when a point lookup is
needed.

On the other hand, it was not safe to keep struct swim_member
pointer for a long time, because it could be deleted at any
moment.

This patch allows to reference a member and be sure that it will
not be deleted until dereferenced explicitly. The member still
can be dropped from the member table, but its memory will be
valid. To detect that a member is dropped, a user can use
swim_member_is_dropped() function.

Part of #3234

42bdc367

swim: do not use ev_timer_start · fd9c6c9d

Vladislav Shpilevoy authored 5 years ago

Appeared that libev changes 'ev_timer.at' field to a remaining
time value, and it can't be used as a storage for a timeout. By
the same reason ev_timer_start() can't be used to reuse a timer.

On the contrary, 'ev_timer.repeat' is not touched by libev, and
ev_timer_again() allows to reuse a timer.

This patch replaces 'at' with 'repeat' and ev_timer_start() with
ev_timer_again().

The bug was not detected by unit tests, because they implement
their own event loop and do not change ev_timer.at. Now they do
to prevent a regression.

Part of #3234

fd9c6c9d

test: update test-run · 71f7ecf1

Alexander Turenko authored 5 years ago

Added the signal option into 'stop server' command.

How to use:

 | test_run:cmd('stop server foo with signal=KILL')

The 'stop server foo' command without the option sends SIGTERM as
before.

This feature is intended to be used in a fix of #4162 ('test:
gc.test.lua test fails on *.xlog files cleanup').

71f7ecf1

Apr 30, 2019

Add functions to ease using Lua iterators from C · fa6bb2fe
Alexander Turenko authored 6 years ago
```
Needed for #3276.
```
fa6bb2fe
Add luaL_iscallable with support of cdata metatype · 82dc8c71
Alexander Turenko authored 6 years ago
```
Needed for #3276.
```
82dc8c71

test: use yaml.safe_load() instead of yaml.load() · d5fdc533

Alexander Turenko authored 5 years ago

The primary reason why this change is needed is that yaml.load() w/o an
explicit loader was banned in Gentoo Linux for recent pyyaml versions;
see [1].

We don't use the pyyaml feature that allows to construct a Python object
based on a yaml tag, so safe_load() fit our needs.

See also related changes in test-run and tarantool-python ([2], [3], [4]).

[1]: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=79ba924d94cb0cf8559565178414c2a1d687b90c
[2]: https://github.com/tarantool/test-run/commit/38400e91c600677fb661154d00459d660fa9880d
[3]: https://github.com/tarantool/test-run/commit/89808d60eb3b5130e227fc1a7866f2ad5a197bea
[4]: https://github.com/tarantool/tarantool-python/commit/350771d240a18eec188a53e8c696028b41baa13f

d5fdc533

test: update test-run · 91dd3cfd

Alexander Turenko authored 5 years ago

* Better handle keyboard interrupt (PR #160).
* Fail testing when test-run fails internally (PR #161).
* Catch non-default server start fail in app server (#115, PR #162).
* Update tarantool-python submodule (PR #165).

91dd3cfd

Apr 29, 2019

test: drop invalid assert from swim test transport · 906fb352

Vladislav Shpilevoy authored 5 years ago

The assertion was checking that a next event object is not the
same as the previous, but

    1) the previous was deleted already to this moment;
    2) comparison was done by pointer

The first problem would be enough to drop it. The second is
already curious - looks like after the old event was deleted,
the next event was allocated right on the same memory. This is
why their pointers are equal and the assertion fails.

For example, swim_timer_event_process() - it deletes the event
object and calls ev_invoke() which can generate a new event on
the just freed memory.

906fb352

test: vinyl/errinj test fails under highload · 8c3bd9d8

avtikhon authored 6 years ago

Test "check that all dump/compaction tasks that are in progress at
the time when the server stops are aborted immediately.", but in
real the awaiting time of 1 second is not enough due to runs in
parallel and it fails, like:

[009] --- vinyl/errinj.result Tue Apr 16 16:43:36 2019
[009] +++ vinyl/errinj.reject Wed Apr 17 09:42:36 2019
[009] @@ -530,7 +530,7 @@
[009] ...
[009] t2 - t1 < 1
[009] ---
[009] -- true
[009] +- false
[009] ...
[009] test_run:cmd("cleanup server test")
[009] ---
[009]

in 100 parallel runs the failed delays were found:

[002] +- 1.4104716777802
[022] +- 1.3933029174805
[044] +- 1.4296517372131
[033] +- 1.6380662918091
[001] +- 1.9799520969391
[027] +- 1.7067711353302
[043] +- 1.3778221607208
[034] +- 1.3820221424103
[032] +- 1.3820221424103
[020] +- 1.6275615692139
[050] +- 1.6275615692139
[048] +- 1.1880359649658

Decided to avoid of use the time check at all and change the
ERRINJ_VY_RUN_WRITE_STMT_TIMEOUT to ERRINJ_VY_DUMP_DELAY injection.
In this way the time checks were completely removed.

Next issue met was the error:

vy_quota.c:298 !> SystemError Failed to allocate 2097240 bytes
    in lsregion for vinyl transaction: Cannot allocate memory

That is why the merged 2 subtests were divided into 2 standalone
subtests to be able to set the memory limit of the 2nd subtest to
2097240 value.

Close #4169

8c3bd9d8

vinyl: be pessimistic about write rate when setting dump watermark · b9b8e8af

Vladimir Davydov authored 5 years ago

We set the dump watermark using the following formula

    limit - watermark     watermark
    ---------------- = --------------
       write_rate      dump_bandwidth

This ensures that by the time we run out of memory quota, memory
dump will have been completed and we'll be able to proceed. Here
the write_rate is the expected rate at which the workload will
write to the database while the dump is in progress. Once the dump
is started, we throttle the workload in case it exceeds this rate.

Currently, we estimate the write rate as a moving average observed
for the last 5 seconds. This performs poorly unless the workload
write rate is perfectly stable: if the 5 second average turns out to
be even slightly less than the max rate, the workload may experience
long stalls during memory dump.

To avoid that let's use the max write rate multiplied by 1.5 instead
of the average when setting the watermark. This means that we will
start dump earlier than we probably could, but at the same time this
will tolerate write rate fluctuations thus minimizing the probability
of stalls.

Closes #4166

b9b8e8af

httpc: fix zero timeout handling · 47bd51b5

Alexander Turenko authored 5 years ago

When libcurl is built with --enable-threaded-resolver (which is default)
and the version of the library is 7.60 or above, libcurl calls a timer
callback with exponentially increasing timeout_ms value during DNS
resolving.

This behaviour was introduced in curl-7_59_0-36-g67636222f (see [1],
[2]). During first ten milliseconds the library sets a timer to a passed
time divided by three (see Curl_resolver_getsock()). It is possible that
passed time is zero during at least several thousands of iterations.

Before this commit we didn't set a libev timer in curl_multi_timer_cb()
when a timeout_ms value is zero, but call curl_multi_process()
immediately. Libcurl however can call curl_multi_timer_cb() again and
here we're going into a recursion that stops only when timeous_ms
becomes positive. Often we generate several thousands of stack frames
within this recursion and exceed 512KiB of a fiber stack size.

The fix is easy: set a libev timer to call curl_multi_process() even
when a timeout_ms value is zero.

The reason why we did the call to curl_multi_process() immediately is
the unclear wording in the CURLMOPT_TIMERFUNCTION option documentation.
This documentation page was fixed in curl-7_64_0-88-g47e540df8 (see [3],
[4], [5]).

There is also the related change in curl-7_60_0-121-g3ef67c686 (see [6],
[7]): after this commit libcurl calls a timer callback with zero
timeout_ms during a first three milliseconds of asynchronous DNS
resolving.

Fixes #4179.

[1]: https://github.com/curl/curl/pull/2419
[2]: https://github.com/curl/curl/commit/67636222f42b7db146b963deb577a981b4fcdfa2
[3]: https://github.com/curl/curl/issues/3537
[4]: https://github.com/curl/curl/pull/3601
[5]: https://github.com/curl/curl/commit/47e540df8f32c8f7298ab1bc96b0087b5738c257
[6]: https://github.com/curl/curl/pull/2685
[7]: https://github.com/curl/curl/commit/3ef67c6861c9d6236a4339d3446a444767598a58

47bd51b5

travis-ci: set right flags in release testing jobs · c308f35d

Alexander Turenko authored 5 years ago

It is important to have testing jobs that build the project with both
-Werror and -O2 to keep the code clean. -O2 is needed, because some
compiler warnings are available only after extra analyzing passes that
are disabled with lesser optimization levels.

The first attempt to add -Werror for release testing jobs was made in
da505ee7 ('Add -Werror for CI (1.10
part)'), but it mistakely doesn't enable -O2 for RelWithDebInfoWError
build. It is possible to fix it in this way:

 | --- a/cmake/compiler.cmake
 | +++ b/cmake/compiler.cmake
 | @@ -113,10 +113,14 @@ set (CMAKE_C_FLAGS_DEBUG
 |      "${CMAKE_C_FLAGS_DEBUG} ${CC_DEBUG_OPT} -O0")
 |  set (CMAKE_C_FLAGS_RELWITHDEBINFO
 |      "${CMAKE_C_FLAGS_RELWITHDEBINFO} ${CC_DEBUG_OPT} -O2")
 | +set (CMAKE_C_FLAGS_RELWITHDEBINFOWERROR
 | +    "${CMAKE_C_FLAGS_RELWITHDEBINFOWERROR} ${CC_DEBUG_OPT} -O2")
 |  set (CMAKE_CXX_FLAGS_DEBUG
 |      "${CMAKE_CXX_FLAGS_DEBUG} ${CC_DEBUG_OPT} -O0")
 |  set (CMAKE_CXX_FLAGS_RELWITHDEBINFO
 |      "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} ${CC_DEBUG_OPT} -O2")
 | +set (CMAKE_CXX_FLAGS_RELWITHDEBINFOWERROR
 | +    "${CMAKE_CXX_FLAGS_RELWITHDEBINFOWERROR} ${CC_DEBUG_OPT} -O2")
 |
 |  unset(CC_DEBUG_OPT)

However I think that a build type (and so `tarantool --version`) should
not show whether -Werror was passed or not. So I have added
ENABLE_WERROR CMake option for that. It can be set like so:

 | cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_WERROR=ON

Enabled the option in testing Travis-CI jobs with the RelWithDebInfo
build type. Deploy jobs don't include it as before.

Fixed all -Wmaybe-uninitialized and -Wunused-result warnings. A few
notes about the fixes:

* net.box does not validate received data in general, so I don't add a
  check for autoincrement IDs too. Set the ID to INT64_MIN, because this
  value is less probably will appear here in a normal case and so is the
  best one to signal a user that something probably going wrongly.
* xrow_decode_*() functions could read uninitialized data from
  row->body[0].iov_base in xrow_on_decode_err() when printing a hex code
  for a row. It could be possible when the received msgpack was empty
  (row->bodycnt == 0), but there were expected keys (key_map != 0).
* getcwd() is marked with __attribute__((__warn_unused_result__)) in
  glibc, but the buffer filled by this call is not used anywhere and so
  just removed.
* Vinyl -Wmaybe-uninitialized warnings are false positive ones.

Added comments and quotes into .travis.yml to ease reading. Removed
"test" word from the CentOS 6 job name, because we don't run tests on
this distro (disabled in the RPM spec).

Fixes #4178.

c308f35d

core/coio_file: Use eio_sendfile_sync instead of a chunk mode · 04bf646f

Cyrill Gorcunov authored 5 years ago

eio library provides a portable version of sendfile syscall
which works a way more efficient than explicit copying file
by 4K chunks.

04bf646f

Apr 26, 2019

swim: optimize struct swim memory layout · 8b350d42

Vladislav Shpilevoy authored 5 years ago

The same problem that occured with struct swim_member, has
happened with struct swim - it contains a huge structure right in
the middle, struct swim_task. It consumes 1.5Kb and obviously
splits the most accessible struct swim attributes into multiple
cache lines.

This patch moves struct swim_task to the bottom as well as other
members, related to dissemination component.

8b350d42

sql: fix boolean variables binding · 15c990cb
Kirill Shcherbatov authored 5 years ago
```
The bindings mechanism was not updated in scope of BOOLEAN static
type patch. Fixed.
```
15c990cb

Apr 25, 2019

swim: optimize memory layout of struct swim_member · e1c94e90

Vladislav Shpilevoy authored 5 years ago

Struct swim_member describes attributes of one remote member of
a SWIM cluster. It is relatively often accessed. And it has two
huge structures - struct swim_packet ping/ack_task. Each is 1500
bytes. When these tasks are in the middle of the structure, they
split and spoil cache lines.

This patch moves the whole failure detection attribute family to
the bottom of the structure, and moves these two tasks to the
end of layout.

e1c94e90

swim: introduce suspicion · 8f903360

Vladislav Shpilevoy authored 6 years ago

Suspicion component is a way how SWIM protects from
false-positive failure detections. When the network is slow, or
a SWIM node does not manage to process messages in time because
of being overloaded, other nodes will not receive ACKs in time,
but it is too soon to declare the member dead.

The nodes will mark the member as suspected, and will ping it
indirectly, via other members. It 1) gives the suspected member
more time to respond on ACKs, 2) protects from the case when it
is a network problem on particular channels.

Part of #3234

8f903360

swim: introduce routing · 8c9b88d2

Vladislav Shpilevoy authored 6 years ago

Before the patch SWIM packets were being sent quite
straightforward from one instance to another with transparent
routing on Internet Level of TCP/IP. But the SWIM paper
describes last yet not implemented component - suspicion
mechanism.

So as not to overload this message with suspicion details it is
enough to say that it makes possible sending a packet through an
intermediate SWIM instance, not directly.

This commit extends the SWIM protocol with a new transport-level
section named 'routing'. It allows to send indirect SWIM messages
transparently via packet forwarding implemented fully inside
transportation component, in swim_io.c.

Part of #3234

8c9b88d2

swim: store sender UUID in swim io tasks · 776397ba

Vladislav Shpilevoy authored 5 years ago

Struct swim_task is an asynchronous task generated by the SWIM
core and scheduled to be sent when a next EV_WRITE event appears.

It has a callback 'complete' called when the task finally sent
its packet into the network. In this callback a next SWIM round
step can be scheduled, or set a deadline for a ping. Usually it
requires to know to which member the packet was sent. For this
UUID is required, but swim_task operates by inet addresses only.

At this moment UUID necessity can be bypassed via container_of
or via some queues, but it is not so once suspicion component is
introduced.

The patch adds sender's UUID to struct swim_task.

Part of #3234

776397ba

swim: wrap sio_strfaddr() · caeadc15

Vladislav Shpilevoy authored 5 years ago

SIO provides a function sio_strfaddr() to obtain string
representation of an arbitrary struct sockaddr. Call of this
function usually looks bulky because it requires explicit cast to
const struct sockaddr *, and expects the address size in the
second paremeter.

SWIM uses only AF_INET addresses and always casts them to
const struct sockaddr * + passes sizeof(struct sockaddr_in) to
each invocation of sio_strfaddr(). This patch wraps
sio_strfaddr() with a function making these preparations.

Part of #3234

caeadc15

sio: make sio_strfaddr() using tt_static_buf · 84638518

Vladislav Shpilevoy authored 5 years ago

Sometimes it is wanted to format multiple addresses and call
sio_strfaddr() multiple times, but it uses one static thread
local buffer, and next calls overwrite results of previous ones.

On the contrary, tt_static_buf() is not a single buffer per
thread - it is 4 buffers. Now sio_strfaddr() uses it and it is
possible to call it 4 times without rewriting old results.

Also, this update makes .bss section a bit smaller - -1 static
buffer of size 1025, and +16 bytes for tt_static buffers. Total:
-1009 bytes.

84638518

swim: drop swim_uuid_str() function · 8770fd5c
Vladislav Shpilevoy authored 5 years ago
```
It appeared  that tt_uuid lib provides the same function:
tt_uuid_str().
```
8770fd5c

test: introduce swim packet filter by destination address · 7251efa5

Vladislav Shpilevoy authored 5 years ago

The filter is going to be used to test the SWIM suspicion
component. The destination filter will break certain network
channels, and the suspicion component shall withstand that.

Part of #3234

7251efa5

test: remove swim packet filter destructors · adb63189

Vladislav Shpilevoy authored 5 years ago

Swim test packet filters are supposed to filter out packets
matching certain criteria or with a probability. They were
implemented as a filter-function and userdata passed into the
former on each invocation. Usually it was allocated on heap and
needed deletion. But it appeared that much simpler is to store
the filters inside struct swim_node, pass it as userdata, and get
rid of userdata destructors and dynamic allocations.

The patch is motivated by necessity to add one new filter, which
anyway will need struct swim_node as userdata.

adb63189

test: rename swim_cluster_node to swim_cluster_member · 1bef21a9

Vladislav Shpilevoy authored 5 years ago

There are two different structures - public struct swim_member
exposed by SWIM API, and struct swim_node defined and used inside
tests. Before this patch swim_cluster_node() was returning struct
swim_member, just historically. But more and more places appear
where it is wanted to safely take struct swim_node, not
swim_member, with an appropriate assertion on an invalid index.

This patch renames swim_cluster_node() to swim_cluster_member(),
and introduces new swim_cluster_node() returning swim_node.

1bef21a9

travis-ci: add Debian Buster and Ubuntu Disco · d2d85f8a
Alexander Turenko authored 5 years ago
```
Fixes #4174.
```
d2d85f8a

sql: invalid integer type in arithmetic operations · ff846255

Kirill Shcherbatov authored 5 years ago

Tarantool SQL used to return 'number' type in request metadata
for arithmetic operations even when only 'integer's were used.

This also fixes query planner optimisation bug:
SELECT a FROM t1 WHERE a+0 IN (SELECT a FROM t1);
used to open a new ephemeral table with OpenTEphemeral when
it is not required (introduced by 2b22b913).

Closes #4103

ff846255

sql: check access rights of table in VIEW · c0ae52e8

Kirill Shcherbatov authored 5 years ago

When access is performed using VIEW, access rights should be
checked against table[s] which it is referencing, not against
VIEW itself. Added a test case to verify this behaviour.

Closes #4104

c0ae52e8

tap: fix is_deeply box.NULL corner cases · 8e337858

Kirill Shcherbatov authored 5 years ago

The tap:is_deeply call used to return inconsistent result
processing an empty tuple and tuple that has no values:
is_deeply({a = box.NULL}, {})  == true
is_deeply({}, {a = box.NULL}) == false
Fixed to return true by default in such case. You may also
set tap.strict = true to change this behaviour.

@TarantoolBot document
Title: tap test new flag 'strict'
In some scenarios it is convenient to distinguish box.NULL and
nil in tap:is(), tap:isnt(), tap:is_deeply() tests.
For example, {result='Success'} and {result='Success', error=None}
are different HTTP responses. You may set t.strict = true to
behave such way.
Example:
t = require('tap').test('123')
t.strict = true
t:is_deeply({a = box.NULL}, {}) -- false

Closes #4125

8e337858

sql: update ptr to VDBE after its creation in sql_txn · d03c337a

Nikita Pettik authored 6 years ago

VDBE object is used in struct sql_txn to add new autoincrement ids in
sequence_next(). List of these ids is returned later as a query
execution result. sql_txn is created once SQL statement is executed
inside transaction and exists till commit or rollback. After its
creation it contains pointer to current VDBE. Each VDBE is freed after
statement is executed. Hence, after first SQL statement within
transaction is executed, sql_txn will point to freed memory (dangling
pointer). This leads to crash in the next processed statement. Fix to
this bug is simple: we must re-assign pointer to VDBE in sql_txn before
VDBE execution.

Closes #4157

d03c337a

sql: make <search condition> accept only boolean · 085ba61e

Nikita Pettik authored 6 years ago

<search condition> is a predicate used as a part of WHERE and JOIN
clauses. ANSI SQL states that <search condition> must accept only
boolean arguments. In our SQL it is implemented as bytecode instruction
OP_If which in turn carries out logic of conditional jump. Hence this
patch makes this opcode accept only boolean values.

Closes #3723

085ba61e

sql: make LIKE predicate return boolean result · 101f87f5

Nikita Pettik authored 6 years ago

According to ANSI, LIKE predicate should return boolean result.
This patch changes type of return value of LIKE predicate.

Part of #3723

101f87f5

sql: make predicates accept and return boolean · cb237e7d

Nikita Pettik authored 6 years ago

This patch make following predicates accept and return only values
of type boolean: IN, EXISTS, OR, AND, NOT, BETWEEN, IS (NULL).
In terms of approach, it is enough to patch opcodes implementing these
predicates.

Part of #3723

cb237e7d

sql: make comparison predicate return boolean · a57ae24e

Nikita Pettik authored 6 years ago

According to ANSI SQL result of comparison predicates must be BOOLEAN.
Before introduction of BOOLEAN type they returned 0 and 1. Now we can
change those values to false and true respectively.

Part of #3723

a57ae24e

sql: improve type determination for column meta · 8dcea62a

Nikita Pettik authored 6 years ago

In most cases we don't assign and store type of node of expression AST
(except for constant literals).  To determine type of node  we use
sql_expr_type() function, which implements logic of quite simple
recursive tree traversal. Before this patch we set type of node after
code generation in sqlExprCodeTarget() without any traversal. This
approach is way worse even then sql_expr_type(). So, to improve accuracy
of type determination, let's always call that method and remove type
assignment in sqlExprCodeTarget().

Closes #4126

8dcea62a

sql: introduce type boolean · 0a8b1791

Nikita Pettik authored 6 years ago

This patch introduces basic facilities to operate on boolean type:
boolean literals "true" and "false" where true > false; alias to null -
unknown; column type "BOOLEAN" and shortcut "BOOL"; opportunity to
insert and select boolean values from table; OR and AND predicates
accept boolean arguments; CAST operation involving boolean type;
comparison between boolean values (including VDBE sorter routines).

Part of #3648

0a8b1791

sql: remove sql_bind_value() function · 0f6d6419

Nikita Pettik authored 5 years ago

It is not used (since we dispatch call of sql_bind_*type*() in
sql_bind_column()), so can be removed.

0f6d6419

sql: use msgpack types instead of custom ones · 56e7b657

Nikita Pettik authored 6 years ago

This patch provides straightforward refactoring replacing enum sql_type
with enum mp_type. Note that we use msgpack format instead of field_type
since it is more suitable when dealing with NULLs.

56e7b657