Commits · f4b80bcf4908442c98f9d16538b7d792f92bd40e · core / tarantool

Apr 10, 2019

test: fix vinyl/errinj_stat failure · f4b80bcf

The patch fixes the following test failures:

 | [022] --- vinyl/errinj_stat.result	Tue Mar 19 17:52:48 2019
 | [022] +++ vinyl/errinj_stat.reject	Wed Mar 20 08:08:41 2019
 | [022] @@ -229,7 +229,7 @@
 | [022] ...
 | [022] stat.tasks_inprogress == 0
 | [022] ---
 | [022] -- true
 | [022] +- false
 | [022] ...
 | [022] stat.tasks_completed == 1
 | [022] ---

 | [013] --- vinyl/errinj_stat.result	Tue Mar 19 17:52:48 2019
 | [013] +++ vinyl/errinj_stat.reject	Wed Mar 20 08:11:15 2019
 | [013] @@ -168,7 +168,7 @@
 | [013] ...
 | [013] stat.tasks_inprogress > 0
 | [013] ---
 | [013] -- true
 | [013] +- false
 | [013] ...
 | [013] stat.tasks_completed == 0
 | [013] ---
 | [013] @@ -183,7 +183,7 @@
 | [013] ...
 | [013] box.stat.vinyl().scheduler.tasks_inprogress > 0
 | [013] ---
 | [013] -- true
 | [013] +- false
 | [013] ...
 | [013] errinj.set('ERRINJ_VY_RUN_WRITE_DELAY', false)
 | [013] ---

The problem occurred, because the test didn't make sure that an
asynchronous dump/compaction task has actually started/completed.
Even box.snapshot() doesn't guarantee that a dump task is complete,
in fact. This patch adds wait_cond's to guarantee the test never
fails like that anymore.

Closes #4059
Closes #4060

f4b80bcf

test: wait for xlog/snap/log file changes · def75c88

Alexander Tikhonov authored 6 years ago

When a system in under heavy load (say, when tests are run in parallel)
it is possible that disc writes stalls for some time. This can cause a
fail of a check that a test performs, so now we retry such checks during
60 seconds until a condition will be met.

This change targets replication test suite.

def75c88

test: increase timeouts in replication/errinj · e257eb27

Alexander Turenko authored 6 years ago

Needed for parallel running of the test suite.

Use default replication_connect_timeout (30 seconds) instead of 0.5
seconds. This don't changes meaning of the test cases.

Increase replication_timeout from 0.01 to 0.1.

These changes allow to run the test 100 times in 50 parallel jobs
successfully.

e257eb27

test: increase timeouts in replication/misc · 697caa6b

Alexander Turenko authored 6 years ago

All changes are needed to eliminate sporadic fails when testing is run
with, say, 30 parallel jobs.

First, replication_connect_timeout is increased to 30 seconds. This
parameter doesn't change meaning of the test cases.

Second, increase replication_timeout from 0.01 to 0.03. We usually set
it to 0.1 in tests, but a duration of the gh-3160 test case ('Send
heartbeats if there are changes from a remote master only') is around
100 * replication_timeout seconds and we don't want to make this test
much longer. Runs of the test case (w/o other ones that are in
replication/mics.test.lua) in 30 parallel jobs show that 0.03 is enough
for the gh-3160 case to pass stably and hopefully enough for the
following test cases too.

697caa6b

test: allow to run replication/misc multiple times · 7a2c31d3

Alexander Turenko authored 6 years ago

It allows to run `./test-run.py -j 1 replication/misc <...>
replication/misc` that can be useful when debugging a flaky problem.

This ability was broken after after 7474c14e ('test: enable cleaning of
a test environment'), because test-run starts to clean package.loaded
between runs and so each time the test is run it calls ffi.cdef() under
require('rlimit'). This ffi.cdef() call defines a structure, so a second
and following attempts to call the ffi.cdef() will give a Lua error.

This commit does not change anything in regular testing, because each
test runs once (if other is not stated in a configuration list).

7a2c31d3

swim: make UUID update smoother and faster · d98f3d52

Vladislav Shpilevoy authored 6 years ago

Before this patch UUID update was the same as introduction of a
new member and waiting until the 'old' is dropped as 'dead' by
the failure detection component. It could take 2.5 minutes with
the default ack timeout. What is more, with GC turned off it
would always result in never deleted old UUID.

The patch on a UUID update marks the old UUID as 'left' member.
In the best and most common case it guarantees that old UUID will
be dropped not later than after 2 complete rounds, and marked as
'left' everywhere for log(cluster_size) round steps. Even with GC
turned off.

Part of #3234

d98f3d52

swim: introduce quit message · 3702686e

Vladislav Shpilevoy authored 6 years ago

Quit allows to gracefully leave a cluster. Other members will not
consider the quited instance as dead, and will drop it much
earlier than it would happen via failure detection.

Quit works as follows: a special message is sent to each member.
Members, got that message, will mark the source as 'left' and
will keep and disseminate that change for one round. In the best
case after one round the left member will be marked as such in
the whole cluster. 'Left' member will not be added back because,
it is prohibited explicitly to add new 'left' members.

Part of #3234

3702686e

test: process IO swim test events before protocol's ones · 21705c08

Vladislav Shpilevoy authored 6 years ago

Before that patch the swim test event loop worked like this: pop
a new event, set the global watch to its deadline, process the
event, repeat until the deadlines are the same. These events
usually generate IO events, which are processed next. But after
swim_quit() will be introduced, it is possible to insert new IO
events before protocol's events like round steps and ack checks.

Because of that it would be impossible to process new IO events
only, with timeout = 0, or with timeout > 0, but without changing
the global clock.

For example, a typical test would try to call swim_quit() on a
swim instance, and expect that it has sent all the quit messages
without delays immediately. But before this patch it would be
necessary to run at least one swim round to get to the IO
processing.

The patch splits protocol's events and IO events processing logic
into two functions and calls them explicitly in
swim_wait_timeout() - the main function to check something in the
swim tests.

Part of #3234

21705c08

test: on close of swim fake fd send its packets, not drop · 3f0a4e25

Vladislav Shpilevoy authored 6 years ago

The packets originator has already got an OK status and expects
these messages sent even if the originator is closed right after
that. This commit does the TCP-way and sends all the pending
messages before actually closing the fake fd.

Part of #3234

3f0a4e25

test: allow to remove swim nodes from the cluster · cc646f44

Vladislav Shpilevoy authored 6 years ago

Until now it was impossible in swim tests to drop a SWIM instance
from the cluster. It should have been either restarted, or
blocked, but a real drop led to an assertion on any attemp to use
one of methods like swim_wait_timeout(). It was due to inability
to get instance's UUID without the instance itself. Even if it
was stored in membership tables of other instances.

This patch makes swim_cluster store swim instances and UUIDs
separately. This is going to be used to test swim_quit() API.
Also, some cfg parameters are saved as well, like ack timeout, gc
mode. They are used to restart a node with exactly same cfg as
it was before restart. Even if original struct swim * is not
valid already.

Part of #3234

cc646f44

Apr 09, 2019

sql: make FOR EACH ROW clause mandatory in trigger definition · 7ce52399

Konstantin Osipov authored 6 years ago

Before this patch, it was possible to create a trigger without FOR EACH
ROW clause, for example:

CREATE TRIGGER trg AFTER DELETE ON tbl BEGIN ; END;

In ANSI SQL, if trigger-timing-clause is not specified, FOR EACH
STATEMENT is used. Tarantool, however, did not support FOR EACH
STATEMENT and assumed FOR EACH ROW. This could break future
applications, once FOR EACH STATEMENT is added.

Thus, make FOR EACH ROW clause mandatory. Update tests.

No docs ticket since there is no docs for this feature yet :/ -
will document the fixed behaviour right away.

7ce52399

swim: dissemination review fixes · dc5a6745
Konstantin Osipov authored 6 years ago
```
Rename event_queue -> dissemination_queue
```
dc5a6745
swim: update comments, use singular in method names. · 5139ff14
Konstantin Osipov authored 6 years ago

5139ff14

swim: introduce dissemination component · ecef10c3

Vladislav Shpilevoy authored 6 years ago

Dissemination components broadcasts events about member status
updates. When any member attribute is updated (incarnation,
status, UUID, address), the member stands into an event queue.
Members from the queue are encoded into each round step message
with a higher priority and before anti-entropy section.

It means, then even if a cluster consists of hundreds of members
and one of them was updated on one of instances, this update will
be disseminated regardless of whether this memeber is encoded
into anti-entropy section or not. It drastically speeds events
dissemination up, according to the SWIM paper, and is noticed in
the tests.

Part of #3234

ecef10c3

test: set packet drop rate instead of flag in swim tests · 86af0bd3

Vladislav Shpilevoy authored 6 years ago

Before dissemination component it was enough in the tests to
either drop all packets to/from a certain member, or do not drop
at all. But after dissemination it will be time to test more
granulated packet loss table: not 0/100, but 5/10/20/50/.../100
packet loss rate.

Part of #3234

86af0bd3

test: speed up swim big cluster failure detection · f30d9ed2

Vladislav Shpilevoy authored 6 years ago

The test checks that if a member has failed in a big cluster, it
is eventually deleted from all instances. But it takes too much
real time despite usage of virtual time.

This is because member total deletion takes
O(N + ack_timeout * 5) time. N so as to wait until every member
pinged the failed one at least once, + 3 * ack_timeout to learn
that it is dead, and + 2 * ack_timeout to drop it. Of course, it
is an upper border, and usually it is faster but not much. For
example, on the cluster of size 50 it takes easily 55 virtual
seconds.

On the contrary, to just learn that a member is dead on every
instance takes O(log(N)) according to the SWIM paper. On the
same test with 50 instances cluster it takes ~15 virtual seconds
to disseminate 'dead' status of the failed member on every
instance. And even without dissemination component, with
anti-entropy only.

Leaping ahead, for the subsequent patches it is tested that with
the dissemination component it takes already ~6 virtual seconds.

In the summary, without losing test coverage it is much faster to
turn off SWIM GC and wait until the failed member looks dead on
all instances.

Part of #3234

f30d9ed2

swim: make members array decoder be a separate function · 190e201a

Vladislav Shpilevoy authored 6 years ago

At this moment SWIM protocol stores array of members only in one
place: inside the anti-entropy component. Its decoding is a
simple loop taking the member definitions one by one and
upserting them into the member table.

But the dissemination also has something kinda like members
array: an array of events. The trick is that an event is
basically the same as a member +/- a couple of optional fields.
Events are also decoded into the member definition structure. It
means that anti-entropy decoder can be easily reused.

Part of #3234

190e201a

swim: encapsulate member bin info into a 'passport' · afbc8504

Vladislav Shpilevoy authored 6 years ago

Each member stored in components dissemination and anti-entropy
should carry a unique identifier, a status, and an address. Those
attributes are UUID, IP, Port, enum swim_member_status,
incarnation.

Now they are sent only in scope of anti-entropy, but forthcoming
dissemination component also would like to use these attributes
for each event.

This commit makes the vital attributes and their code more
reusable by encapsulation of them into a binary passport
structure.

Part of #3234

afbc8504

Apr 08, 2019

vinyl: rename vy_set and vy_set_with_colmask · 7b56f1fe

Konstantin Osipov authored 6 years ago

Rename vy_set() and vy_set_with_colmask() to vy_tx_set() and
vy_tx_set_with_colmask()

These methods really belong to vy_tx module, so move them there.

7b56f1fe

lua: add type of operation to space trigger parameters · 5ab0763b

Serge Petrenko authored 6 years ago

Add the type of operation which is being executed to before_replace and
on_replace triggers.

Closes #4099

@TarantoolBot document
Title: new parameter for space before_replace and on_replace triggers
Now before_replace and on_replace triggers accept an additional
parameter: the type of operation that is being executed.
(INSERT/REPLACE/DELETE/UPDATE/UPSERT)
For example, a trigger function may now look like this:
```
function before_replace_trig(old, new, space_name, op_type)
    if op_type == 'INSERT' then
	return old
    else
	return new
    end
end
```
And will restrict all INSERTs, but allow REPLACEs, UPSERTs, DELETEs and
UPDATEs.

5ab0763b

Add idle to downstream status in box.info · a4a7744c
Roman Tokarev authored 6 years ago

a4a7744c

Apr 07, 2019

test: update test-run · 879ec075

Alexander Turenko authored 6 years ago

Add more logging into wait_fullmesh() and return immediately with false
when 'stopped' status is observed.

The purpose of the change is to provide more information in case of a
master-master replication bootstrap failure.

879ec075

vinyl: incorporate tuple comparison hints into vinyl data structures · dafd3926

Vladimir Davydov authored 6 years ago

Apart from speeding up statement comparisons and hence index lookups,
this is also a prerequisite for multikey indexes, which will reuse tuple
comparison hints as offsets in indexed arrays.

Albeit huge, this patch is pretty straightforward - all it does is
replace struct tuple with struct vy_entry (which is tuple + hint pair)
practically everywhere in the code. Now statements are stored and
compared without hints only in a few places, primarily at the very top
level. Hints are also computed at the top level so it should be pretty
easy to replace them with multikey offsets when the time comes.

dafd3926

vinyl: prepare for storing hints in vinyl data structures · a075fb97

Vladimir Davydov authored 6 years ago

This patch adds a helper struct vy_entry, which unites a statement with
a hint. We will use this struct to store hinted statements in vinyl data
structures, such as cache or memory tree.

Note, it's defined in a separate file to minimize dependencies.

a075fb97

vinyl: add wrapper around vy_tx_set · 43e79618

Vladimir Davydov authored 6 years ago

This patch adds vy_set and vy_set_with_colmask functions. For now they
simply forward all arguments to vy_tx_set, but once comparison hints are
introduced, they will also compute a hint for the inserted statement.
Later, with the appearance of multikey indexes, they will also extract
multikey offsets.

43e79618

vinyl: zap vy_mem_iterator_curr_stmt helper · 6affa359

Vladimir Davydov authored 6 years ago

It's a trivial one-line function, which can be folded without hurting
readability, i.e. it only obfuscates the code. Let's kill it.

6affa359

vinyl: rename tree_mem_key to vy_mem_tree_key · 6e793970
Vladimir Davydov authored 6 years ago
```
For aesthetic purposes. No functional changes.
```
6e793970

vinyl: rename vy_cache_entry to vy_cache_node · 4465a62e

Vladimir Davydov authored 6 years ago

In the next patch I'm planning to introduce the concept of vy_entry,
which will encapsulate a statement stored in a container. Let's rename
vy_cache_entry to vy_cache_node so as not to mix the two concepts.

4465a62e

Move hint_t definition to tuple_compare.h · 3475eead
Vladimir Davydov authored 6 years ago
```
So as not to include heavy key_def.h when we only need hint_t.
```
3475eead

lib: update msgpuck library · 51855796

Kirill Shcherbatov authored 6 years ago

The msgpack dependency has been updated because the new version
introduces the new method mp_stack_top for the mp_stack class
which we will use to store a pointer for a multikey frame to
initialize a field_map in case of multikey index.

As the library API has changed, the code has been modified
correspondingly.

@locker: add missing frame update in vy_stmt_new_surrogate_delete.

Needed for #1012

51855796

xrow: improve corrupted header logging on an error · 9d9b4188

Serge Petrenko authored 6 years ago

Improve row printing to log. Since say only has 16k buffer, there is no
point in printing the whole packet, which can have arbitrary length, in one
go.
So, print the header row by row, 16 bytes in a row, and format output to
match `xxd` output:
```
[001] 2019-04-05 18:22:46.679 [11859] iproto V> Got a corrupted row:
[001] 2019-04-05 18:22:46.679 [11859] iproto V> 00000000: A3 02 D6 5A E4 D9 E7 68 A1 53 8D 53 60 5F 20 3F
[001] 2019-04-05 18:22:46.679 [11859] iproto V> 00000010: D8 E2 D6 E2 A3 02 D6 5A E4 D9 E7 68 A1 53 8D 53
```
Now we can get rid of malloc, and use a preallocated tt_static_buf
instead.
Also, replace a big macro with a small macro and a helper function.

Followup to f645119f

9d9b4188

Revert "Revert "Add more tests for DDL outside autocommit mode."" · 9bf66953

Vladimir Davydov authored 6 years ago

This reverts commit 8be593ce.

Now, as the use-after-free bug in space_truncate() implementation has
been fixed, we can enable this test again.

Follow-up #4093

9bf66953

box: fix use-after-free in space_truncate · b76542c4

Vladimir Davydov authored 6 years ago

space_truncate allocates a statement on the stack which is grossly
incorrect as the stack may be purged once the function returns while
box_process_rw expects the statement to be valid until the end of
the transaction. By happy accident, it worked fine until commit
1f7b0d65 ("Require for single statement not autocommit in case of
ddl"), which made it possible to run this function from a transaction
and hence increased the probability of hitting the use-after-free bug.
The fix is trivial: allocate a truncation statement on the region.

Fixes commit 353bcdc5 ("Rework space truncation").

Closes #4093

b76542c4

test: enable cleaning of a test environment · 7474c14e

Alexander Turenko authored 6 years ago

This commit enables pretest_clean test-run option on 'core = tarantool'
test suites with Lua tests and 'core = app' test suites. Consider #4094
for an example of a problem that is eliminated by this option.

For 'core = tarantool': this option drops non-system spaces, drops data
in system spaces and global variables to the initial state, unloads
packages except build-in ones.

For 'core = app': this option deletes xlog and snap files before run a
test.

test-run doesn't remove global variables that are listed in the
'protected_globals' global variable. Use it for, say, functions that are
defined in an instance file and called from tests.

Consider test-run/README.md for the information how exactly the option
works.

Removed unused cfg_filter() function from test/engine/box.lua.

Fixes #4094.

7474c14e

Apr 05, 2019

Revert "Add more tests for DDL outside autocommit mode." · 8be593ce

Alexander Turenko authored 6 years ago

This reverts commit 14a87bb7.

The test cases generate corrupted xlog files (see #4093) and don't allow
other tests to proceed successfully, so we need to temporary disable
these cases. They should be enabled back in the scope of #4093.

8be593ce

test: update test-run · 4ee8910b

Alexander Turenko authored 6 years ago

* Added default timeout for wait_cond() (60 sec).
* Updated pyyaml version in requirements.txt.
* Fixed reporting of non-default server fail at start.
* Stop 'proxy' when a new non-default instance fails.
* Added user-defined protected globals for pretest_clean.

4ee8910b

sql: fix extra type calculation before bytecode generation · 2f6f3bbd

Nikita Pettik authored 6 years ago

In SQL type of constant literal (e.g. 1, 2.5, 'abc') is assigned right
after parsing and saving into struct Expr. Occasionally, type is
re-assigned before emitting opcodes to store literal into VDBE memory.
What is more, for floating point number type is changed to "integer".
This patch fixes this obvious misbehaviour.

2f6f3bbd

Drop const qualifier of struct tuple · 077671fe

Vladimir Davydov authored 6 years ago

Using the const qualifier for complex structures like tuple is bad.
We already have to cast it to drop the const qualifier now and then,
e.g. to increment/decrement the reference counter.

We are planning to wrap struct tuple in a helper struct (aka entry) to
store it in vinyl containers along with a comparison hint (cache, memory
tree, etc). We will be passing this struct by value so we won't be able
to retain const qualifier, because in contrast to a const pointer, one
must initialize a const struct upon definition.

That said, it's time to drop const qualifier of struct tuple everywhere,
like we have already done in case of struct key_def and tuple_format.

077671fe

box: remove _sql_stat1 and _sql_stat4 system tables · ec93b4a5

Mergen Imeev authored 6 years ago

These tables won't be used anymore and should be deleted.

Note, this patch breaks backward compatibility between 2.1.1 and
2.1.2, but that's okay as 2.1.1 was beta and we didn't recommend
anyone to use it.

Part of #2843
Follow up #4069

ec93b4a5

sql: allocate memory for index_id in VDBE · fd6e4b94

Mergen Imeev authored 6 years ago

Currently, the memory for index_id is not allocated in VDBE code
in the sql_code_drop_table() and sql_drop_index() functions. This
may lead to SEGMENTATION FAULT.

Needed for #2843

fd6e4b94