Commits · ccd451eb9652b27821dba0ff74e6443c296ae246 · core / tarantool

Nov 16, 2017

Always touch snapshpot in checkpoint_daemon · ccd451eb

Ilya authored 7 years ago

* Remove check on checkpoint signature and always touch snapshot
  even if there were no transations since the previous checkpoint
* Fix timeout check

Fixup #2780

ccd451eb

Update vinyl/info.result · f6ec32e8
Roman Tsisyk authored 7 years ago

f6ec32e8
Merge remote-tracking branch 'origin/1.7' into vy-read-iterator-cleanup · 25324cde
Roman Tsisyk authored 7 years ago

25324cde
vinyl: run iterator: do not pin slice · a850321d
Vladimir Davydov authored 7 years ago
```
It must be pinned by the caller (vy_point_iterator, vy_read_iterator).
```
a850321d

vinyl: read iterator: simplify version checking · 0b4e467d

Vladimir Davydov authored 7 years ago

 - Instead of returning the mysterious -2 error code, restart the
   iterator right in vy_read_iterator_next_key().
 - Pin slices while fetching data from disk to avoid checking range
   version after each disk read.

0b4e467d

vinyl: read iterator: do not reopen all sources when range is changed · 5e414a73

Vladimir Davydov authored 7 years ago

It is not necessary to reopen all sources when the iterator transgresses
the current range's boundaries. It's enough to reopen only disk sources,
because txw, cache, and mem do not belong to ranges.

5e414a73

vinyl: read iterator: do not fetch statements found in the cache · b1d73e0c

Vladimir Davydov authored 7 years ago

When the read iterator stops reading a chain of statements from the
cache it advances all other sources by calling next_key() until the
last_stmt is reached. This effectively cancels the benefit of using
the cache, because all statements skipped due to the cache are fetched
from in-memory trees or, even worse, on-disk runs.

To fix this, let's introduce and use skip() method which makes the
source iterator jump to the first statement following a particular key.
Its implementation is similar to and reuses the code from start and
restore procedures.

With this new method, we don't need to mangle iterator_type/key when
reopening source iterators during restoration so that they start
iteration from last_stmt: instead we can advance them with skip() on
the first iteration. Let's do this too, because the iterator can
benefit from knowing the real iterator type (e.g. cache can stop
ITER_EQ iteration even if there's no chain in the cache, by looking at
vy_cache_entry::left_boundary_level,right_boundary_level).

b1d73e0c

vinyl: split vy_stmt_iterator.h header · 89ef2c76

Vladimir Davydov authored 7 years ago

There is no such thing as vy_stmt_iterator anymore so split the header
in vy_stmt_stream.h and vy_read_view.h.

89ef2c76

vinyl: zap vy_stmt_iterator interface · d1ab55cf

Vladimir Davydov authored 7 years ago

vy_read_iterator was the only user of this interface. As now it handles
sources of different types differently, the interface is not needed any
more.

d1ab55cf

vinyl: decompose read iterator merge procedure · 023ab35f

Vladimir Davydov authored 7 years ago

The generic approach trying to build the merge procedure around the
vy_stmt_iterator interface didn't pan out, because sources are way too
different: in contrast to other sources, the cache stores intervals;
run iterators may yield; txw does not preserve statement history.

Let's rewrite vy_read_iterator_next_{key,lsn} in such a way that they
do not use this generic interface. This results in a quite bit of code
being duplicated, because loops over sources are unrolled, but this is
intentional - hopefully it makes the code easier to follow. The patch
isn't supposed to change the merge algorithm or remove any optimization
implemented in it.

023ab35f

Partially revert "vinyl: reference iterator keys" · 705ca361

Vladimir Davydov authored 7 years ago

This reverts commit be8ee29a.

Taking a reference to the search key in source iterators is pointless -
it can't go away while we are using them.

The only part of this patch that makes sense is removing the const
specifier from vy_point_iterator->key.

705ca361

Add test for vinyl statistics · 1456af17
Vladimir Davydov authored 7 years ago
```
Closes #2558
```
1456af17
Fix input wrapping when terminal size changed · 8ff671d4
Ivan Kosenko authored 7 years ago

8ff671d4

replication: enable ACKS onlys after SUBSCRIBE · e4bcc2ba

Georgy Kirichenko authored 7 years ago

Start applier->writer fiber only after SUBSCRIBE.
Otherwiser writer will send ACK during FINAL JOIN and break
replication protocol.

Fixes #2726

e4bcc2ba

Nov 15, 2017

Fix replication/gc test · 473b85a3

Vladimir Davydov authored 7 years ago

Make sure the master receives an ack from the replica and performs
garbage collection before checking the checkpoint count.

473b85a3

relay: don't delete xlog files until replica confirms receipt · 2d86127b

Vladimir Davydov authored 7 years ago

We remove old xlog files as soon as we have sent them to all replicas.
However, the fact that we have successfully sent something to a replica
doesn't necessarily mean the replica will have received it. If a replica
fails to apply a row (for instance, it is out of memory), replication
will stop, but the data files have already been deleted on the master so
that when the replica is back online, the master won't find appropriate
xlog to feed to the replica and replication will stop again.

The user visible effect is the following error message in the log and in
the replica status:

  Missing .xlog file between LSN 306 {1: 306} and 311 {1: 311}

There is no way to recover from this but to re-bootstrap the replica
from scratch.

The issue was introduced by commit ba09475f ("replica: advance gc
state only when xlog is closed"), which targeted at making the status
update procedure as lightweight and fast as possible and so moved
gc_consumer_advance() from tx_status_update() to a special gc message.
A gc message is created and sent to TX as soon as an xlog is relayed.
Let's rework this so that gc messages are appended to a special queue
first and scheduled only when the relay receives the receipt
confirmation from the replica.

Closes #2825

2d86127b

vinyl: log reads that take too long · af63fcbe

Vladimir Davydov authored 7 years ago

If read of a single statement from vinyl takes more than the value of
box.cfg.too_long_threshold, the request will be logged:

  512/1: select([1], EQ) => REPLACE([100001, 1], lsn=200006) took too long: 0.626 sec

This is useful for debugging.

While we are at it, let's also remove 'timeout' from the vinyl engine
constructor arguments and set it with box_set_vinyl_timeout() on box
initialization instead, similarly to vinyl_max_tuple_size.

Closes #2871

af63fcbe

vinyl: zap vy_key_snprint() and vy_key_str() · 41e08261
Vladimir Davydov authored 7 years ago
```
Use generic tuple_snprint() and tuple_str() instead.
```
41e08261

vinyl: merge vy_cursor and vinyl_iterator · 54f9d3b7

Vladimir Davydov authored 7 years ago

The vinyl_iterator struct was introduced as a C++ wrapper around
vy_cursor. Since there's no C++ code left in the engine, and both
structures are defined in the same file, we can merge them now.

54f9d3b7

vinyl: remove engine wrapper functions · f5fca22f

Vladimir Davydov authored 7 years ago

The engine infrastructure was initially implemented in C++ so we
needed the wrappers to provide C++ API to Vinyl. Now everything is
in C so we don't need them any more. Let's fold them in vinyl.c.

Note, this patch does not touch vinyl_engine, vinyl_index, and
vinyl_iterator structures, they are still there, it just gets rid
of the intermediate layer of wrapper functions, which is not needed
any more.

f5fca22f

vinyl: pass force_recovery on engine initialization · 85fd9907
Vladimir Davydov authored 7 years ago
```
Accessing configuration from inside an engine implementation
violates encapsulation.
```
85fd9907

Fix race in garbage collection · b8717738

Vladimir Davydov authored 7 years ago

Engine callbacks that perform garbage collection may sleep, because they
use coio for removing files to avoid blocking the TX thread. If garbage
collection is called concurrently from different fibers (e.g. from relay
fibers), we may attempt to delete the same file multiple times. What is
worse xdir_collect_garbage(), used by engine callbacks to remove files,
isn't safe against concurrent execution - it first unlinks a file via
coio, which involves a yield, and only then removes the corresponding
vclock from the directory index. This opens a race window for another
fiber to read the same clock and yield, in the interim the vclock can be
freed by the first fiber:

  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007f105ceda3fa in __GI_abort () at abort.c:89
  #2  0x000055e4c03f4a3d in sig_fatal_cb (signo=11) at main.cc:184
  #3  <signal handler called>
  #4  0x000055e4c066907a in vclockset_remove (rbtree=0x55e4c1010e58, node=0x55e4c1023d20) at box/vclock.c:215
  #5  0x000055e4c06256af in xdir_collect_garbage (dir=0x55e4c1010e28, signature=342, use_coio=true) at box/xlog.c:620
  #6  0x000055e4c0417dcc in memtx_engine_collect_garbage (engine=0x55e4c1010df0, lsn=342) at box/memtx_engine.c:784
  #7  0x000055e4c0414dbf in engine_collect_garbage (lsn=342) at box/engine.c:155
  #8  0x000055e4c04a36c7 in gc_run () at box/gc.c:192
  #9  0x000055e4c04a38f2 in gc_consumer_advance (consumer=0x55e4c1021360, signature=342) at box/gc.c:262
  #10 0x000055e4c04b4da8 in tx_gc_advance (msg=0x7f1028000aa0) at box/relay.cc:250
  #11 0x000055e4c04eb854 in cmsg_deliver (msg=0x7f1028000aa0) at cbus.c:353
  #12 0x000055e4c04ec871 in fiber_pool_f (ap=0x7f1056800ec0) at fiber_pool.c:64
  #13 0x000055e4c03f4784 in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=0x55e4c04ec6d4 <fiber_pool_f>, ap=0x7f1056800ec0) at fiber.h:665
  #14 0x000055e4c04e6816 in fiber_loop (data=0x0) at fiber.c:631
  #15 0x000055e4c0687dab in coro_init () at /home/vlad/src/tarantool/third_party/coro/coro.c:110

Fix this by serializing concurrent execution of garbage collection
callbacks with a latch.

b8717738

box: disable schema auto upgrade for replication · 582a85d4

Vladimir Davydov authored 7 years ago

Currently, box.schema.upgrade() is called automatically after box.cfg()
if the upgrade is considered safe (currently, only upgrade to 1.7.5 is
"safe"). However, no upgrade is safe in case replication is configured,
because it can easily result in replication conflicts. Let's disable
auto upgrade if the 'replication' configuration option is set.

Closes #2886

582a85d4

Travis CI: deploy development branches to PackageCloud · 3f8e6de2
Roman Tsisyk authored 7 years ago

3f8e6de2

Nov 13, 2017

Fix recovery from 1.7.5 xlogs containing space truncate+drop · 50a15e3f

Vladimir Davydov authored 7 years ago

Before commit 29d00dca ("alter: forbid to drop space with truncate
record") a space record was removed before the corresponding record in
the _truncate system space so we should disable the check that the space
being dropped doesn't have a record in _truncate in case we are
recovering data generated by tarantool < 1.7.6.

Closes #2909

50a15e3f

iobuf: remove dead code · 05506102

Konstantin Osipov authored 7 years ago

Remove iobuf_is_idle(). It uses obuf.wpos, which itself
needs to be removed from obuf.

05506102

Nov 10, 2017
- box: do not pass obuf to box_process_auth(). · 970aabb3
  Konstantin Osipov authored 7 years ago
  
  Spare box from iproto I/O to simplify transition of control of output buffer from iproto to tx thread. In scope of gh-946.
  View commits for tag 1.7.7 1.7.7
  
  970aabb3
Nov 06, 2017

Fix flaky vinyl/ddl.test.lua · 7b2945d6
Roman Tsisyk authored 7 years ago

View commits for tag 1.7.6 1.7.6

7b2945d6
Fix box/tree_misc.test.lua on 32-bit · 16151314
Roman Tsisyk authored 7 years ago

16151314
Filter out bloom_filter in vinyl/layout.test.lua · 0a37ccad
Roman Tsisyk authored 7 years ago
```
Bloom filter depends on hash function, which depends on ICU
version, which may vary.
```
0a37ccad
Fix "field type is deprecated" error message · 67e3a34d
Roman Tsisyk authored 7 years ago

67e3a34d
Add tests cases for 1.7.5 -> 1.7.6 upgrade · cd5a7edf
Roman Tsisyk authored 7 years ago

cd5a7edf
box: rename unicode_s1 to unicode_ci · 8bd310c3
Roman Tsisyk authored 7 years ago
```
+ Don't use id=0 for collations

Follow up #2649
```
8bd310c3

Fix tuple hash computation for scalar and nullable string fields · 69e30163

Vladimir Davydov authored 7 years ago

Fix tuple_hash_field() to handle the following cases properly:

 - Nullable string field (crash in vinyl on dump).
 - Scalar field with collation enabled (crash in memtx hash index).

Add corresponding test cases.

69e30163

memtx: restore stable order for nullable indexes · 926a32f0

Vladimir Davydov authored 7 years ago

First, unique but nullable indexes are not rebuilt when the primary key
is altered although they should be, because they can contain multiple
NULLs. Second, when rebuilding such indexes we use a wrong key def
(index_def->key_def instead of cmp_def), which results in lost stable
order after recovery. Fix both these issues and add a test case.

926a32f0

Test replication in case primary index uses collation · 5d009876

Vladimir Davydov authored 7 years ago

Needed to check if the key definition loaded from vylog to send initial
data to a replica has the collation properly recovered.

5d009876

vinyl: store nullability key def property in vylog · 4b0cc22a

Vladimir Davydov authored 7 years ago

It isn't stored currently, but this doesn't break anything, because the
primary key, which is the only key whose definition is used after having
been loaded from vylog, can't be nullable. Let's store it there just in
case. Update the vinyl/layout test to check that.

4b0cc22a

vinyl: enable collations · 610ae25a

Vladimir Davydov authored 7 years ago

Collations were disabled in vinyl by commmit 2097908f ("Fix
collation test on some platforms and disable collation in vinyl"),
because a key_def referencing a collation could not be loaded from
vylog on recovery (collation objects are created after vylog is
recovered). Now, it isn't a problem anymore, because the decoding
procedure, key_def_decode_parts(), deals with struct key_part_def,
which references a collation by id and hence doesn't need a collation
object to be created. So we can enable collations in vinyl.

This patch partially reverts the aforementioned commit (it can't
do full revert, because that commit also fixed some tests along
the way).

Closes #2822

610ae25a

key_def: do not lookup collation when decoding parts · 7a0f2898

Vladimir Davydov authored 7 years ago

We can't use key_def_decode_parts() when recovering vylog if key_def has
a collation, because vylog is recovered before the snapshot, i.e. when
collation objects haven't been created yet, while key_def_decode_parts()
tries to look up the collation by id. As a result, we can't enable
collations for vinyl indexes.

To fix this, let's rework the decoding procedure so that it works with
struct key_part_def instead of key_part.  The only difference between
the two structures is that the former references the collation by id
while the latter by pointer.

Needed for #2822

7a0f2898

replication: stop applier writer fiber before reconnect · 0a0731f4

Georgy Kirichenko authored 7 years ago

Writer fiber should be stopped before re-connect to avoid
sending unwanted IPROTO_OK replication acknowledges.

Fixes #2726

0a0731f4