- Nov 16, 2017
-
-
Ilya authored
* Remove check on checkpoint signature and always touch snapshot even if there were no transations since the previous checkpoint * Fix timeout check Fixup #2780
-
Roman Tsisyk authored
-
Roman Tsisyk authored
-
Vladimir Davydov authored
It must be pinned by the caller (vy_point_iterator, vy_read_iterator).
-
Vladimir Davydov authored
- Instead of returning the mysterious -2 error code, restart the iterator right in vy_read_iterator_next_key(). - Pin slices while fetching data from disk to avoid checking range version after each disk read.
-
Vladimir Davydov authored
It is not necessary to reopen all sources when the iterator transgresses the current range's boundaries. It's enough to reopen only disk sources, because txw, cache, and mem do not belong to ranges.
-
Vladimir Davydov authored
When the read iterator stops reading a chain of statements from the cache it advances all other sources by calling next_key() until the last_stmt is reached. This effectively cancels the benefit of using the cache, because all statements skipped due to the cache are fetched from in-memory trees or, even worse, on-disk runs. To fix this, let's introduce and use skip() method which makes the source iterator jump to the first statement following a particular key. Its implementation is similar to and reuses the code from start and restore procedures. With this new method, we don't need to mangle iterator_type/key when reopening source iterators during restoration so that they start iteration from last_stmt: instead we can advance them with skip() on the first iteration. Let's do this too, because the iterator can benefit from knowing the real iterator type (e.g. cache can stop ITER_EQ iteration even if there's no chain in the cache, by looking at vy_cache_entry::left_boundary_level,right_boundary_level).
-
Vladimir Davydov authored
There is no such thing as vy_stmt_iterator anymore so split the header in vy_stmt_stream.h and vy_read_view.h.
-
Vladimir Davydov authored
vy_read_iterator was the only user of this interface. As now it handles sources of different types differently, the interface is not needed any more.
-
Vladimir Davydov authored
The generic approach trying to build the merge procedure around the vy_stmt_iterator interface didn't pan out, because sources are way too different: in contrast to other sources, the cache stores intervals; run iterators may yield; txw does not preserve statement history. Let's rewrite vy_read_iterator_next_{key,lsn} in such a way that they do not use this generic interface. This results in a quite bit of code being duplicated, because loops over sources are unrolled, but this is intentional - hopefully it makes the code easier to follow. The patch isn't supposed to change the merge algorithm or remove any optimization implemented in it.
-
Vladimir Davydov authored
This reverts commit be8ee29a. Taking a reference to the search key in source iterators is pointless - it can't go away while we are using them. The only part of this patch that makes sense is removing the const specifier from vy_point_iterator->key.
-
Vladimir Davydov authored
Closes #2558
-
Ivan Kosenko authored
-
Georgy Kirichenko authored
Start applier->writer fiber only after SUBSCRIBE. Otherwiser writer will send ACK during FINAL JOIN and break replication protocol. Fixes #2726
-
- Nov 15, 2017
-
-
Vladimir Davydov authored
Make sure the master receives an ack from the replica and performs garbage collection before checking the checkpoint count.
-
Vladimir Davydov authored
We remove old xlog files as soon as we have sent them to all replicas. However, the fact that we have successfully sent something to a replica doesn't necessarily mean the replica will have received it. If a replica fails to apply a row (for instance, it is out of memory), replication will stop, but the data files have already been deleted on the master so that when the replica is back online, the master won't find appropriate xlog to feed to the replica and replication will stop again. The user visible effect is the following error message in the log and in the replica status: Missing .xlog file between LSN 306 {1: 306} and 311 {1: 311} There is no way to recover from this but to re-bootstrap the replica from scratch. The issue was introduced by commit ba09475f ("replica: advance gc state only when xlog is closed"), which targeted at making the status update procedure as lightweight and fast as possible and so moved gc_consumer_advance() from tx_status_update() to a special gc message. A gc message is created and sent to TX as soon as an xlog is relayed. Let's rework this so that gc messages are appended to a special queue first and scheduled only when the relay receives the receipt confirmation from the replica. Closes #2825
-
Vladimir Davydov authored
If read of a single statement from vinyl takes more than the value of box.cfg.too_long_threshold, the request will be logged: 512/1: select([1], EQ) => REPLACE([100001, 1], lsn=200006) took too long: 0.626 sec This is useful for debugging. While we are at it, let's also remove 'timeout' from the vinyl engine constructor arguments and set it with box_set_vinyl_timeout() on box initialization instead, similarly to vinyl_max_tuple_size. Closes #2871
-
Vladimir Davydov authored
Use generic tuple_snprint() and tuple_str() instead.
-
Vladimir Davydov authored
The vinyl_iterator struct was introduced as a C++ wrapper around vy_cursor. Since there's no C++ code left in the engine, and both structures are defined in the same file, we can merge them now.
-
Vladimir Davydov authored
The engine infrastructure was initially implemented in C++ so we needed the wrappers to provide C++ API to Vinyl. Now everything is in C so we don't need them any more. Let's fold them in vinyl.c. Note, this patch does not touch vinyl_engine, vinyl_index, and vinyl_iterator structures, they are still there, it just gets rid of the intermediate layer of wrapper functions, which is not needed any more.
-
Vladimir Davydov authored
Accessing configuration from inside an engine implementation violates encapsulation.
-
Vladimir Davydov authored
Engine callbacks that perform garbage collection may sleep, because they use coio for removing files to avoid blocking the TX thread. If garbage collection is called concurrently from different fibers (e.g. from relay fibers), we may attempt to delete the same file multiple times. What is worse xdir_collect_garbage(), used by engine callbacks to remove files, isn't safe against concurrent execution - it first unlinks a file via coio, which involves a yield, and only then removes the corresponding vclock from the directory index. This opens a race window for another fiber to read the same clock and yield, in the interim the vclock can be freed by the first fiber: #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f105ceda3fa in __GI_abort () at abort.c:89 #2 0x000055e4c03f4a3d in sig_fatal_cb (signo=11) at main.cc:184 #3 <signal handler called> #4 0x000055e4c066907a in vclockset_remove (rbtree=0x55e4c1010e58, node=0x55e4c1023d20) at box/vclock.c:215 #5 0x000055e4c06256af in xdir_collect_garbage (dir=0x55e4c1010e28, signature=342, use_coio=true) at box/xlog.c:620 #6 0x000055e4c0417dcc in memtx_engine_collect_garbage (engine=0x55e4c1010df0, lsn=342) at box/memtx_engine.c:784 #7 0x000055e4c0414dbf in engine_collect_garbage (lsn=342) at box/engine.c:155 #8 0x000055e4c04a36c7 in gc_run () at box/gc.c:192 #9 0x000055e4c04a38f2 in gc_consumer_advance (consumer=0x55e4c1021360, signature=342) at box/gc.c:262 #10 0x000055e4c04b4da8 in tx_gc_advance (msg=0x7f1028000aa0) at box/relay.cc:250 #11 0x000055e4c04eb854 in cmsg_deliver (msg=0x7f1028000aa0) at cbus.c:353 #12 0x000055e4c04ec871 in fiber_pool_f (ap=0x7f1056800ec0) at fiber_pool.c:64 #13 0x000055e4c03f4784 in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=0x55e4c04ec6d4 <fiber_pool_f>, ap=0x7f1056800ec0) at fiber.h:665 #14 0x000055e4c04e6816 in fiber_loop (data=0x0) at fiber.c:631 #15 0x000055e4c0687dab in coro_init () at /home/vlad/src/tarantool/third_party/coro/coro.c:110 Fix this by serializing concurrent execution of garbage collection callbacks with a latch.
-
Vladimir Davydov authored
Currently, box.schema.upgrade() is called automatically after box.cfg() if the upgrade is considered safe (currently, only upgrade to 1.7.5 is "safe"). However, no upgrade is safe in case replication is configured, because it can easily result in replication conflicts. Let's disable auto upgrade if the 'replication' configuration option is set. Closes #2886
-
Roman Tsisyk authored
-
- Nov 13, 2017
-
-
Vladimir Davydov authored
Before commit 29d00dca ("alter: forbid to drop space with truncate record") a space record was removed before the corresponding record in the _truncate system space so we should disable the check that the space being dropped doesn't have a record in _truncate in case we are recovering data generated by tarantool < 1.7.6. Closes #2909
-
Konstantin Osipov authored
Remove iobuf_is_idle(). It uses obuf.wpos, which itself needs to be removed from obuf.
-
- Nov 10, 2017
-
-
Konstantin Osipov authored
Spare box from iproto I/O to simplify transition of control of output buffer from iproto to tx thread. In scope of gh-946.
-
- Nov 06, 2017
-
-
Roman Tsisyk authored
-
Roman Tsisyk authored
Bloom filter depends on hash function, which depends on ICU version, which may vary.
-
Roman Tsisyk authored
-
Roman Tsisyk authored
-
Roman Tsisyk authored
+ Don't use id=0 for collations Follow up #2649
-
Vladimir Davydov authored
Fix tuple_hash_field() to handle the following cases properly: - Nullable string field (crash in vinyl on dump). - Scalar field with collation enabled (crash in memtx hash index). Add corresponding test cases.
-
Vladimir Davydov authored
First, unique but nullable indexes are not rebuilt when the primary key is altered although they should be, because they can contain multiple NULLs. Second, when rebuilding such indexes we use a wrong key def (index_def->key_def instead of cmp_def), which results in lost stable order after recovery. Fix both these issues and add a test case.
-
Vladimir Davydov authored
Needed to check if the key definition loaded from vylog to send initial data to a replica has the collation properly recovered.
-
Vladimir Davydov authored
It isn't stored currently, but this doesn't break anything, because the primary key, which is the only key whose definition is used after having been loaded from vylog, can't be nullable. Let's store it there just in case. Update the vinyl/layout test to check that.
-
Vladimir Davydov authored
Collations were disabled in vinyl by commmit 2097908f ("Fix collation test on some platforms and disable collation in vinyl"), because a key_def referencing a collation could not be loaded from vylog on recovery (collation objects are created after vylog is recovered). Now, it isn't a problem anymore, because the decoding procedure, key_def_decode_parts(), deals with struct key_part_def, which references a collation by id and hence doesn't need a collation object to be created. So we can enable collations in vinyl. This patch partially reverts the aforementioned commit (it can't do full revert, because that commit also fixed some tests along the way). Closes #2822
-
Vladimir Davydov authored
We can't use key_def_decode_parts() when recovering vylog if key_def has a collation, because vylog is recovered before the snapshot, i.e. when collation objects haven't been created yet, while key_def_decode_parts() tries to look up the collation by id. As a result, we can't enable collations for vinyl indexes. To fix this, let's rework the decoding procedure so that it works with struct key_part_def instead of key_part. The only difference between the two structures is that the former references the collation by id while the latter by pointer. Needed for #2822
-
Georgy Kirichenko authored
Writer fiber should be stopped before re-connect to avoid sending unwanted IPROTO_OK replication acknowledges. Fixes #2726