- Nov 17, 2017
-
-
Vladimir Davydov authored
The write iterator never discards DELETE statements referenced by a read view unless it is major compaction. However, a DELETE is useless in case it is preceded by another DELETE for the same key. Let's skip such tautological DELETEs. It is not only a useful optimization on its own - it will also help us annihilate INSERT+DELETE pairs on compaction. Needed for #2875
-
- Nov 16, 2017
-
-
Konstantin Osipov authored
This file used to be smaller so not having extra forward declarations seemed to be reasonable. Now it's time to split the code of iproto_msg and iproto_connection. No semantical changes.
-
ivankosenko authored
-
Konstantin Osipov authored
-
Konstantin Osipov authored
A small refactoring in scope of gh-946.
-
Konstantin Osipov authored
-
Vladimir Davydov authored
If the index was dropped during dump, we abort the dump task so as not to log information about zombie indexes to the vylog. However, we should still notify the scheduler that its memory can be released by calling vy_scheduler_complete_dump(). If we don't, lsregion won't be truncated until another index is dumped. If there is no other vinyl indexes out there or all other indexes have already been dumped, memory will never be freed. On debug builds, it will result in the following assertion failure: vy_scheduler.c:1319: vy_scheduler_peek_dump: Assertion `scheduler->dump_task_count > 0' failed.
-
Ilya authored
* Add golang-like approach to handle errors * Refactor some issues after review #2751 Closes #2757
-
Ilya authored
* Add listdir function based on dirent:readdir * Add mktree, rmtree, copyfile, copytree to fio module Closes #2751
-
Ilya authored
* Remove check on checkpoint signature and always touch snapshot even if there were no transations since the previous checkpoint * Fix timeout check Fixup #2780
-
Roman Tsisyk authored
-
Roman Tsisyk authored
-
Vladimir Davydov authored
It must be pinned by the caller (vy_point_iterator, vy_read_iterator).
-
Vladimir Davydov authored
- Instead of returning the mysterious -2 error code, restart the iterator right in vy_read_iterator_next_key(). - Pin slices while fetching data from disk to avoid checking range version after each disk read.
-
Vladimir Davydov authored
It is not necessary to reopen all sources when the iterator transgresses the current range's boundaries. It's enough to reopen only disk sources, because txw, cache, and mem do not belong to ranges.
-
Vladimir Davydov authored
When the read iterator stops reading a chain of statements from the cache it advances all other sources by calling next_key() until the last_stmt is reached. This effectively cancels the benefit of using the cache, because all statements skipped due to the cache are fetched from in-memory trees or, even worse, on-disk runs. To fix this, let's introduce and use skip() method which makes the source iterator jump to the first statement following a particular key. Its implementation is similar to and reuses the code from start and restore procedures. With this new method, we don't need to mangle iterator_type/key when reopening source iterators during restoration so that they start iteration from last_stmt: instead we can advance them with skip() on the first iteration. Let's do this too, because the iterator can benefit from knowing the real iterator type (e.g. cache can stop ITER_EQ iteration even if there's no chain in the cache, by looking at vy_cache_entry::left_boundary_level,right_boundary_level).
-
Vladimir Davydov authored
There is no such thing as vy_stmt_iterator anymore so split the header in vy_stmt_stream.h and vy_read_view.h.
-
Vladimir Davydov authored
vy_read_iterator was the only user of this interface. As now it handles sources of different types differently, the interface is not needed any more.
-
Vladimir Davydov authored
The generic approach trying to build the merge procedure around the vy_stmt_iterator interface didn't pan out, because sources are way too different: in contrast to other sources, the cache stores intervals; run iterators may yield; txw does not preserve statement history. Let's rewrite vy_read_iterator_next_{key,lsn} in such a way that they do not use this generic interface. This results in a quite bit of code being duplicated, because loops over sources are unrolled, but this is intentional - hopefully it makes the code easier to follow. The patch isn't supposed to change the merge algorithm or remove any optimization implemented in it.
-
Vladimir Davydov authored
This reverts commit be8ee29a. Taking a reference to the search key in source iterators is pointless - it can't go away while we are using them. The only part of this patch that makes sense is removing the const specifier from vy_point_iterator->key.
-
Vladimir Davydov authored
Closes #2558
-
Ivan Kosenko authored
-
Georgy Kirichenko authored
Start applier->writer fiber only after SUBSCRIBE. Otherwiser writer will send ACK during FINAL JOIN and break replication protocol. Fixes #2726
-
- Nov 15, 2017
-
-
Vladimir Davydov authored
Make sure the master receives an ack from the replica and performs garbage collection before checking the checkpoint count.
-
Vladimir Davydov authored
We remove old xlog files as soon as we have sent them to all replicas. However, the fact that we have successfully sent something to a replica doesn't necessarily mean the replica will have received it. If a replica fails to apply a row (for instance, it is out of memory), replication will stop, but the data files have already been deleted on the master so that when the replica is back online, the master won't find appropriate xlog to feed to the replica and replication will stop again. The user visible effect is the following error message in the log and in the replica status: Missing .xlog file between LSN 306 {1: 306} and 311 {1: 311} There is no way to recover from this but to re-bootstrap the replica from scratch. The issue was introduced by commit ba09475f ("replica: advance gc state only when xlog is closed"), which targeted at making the status update procedure as lightweight and fast as possible and so moved gc_consumer_advance() from tx_status_update() to a special gc message. A gc message is created and sent to TX as soon as an xlog is relayed. Let's rework this so that gc messages are appended to a special queue first and scheduled only when the relay receives the receipt confirmation from the replica. Closes #2825
-
Vladimir Davydov authored
If read of a single statement from vinyl takes more than the value of box.cfg.too_long_threshold, the request will be logged: 512/1: select([1], EQ) => REPLACE([100001, 1], lsn=200006) took too long: 0.626 sec This is useful for debugging. While we are at it, let's also remove 'timeout' from the vinyl engine constructor arguments and set it with box_set_vinyl_timeout() on box initialization instead, similarly to vinyl_max_tuple_size. Closes #2871
-
Vladimir Davydov authored
Use generic tuple_snprint() and tuple_str() instead.
-
Vladimir Davydov authored
The vinyl_iterator struct was introduced as a C++ wrapper around vy_cursor. Since there's no C++ code left in the engine, and both structures are defined in the same file, we can merge them now.
-
Vladimir Davydov authored
The engine infrastructure was initially implemented in C++ so we needed the wrappers to provide C++ API to Vinyl. Now everything is in C so we don't need them any more. Let's fold them in vinyl.c. Note, this patch does not touch vinyl_engine, vinyl_index, and vinyl_iterator structures, they are still there, it just gets rid of the intermediate layer of wrapper functions, which is not needed any more.
-
Vladimir Davydov authored
Accessing configuration from inside an engine implementation violates encapsulation.
-
Vladimir Davydov authored
Engine callbacks that perform garbage collection may sleep, because they use coio for removing files to avoid blocking the TX thread. If garbage collection is called concurrently from different fibers (e.g. from relay fibers), we may attempt to delete the same file multiple times. What is worse xdir_collect_garbage(), used by engine callbacks to remove files, isn't safe against concurrent execution - it first unlinks a file via coio, which involves a yield, and only then removes the corresponding vclock from the directory index. This opens a race window for another fiber to read the same clock and yield, in the interim the vclock can be freed by the first fiber: #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f105ceda3fa in __GI_abort () at abort.c:89 #2 0x000055e4c03f4a3d in sig_fatal_cb (signo=11) at main.cc:184 #3 <signal handler called> #4 0x000055e4c066907a in vclockset_remove (rbtree=0x55e4c1010e58, node=0x55e4c1023d20) at box/vclock.c:215 #5 0x000055e4c06256af in xdir_collect_garbage (dir=0x55e4c1010e28, signature=342, use_coio=true) at box/xlog.c:620 #6 0x000055e4c0417dcc in memtx_engine_collect_garbage (engine=0x55e4c1010df0, lsn=342) at box/memtx_engine.c:784 #7 0x000055e4c0414dbf in engine_collect_garbage (lsn=342) at box/engine.c:155 #8 0x000055e4c04a36c7 in gc_run () at box/gc.c:192 #9 0x000055e4c04a38f2 in gc_consumer_advance (consumer=0x55e4c1021360, signature=342) at box/gc.c:262 #10 0x000055e4c04b4da8 in tx_gc_advance (msg=0x7f1028000aa0) at box/relay.cc:250 #11 0x000055e4c04eb854 in cmsg_deliver (msg=0x7f1028000aa0) at cbus.c:353 #12 0x000055e4c04ec871 in fiber_pool_f (ap=0x7f1056800ec0) at fiber_pool.c:64 #13 0x000055e4c03f4784 in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=0x55e4c04ec6d4 <fiber_pool_f>, ap=0x7f1056800ec0) at fiber.h:665 #14 0x000055e4c04e6816 in fiber_loop (data=0x0) at fiber.c:631 #15 0x000055e4c0687dab in coro_init () at /home/vlad/src/tarantool/third_party/coro/coro.c:110 Fix this by serializing concurrent execution of garbage collection callbacks with a latch.
-
Vladimir Davydov authored
Currently, box.schema.upgrade() is called automatically after box.cfg() if the upgrade is considered safe (currently, only upgrade to 1.7.5 is "safe"). However, no upgrade is safe in case replication is configured, because it can easily result in replication conflicts. Let's disable auto upgrade if the 'replication' configuration option is set. Closes #2886
-
Roman Tsisyk authored
-
- Nov 13, 2017
-
-
Vladimir Davydov authored
Before commit 29d00dca ("alter: forbid to drop space with truncate record") a space record was removed before the corresponding record in the _truncate system space so we should disable the check that the space being dropped doesn't have a record in _truncate in case we are recovering data generated by tarantool < 1.7.6. Closes #2909
-
Konstantin Osipov authored
Remove iobuf_is_idle(). It uses obuf.wpos, which itself needs to be removed from obuf.
-
- Nov 10, 2017
-
-
Konstantin Osipov authored
Spare box from iproto I/O to simplify transition of control of output buffer from iproto to tx thread. In scope of gh-946.
-
- Nov 06, 2017
-
-
Roman Tsisyk authored
-
Roman Tsisyk authored
Bloom filter depends on hash function, which depends on ICU version, which may vary.
-
Roman Tsisyk authored