- Sep 26, 2017
-
-
Konstantin Osipov authored
No need to allocate a tuple buffer for every sequence value.
-
Vladimir Davydov authored
Currently, _sequence_data is written to the snapshot the same way as any other memtx space - the snapshot iterator simply walks over the space tuples and dumps them one-by-one. This is OK now, because _sequence_data is always up-to-date. However, after the next patch, its contents may become stale - due to technical reasons we won't update _sequence_data if a sequence is used for auto increment in a space. To make sure we still snapshot sequence values properly, let's introduce a special iterator for this space that walks over the sequence cache instead of the space contents. Needed for #389
-
Vladimir Davydov authored
Currently, Index::createSnapshotIterator() returns a pointer to struct iterator, so the iterator implementation has to deal with struct tuple, which might not always be an option. Since the iterator is only used for writing a snapshot, struct tuple isn't really necessary, raw msgpack would be enough. So let's introduce snapshot_iterator which returns raw msgpack instead of struct tuple. It is required by the next patch to make a snapshot of _sequence_data space. Needed for #389
-
Vladimir Davydov authored
-
Vladimir Davydov authored
Currently, read iterator adds only final statements (i.e. statements returned to the user) to the conflict manager while intermediate DELETEs are silently ignored. This can result to the following select() inconsistency. Suppose there are REPLACE{10}, DELETE{20} and REPLACE{30} (it does not matter which source owns which statement in this example) and select({}) is called. The read iterator returns REPLACE{10} and is called again. It scans the sources, finds DELETE{20} and silently skips to the next statement. If its fiber yields now, and REPLACE{15} is inserted, the read iterator will ignore it, because its curr_stmt is {20} and so it will return {30}, i.e. the select() will return [{10}, {30}]. The same select() called from the same transaction afterwards will, however, return [{10}, {20}, {30}], which is inconsistent with the previous result. This happens, because interval [{10}, {20}] was not added to the conflict manager before the yield and so the transaction was not sent to read view. To avoid this, we must track not only final statements, but also intermediate DELETEs. There's a nuance to this patch regarding REQ case that needs to be noted explicitly. Since source iterators can't handle REQ, in contrast to EQ, we change vy_read_iterator->iterator_type from REQ to LE and set the need_check_eq flag when the iterator is opened. The need_check_eq flag makes read iterator check if the resulting statement equals the search key before returning it to the user. This check must be performed before vy_tx_track() is called, otherwise we might add a larger interval to the conflict manager than necessary, resulting in unnecessary conflicts (vinyl/tx_gap_lock test will surely catch that). So this patch moves call to vy_tx_track() along with the need_check_eq check immediately after vy_read_iterator_next_key() irrespective of the statement type. The problem is vy_cache_add() still has to be called on the final statement: * WAS: vy_cache_add(result) need_check_eq check; may set result to NULL vy_tx_track(result) * NOW: need_check_eq check; may set result to NULL vy_tx_track(result) vy_cache_add(result) Beside the result, vy_cache_add() is passed the iterator type so it should handle NULL properly and it sure does in case of EQ, but not in case of REQ, because REQ was changed to LE. That is, when vy_cache_add() is passed NULL (search end marker) in case of REQ, it thinks it saw the leftmost statement across all keys, not the leftmost statement equal to the search key. I decided that the best way to overcome this is to make read iterator handle REQ separately, just like EQ, and only replace it with LE right before passing it down to a source iterator. This gives vy_cache_add() enough info to handle the REQ case properly and makes the code look more symmetrical IMO.
-
Vladimir Davydov authored
Suppose, a source iterator was positioned at the minimal statement across all sources before yielding to read a disk source. Then it has front_id equal to the front_id of the read iterator, which indicates that the source iterator needs to be advanced on the next iteration. However, if the statement was removed during the yield, the iterator gets repositioned on the next statement by the restoration procedure (see vy_read_iterator_next_key()). In this case it isn't positioned on the minimal statement any more and hence must not be advanced on the next iteration, but its front_id remains the same, which means it will be advanced, possibly skipping a statement. The situation described above can happen only in case of the tuple cache, because other sources are append-only. Obviously, to avoid this kind of trouble, we must reset the front_id if a source iterator is restored to a statement that is greater than the minimal statement.
-
Vladimir Davydov authored
We drop a reference to itr->curr_stmt before restoring the iterator position so we must not drop it while iterating over the cache tree. This mistake results in funky crashes in various places, e.g. in vy_stmt_compare(). Shame on me: I missed that during self review. Fixes: 073c8d07 ("vinyl: rework cache iterator restore")
-
Vladimir Davydov authored
Initially merge sort implementation was supposed to be shared between read and write iterators hence we got vy_merge_iterator. However, over time such a design decision proved to be inadequate, because it turned out that the two iterators can barely share a few lines - so different are the tasks they solve. The write iterator has already been rewritten without the use of the vy_merge_iterator so we can now fold the latter in vy_read_iterator. While we are at it, spruce up some comments a bit.
-
Vladimir Davydov authored
So as not to confuse it with vy_merge_iterator->curr_stmt.
-
Vladimir Davydov authored
- search_started is not needed, (curr_stmt != NULL) can be used instead. - unique_optimization and one_value can be checked in-place.
-
Vladimir Davydov authored
Range iterator is trivial and only used by the read iterator, so it doesn't deserve a layer of abstraction. Fold it in.
-
Vladimir Davydov authored
Historically, we have this 'belong_range/range_ended' optimization in the read iterator: we move to the next range iff all sources that are marked as 'belong_range' are done. It is supposed to spare us a comparison with a range boundary on each iteration. However, with the introduction of the tuple cache and the unified memory level, it became more of a pain in the ass rather than a valuable asset. Thanks to it, we have to read all runs that belong to the current range before we can jump to the next range, even if all statements spanned by the current range are in the cache. Also, when all runs in a range have been read, we immediately jump to the next range and read its runs, even if the user intends to stop the iterator after reading a memory statement that is within the current range. That said, we effectively trade disk accesses for simple in-memory comparison operations, which is silly. Let's get rid of this "optimization" and make iteration over ranges straightforward: after the merge iterator returns a statement, compare it to the current range boundary and jump to the next range if it is outside.
-
Vladislav Shpilevoy authored
* no-wrap: do not wrap each 72 result symbols on a new line; * no-pad: do not add padding at the end of a result; * urlsafe: no-wrap + no-pad + replace '+' -> '-' and '/' -> '_'. Closes #2777 Closes #2478 Closes #2479
-
Alexander Turenko authored
The option is '-DENABLE_BUNDLED_ZSTD' and defaults to ON. There are two goals of making this conditional, both related to building Tarantool on Gentoo Linux from an ebuild: * Avoid bundled ZStd building issue w/o pay to investivage it. * Allow user to choose between system and bundled library.
-
- Sep 25, 2017
-
-
Vladimir Davydov authored
A value stored in a light hash must be convertible to int and have such a size that sizeof(LIGHT(record)) is a power of two. Let's remove these limitations.
-
Vladimir Davydov authored
- Use 'static inline' instead of 'inline' - Move GROW_INCREMENT enum out of LIGHT(record) struct
-
Vladimir Davydov authored
So as not to confuse the user, let's start reader threads when recovery completes only if there are Vinyl indexes. Otherwise, start them on the first Vinyl index creation. See #2664
-
Vladimir Davydov authored
Vinyl starts its worker threads on checkpoint, even if there is no Vinyl spaces to dump out there. This is a mess. Let's wake up the scheduler only if we actually have something to dump. BTW, this patch removes that annoying 'vinyl checkpoint done' message in case nothing was dumped. See #2664
-
Vladimir Davydov authored
Introduce the following functions: vy_scheduler_begin_checkpoint() vy_scheduler_wait_checkpoint() vy_scheduler_end_checkpoint() and make the Vinyl API methods use them instead of playing with the scheduler directly. Rationale: - It will help move the scheduler to a separate file in future. - We want to avoid waking the scheduler on checkpoint if Vinyl is not used, because the scheduler starts worker threads on the first wakeup. At the same time, we have to call vy_log_rotate() on checkpoint no matter if Vinyl is used or not (it needs to update its internal housekeeping). Factoring out the scheduler functions will allow to achieve this goal neatly.
-
Vladimir Davydov authored
There's no point in maintaining the LSN of the last checkpoint in Vinyl, we have checkpoint_last() for retrieving it. Make this helper a little bit more friendly by allowing to omit vclock argument and use it wherever appropriate. Here's why I need to do that now. I'm planning to omit calling scheduler checkpoint functions altogether if there's nothing to dump (so as not to start worker threads and print 'vinyl checkpoint done' message in case Vinyl is not used), but one of those functions updates last_checkpoint, which other pieces of code (gc primarily) relies on. I could factor last_checkpoint out of vy_scheduler, but why if I can simply use checkpoint_last() instead.
-
Vladimir Davydov authored
Currently, box_set_checkpoint_count() not only changes the corresponding configuration option, but also invokes garbage collection. The problem is it is called in the beginning of box_cfg(), before recovering engine data. At that time engines are not ready to invoke garbage collection (e.g. Vinyl hasn't scanned vylog directory yet). Besides, the server might be started in the hot standby mode, in which case calling garbage collection is obviously wrong. To fix this, we could somehow detect if box_set_checkpoint_count() is called from box_cfg() and avoid invoking the garbage collector if so, but I think it isn't worth it. Instead let's just not invoke garbage collection when this option changes. It's not a big deal - old files will be removed by the next snapshot anyway.
-
Vladimir Davydov authored
It's inappropriate to fail startup if the vinyl data directory does not exist, because vinyl might not even be in use. Move the check to the time when a vinyl index is opened. See #2664
-
Vladimir Davydov authored
If an index is dropped while a dump/compaction task is in progress, the task will be aborted, but VY_LOG_DROP_RUN record won't be written to the metadata log so the run will remain incomplete (VY_LOG_PREPARE_RUN) and pin the dropped index in the log until the server is restarted and all incomplete runs are purged. This results in occasional failures of the vinyl/gc test which checks that vylog files are removed after all vinyl spaces are dropped. There's actually no reason not to log deletion of an incomplete run that belongs to a dropped index so let's do this to fix this problem. Closes #2486
-
Vladimir Davydov authored
So that BPS tree can be redefined in files that include memtx_tree.h
-
- Sep 22, 2017
-
-
Roman Tsisyk authored
Fixes #2761
-
Roman Tsisyk authored
Apple's version of libcurl is outdated and buggy. Use libcurl from homebrew by default. Closes #2772
-
Eugene Leonovich authored
queue.start() is not needed anymore.
-
Roman Tsisyk authored
Follow up the previous commit.
-
Vladimir Davydov authored
The cache iterator restoration function is muddy and suboptimal: - If last_stmt equals NULL, it doesn't attempt to change the iterator position although it should restart the iterator. Moreover, it doesn't even check EQ condition in this case, i.e. it may return a statement that doesn't satisfy the search criteria. - It may restore iterator position to last_stmt, but the restoration semantics mandates that the returned statement must be strictly greater than last_stmt if the iterator was restored. - If the iterator was invalidated, it first positions the iterator to the last known position (curr_stmt) and then steps back to the statement closest to last_stmt although it could jump to last_stmt immediately. Let's rewrite it.
-
Vladimir Davydov authored
vy_merge_iterator_squash_upserts() returns an UPSERT, which is then converted to a REPLACE by vy_read_iterator_next(). Fold the latter into the former to make the workflow easier to follow. Also, remove useless suppress_error argument.
-
Vladimir Davydov authored
Not only restore() callback can restore the iterator position after the source changed - next_key() and next_lsn() can do that too, which brings additional complexity. Do we really need it? Not really: - We can avoid handling restoration in next_key() by simply calling restore() before it. This is cheap as restore() is a no-op if the source was not changed. - Regarding next_lsn() - when it comes to calling it, we iterate over sources from the most recent one to the oldest, which means that the only sources out there that may yield, on-disk runs, are guaranteed to be iterated last, after all mutable sources (txw, cache, mems), so none of the mutable sources needs to be restored in this case. Simplify the read iterator as per above.
-
Vladimir Davydov authored
Currently, ->next_lsn() may be (and actually is) called right after opening the iterator, which means that apart from advancing the iterator to the next version of the same statement it has to handle the case of starting iteration. This complicates the callback semantics so let's require ->next_key() to be called first.
-
- Sep 21, 2017
-
-
Roman Tsisyk authored
Closes #2774
-
- Sep 20, 2017
-
-
Vladimir Davydov authored
We generally do not allow to drop objects that have dependents.
-
Vladimir Davydov authored
In contrast to PostgreSQL, this method doesn't make sense in Tarantool, because we don't have sessions.
-
- Sep 19, 2017
-
-
Vladimir Davydov authored
This patch implements a new object type, persistent sequences. Sequences are created with function box.schema.sequence.create(name, options). Options include min/max, start value, increment, cache size, just like in Postgresql, although 'cache' is ignored for now. All sequences can be accessed via box.sequence.<name>, similarly to spaces. To generate a sequence value, use seq:next() method. To retrieve the last generated value, use seq:get(). A sequence value can also be reset to the start value or to any other value using seq:reset() and seq:set() methods. Needed for #389
-
Vladimir Davydov authored
Currently, it is done by space_cache_replace/delete which violates incapsulation. Let's introduce a trigger that is fired after a change in a space definition is committed and use it to propagate changes to Lua. Patch by @kostja.
-
lenkis authored
-
Vladimir Davydov authored
-
- Sep 18, 2017
-
-
Konstantin Osipov authored
Implement a helper function to swap space triggers. Patch by @locker.
-