- Sep 11, 2017
-
-
Georgy Kirichenko authored
Fixes #2707
-
Georgy Kirichenko authored
Set applier reconnect delay and ack interval (hearthbeat interval) via box.cfg replication_timeout parameter. Relay timeout (time interval without hearthbeat messages) is four times bigger than replication_timeout, so up to three hearthbeat messages can be skipped until connection to close. Fixed #2708
-
Vladimir Davydov authored
After the read iterator selects the minimal key across all available sources, it checks mutable sources for new statements using ->restore() callback. If there is a new statement in a source, it uses it as the min key provided it is *strictly* less than the current min key. If they are equal, the min key isn't changed, but this is wrong, because the new statement may be newer than the statement selected previously. If we don't select it, we might end up with stale data in the cache. Fix this.
-
Vladimir Davydov authored
Since ->restore() is not used by the read iterator to start iteration any more, we can remove the corresponding code from the cache iterator ->restore() callback. Although it might be tempting to simplify it even more by doing a full lookup every time the cache version changes, as we already do in case of memory and txw iterators, it doesn't seem to be a sound idea, because the read iterator itself can change the cache version on each iteration by inserting new elements into the cache, even if there were no disk accesses.
-
Vladimir Davydov authored
We don't need to handle iterator restart in the ->restore() callback, so we can remove the corresponding code. Also, let's reuse the start iteration function for restoration, because the two cases are in fact equivalent.
-
Vladimir Davydov authored
After the recent changes in the read iterator, the ->restore() callback does not need to handle the case of iterator restart any more. Taking this into account and keeping in mind that on-disk runs are immutable, we can turn the run iterator ->restore() callback into no-op.
-
Vladimir Davydov authored
To avoid lookup in the memory tree, the memory iterator ->restore() callback tries to walk from the current iterator position to the first statement matching the restoration criteria. Such an optimization complicates the restoration procedure beyond comprehension and makes it extremely error prone. Ironically, all this complexity seems to be pointless, because a change in the memory tree means either a disk access, which is by orders of magnitudes more expensive than a memory lookup, or an insertion of a new statement into the tree, which has exactly the same complexity as a lookup. That said, let's rewrite the restoration procedure so that it always does a full lookup in case the version of the memory tree has changed. Also, remove handling of iterator restart and the corresponding test case as a ->restore() callback does not need to handle them any more.
-
Vladimir Davydov authored
Apart from restoring the iterator position in case the source changed, the vy_stmt_iterator_iface->restore callback is also used for starting iteration in vy_merge_iterator_next_key() even though next_key() can be used instead. Let's rewrite the function so that it uses next_key() instead of restore() where appropriate. This will allow us to simplify restore() by making it handle nothing but iterator restoration.
-
- Sep 10, 2017
-
-
Vladimir Davydov authored
The read iterator has to restart (i.e. reopen all its sources) from the position last returned to the caller when the current range or the whole range tree changes as a result of dump or compaction. To reposition the iterator, we use vy_stmt_iterator_iface->restore callback, which was initially designed to restore an individual merge source (txw, mem, or cache) after a statement is added to or removed from it. Abusing the callback like that complicates its implementation as well as the read iterator itself. We can avoid that by simply reopening merge sources with the proper key when we need to restart the read iterator.
-
Vladimir Davydov authored
The 'cleanup' callback is always called together with 'close'. The two callbacks were separated long time ago, when vy_merge_iterator was used for writing runs. There is no point in keeping them apart any more.
-
- Sep 08, 2017
-
-
Vladislav Shpilevoy authored
Closes #2746
-
- Sep 07, 2017
-
-
Vladimir Davydov authored
Statement generated by the following piece code ({1, 1, 2}) isn't dumped to the secondary index: s = box.schema.space.create('test', {engine = 'vinyl'}) s:create_index('i1', {parts = {1, 'unsigned'}}) s:create_index('i2', {parts = {2, 'unsigned'}}) box.begin() s:insert{1, 1, 1} s:update(1, {{'+', 3, 1}}) box.commit() This happens, because UPDATE is replaced with DELETE + REPLACE in the transaction log both of which have colun_mask = 0x04 (field #3 is updated). These statements overwrite the original INSERT in the memory index on commit, but they are not dumped, because their column_mask does not intersect with the column mask of the secondary index (0x02). To avoid that, the new statement (UPDATE = DELETE + REPLACE in this case) must inherit the column mask of the overwritten statement (REPLACE). Fixes #2745
-
- Sep 06, 2017
-
-
Roman Tsisyk authored
Emulate http://w3.impa.br/~diego/software/luasocket/tcp.html API Needed for MobDebug Closes #2727
-
Roman Tsisyk authored
Closes #598
-
Roman Tsisyk authored
No semantic changes. In context of #2727
-
Roman Tsisyk authored
No semantic changes. Needed for #2727
-
- Sep 05, 2017
-
-
Konstantin Osipov authored
* update error messages * rename variables * add a few comments
-
Vladislav Shpilevoy authored
Savepoint allows to partialy rollback a transaction. After savepoint creation a transaction owner can rollback all changes applied after the savepoint without rolling back the entire transaction. Multiple savepoints can be created in each transaction. Rollback to a savepoint cancels changes made after the savepoint, and deletes all newer savepoints. It is impossible to rollback to a savepoint from a substatements level, different from the savepoint's one. For example, a transaction can not rollback to a savepoint, created outside of a trigger, from a trigger body. Closes #2025
-
Vladislav Shpilevoy authored
Vinyl can not calculate bsize during transaction execution because of DELETE and UPSERT in vinyl spaces with single index. Move space bsize into MemtxSpace, because Vinyl can not calculate it now. In a future, Vinyl bsize can be calculated after dumps and compactions, but never during transaction execution.
-
Vladimir Davydov authored
Address issues spotted by Alex Lyapunov: - Fix key part count computation in vy_read_interval_cmp[lr]() and vy_read_interval_should_merge() and add the corresponding test case. - Simplify comparison in vy_read_interval_cmp[lr](). - Improve comment to vy_tx_track(). See #2671
-
Roman Tsisyk authored
Fix misleading "C atomics not supported" when git submodules are missing. Closes #2088
-
- Sep 04, 2017
-
-
Roman Tsisyk authored
Since #1265 tarantool is fully compatible with lua5.1. Install /usr/bin/tarantool as /usr/bin/lua alternative. Closes #2730
-
Vladimir Davydov authored
The check was accidentally broken by commit eb5cd536 ("vinyl: do not track partial reads in tx manager"). Add a test case to avoid similar screw-ups in future. See #2716
-
Vladimir Davydov authored
There are two cases in the hermitage test that check gap locks - PMP (predicate with many preceders) and G4 (anti-dependency cycles). As we didn't have gap locks, we used get() to put a non-existent value to the conflict set. Now we can use select(*) instead.
-
Vladimir Davydov authored
-
Vladimir Davydov authored
Currently, the conflict manager only tracks keys returned by the read iterator, so Vinyl isn't really serializable as select() can return phantom records, e.g. space: {10}, {20}, {30}, {40}, {50} Transaction 1 Transaction 2 ------------- ------------- box.begin() space:select({30}, {iterator='GE'}) -- returns {30}, {40}, {50} box.begin() box.insert{35} box.insert{45} box.insert{55} box.commit() space:select({30}, {iterator='GE'}) -- returns {30}, {35}, {40}, {45}, {50}, {55}; -- were it serializable, the transaction would -- be sent to read view so that this select() -- would return the same set of values as the -- previous one box.commit() Besides, tracking individual keys read by a transaction can be very expensive from the memory consumption point of view: think of calling select(*) on a big space. So this patch makes the conflict manager track intervals instead of individual keys. To achieve that it splits tx_manager->read_set in two: - vy_tx->read_set. Contains intervals read by a transaction. Needed to efficiently search intervals that should be merged with a new one. Intervals in this tree cannot intersect. - vy_index->read_set. Contains intervals read by all transaction from an index. Needed to efficiently search transactions that conflict with a write. Intervals can intersect. When vy_tx_track() is called, it first looks up all intervals intersecting with the new interval in vy_tx->read_set, removes them, and extends the new interval to span them. Then it inserts the new interval into both vy_index->read_set and vy_tx->read_set. The vy_index->read_set is used on commit to send all transactions that read intervals modified by the committed statement to read view. Note, now we don't differentiate 'gaps', i.e. non-existent keys read by a transaction. Gaps were used to avoid aborting a transaction if a non-existent key read by it is deleted. We can't track gaps without bloating the read set on select(*). Closes #2671
-
Vladimir Davydov authored
Currently, this is done in each plain iterator (run, mem, txw, cache). To handle the empty search key the same way as non-empty keys when setting a gap lock, this needs to be handled in vy_read_iterator. Needed for #2671
-
Vladimir Davydov authored
To set a gap lock properly, the read iterator needs to discern ITER_REQ from ITER_LE, which is used by vy_cursor instead of ITER_REQ. Needed for #2671
-
Vladimir Davydov authored
-
- Sep 01, 2017
-
-
Vladislav Shpilevoy authored
The trigger is called when the flush callback sends messages to the consumer pipe (in cpipe_flush_cb, if messages queue is not empty). Needed for #946 to send buffers from tx to iproto.
-
Vladimir Davydov authored
To rebuild an index when its key def changes, we effectively drop it and create a new index instead. Skipping Index::commitDrop and commitCreate stages at this point deprives Vinyl of an opportunity to log the change in the metadata log and replace the index in the scheduler, which leads to a crash. This patch adds the commit stage to RebuildIndex which calls the above-mentioned commitDrop and commitCreate for the old and the new indexes, respectively. There is a nuance here. Memtx piggybacks Index::commitDrop to drop space tuples when the primary index is dropped. This is actually wrong for tuples belong to a space, not to an index. Besides, it prevents us from just calling Index::commitDrop() from RebuildIndex::commit() as is, because RebuildIndex does not modify space data, it just moves space tuples to a new index. To circumvent this, let us remove commitDrop() method from MemtxIndex and drop space tuples directly from MemtxSpace's commitTruncateSpace() and commitAlterSpace().
-
- Aug 30, 2017
-
-
Vladimir Davydov authored
space:select(key, {limit = N}) limits the output to N keys, but it still fetches the (N+1)-th key from the engine. This is pointless. Besides, this can result in a conflict in Vinyl as Vinyl adds all keys returned by iterator to the conflict manager.
-
Vladislav Shpilevoy authored
-
- Aug 28, 2017
-
-
Roman Tsisyk authored
Patch aa549401 "Split key_def.h/.cc" accidentally added FIELD_TYPE_MAP member to `enum field_type`. Currently this enum is only used to define index parts. We don't support 'map' indexed field type at least in 1.7.x. See #2652
-
Vladislav Shpilevoy authored
Part of #2652
-
- Aug 24, 2017
-
-
Vladislav Shpilevoy authored
In the next patches the field_def will being parsed from space:format. Field_def will contain char *name, which is limited by BOX_NAME_MAX = 65000. So neither opt_type OPT_STR or OPT_STRPTR can be used to parse this name from space:format. Besides, field_def contains enum field_type, which can not be parsed using and opt_type. Also, field_def will contain default_value, which can store values of many types. Proposal is to use opt_create_from_field not for entire field_def, but only for several fields using opts_parse_key. And parse other options manualy.
-
Vladislav Shpilevoy authored
Needed for #2652
-
Konstantin Osipov authored
replication/cluster.test.py would fail at server exit, because at_exit() handler tries to destroy a cbus while its mutex is locked. args.test.py would fail when run with 'make test'
-
- Aug 22, 2017
-
-
Vladimir Davydov authored
fiber_time() reports real time, which shouldn't be used for calculating timeouts as it is affected by system time changes. Add fiber_clock() based on ev_monotonic_now(), export it to Lua, and use it instead. Needed for #2527
-
Vladimir Davydov authored
We should use ev_monotonic_now()/ev_monotonic_time() instead of ev_now()/ev_time() for calculating timeouts, because the latter are affected by system time changes so that using them for timeouts can lead to unexpected hangs in case system time changes. Needed for #2527
-