- Nov 03, 2017
-
-
Vladimir Davydov authored
We have a timer for updating watermark every second. Let's reuse it for quota use rate calculation. This will allow us to get rid of legacy vinyl statistics. Also, let's use EWMA for calculating the average. It is a more efficient and common method, which allows to easily tune the period over which the value is averaged.
-
Roman Tsisyk authored
Follow up #1557
-
- Nov 02, 2017
-
-
Alexandr Lyapunov authored
Collation was simply ignored for non-string parts, that could confuse potential user. Generate a readable error in this case. Fix #2862 part 2
-
Alexandr Lyapunov authored
Now collation is silently ignored for type='scalar' parts. Use collation for string scalar fields. Fix #2862 part 1
-
Alexandr Lyapunov authored
Show collation name (if present) in space.index.name.parts[no]. Fix #2862 part 4
-
Alexandr Lyapunov authored
test:create_index('unicode_s1', {parts = {{1, 'STR', collation = 'UNICODE'}}}) will work now. Fix #2862 part 3
-
Vladislav Shpilevoy authored
If a field is not indexed and no more indexed or not nullable fields after that, than allow to skip it in insertion. Such field value looks like MP_NIL, but MP_NIL is not explicitly stored. Named access to this field in lua returns nil. Example: format = {{'field1'}, {'field2'}, {'field3', is_nullable = true}, {'field4', is_nullable = true}} t = space:insert{1, 2} -- ok. t.field1 == 1, t.field2 == 2, t.field3 == nil, t.field4 == nil Closes #2880
-
Vladislav Shpilevoy authored
Some users store in format fields their custom keys. But current opts parser does not allow to store any unknown keys. Lets allow it. Example: format = {} format[1] = {name = 'field1', type = 'unsigned', custom_field = 'custom_value'} s = box.schema.create_space('test', {format = format}) s:format()[1].custom_field == 'custom_value' Closes #2839
-
Vladimir Davydov authored
Using DML/DDL on a Vinyl index with wal_mode = 'none' is likely to result in unrecoverable errors like: F> can't initialize storage: Invalid VYLOG file: Index 512/0 created twice To avoid data corruption in case the user tries to use an existing Vinyl database in conjunction with wal_mode = 'none', let's explicitly forbid it until we figure out how to fix it. Workaround #2278
-
Vladimir Davydov authored
During initial join, a replica receives all data accumulated on the master for its whole lifetime, which may be quota a lot. If the network connection is fast enough, the replica might fail to keep up with dumps, in which case replication fails with ER_VY_QUOTA_TIMEOUT. To avoid that, let's ignore quota timeout until bootstrap is complete. Note, replication may still fail during the 'subscribe' stage for the same reason, but it's unlikely, because the rate at which the master sends data is limited by the number of requests served by the master per a unit of time, and it should become nearly impossible once throttling is introduced (See #1862). Closes #2873
-
Vladimir Davydov authored
If the user sets snap_dir to an empty directory by mistake while leaving vinyl_dir the same, tarantool will still bootstrap, but there is likely to be errors like: vinyl.c:835 E> 512/0: dump failed: file './512/0/00000000000000000001.run' already exists vy_log.c:1095 E> failed to rotate metadata log: file './00000000000000000005.vylog' already exists Even worse, it may eventually fail to restart with: vy_log.c:886 E> ER_MISSING_SNAPSHOT: Can't find snapshot To avoid that, let's check the vinyl_dir on bootstrap and abort if it contains vylog files left from previous setups. Closes #2872
-
Vladimir Davydov authored
The only reason why it was allocated is that struct vy_scheduler was defined after struct vy_env, which is not a problem any more. Embedding it allows us to drop the extra argument to vy_scheduler_need_dump_f().
-
Vladimir Davydov authored
It's a big independent entity, let's isolate its code in a separate file. While we are at it, add missing comments to vy_scheduler struct members.
-
Vladimir Davydov authored
Instead of storing a pointer to vy_env in vy_scheduler, let's: - Add pointers to tx_manager::read_views and vy_env::run_env to vy_scheduler struct. They are needed to create a write iterator for a dump/compaction task. - Add a callback to struct vy_scheduler that is called upon dump completion to free memory. This allows us to eliminate accesses vy_env::quota and vy_env::allocator from vy_scheduler code. - Move the assert that assures that the scheduler isn't started during local recovery from vy_scheduler_f() to vy_env_quota_exceeded_cb() callback so that we don't need to access vy_env::status from the scheduler code. Note, after this change we have to set vy_env::status to VINYL_ONLINE before calling vy_quota_set_limit(), because the latter might schedule a dump. - Check if we have anything to dump from vy_begin_checkpoint() instead of vy_scheduler_begin_checkpoint(). This will allow us to isolate the scheduler code in a separate file.
-
Vladimir Davydov authored
Currently, dump is triggered (by bumping the memory generation) by the scheduler fiber while quota consumers just wake it up. As a result, the scheduler depends on the quota - it has to access the quota to check if it needs to trigger dump. In order to move the scheduler to a separate source file, we need to get rid of this dependency. Let's rework this code as follows: - Remove vy_scheduler_trigger_dump() from vy_scheduler_peek_dump(). The scheduler fiber now just dumps all indexes eligible for dump and completes dump by bumping dump_generation. It doesn't trigger dump by bumping generation anymore. As a result, it doesn't need to access the quota. - Make quota consumers call vy_scheduler_trigger_dump() instead of just waking up the scheduler. This function will become a public one once the scheduler is moved out of vinyl.c. The function logic is changed a bit. First, besides bumping generation, it now also wakes up the scheduler fiber. Second, it does nothing if dump is already in progress or can't be scheduled because of concurrent checkpoint. In the latter case it sets a special flag though that will force the scheduler trigger dump upon checkpoint completion. - vy_scheduler_begin_checkpoint() can't use vy_scheduler_trigger_dump() anymore due to additional checks added to the function, so it bumps the generation directly. This looks fine. - Such a design has a subtlety regarding how quota consumers notify the scheduler and how they are notified back about available quota. In extreme cases, quota released by a dump may be not enough to satisfy all consumers, in which case we need to reschedule dump. Since the scheduler doesn't check the quota anymore and doesn't reschedule dump, it has to be done by the left consumers. So consumers has to call the quota_exceeded_cb (which triggers a dump now) callback every time they are woken up and see there's not enough quota. The vy_quota_use() is reworked accordingly. Also, since the quota usage may exceed the limit (because of vy_quota_force_use()), the quota usage may remain higher than the limit after a dump completion, in which case vy_quota_release() doesn't wake up consumers and again there's no one to trigger another dump. So we must wake up all consumers every time vy_quota_release() is called.
-
Vladimir Davydov authored
quota_cond, which is used for throttling quota consumers, doesn't really belong to vy_scheduler. It would fit much better in vy_quota. Let's move it there. This also allows us to remove the two callbacks from vy_quota struct, quota_throttled_cb and quota_released_cb, and make the code more straightforward. While we are at it, let's also rename vy_scheduler_quota_exceeded_cb() to vy_env_quota_exceeded_cb().
-
- Nov 01, 2017
-
-
Vladimir Davydov authored
If xlog_flush() fails, box.snapshot() will still succeed, but recovery from such an incomplete snapshot will fail. Fix it and add the corresponding test case.
-
- Oct 31, 2017
- Oct 30, 2017
-
-
Vladimir Davydov authored
In 1.7 the join procedure consists of two phases: initial, during which we send the last snapshot, and final, when we send xlogs written after the snapshot. Between the two phases, the replica uuid is added to the cluster table on the master, so by the time join is finished, the replica should have received its id. However, on 1.6 there's no final join phase, instead the master expects the replica to receive xlogs upon subscription. As a result, the replica doesn't receive its id until it sends the subscribe request. This is not expected by 1.7 clients - they fail with ER_UNKNOWN_REPLICA. Fix this problem by making 1.7 replicas proceed to subscription and wait until the id is received before completing bootstrap from 1.6 master. Closes #2702
-
- Oct 27, 2017
-
-
Vladimir Davydov authored
There was a bug in small garbage collection that resulted in tuple leak in case box.snapshot() races with DML. The leak was indicated by constantly growing box.slab.info().items_used. Update the small library to fix it. Closes #2842
-
Roman Tsisyk authored
This reverts commit 8b6cefd0. This feature is so good to be pushed into 1.7. Sorry.
-
- Oct 26, 2017
-
-
Roman Tsisyk authored
Improve usability.
-
Roman Tsisyk authored
Follow up 9297ec36 "chmod and chown control socket"
-
Roman Tsisyk authored
-
Alexander Turenko authored
Fixes #2852. Fixes #2849.
-
Ilya authored
* Fix parsing in case of unexpected headers * Fix duplicating headers in case of retransmitting responses A workaround for #2836
-
Vladislav Shpilevoy authored
Tomap() creates a lua table with both names and number indexes. Each named field stored by its name AND by its index in a tuple. For example, if a tuple is {'a', 'b', 'c'} and its format is {'field1', 'field2'}, then t.field1 is the same as t[1], t.field2 is the same as t[2]. Not named fields can be accessed only by their indexes. For the example above 'c' can be accessed only as t[3]. Closes #2821
-
Ilya authored
* Call box.cfg() instead of raising an error on the first access to box.XXX Fixes #2559
-
Ilya authored
Fix typos in types check in string module Closes #2775
-
Georgy Kirichenko authored
-
- Oct 25, 2017
-
-
Vladislav Shpilevoy authored
Check request fields on NULL before print.
-
Konstantin Osipov authored
-
- Oct 24, 2017
-
-
Vladislav Shpilevoy authored
Needed to remove monitoring fibers from shard and use only netbox api to track, if a connection is closed or reopened. Closes #2858
-
Vladislav Shpilevoy authored
Closes #2779
-
- Oct 19, 2017
-
-
Konstantin Nazarov authored
luarocks make <rockspec> allows one to build a rock from local directory. In addition to the "rocks make" argument, one additional option is needed in tarantoolctl: --chdir. This is because we need to build inside the rock directory, but output the result to <project_root>/.rocks. Implements #2846
-
Vladimir Davydov authored
> src/box/txn.c:454:40: error: '_Alignof' applied to an expression is a GNU extension [-Werror,-Wgnu-alignof-expression] > diag_set(OutOfMemory, sizeof(*svp) + alignof(*svp) - 1, > ^ Do not try to be smart and guess allocation size using alignof. > src/box/memtx_tree.c:391:11: error: comparison of unsigned enum expression < 0 is always false [-Werror,-Wtautological-compare] > if (type < 0 || type > ITER_GT) { /* Unsupported type */ > ~~~~ ^ ~ > src/box/vinyl_index.c:184:29: error: comparison of unsigned enum expression < 0 is always false [-Werror,-Wtautological-compare] > if (type > ITER_GT || type < 0) { > ~~~~ ^ ~ Move the check for illegal params (i.e. 'type < 0') to the box API. In index callbacks, only check that the iterator type is supported by the index.
-
Vladimir Davydov authored
Since we already have the index_create_iterator() method to create an iterator, the API basically consists of two functions: iterator_next() and iterator_delete(). While iterator_delete() is just a trivial wrapper around iterator::free callback, iterator_next() is more than that: it also checks schema version and invalidates the iterator in case there was a DDL that affected the index. Previously, this was done only by the box API, but the overhead of this check seems to be really negligible so it is compelling to do it from the internal API so that an internal API user doesn't need to care about DDL once he opened an iterator. Needed for #2776
-
Vladimir Davydov authored
This virtual method was added to make use of the 'position' optimization implemented in memtx. Since the optimization was removed recently, we don't need it anymore.
-
Vladimir Davydov authored
The primary reason for these methods to be implemented differently for memtx and vinyl was the 'position' optimization exploited by the memtx engine: since selects from memtx do not yield, we could use a preallocated iterator there. Now, as the 'position' optimization became redundant and was removed due to the switch to memory pools for iterator allocations, the only idiosyncrasy left in the memtx implementation is the count() optimization: count() falls back on size() for ITER_ALL. Since this optimization consists of just a few lines of code, we don't really need memtx_index_count() co-used by all memtx index implementations: we can implement it in each memtx index separately. That being said, let us: - implement generic versions of min(), max(), and count(); - make vinyl, memtx, and sysview engines use generic versions of the above-mentioned methods if appropriate; - Remove memtx_index.[hc] As a side-effect, this patch enables min(), max(), and count() in the sysview engine, but that is not bad considering that this engine implements general-purpose iterator for its indexes.
-