- May 03, 2017
-
-
Vladimir Davydov authored
Closes #2394
-
Roman Tsisyk authored
Closes #2386
-
Roman Tsisyk authored
Rename `remote_check` to `check_remote_arg` to follow conventions in schema.lua
-
Roman Tsisyk authored
Change conn:call() and conn:eval() API to accept Lua table instead of varargs for function/expression arguments: conn:call(func_name, arg1, arg2, ...) => conn:call(func_name, {arg1, arg2, ...}, opts) conn:eval(expr, arg1, arg2, ...) => conn:eval(expr, {arg1, arg2, ...}, opts) This breaking change is needed to extend call() and eval() API with per-requests options, like `timeout` and `buffer` (see #2195): c:call("echo", {1, 2, 3}, {timeout = 0.2}) c:call("echo", {1, 2, 3}, {buffer = ibuf}) ibuf.rpos, result = msgpack.ibuf_decode(ibuf.rpos) result Tarantool 1.6.x behaviour can be turned on by `call_16` per-connection option: c = net.connect(box.cfg.listen, {call_16 = true}) c:call('echo', 1, 2, 3) This is a breaking change for 1.7.x. Needed for #2285 Closes #2195
-
Konstantin Nazarov authored
Getting the space format should be safe, as it is tied to schema_id, and net.box makes sure that schema_id stays consistent. It means that when you receive a tuple from net.box, you may be sure that its space format is consistent with the remote. Fixes #2402
-
Roman Tsisyk authored
Fixes #2391
-
Konstantin Nazarov authored
Previously the format in space:format() wasn't allowed to be nil. In context of #2391
-
- May 02, 2017
-
-
Vladimir Davydov authored
- In-memory trees are now created per index, not per range as before. - Dump is scheduled per index and writes the whole in-memory tree to a single run file. Upon completion it creates a slice for each range of the index. - Compaction is scheduled per range as before, but now it doesn't include in-memory trees, only on-disk runs (via slices). Compaction and dump of the same index can happen simultaneously. - Range split, just like coalescing, is done immediately by creating new slices and doesn't require long-term operations involving disk writes.
-
Vladimir Davydov authored
With the single in-memory tree per index, read iterator will reopen memory iterator per each range, as it already does in case of txw and cache iterators, so we need to teach memory iterator to skip to the statement following the last key returned by read iterator. So this patch adds a new parameter to memory iterator, before_first, which, if not NULL, will make it start iteration from the first statement following the key of before_first.
-
Vladimir Davydov authored
The key is created in the main cord so there's absolutely no point in deleting it in a worker thread. Moving key unref from cleanup to delete will simplify some of the workflows of the single memory level patch.
-
Vladimir Davydov authored
This parameter was needed for replication before it was redesigned. Currently, it is always false.
-
Vladimir Davydov authored
To ease recovery, vy_recovery_iterate() iterates over slices of the same range in the chronological order. It is easy to do, because we always log slices of the same range in the chronological order, as there can't be concurrent dump and compaction of the same range. However, this will not hold when the single memory level is introduced: a dump, which adds new slices to all ranges, may occur while compaction is in progress so that when compaction is finished a record corresponding to the slice created by compaction will appear after the slice created by dump, although the latter is newer. To prevent this from breaking the assumption made by iterators that newer slices are closer to the head of vy_range->slices list, let's sort the list on recovery/join.
-
Vladimir Davydov authored
Currently, on recovery we create and load a new vy_run per each slice, so if there's more than one slice created for a run, we will have the same run duplicated in memory. To avoid that, maintain the hash of all runs loaded during recovery of the current index, and look up the run there when a slice is created instead of creating a new run. Note, we don't need to do anything like this on initial join, as we delete the run right after sending it to the replica, so we can just create a new run each time we make a slice.
-
Vladimir Davydov authored
In order to recover run slices, we need to store info about them in the metadata log, so this patch introduces two new records: - VY_LOG_INSERT_SLICE: takes IDs of the slice, the range to insert the slice into, and the run the slice is for. Also, it takes the slice boundaries as after coalescing two ranges a slice inserted into the resulting range may be narrower than the range. - VY_LOG_DELETE_SLICE: takes ID of the slice to delete. Also, it renames VY_LOG_INSERT_RUN and VY_LOG_DELETE_RUN to VY_LOG_CREATE_RUN and VY_LOG_DROP_RUN. Note, we don't need to keep deleted ranges (and slices) in the log until the garbage collection wipes them away any more, because they are not needed by deleted run records, which garbage collection targets at.
-
Vladimir Davydov authored
The same keys will be used to specify slice boundaries, so let's call them in a neutral way. No functional changes.
-
Vladimir Davydov authored
Currently, there can't be more than one slice per run, but this will change one the single memory level is introduced. Then we will have to count the number of slices per each run so as not to unaccount the same run more than once on each slice deletion. Unfortunately, we can't use vy_run->refs to count the number of slices created per each run, because, although vy_run->refs is only incremented per each slice allocated for the run, this includes slices that were removed from ranges and stay allocated only because of being pinned by open iterators. So we add one more counter to vy_run, slice_count, and introduce new helpers to be used for slice creation/destruction, vy_run_make_slice() and vy_run_destroy_slice(), which inc/dec the counter.
-
Vladimir Davydov authored
There's a sanity check in vy_range_needs_split() that assures the resulting ranges are not going to be empty: it checks the split key against the oldest run's min key. The check is not enough for the slice concept, because even if the split key is > min key, it still can be < the beginning of the slice.
-
Vladimir Davydov authored
We use run->info.keys to estimate the size of a new run's bloom filter. We use run->info.size to trigger range split/coalescing. If a range contains a slice that spans only a part of a run, we can't use run->info stats, so this patch introduces the following slice stats: number of keys (for the bloom filter) and the size on disk (for split/coalesce). These two counters are not accurate, they are only estimates, because calculating exact numbers would require disk reads. Instead we simply take the corresponding run's stat and multiply it by slice page count / run page count
-
Vladimir Davydov authored
There will be more than one slice per run, i.e. the same run will be used jointly by multiple ranges. To make sure that a run isn't accounted twice, separate run accounting from range accounting.
-
Vladimir Davydov authored
Make sure that we start iteration within the given slice and end it as soon as the current position leaves the slice boundaries. Note, the overhead caused by extra comparisons is only incurred if the slice has non-NULL boundaries, which is only the case if the run is shared among ranges.
-
Vladimir Davydov authored
When we finally move to single memory tree per index (currently we maintain one per range), dump will result in creation of a single run. To add such runs to ranges and be able to iterate over statements that are within a particular range, we introduce a new concept, a run slice. A run slice is a simple object that references a run and contains begin and end keys inherited from the range it belongs to. Runs are now not referenced directly by ranges, instead we use slices as an intermediary. For now the concept is oversimplified: there may only be one slice per run, but the following patches will remove this limitation.
-
Vladimir Davydov authored
Maintaining range->{begin,end} as tuples is useful for the concept of run slices, which is introduced by the following patches. A slice may inherit its begin and end from a range, so basically we have two alternatives: either copy keys or take references. The latter seems to be more straightforward.
-
Vladimir Davydov authored
The vinyl/recover test was written long time ago. Back then recovery was a convoluted procedure based on scanning the data directory so that it was crucial to generate a number of stale files to check its validity. Nowadays, there's no point in it thanks to the metadata log. Moreover, when the single memory level is introduced, ERRINJ_VY_RANGE_SPLIT won't make sense any more as range splitting will not involve a worker thread. So we can remove errinj from this test. Another problem with this test is that it doesn't take into account data compression (all tuples generated by it are compressed perfectly). Handle this by generating random padding strings. Also, when we snapshot vinyl stats before restart, there still may be compaction in progress that will modify the stats in the last moment, resulting in a sporadic failure. Address that by checking stats separately, for another space with compaction disabled.
-
Vladimir Davydov authored
This reverts commit f3764063. With the recently proposed concept of run slices, this commit is not needed to implement the single memory level any more. Conflicts: src/box/vinyl.c src/box/vy_log.c src/box/vy_log.h
-
Vladimir Davydov authored
This reverts commit f3ecce75. With the recently proposed concept of run slices, this commit is not needed to implement the single memory level any more. Conflicts: src/box/vinyl.c
-
Vladimir Davydov authored
This reverts commit 55685eaf. With the recently proposed concept of run slices, this commit is not needed to implement the single memory level any more. Conflicts: src/box/vinyl.c
-
Alexandr Lyapunov authored
Patch fb9c8b32 introduced generation of reproduce code and dump of it to the log. But the problem is that the code is initially generated in a big lua string using repeated concatenation in a loop. Such a use of lua strings is too vulnerable in terms of performance. Avoid repeated concatenation of lua string in tx_serial.test.
-
Vladimir Davydov authored
This reverts commit 44367f5c. With the recently proposed concept of run slices, this commit is not needed to implement the single memory level any more. Conflicts: src/box/vinyl.c
-
Vladimir Davydov authored
This reverts commit a839af29. With the recently proposed concept of run slices, this commit is not needed to implement the single memory level any more.
-
Vladimir Davydov authored
This reverts commit 1360f193. The idea behind this patch makes sense for memory iterator, but the implementation is incorrect: it's not enough to patch ->restore, because memory iterator can start right from ->next_key, bypassing ->restore. Since other hunks of this patch are not needed in the scope of the run slice paradigm, and what is done by this patch needs to be rewritten from scratch anyway, revert the whole patch.
-
Vladimir Davydov authored
Fixes commit 976b31cb ("vinyl: fix order error in txv_iterator_start"). If search key is empty, txw iterator starts from the first or last entry in the write set depending on the iterator direction. This is incorrect, because the write set is grouped by index so if the first/last entry happens to be for another index, txw iterator will stop immediately even if there are statements for the given index in the write set. Instead we must take the first/last statement for the given index. We can use psearch/nsearch for it - it will position to the first/last element in the tree equal to the search key; since the search key is equal to any statement of the given index in case the given key is empty, it will do the job. While we are at it, also remove handling 'key == NULL' case from the write_set_key_cmp() as it is not used anywhere.
-
- Apr 27, 2017
-
-
Georgy Kirichenko authored
Update snapshot timestamp if a snapshot already exists. Do not produce error if the snapshot already exists. Fixes gh-2045.
-
Konstantin Osipov authored
Use one function to initialize a global read view, pass read view lsn in.
-
Alexandr Lyapunov authored
The upsert squash process must not touch or take into consideration non-committed statemens because they are owned by TX manager and thus might be changed or removed by it's will. Add a committed read view in TX manager (in addition to global read view) that allows to see only committed statements. Use the read view in squash process. Move upsert count calculation, simple inplace upsert squash and upsert process invocation from preparation stage to commit stage of TX; without this change after a squash process prepared statements will not get a change to use simple inplace upsert squash. fix gh-2382.
-
Alexandr Lyapunov authored
Due to some historical reasons, vy_upsert hides some fatal errors in the upsert statement and returns the original base statement with original lower lsn. In that case a squash process replaced wrong statement with wrong lsn doing usless work. Fix it.
-
Alexandr Lyapunov authored
There might be a case, when a non-upsert statement was inserted just after a sqush process of the same key was started. Before this patch the squash process copied that non-upsert statement and reinserted it into the index for some reason. Make the squash process to detect the case and quit immediately.
-
Roman Tsisyk authored
We are not ready to freeze `box.info().vinyl()` and `index:info()` output right now. Let's keep this API for internal use only. Please note that the output might change in the future. See #1662
-
Roman Tsisyk authored
* Rename box.info.server.id to box.info.id * Rename box.info.server.uuid to box.info.uuid * Rename box.info.server.lsn to box.info.lsn * Rename box.info.cluster.signature to box.info.signature * Drop box.info.server section * Return `nil` instead of `0` for box.info.id during bootstrap * Return `nil` instead of `-1` for box.info.lsn during bootstrap Sample output: ``` tarantool> box.info --- - version: 1.7.3-538-g7ee75dee4 id: 1 ro: false vclock: {} uptime: 3 lsn: 0 vinyl: [] pid: 20714 status: running uuid: e6d913b9-a8b8-4873-ba94-14cf4357fec6 signature: 0 replication: 1: id: 1 uuid: e6d913b9-a8b8-4873-ba94-14cf4357fec6 lsn: 0 cluster: uuid: 3cfa8749-0fba-4b13-b8eb-db90f79d1485 ``` Closes #723
-
Roman Tsisyk authored
Prepare for box.info.server removal. See #723
-
- Apr 25, 2017
-
-
Roman Tsisyk authored
-