Commits · 9975ed0655aed0f1f48a632a2264e4d737b8cb94 · core / tarantool

Oct 26, 2018

Merge branch '1.10-features' into 2.1 · 9975ed06
Vladimir Davydov authored 6 years ago

9975ed06
Merge branch '1.10' into 1.10-features · 9dcc7849
Vladimir Davydov authored 6 years ago

9dcc7849

lua: fix tuple cdata collecting · 022a3c50

In some cases luajit does not collect cdata objects which were
transformed with ffi.cast as tuple_bless does. In consequence, internal
table with gc callback overflows and then lua crashes. There might be an
internal luajit issue because it fires only for jitted code. But assigning
a gc callback before transformation fixes the problem.

Closes #3751

022a3c50

vinyl: do not account bloom filters to runtime quota · e4338cc5

Vladimir Davydov authored 6 years ago

Back when bloom filters were introduced, neither box.info.memory() nor
box.stat.vinyl().memory didn't exist so bloom filters were accounted to
box.runtime.info().used for lack of a better place. Now, there's no
point to account them there. In fact, it's confusing, because bloom
filters are allocated with malloc(), not from the runtime arena, so
let's drop it.

e4338cc5

vinyl: fix memory leak in slice stream · 0066457c

Vladimir Davydov authored 6 years ago

If a tuple read from a run by a slice stream happens to be out of the
slice bounds, it will never be freed. Fix it.

The leak was introduced by commit c174c985 ("vinyl: implement new
simple write iterator").

0066457c

sql: fix fk set null clause · 9c6c4483

AKhatskevich authored 6 years ago

After changing behavior of the `IS` operator (#b3a3ddb5),
`SET NULL` was rewritten to use `EQ` instead. Which doesn't respect
NULLs.

This commit fixes the null related behavior by emitting logical
constructions equivalent for this case to old `IS`.
The new expression works differently than old `IS` for nulls, however
the difference doesn't change anything, because matched rows are then
searched in a child table with `EQ` expression which do not match nulls.
Before:
`oldval` old_is `newval`
Now:
`oldval` is_null or (`newval` is_not_null and `oldval` eq `newval`)

Closes #3645

9c6c4483

Oct 25, 2018

Merge remote-tracking branch 'origin/1.9' into 1.10 · 211cef71
Alexander Turenko authored 6 years ago

Unverified

211cef71
Merge remote-tracking branch 'origin/1.7' into 1.9 · 3736c379
Alexander Turenko authored 6 years ago

Unverified

3736c379
Merge remote-tracking branch 'origin/1.6' into 1.7 · 7ef5be2e
Alexander Turenko authored 6 years ago

Unverified

7ef5be2e

Upload tarballs to the new S3 buckets · a5da60bd

Alexander Turenko authored 6 years ago

Upload tarballs of alpha and beta tarantool versions (*.0 and *.1
branches) into 2x (3x, 4x...) buckets. See more details about the
release process in the documentation: [1].

[1]: https://tarantool.io/en/doc/2.0/dev_guide/release_management/

Unverified

a5da60bd

Update README.MacOSX (#3758) · b9945325
Viktor Oreshkin authored 6 years ago

b9945325

replication: make join stage more informative · a6a22f1b

Serge Petrenko authored 6 years ago

This patch adds logging amount of rows received by applier during the
join stage, the same way that recovery has it.

Closes #3165

a6a22f1b

schema: refactor space cache API · c3dd46c5

Kirill Yukhin authored 6 years ago

Remove function which deletes from cache, making replace more general:
it might be used for both insertions, deletions and replaces. Also, put
assert on equality of space pointer found in cache to old one into
replace routine.

c3dd46c5

Upload 2.1 packages into 2x PackageCloud bucket · 539ceb91

Alexander Turenko authored 6 years ago

The idea behind this change is to have 2x (and maybe later 3x, 4x, ...)
bucket for alpha and beta releases. See more details about the release
process in the documentation: [1].

[1]: https://tarantool.io/en/doc/2.0/dev_guide/release_management/

Unverified

539ceb91

wal: delete old wal files when running out of disk space · 8a1bdc82

Vladimir Davydov authored 6 years ago

Now if the WAL thread fails to preallocate disk space needed to commit
a transaction, it will delete old WAL files until it succeeds or it
deletes all files that are not needed for local recovery from the oldest
checkpoint. After it deletes a file, it notifies the garbage collector
via the WAL watcher interface. The latter then deactivates consumers
that would need deleted files.

The user doesn't see a ENOSPC error if the WAL thread successfully
allocates disk space after deleting old files. Here's what's printed
to the log when this happens:

wal/101/main C> ran out of disk space, try to delete old WAL files
wal/101/main I> removed /home/vlad/src/tarantool/test/var/001_replication/master/00000000000000000005.xlog
wal/101/main I> removed /home/vlad/src/tarantool/test/var/001_replication/master/00000000000000000006.xlog
wal/101/main I> removed /home/vlad/src/tarantool/test/var/001_replication/master/00000000000000000007.xlog
main/105/main C> deactivated WAL consumer replica 82d0fa3f-6881-4bc5-a2c0-a0f5dcf80120 at {1: 5}
main/105/main C> deactivated WAL consumer replica 98dce0a8-1213-4824-b31e-c7e3c4eaf437 at {1: 7}

Closes #3397

8a1bdc82

wal: add event_mask to wal_watcher · b073b017

Vladimir Davydov authored 6 years ago

In order to implement WAL auto-deletion, we need a notification channel
through which the WAL thread could notify TX that a WAL file was deleted
so that the latter can shoot off stale replicas. We will reuse existing
wal_watcher API for this. Currently, wal_watcher invokes the registered
callback on each WAL write so using it as is would be inefficient. To
avoid that, let's allow the caller to specify events of interest when
registering a wal_watcher.

Needed for #3397

b073b017

wal: rename wal_watcher->events to pending_events · da9c8c14

Vladimir Davydov authored 6 years ago

We will add another event bitmap to wal_watcher. To avoid confusion
between them, let's rename wal_watcher->events.

da9c8c14

wal: pass wal_watcher_msg to wal_watcher callback · 7077341e

Vladimir Davydov authored 6 years ago

This should make it easier to pass some extra information along with the
event mask. For example, we will use it to pass the vclock of the oldest
stored WAL, which is needed for WAL auto-deletion.

Needed for #3397

7077341e

test: add LTO targets into CI · a7543b2b
AKhatskevich authored 6 years ago

a7543b2b

Add LTO support · f9e28ce4

AKhatskevich authored 7 years ago

Added -DENABLE_LTO=ON/OFF cmake option, OFF by default.

LTO speeds up cpu-intensive workloads by up to 20% (see [1] and [2]).

Requirements to enable LTO:

- cmake >= 3.9;
- Linux: ld.bfd / ld.gold from binutils >= 2.31 (or later 2.30)
  (gold >= 1.15);
- Mac OS: xcode >= 8 (earlier versions are not tested).

The requirement of the recent ld version is due to bug with exporting
symbols from dynamic list when LTO is enabled, see [3].

Note: -Wlto-type-mismatch on GCC (enabled by default with -flto) gives
namy warnings. Filed [4] to investigate it.

Note: LuaJIT will be compiled w/o LTO despite the option set, filed [5].

[1]: https://github.com/tarantool/tarantool/wiki/performance-research
[2]: https://gist.github.com/Khatskevich/31a2da6ab46ce903120e7a03d65966db
[3]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84901
[4]: https://github.com/tarantool/tarantool/issues/3742
[5]: https://github.com/tarantool/tarantool/issues/3743

Closes #3117

f

f9e28ce4

test: prevent guard-breaker optimization with LTO · 7df89506

AKhatskevich authored 6 years ago

In case of very aggressive optimizations the compiler can
optimize guard-breaker function away and the `unit/guard`
test would fail.

7df89506

Update branch name in README.md · e24d1c5e
Kirill Yukhin authored 6 years ago
```
2.0 was renamed to 2.1.
```
e24d1c5e

test: rewrote xlog/checkpoint_daemon.test.lua · 6f6a3040

Alexander Turenko authored 6 years ago

Updated the test case for #2780 to check a last snapshot file
modification time instead of search log messages.

The test was flaky, because of small timeouts on Linux, but now we
spinning on a condition check to achieve both stable results and fast
execution.

Follows up #2780.
Fixes #3684.

6f6a3040

test: update test-run · b4d24341

Alexander Turenko authored 6 years ago

* added more details about hung tests (#107);
* added show_reproduce_content option (#113);
* fixed inspector error reporting for a failed app test;
* expand action of use_unix_socket option to non-default servers;
* updated tarantool-python submodule (#126);
* added test_run:wait_cond() and test_run:wail_log().

Updated box-py/call.test.py result file, because tarantool-python now
uses CALL 1.7 convention by default and slightly changed yaml output
formatting. See [1] and [3] for more information.

Updated replication-py/cluster.test.py, because of changed
tarantool-python internals, see commit [2].

Updated box-py/iproto.test.py because it uses tarantool-python internals
that was rewritten in [2]. Updated its result file according to CALL 1.7
response format that was set as default with [1] and yaml output
formatting changed within [3].

Updated replication-py/swap.test.py result file, because of yaml output
formatting that was slightly changed within [3].

[1]: https://github.com/tarantool/tarantool-python/issues/82
[2]: https://github.com/tarantool/tarantool-python/commit/4639d9ae1c48f1608bd599c6d93ed6bfca48fbf9
[3]: https://github.com/tarantool/tarantool-python/issues/90

b4d24341

wal: preallocate disk space before writing rows · 76110901

Vladimir Davydov authored 6 years ago

This function introduces a new xlog method xlog_fallocate() that makes
sure that the requested amount of disk space is available at the current
write position. It does that with posix_fallocate(). The new method is
called before writing anything to WAL, see wal_fallocate(). In order not
to invoke the system call too often, wal_fallocate() allocates disk
space in big chunks (1 MB).

The reason why I'm doing this is that I want to have a single and
clearly defined point in the code to handle ENOSPC errors, where
I could delete old WALs and retry (this is what #3397 is about).

Needed for #3397

76110901

vinyl: fix memory leak in write iterator · f4ae714b

Vladimir Davydov authored 6 years ago

Memory allocated for vy_write_iterator::src_heap is never freed. Fix it.
The leak was introduced by commit c174c985 ("vinyl: implement new
simple write iterator").

f4ae714b

Oct 24, 2018

xlog: turn use_coio argument of xdir_collect_garbage to flags · a198e273
Vladimir Davydov authored 6 years ago
```
So that we can add more flags.
```
a198e273

vinyl: account disk statements of each type · b2f85642

Vladimir Davydov authored 6 years ago

This patch adds a new entry to per index statistics reported by
index.stat():

  disk.statement
    inserts
    replaces
    deletes
    upserts

It shows the number of statements of each type stored in run files.
The new statistics are persisted in index files. We will need this
information so that we can force major compaction when there are too
many DELETE statements accumulated in run files.

Needed for #3225

b2f85642

vinyl: remove useless local var from vy_range_update_compact_priority · 0a158a4d
Vladimir Davydov authored 6 years ago
```
Local variable total_size equals total_stmt_count.bytes_compressed so we
don't really need it.
```
0a158a4d

tuple: zap tuple_extra · e65ba254

Vladimir Davydov authored 6 years ago

tuple_extra() allows to store arbitrary metadata inside tuples.
To use it, one should set extra_size when creating a tuple_format.
It was introduced for storing UPSERT counter or column mask inside
vinyl statements. Turned out that it wasn't really needed as UPSERT
counter can be stored on lsregion while column mask doesn't need to
be stored at all.

Actually, the whole idea of tuple_extra() is rather crooked: why
would we need it if we can inherit struct tuple instead, as we do
in case of memtx_tuple and vy_stmt? Accessing an inherited struct
is much more convenient than using tuple_extra().

So this patch gets rid of tuple_extra(). To do that, it partially
reverts the following commits:

6c0842e0 vinyl: refactor vy_stmt_alloc()
74ff46d8 vinyl: add special format for tuples with column mask
11eb7816 Add extra size to tuple_format->field_map_size

e65ba254

tuple: zap tuple_format_dup · 9b8c3949

Vladimir Davydov authored 6 years ago

This function was only used for creating a format for tuples with column
mask in vinyl. Not needed anymore and can be removed.

Anyway, it doesn't make much sense to duplciate a tuple format, because
it can be referenced instead. Besides, once JSON indexes are introcued,
duplicating a tuple format will be really painful. One more reason to
drop it now.

9b8c3949

vinyl: zap vy_stmt_column_mask and mem_format_with_colmask · 08afd57f
Vladimir Davydov authored 6 years ago
```
Finally, these atrocities are not used anywhere and can be removed.
```
08afd57f
vinyl: explicitly pass column mask to vy_check_is_unique · dae21083
Vladimir Davydov authored 6 years ago
```
This patch is a preparation for removing vy_stmt_column_mask.
```
dae21083
vinyl: explicitly pass column mask to vy_tx_set · 3a0ab1e1
Vladimir Davydov authored 6 years ago
```
This patch is a preparation for removing vy_stmt_column_mask.
```
3a0ab1e1

vinyl: do not use column mask as trigger for turning REPLACE into INSERT · 4b96c8a9

Vladimir Davydov authored 6 years ago

If a REPLACE statement was generated by an UPDATE operation that updated
a column indexed by a secondary key, we can turn it into INSERT when the
secondary index is dumped, because there can't be an older statement
with the same key other than DELETE. Currently, we use the statement
column mask to detect such REPLACEs in the write iterator, but I'm
planning to get rid of vy_stmt_column_mask so let's instead introduce
a new statement flag to mark such REPLACEs.

4b96c8a9

vinyl: factor out common code of UPDATE and UPSERT · c6985874

Vladimir Davydov authored 6 years ago

This patch introduces a helper function vy_perform_update() that
performs operations common for UPDATE and UPSERT, namely replaces
a tuple in a transaction write set.

c6985874

vinyl: move update optimization from write iterator to tx · 9d0ccd66

Vladimir Davydov authored 6 years ago

An UPDATE operation is written as DELETE + REPLACE to secondary indexes.
We write those statements to the memory level even if the UPDATE doesn't
actually update columns indexed by a secondary key. We filter them out
in the write iterator when the memory level is dumped. That's what we
use vy_stmt_column_mask for.

Actually, there's no point to keep those statements until dump - we
could as well filter them out when the transaction is committed. This
would even save some memory. This wouldn't hurt read operations, because
point lookup doesn't work for secondary indexes by design and so we have
to read all sources, including disk, on every read from a secondary
index.

That said, let's move update optimization from the write iterator to
vy_tx_commit. This is a step towards removing vy_stmt_column_mask.

9d0ccd66

Oct 23, 2018

xlog: fix sync_is_async xlog option · 55dcde00

Alexander Turenko authored 6 years ago

The behaviour change was introduced in cda3cb55: sync_is_async option
was forgotten to be updated from xdir; sync_interval was forgotten too,
but was restored in 1900c58b.

The commit fixes the performance regression around 6-14% for average RPS
on default nosqlbench workload with 30 seconds duration. The additional
information about benchmarking can be found in #3747.

Thanks to Vladimir Davydov (@locker) for the investigation of the
cda3cb55 changes.

Closes #3747

(cherry picked from commit cd9cc4c5)

55dcde00

xlog: fix sync_is_async xlog option · cd9cc4c5

Alexander Turenko authored 6 years ago

The behaviour change was introduced in cda3cb55: sync_is_async option
was forgotten to be updated from xdir; sync_interval was forgotten too,
but was restored in 1900c58b.

The commit fixes the performance regression around 6-14% for average RPS
on default nosqlbench workload with 30 seconds duration. The additional
information about benchmarking can be found in #3747.

Thanks to Vladimir Davydov (@locker) for the investigation of the
cda3cb55 changes.

Closes #3747

cd9cc4c5

Oct 19, 2018
- Revert changes related to master branch. · 6ca97fd5
  Kirill Yukhin authored 6 years ago
  
  6ca97fd5