Commits · 4ed1c41d0f68721c496cbb12d18007fc15cd88d4 · core / tarantool

Dec 09, 2018

Vladislav Shpilevoy authored 6 years ago

Next patches remove exceptions from sio and convert it
to C. So as to do not care about unused functions they
are deleted.

4ed1c41d

wal: trigger checkpoint if there are too many WALs · db8c7aa3

Vladimir Davydov authored 6 years ago

Closes #1082

@TarantoolBot document
Title: Document box.cfg.checkpoint_wal_threshold

Tarantool makes checkpoints every box.cfg.checkpoint_interval seconds
and keeps last box.cfg.checkpoint_count checkpoints. It also keeps all
intermediate WAL files. Currently, it isn't possible to set a checkpoint
trigger based on the sum size of WAL files, which makes it difficult to
estimate the minimal amount of disk space that needs to be allotted to a
Tarantool instance for storing WALs to eliminate the possibility of
ENOSPC errors. For example, under normal conditions a Tarantool instance
may write 1 GB of WAL files every box.cfg.checkpoint_interval seconds
and so one may assume that 1 GB times box.cfg.checkpoint_count should be
enough for the WAL partition, but there's no guarantee it won't write 10
GB between checkpoints when the load is extreme.

So we've agreed that we must provide users with one more configuration
option that could be used to impose the limit on the sum size of WAL
files. The new option is called box.cfg.checkpoint_wal_threshold. Once
the configured threshold is exceeded, the WAL thread notifies the
checkpoint daemon that it's time to make a new checkpoint and delete
old WAL files. Note, the new option only limits the size of WAL files
created since the last checkpoint, because backup WAL files are not
needed for recovery and can be deleted in case of emergency ENOSPC, for
more details see tarantool/tarantool#1082, tarantool/tarantool#3397,
tarantool/tarantool#3822.

The default value of the new option is 1 exabyte (10^18 byte), which
actually means that the feature is disabled.

db8c7aa3

wal: pass struct instead of vclock to checkpoint methods · f33fbbf8

Vladimir Davydov authored 6 years ago

Currently, we only need to pass a vclock between TX and WAL during
checkpointing. However, in order to implement auto-checkpointing
triggered when WAL size exceeds a certain threshold, we will need to
pass some extra info so that we can properly reset the counter
accounting the WAL size in the WAL thread. To make it possible, let's
move wal_checkpoint struct, which is used internally by WAL to pass a
checkpoint vclock, to the header and require the caller to pass it to
wal_begin/commit_checkpoint instead of just a vclock.

f33fbbf8

Rewrite checkpoint daemon in C · 4c04808a

Vladimir Davydov authored 6 years ago

Long time ago, when the checkpoint daemon was added to Tarantool, it was
responsible not only for making periodic checkpoints, but also for
maintaining the configured number of checkpoints and removing old snap
and xlog files, so it was much easier to implement it in Lua than in C.
However, over time, all its responsibilities have been reimplemented in
C and moved to the server code so that now it just calls box.snapshot()
periodically. Let's rewrite this simple procedure in C as well - this
will allow us to easily add more complex logic there, e.g. triggering
checkpoint when WAL files exceed a configured threshold.

Note, this patch removes a few cases from xlog/checkpoint_daemon test
that tested the internal state of the checkpoint daemon, which isn't
available in Lua anymore. This is OK as those cases are covered by
unit/checkpoint_schedule test.

4c04808a

Introduce checkpoint schedule module · 382568b1

Vladimir Davydov authored 6 years ago

This is a very simple module that incorporates the logic for calculating
the time of the next scheduled checkpoint given the configured interval
between checkpoints. It doesn't have any dependencies, which allows to
cover it with a unit test. It will be used by the checkpoint daemon once
we rewrite it in C. Rationale: in future we might want to introduce more
complex rules for scheduling checkpoints (cron-like may be) and it will
be really nice to have this logic neatly separated and tested.

382568b1

gc: some renames · bdb6825b

Vladimir Davydov authored 6 years ago

GC module is responsible not only for garbage collection, but also for
tracking consumers and making checkpoints. Soon it will also incorporate
the checkpoint daemon. Let's prefix all members related to the cleanup
procedure accordingly to avoid confusion.

bdb6825b

box: move checkpointing to gc module · 4ef765d5

Vladimir Davydov authored 6 years ago

Garbage collection module seems to be the best way to accommodate the
checkpoint daemon, but to move it there, we first need to move the code
performing checkpoints there to avoid cross-dependency between box.cc
and gc.c.

4ef765d5

box: don't use box_checkpoint_is_in_progress outside box.cc · b4650492

Vladimir Davydov authored 6 years ago

We only use box_checkpoint_is_in_progress in SIGUSR1 signal handler to
print a warning in case checkpointing cannot be started, because it's
already done by another fiber. Actually, it's not necessary to use it
there - instead we can simply log the error returned by box_checkpoint,
which will be self-explaining ER_CHECKPOINT_IN_PROGRESS in this case.
So let's make box_checkpoint_is_in_progress private to box.cc - this
will simplify moving the checkpoint daemon to the gc module.

While we are at it, remove the unused snapshot_version declaration.

b4650492

box: fix certain cfg options initialized twice on recovery · a6f22d19

Vladimir Davydov authored 6 years ago

Certain dynamic configuration options are initialized right in box.cc,
because they are needed for recovery. All such options are supposed to
be present in dynamic_cfg_skip_at_load table so that load_cfg.lua won't
try to set them again upon recovery completion. However, not all of them
happen to be there - sometime we simply forgot to patch this table after
introduction of a new configuration option. This patch adds all the
missing ones except checkpoint_count - there's no point to initialize
checkpoint_count in box.cc so it removes it from box.cc instead.

a6f22d19

wal: simplify watcher API · f2db4075

Vladimir Davydov authored 6 years ago

This patch reverts changes done in order to make WAL watcher API
suitable for notiying TX about WAL garbage collection triggered on
ENOSPC, namely:

 b073b017 wal: add event_mask to wal_watcher
 7077341e wal: pass wal_watcher_msg to wal_watcher callback

We don't need them anymore, because now we piggyback the notification
on the WAL request message that triggered ENOSPC.

f2db4075

gc: do not use WAL watcher API for deactivating stale consumers · d32166a6

Vladimir Davydov authored 6 years ago

The WAL thread may delete old WAL files if it gets ENOSPC error.
Currently, we use WAL watcher API to notify the TX thread about it so
that it can shoot off stale replicas. This looks ugly, because WAL
watcher API was initially designed to propagate WAL changes to relay
threads and the new event WAL_EVENT_GC, which was introduced for
notifying about ENOSPC-driven garbage collection, isn't used anywhere
else. Besides, there's already a pipe from WAL to TX - we could reuse it
instead of opening another one.

If we followed down that path, then in order to trigger a checkpoint
from the WAL thread (see #1082), we would have to introduce yet another
esoteric WAL watcher event, making the whole design look even uglier.
That said, let's rewrite the garbage collection notification procedure
using a plane callback instead of abusing WAL watcher API.

d32166a6

netbox: fix wait_connected ignorance · a539fdc2

Vladislav Shpilevoy authored 6 years ago

After this patch d2468dac it became possible to
wrap an existing connection into netbox API. A regular
netbox.connect function was refactored so as to reuse
connection establishment code.

But connection should be established in a worker
fiber, not in a caller's one. Otherwise it is
impossible to do not wait for connect result.

The patch just moves connection establishment into a
worker fiber, without any functional changes.

Closes #3856

a539fdc2

Fix premature cdata collecting in luaT_pusherror() · 88de7c34

Alexander Turenko authored 6 years ago

This is follow up of 28c7e667 to fix
luaT_pusherror() itself, not only luaT_error().

Fixes #1955 (again).

88de7c34

Dec 06, 2018

test: replication parallel mode on · f5c8b825
Sergei Voronezhskii authored 6 years ago
```
Part of #2436, #3232
```
f5c8b825

test: use wait_cond to check follow status · f41548b7

Sergei Voronezhskii authored 6 years ago

After setting timeouts in `box.cfg` and before making a `replace` needs
to wait for replicas in `follow` status. Then if `wait_follow()` found
not `follow` status it returns true. Which immediately causes an error.

Fixes #3734
Part of #2436, #3232

f41548b7

test: put require in proper places · d2f28afa

Sergei Voronezhskii authored 6 years ago

* put `require('fiber')` after each switch server command, because
  sometimes got 'fiber' not defined error
* use `require('fio')` after `require('test_run').new()`, because
  sometimes got 'fio' not defined error

Part of #2436, #3232

d2f28afa

test: errinj for pause relay_send · 1c34c91f

Sergei Voronezhskii authored 6 years ago

Instead of using timeout we need just pause `relay_send`. Can't rely
on timeout because of various system load in parallel mode. Add new
errinj which checks boolean in loop and until it is not `True` do not
pass the method `relay_send` to the next statement.

To check the read-only mode, need to make a modification of tuple. It
is enough to call `replace` method. Instead of `delete` and then
useless verification that we have not delete tuple by using `get`
method.

And lookup the xlog files in loop with a little sleep, until the file
count is not as expected.

Update box/errinj.result because new errinj was added.

Part of #2436, #3232

1c34c91f

test: cleanup replication tests · 848a0b03

Sergei Voronezhskii authored 6 years ago

- at the end of tests which create any replication config need to call:
  * `test_run:cmd('delete server ...')` removes server object
    from `TestState.servers` list, this behaviour was taken
    from `test_run:drop_cluster()` function
  * `test_run:clenup_cluster()` which clears `box.space._cluster`
- switch on `use_unix_sockets` because of 'Address already in use'
  problems
- test `once` need to clean `once*` schemas

Part of #2436, #3232

848a0b03

sql: fix tarantoolSqlite3TupleColumnFast · 2bfe8ac5

Kirill Shcherbatov authored 6 years ago

The tarantoolSqlite3TupleColumnFast routine used to lookup
offset_slot in unallocated memory in some cases.
The assert with exact_field_count same as motivation to change
old correct assert with field_count in 7a8de281 is not correct.
assert(format->exact_field_count == 0 ||
       fieldno < format->exact_field_count);
The tarantoolSqlite3TupleColumnFast routine requires offset_slot
that has been allocated during tuple_format_create call. This
value is stored in indexed field with index that limited with
index_field_count that is <= field_count. Look at
tuple_format_alloc for more details.

The format in cursor triggering valid assertion has such
structure because first 4 tuples in _space: 257, 272, 276 and
280 have an old format of _space with only one field
(format->field_count == 1).
It happens because these 4 tuples are recovered not after tuple
with id 280 which stores actual format of _space. After tuple
280 is recovered, an actual format is set in struct space of
_space and all next tuples have full featured formats.

So for these 4 tuples tarantoolSqlite3TupleColumnFast can fail
even if a field exists, is indexed and has a name. Those
features are just described in a newer format.
(thank Gerold103 for problem explanation)

Closes #3772

2bfe8ac5

sql: fix parser.parse_only mode for triggers · ac73e345

Kirill Shcherbatov authored 6 years ago

As the parse_only flag had not worked correctly for sql triggers
sql_trigger_compile have had a Vdbe memory leak.

Closes #3838

ac73e345

box: fix checkpoint_delete · 3c8330ea

Kirill Shcherbatov authored 6 years ago

The rlist_foreach_entry iterator was used for freeing resources.
As a result there was dirty access to memory during next step of
for-loop.
Replaced with rlist_foreach_entry_safe valid for destructors.

Closes #3858

3c8330ea

Dec 04, 2018

info: remove false comments from src/info.h · c6e5bf48
Vladislav Shpilevoy authored 6 years ago

c6e5bf48

box: move info_handler interface into src/info · f1a114ca

Vladislav Shpilevoy authored 6 years ago

Box/info.h defines info_handler interface with a set
of virtual functions. It allows to hide Lua from code
not depending on this language, and is used in things
like index:info(), box.info() to build Lua table with
some info. But it does not depend on box/ so move it
to src/.

Also, this API is needed for the forthcoming SWIM
module which is going to be placed into src/lib and
needs info to dump its state to Lua from C without
strict Lua dependency.

@locker:
 - remove pointless _GNU_SOURCE definition from
   box/lua/info.c
 - remove luaT_info_handler_create declaration from
   box/lua/info.h

Needed for #3234

f1a114ca

Dec 03, 2018

lua: getpwall/getgrall error handling - follow-up fixes · a1606e91

Vladimir Davydov authored 6 years ago

 - Add the forgotten errno(0) to getgrall.
 - Throw errors from getgrall/getpwall instead of returning nil in
   case the underlying system function fails.
 - Fix the error message in getgr.
 - Remove pointless and confusing asterisk sign from error messages.
 - Do not hide a stack frame on error.

Follow-up efccac69 ("lua: fix error handling in getpwall and
getgrall").

a1606e91

lua: fix error handling in getpwall and getgrall · efccac69

Alexander Turenko authored 6 years ago

This commit fixes app-tap/pwd.test.lua test. It seems that the problem
appears after updating to glibc-2.28.

It seems that usual way to handle errors in Unix is to check errno only
when a return value indicates possibility of an error.

Related to #3766.

efccac69

Remove deprecated getaddrinfo() flags · b601d0be

Alexander Turenko authored 6 years ago

AI_IDN_ALLOW_UNASSIGNED and AI_IDN_USE_STD3_ASCII_RULES flags are
deprecated by glibc-2.28 and the deprecation warnings did cause fail of
Debug build, because of -Werror.

Fixes #3766.

b601d0be

box: move port to src/ · 1730b39a

Vladislav Shpilevoy authored 6 years ago

Basic port structure does not depend on anything but
standard types. It just gives an interface and calls
virtual functions.

Its location in box/ was ok since it was not used
anywhere in src/. But next commits will add a new
method to mpstream so as to dump port. Mpstream is
implemented in src/, so lets move port over here.

Needed for #3505

1730b39a

test: fix app/fiber.test.lua flaky fails · 0e19478c
Alexander Turenko authored 6 years ago
```
Fixes #3852.
```
0e19478c

test: fix hardcoded port in box/net.box.test.lua · f36568c0

Alexander Turenko authored 6 years ago

It allows to run the test many times in parallel to investigate flaky
test failures and decreases probability that the test fails, because
this port was already used by, say, some other test.

f36568c0

test: fix http_client.test.lua with curl-7.62 · 10518cc1

Alexander Turenko authored 6 years ago

curl-7.61.1

```
tarantool> require('http.client').new():get('http://localhost:0')
---
- status: 595
  reason: Couldn't connect to server
```

curl-7.62

```
tarantool> require('http.client').new():get('http://localhost:0')
---
- error: 'curl: URL using bad/illegal format or missing URL'
...
```

curl-7.62 returns CURLE_URL_MALFORMAT is case of zero port and tarantool
raises an error in the case. I think this behaviour is valid, so I fixed
the test.

10518cc1

Nov 29, 2018

gc: run garbage collection in background · 07191842

Vladimir Davydov authored 6 years ago

Currently, garbage collection is executed synchronously by functions
that may trigger it, such as gc_consumer_advance or gc_add_checkpoint.
As a result, one has to be very cautious when using those functions as
they may yield at their will. For example, we can't shoot off stale
consumers right in tx_prio handler - we have to use rather clumsy WAL
watcher interface instead. Besides, in future, when the garbage
collector state is persisted, we will need to call those functions from
on_commit trigger callback, where yielding is not normally allowed.

Actually, there's no reason to remove old files synchronously - we could
as well do it in the background. So this patch introduces a background
garbage collection fiber that executes gc_run when woken up. Now all
functions that might trigger garbage collection wake up this fiber
instead of executing gc_run directly.

07191842

recovery: restore garbage collector vclock after restart · baf28a59

Vladimir Davydov authored 6 years ago

After restart the garbage collector vclock is reset to the vclock of the
oldest preserved checkpoint, which is incorrect - it may be less in case
there is a replica that lagged behind, and it may be greater as well in
case the WAL thread hit ENOSPC and had to remove some WAL files to
continue. Fix it.

A note about xlog/panic_on_wal_error test. To check that replication
stops if some xlogs are missing, the test first removes xlogs on the
master, then restarts the master, then tries to start the replica
expecting that replication should fail. Well, it shouldn't - the replica
should rebootstrap instead. It didn't rebootstrap before this patch
though, because the master reported wrong garbage collector vclock (as
it didn't recover it on restart). After this patch the replica would
rebootstrap and the test would hang. Fix this by restarting the master
before removing xlog files.

baf28a59

wal: remove files needed for recovery from backup checkpoints on ENOSPC · bd7f7116

Vladimir Davydov authored 6 years ago

Tarantool always keeps box.cfg.checkpoint_count latest checkpoints. It
also never deletes WAL files needed for recovery from any of them for
the sake of redundancy, even if it gets ENOSPC while trying to write to
WAL. This patch changes that behavior: now the WAL thread is allowed to
delete backup WAL files in case of emergency ENOSPC - after all it's
better than stopping operation.

Closes #3822

bd7f7116

wal: separate checkpoint and flush paths · 74d8db74

Vladimir Davydov authored 6 years ago

Currently, wal_checkpoint() is used for two purposes. First, to make a
checkpoint (rotate = true). Second, to flush all pending WAL requests
(rotate = false). Since checkpointing has to fail if cascading rollback
is in progress so does flushing. This is confusing. Let's separate the
two paths.

While we are at it, let's also rewrite WAL checkpointing using cbus_call
instead of cpipe_push as it's a more convenient way of exchanging simple
two-hop messages between two threads.

74d8db74

json: some renames · b56103f5

Kirill Shcherbatov authored 6 years ago

We are planning to link json_path_node objects in a tree and attach some
extra information to them so that they could be used to describe a json
document structure. Let's rename it to json_token as it sounds more
appropriate for the purpose.

Also, rename json_path_parser to json_lexer as it isn't a parser,
really, it's rather a tokenizer or lexer. Besides, the new name is
shorter.

Needed for #1012

b56103f5

test: fix vinyl/errinj spurious failure · 8e13153b

Vladimir Davydov authored 6 years ago

The failing test case checks that modifications done to the space during
the final dump of a newly built index are recovered properly. It assumes
that a series of operations will complete in 0.1 seconds, but it may not
happen if the disk is slow (like on Travis CI). This results in spurious
failures. To fix this issue, let's replace ERRINJ_VY_RUN_WRITE_TIMEOUT
used by the test with ERRINJ_VY_RUN_WRITE_DELAY, which blocks index
creation until it is disabled instead of injecting a time delay as its
predecessor did.

Closes #3756

8e13153b

Don't repeast SQL stress tests with vinyl engine. · 6e07131d

Konstantin Osipov authored 6 years ago

These are stress testing some of the parser/vdbe features, no point
in replaying them against vinyl. They could just as well run in
wal_mode="none"

6e07131d

Disable gh-3332-tuple-format-leak.test, gh-3083-ephemeral-unref-tuples.test · 52a212f3
Konstantin Osipov authored 6 years ago
```
Disable these tests in regular suite until they are sped up in scope
of gh-3845
```
52a212f3
lua: moving lua error functions to separate file · 27a04953
Ilya Markov authored 6 years ago
```
Refactoring. Move lua error functions to a separate file.

A prerequisite for #677
```
27a04953
test: skip test backtrace if no libunwind support · 2aa25ba5
Sergei Voronezhskii authored 6 years ago
```
Closes #3824
```
2aa25ba5