Commits · 1f9aa1760e23df2d331829641b54603f45da2330 · core / tarantool

Jun 02, 2018
- vclock: clang 6.0 fix · 1f9aa176
  Konstantin Osipov authored 6 years ago
  
  Fix a compiler warning with clang 6
  1f9aa176
Jun 01, 2018

vinyl: fix compaction vs checkpoint race resulting in invalid gc · b25e3168

The callback invoked upon compaction completion uses checkpoint_last()
to determine whether compacted runs may be deleted: if the max LSN
stored in a compacted run (run->dump_lsn) is greater than the LSN of the
last checkpoint (gc_lsn) then the run doesn't belong to the last
checkpoint and hence is safe to delete, see commit 35db70fa ("vinyl:
remove runs not referenced by any checkpoint immediately").

The problem is checkpoint_last() isn't synced with vylog rotation - it
returns the signature of the last successfully created memtx snapshot
and is updated in memtx_engine_commit_checkpoint() after vylog is
rotated. If a compaction task completes after vylog is rotated but
before snap file is renamed, it will assume that compacted runs do not
belong to the last checkpoint, although they do (as they have been
appended to the rotated vylog), and delete them.

To eliminate this race, let's use vylog signature instead of snap
signature in vy_task_compact_complete().

Closes #3437

b25e3168

May 31, 2018

vinyl: fix false-positive assertion at exit · ff02157f

Vladimir Davydov authored 6 years ago

latch_destroy() and fiber_cond_destroy() are basically no-op. All they
do is check that latch/cond is not used. When a global latch or cond
object is destroyed at exit, it may still have users and this is OK as
we don't stop fibers at exit. In vinyl this results in the following
false-positive assertion failures:

  src/latch.h:81: latch_destroy: Assertion `l->owner == NULL' failed.

  src/fiber_cond.c:49: fiber_cond_destroy: Assertion `rlist_empty(&c->waiters)' failed.

Remove "destruction" of vy_log::latch to suppress the first one. Wake up
all fibers waiting on vy_quota::cond before destruction to suppress the
second one. Add some test cases.

Closes #3412

ff02157f

May 29, 2018

Don't raise on box.cfg for instance_uuid and replicaset_uuid · 8ed45cef

Georgy Kirichenko authored 6 years ago

Handle cases if instance_uuid and replicaset_uuid are present in
box.cfg and have same values as already set.

Fixes #3421

8ed45cef

May 25, 2018

test: update replication_connect_timeout in tests to a lower value · e9bf00fc
Konstantin Osipov authored 6 years ago
```
replication: make replication_connect_timeout dynamic
```
e9bf00fc
test: update replication_connect_timeout in tests to a lower value · c2220fe4
Konstantin Osipov authored 6 years ago

c2220fe4

replication: fix log message in case of sync failure · 6c35bf9b

Vladimir Davydov authored 6 years ago

replicaset_sync() returns not only if the instance synchronized to
connected replicas, but also if some replicas have disconnected and
the quorum can't be formed any more. Nevertheless, it always prints
that sync has been completed. Fix it.

See #3422

6c35bf9b

replication: do not stop syncing if replicas are loading · 1785e79c

Vladimir Davydov authored 6 years ago

If a replica disconnects while sync is in progress, box.cfg{} may stop
syncing leaving the instance in 'orphan' mode. This will happen if not
enough replicas are connected to form a quorum. This makes sense e.g. on
network error, but not when a replica is loading, because in the latter
case it should be up and running quite soon. Let's account replicas that
disconnected because they haven't completed initial configuration yet
and continue syncing if connected + loading > quorum.

Closes #3422

1785e79c

replication: use applier_state to check quorum · ca53ab91

Konstantin Belyavskiy authored 6 years ago

Small refactoring: remove 'enum replica_state' since reuse a subset
from applier state machine 'enum replica_state' to check if we have
achieved replication quorum and hence can leave read-only mode.

ca53ab91

replication: change default replication_connect_timeout to 30 seconds · 06a63686
Konstantin Osipov authored 6 years ago
```
The default of 4 seconds is too low to bootstrap a large cluster.
```
06a63686

May 24, 2018

replication: add strict ordering for appliers operating in a full mesh · edd76a2a

Georgy Kirichenko authored 6 years ago

In some cases when an applier processing yielded, other applier might
start some conflicting operation and break replication and database
consistency.
Now applier locks a per-server-id latch before processing a transaction.
This guarantees that there is only one applier request for each server
in progress at each given moment.

The problem was very rare until full mesh topologies in vinyl
became a commonplace.

Fixes gh-3339

edd76a2a

May 22, 2018

replication: a minor refactoring in leader election · 77dfe1b0
Konstantin Osipov authored 6 years ago
```
Avoid goto, a follow up on gh-3257.
```
77dfe1b0

replication: fix bug with read-only replica as a bootstrap leader · 77098294

Konstantin Belyavskiy authored 6 years ago

Another broken case. Adding a new replica to cluster:
+		if (replica->applier->remote_is_ro &&
+		    replica->applier->vclock.signature == 0)
In this case we may got an ER_READONLY, since signature is not 0.
So leader election now has two phases:
 1. To select among read-write replicas.
 2. If no such found, try old algorithm for backward compatibility
    (case then all replicas exist in cluster table).

Closes #3257

77098294

replication: add logging to replication connect/sync · 861803ee
Konstantin Osipov authored 6 years ago

861803ee

May 17, 2018

vinyl: remove runs not referenced by any checkpoint immediately · 35db70fa

Vladimir Davydov authored 6 years ago

If a compacted run was created after the last checkpoint, it is not
needed to recover from any checkpoint and hence can be deleted right
away to save disk space.

Closes #3407

35db70fa

May 15, 2018

test: improve vinyl/select_consistency · 47fe6ced

Vladimir Davydov authored 6 years ago

Improve the test by decreasing range_size so that it creates a lot of
ranges for test indexes, not just one. This helped find bugs causing
the crash described in #3393.

Follow-up #3393

47fe6ced

vinyl: do not panic if secondary index is inconsistent with primary · 1558c538

Vladimir Davydov authored 6 years ago

Although the bug in vy_task_dump_complete() due to which a tuple could
be lost during dump was fixed, there still may be affected deployments
as the bug was persisted on disk. To avoid occasional crashes on such
deployments, let's make vinyl_iterator_secondary_next() skip tuples that
are present in a secondary index but missing in the primary.

Closes #3393

1558c538

vinyl: fix lost key on dump completion · 1f0023ad

Vladimir Davydov authored 6 years ago

vy_task_dump_complete() creates a slice per each range overlapping with
the newly written run. It uses vy_range_tree_psearch(min_key) to find
the first overlapping range and nsearch(max_key) to find the range
immediately following the last overlapping range. This is incorrect as
nsearch rb tree method returns the element matching the search key if it
is present in the tree. That is, if the max key written to a run turns
out to be equal the beginning of a range, the slice won't be created for
it and it will be silently and persistently lost.

The issue manifests itself as crash in vinyl_iterator_secondary_next(),
when we fail to find the tuple in the primary index corresponding to a
statement found in a secondary index.

Part of #3393

1f0023ad

vinyl: fix EQ check in run iterator · 7ee79a0a

Vladimir Davydov authored 6 years ago

vy_run_iterator_seek() is supposed to check that the resulting statement
matches the search key in case of ITER_EQ, but if the search key lies at
the beginning of the slice, it doesn't. As a result, vy_point_lookup()
may fail to find an existing tuple as demonstrated below.

Suppose we are looking for key {10} in the primary index which consists
of an empty mem and two runs:

    run 1: DELETE{15}
    run 2: INSERT{10}

vy_run_iterator_next() returns DELETE{15} for run 1 because of the
missing EQ check and vy_point_lookup() stops at run 1 (since the
terminal statement is found) and mistakenly returns NULL.

The issue manifests itself as crash in vinyl_iterator_secondary_next(),
when we fail to find the tuple in the primary index corresponding to a
statement found in a secondary index.

Part of #3393

7ee79a0a

Add test case for fiber safety of digest.pbkdf2 · ec9ec946
Alexander Turenko authored 6 years ago
```
Follows up #3396.
```
ec9ec946

May 14, 2018

Use automatic storage for digest.pbkdf2 results · 06ec3d50

Alexander Turenko authored 6 years ago

It prevents rewriting result by an another thread after coio_call(), but
before lua_pushlstring(). Such case is possible because libeio uses
thread pool internally and static __thread storage can be reused before
lua_pushlstring() if many parallel digest.pbkdf2() calls are on the fly.

Fixes #3396.

06ec3d50

May 08, 2018

socket: Fix socket test · 2b973c05

Ilya Markov authored 6 years ago

In sequential launch of app-tap/console.test, tests failed with "User
exists" and binding errors.

Make sockets path relative.
Add users cleanup.

Relates #3168

2b973c05

May 07, 2018

Don't try to lock a ddl latch in a multistatement tx · c7012534

Georgy Kirichenko authored 7 years ago

Any ddl is prohibited in a multistatement transaction, there is no
reason to try to lock a ddl latch in tis case. Locking for already
locked latch will cause an yield and a silent transaction rollback, and
this will crash or assert tarantool server.

Fixes #2783

c7012534

May 05, 2018

console: fix a bug in interactive readline usage · 92e130cb

Vladislav Shpilevoy authored 6 years ago

Spurious wakeups are possible in console, that makes readline
think that there are some data on stdin. Waked up readline
returns garbage instead of string, that crashes a server on
assertion in Lua.

Closes #3343

92e130cb

May 03, 2018

digest: fix error in base64 encode options · 6e1ac12e

Vladislav Shpilevoy authored 6 years ago

Any option of base64 leads to urlsafe encoding. It is wrong, and
caused by incorrect flag checking. Fix it.

Closes #3358

6e1ac12e

iproto: follow up patch for the fix for blocked connection · 1dcdc98e

Konstantin Osipov authored 6 years ago

* rename request_limit.test.lua to net_msg_max.test.lua
* make net_msg_max.test.lua stable (courtesy of @Gerold103)
* exclude disconnect messages from iproto_msg_max limit
* add a separate warning for throttling based on readahead buffer overflow

1dcdc98e

iproto: connection could block forever after a CALL request · f4d66dae

Vladislav Shpilevoy authored 6 years ago

Starting with 1.9, CALL request which yields releases
the intput buffer in net thread before CALL is complete.
A release trigger is fired when the CALL fiber yields.

The problem is that by default the input socket is not
included into poll() list of the event loop: thanks to an
optimization by @kostja for strict request/response scenario,
the socket is included into poll() list only after the response
is sent to the client. Thus, the following could happen:

* a client sends a long-polling request
* the request yields and maybe never finishes
* the socket is not being read until the long-polling request
  is finished

The patch is to explicitly feed EV_READ event to the event
loop on the client socket whenever we release the input buffer
for a long-polling request.

We may remove iproto_resume() from net_discard_input() along
with this patch since iproto_resume() will be called by
iproto_connection_on_input().

f4d66dae

Apr 18, 2018

wal: Update request header after sequence update · 41589229

Ilya Markov authored 6 years ago

When tuple in insert/replace request has NULL value
in the field incremented by sequence,
request body is changed, NULL is replaced by value taken from
sequence.
But request header is not updated.
So Redo log, which takes body from header if header exists,
writes the old version of request to wal.

Fixed this with updating header value after handling the sequence.

Closes #3247

41589229

replication: fix broken cases with quorum=0 · 01b8ebc3

Konstantin Belyavskiy authored 6 years ago

This commit is related with 6d81fa99
With replication_connect_quorum=0 set, previous commit broke replication
since skip applier_resume() and applier_start() parts.
Fix it and add more test cases.

Closes #3278

01b8ebc3

replication: fix bug with read-only replica as a bootstrap leader · a8ecd1e1

Konstantin Belyavskiy authored 6 years ago

When bootstrapping a new cluster, each replica from replicaset can
be chosen as a leader, but if it is 'read-only', bootstrap will
failed with an error.
Fixed it by eliminating read-only replicas from voting by adding
access rights information to IPROTO_REQUEST_VOTE reply.

Closes #3257

a8ecd1e1

Apr 11, 2018

Fix backtrace typo · b8bff028
Georgy Kirichenko authored 6 years ago
```
There was an invalid refactoring with forgotten renaming.
```
b8bff028

Fix PyPI TLS connection failures in CI on OS X … (#3336) · 531c80f0

Arseny Antonov authored 6 years ago

The reason of the failures is TLSv1.0/TLSv1.1 brownout on the PyPI side,
see [1] for more information.

[1]: pypa/packaging-problems#130

531c80f0

Apr 10, 2018
- Added new coveralls options to sync repos with coveralls site (#3327) (#3331) · 73d75ed1
  Arseny Antonov authored 6 years ago
  
  * Added new coveralls options to sync repo * Pass travis job to coverage docker
  73d75ed1
Apr 09, 2018

replication: fix bug with zero replication_connect_quorum · 6d81fa99

Konstantin Belyavskiy authored 6 years ago

If 'box.cfg.read_only' is false, 'replication' defines at least one
replica (other than itself), but they are not available at the time
of box.cfg execution and replication_connect_quorum is set to zero,
master displays 'orphan' status instead of 'running' since logic
which cnange this state is executed only after successfull connection.

Closes #3278

6d81fa99

Apr 07, 2018

vinyl: fix crash if index is dropped while read task is in progress · 2a7cf7f5

Vladimir Davydov authored 6 years ago

If a fiber waiting for a read task to complete is cancelled, it will
leave the read iterator immediately, leaving the read task pending.
If the index is dropped before the read task is complete, the task
will attempt to dereference a deleted run upon completion:

    0  0x560b4007dbbc in print_backtrace+9
    1  0x560b3ff80a1d in _ZL12sig_fatal_cbiP9siginfo_tPv+1e7
    2  0x7f52b09190c0 in __restore_rt+0
    3  0x7f52af6ea30a in bzero+5a
    4  0x560b3ffc7a99 in mempool_free+2a
    5  0x560b3ffcaeb7 in vy_page_read_cb_free+47
    6  0x560b400806a2 in cbus_call_done+3f
    7  0x560b400805ea in cmsg_deliver+30
    8  0x560b40080e4b in cbus_process+51
    9  0x560b4003046b in _ZL10tx_prio_cbP7ev_loopP10ev_watcheri+2b
    10 0x560b4023d86e in ev_invoke_pending+ca
    11 0x560b4023e772 in ev_run+5a0
    12 0x560b3ff822dc in main+5ed
    13 0x7f52af6862b1 in __libc_start_main+f1
    14 0x560b3ff801da in _start+2a
    15 (nil) in +2a

Fix this by elevating the run reference counter per each read task.

Note, currently we use vy_run::refs not only as a reference counter, but
also as a counter of slices created for the run - see how we compare it
to vy_run::compacted_slice_count in vy_task_compact_complete(). This
isn't going to work anymore, obviously. Now we need to count slices
created per each run in a separate counter, vy_run::slice_count. Anyway,
it was a rather dubious hack to abuse reference counter for counting
slices and it's good to finally get rid of it.

2a7cf7f5

vinyl: use ERRINJ_DOUBLE for ERRINJ_VY_READ_PAGE_TIMEOUT · 8dc9895f

Vladimir Davydov authored 6 years ago

We use ERRINJ_DOUBLE for all other timeout injections. This makes them
more flexible as we can inject an arbitrary timeout in tests, not just
enable some hard-coded timeout. Besides, it makes tests easier to
follow. So let's use ERRINJ_DOUBLE for ERRINJ_VY_READ_PAGE_TIMEOUT too.

8dc9895f

alter: do not crash if sequence is created for space with no indexes · 95aefec3

Vladimir Davydov authored 6 years ago

If a space has no indexes, index_find() will return NULL, which will be
happily dereferenced by on_replace_dd_sequence(). Looks like this bug
goes back to the time when we made index_find() exception-free and
introduced index_find_xc() wrapper. Fix it and add a test case.

95aefec3

Apr 05, 2018

log: Fix syslog logger · 7c7a2fa1

Ilya Markov authored 6 years ago

* Remove rewriting format of default logger in case of syslog option.
* Add facility option parsing and use parsed results in format message
  according to RFC3164. Possible values and default value of syslog
  facility are taken from nginx (https://nginx.ru/en/docs/syslog.html)
* Move initialization of logger type and format fucntion before
  initialization of descriptor in log_XXX_init, so that we can test
  format function of syslog logger.

Closes gh-3244.

7c7a2fa1

Apr 04, 2018
- Add 'key_def_new_with_parts' (temporary) · 7d089bbd
  Alexander Turenko authored 6 years ago
  
  Filed gh-3311 to remove this export soon. Fixes #3310.
  7d089bbd
Apr 03, 2018

vinyl: fail transaction immediately if it does not fit in memory · 8f63d5d9

Vladimir Davydov authored 6 years ago

If the size of a transaction is greater than the configured memory
limit (box.cfg.vinyl_memory), the transaction will hang on commit
for 60 seconds (box.cfg.vinyl_timeout) and then fail with the
following error message:

  Timed out waiting for Vinyl memory quota

This is confusing. Let's fail such transactions immediately with
OutOfMemory error.

Closes #3291

8f63d5d9