Commits · 25382617b95722da7a57ed58bbef3ce528177ab8 · core / tarantool

Jul 06, 2020

replication: append NOP as the last tx row · 25382617

Serge Petrenko authored 4 years ago


Since we stopped sending local space operations in replication, the last
tx row has to be global in order to preserve tx boundaries on replica.
If the last row happens to be a local one, replica will never receive
the tx end marker, yielding the following errors:
`ER_UNSUPPORTED: replication does not support interleaving
transactions`.

In order to fix the problem append a global NOP row at the tx end if
it happens to end on a local row.

Follow-up #4114
Closes #4928

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

25382617

wal: fix tx boundaries · f41d1ddd

Serge Petrenko authored 4 years ago


In order to preserve transaction boundaries in replication protocol, wal
assigns each tx row a transaction sequence number (tsn). Tsn is equal to
the lsn of the first transaction row.

Starting with commit 7eb4650e, local
space requests are assigned a special replica id, 0, and have their own
lsns. These operations are not replicated.

If a transaction starting with a local space operation ends up in the
WAL, it gets a tsn equal to the lsn of the local space request. Then,
during replication, when such a transaction is replicated, the local
space request is omitted, and replica receives a global part of the
transaction with a seemingly random tsn, yielding an ER_PROTOCOL error:
"Transaction id must be equal to LSN of the first row in the transaction".

Assign tsn as equal to the lsn of the first global row in the
transaction to fix the problem, and assign tsn as before for fully local
transactions.

Follow-up #4114
Part-of #4928

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

f41d1ddd

applier: fix tx boundary check for half-applied txns · 9fcbbb3e

Serge Petrenko authored 4 years ago

In case there are 2 "new" instances, running tarantool 2.2+,
master and replica, and one "old" instance, running an earlier tarantool
version, in a full-mesh cluster, it may happen that the "new" replica
receives part of a tx from an "old" instance, and the remaining part
from a "new" instance.

Since "new" instances preserve tx boundaries, "new" replica would skip
the tx remains assuming it has already applied the full tx if it has
applied the first tx row. This leads to gaps in "new" replica's WAL and
to skipping the remaining part of the tx forever.

Fix this behaviour to apply the full tx even if it's beginning is
already applied in mixed clusters.

Closes #5125

9fcbbb3e

Jul 03, 2020

test: fix flaky box/net.box_readahead_gh-3958 test · 4c7d8281

Alexander V. Tikhonov authored 4 years ago

Issue:

[014] --- box/net.box_readahead_gh-3958.result Mon Jun 15 15:33:23 2020
[014] +++ box/net.box_readahead_gh-3958.reject Tue Jun 16 02:24:04 2020
[014] @@ -46,6 +46,7 @@
[014]  ...
[014]  test_run:wait_log('default', 'readahead limit is reached', 1024, 0.1)
[014]  ---
[014] +- readahead limit is reached
[014]  ...
[014]  s:drop()
[014]  ---
[014]
[014] Last 15 lines of Tarantool Log file [Instance "box"][/tarantool/test/var/014_box/box.log]:
[014] 2020-06-16 02:24:03.792 [5585] main/121/console/unix/: I> set 'read_only' configuration option to false
[014] 2020-06-16 02:24:03.834 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.835 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.835 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.836 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.837 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.837 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.837 [5585] iproto iproto.cc:606 W> stopping input on connection fd 26, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
[014] 2020-06-16 02:24:03.951 [5585] main/121/console/unix/: space.h:336 E> ER_NO_SUCH_INDEX_ID: No index #1 is defined in space '_space'
[014] 2020-06-16 02:24:04.180 [5585] main/121/console/unix/: I> set 'readahead' configuration option to 128
[014] 2020-06-16 02:24:04.183 [5585] main/121/console/unix/: I> set 'readahead' configuration option to 102400
[014] 2020-06-16 02:24:04.189 [5585] main/453/console/unix/: I> set 'readahead' configuration option to 16320

Found that the root cause of the issue, was the previously run test
'box/net.box_call_blocks_gh-946.test.lua' on the same worker, in this
case the log output mistakenly checked by wait_log/grep_log test_run
function, which finds the grepping string in the log of the previous
test. To avoid of it the tests can be swapped in worker running queue
and in this case both tests pass, check swapped log output:

2020-06-17 10:57:39.881 [69372] main C> entering the event loop
2020-06-17 10:57:39.896 [69372] main/119/console/unix/: I> set 'readahead' configuration option to 128
2020-06-17 10:57:39.898 [69372] main/119/console/unix/: I> set 'readahead' configuration option to 102400
2020-06-17 10:57:40.003 [69372] main/156/console/unix/: I> set 'readahead' configuration option to 16320
2020-06-17 10:57:40.053 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.056 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.056 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.058 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.058 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.061 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.061 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.062 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.062 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.063 [69372] iproto iproto.cc:606 W> stopping input on connection fd 33, aka unix/:(socket), peer of unix/:(socket), readahead limit is reached
2020-06-17 10:57:40.067 [69372] main C> got signal 15 - Terminated

Also found that 'readahead' issue from the first test blocks its
printing to log file due to suppressed. To fix this issue the
default server must be restarted at the very start of the test.

Closes #5082

4c7d8281

Correct cleanup gitlab-ci for perf jobs · 27ee9382

Alexander V. Tikhonov authored 4 years ago

Found that some perf jobs were forgot to be updated with local cleanup
routine as was done for the other jobs at commit:

  892a188b "Correct cleanup gitlab-ci"

Follows up #5036

27ee9382

build: static build needs more cleanup in sources · b74a4623

Alexander V. Tikhonov authored 4 years ago

Building Tarantool sources on make command run may fail with:

  [ 10%] make[2]: *** [test/small] Error 1
  [ 10%] make[1]: *** [test/CMakeFiles/symlink_small_tests.dir/all] Error 2
  make[1]: *** Waiting for unfinished jobs....

The root cause of the issue that Dockerfile.staticbuild
uses local copy of sources:

  COPY . /tarantool

Which may have broken links in tests, like:

  $ ls -al test
  ...
  luajit-tap -> /<wrong path>/third_party/luajit/test
  small -> /<wrong path>/src/lib/small/test/
  ...

To fix the issue this links should be removed from
the docker local copy of sources before build, like:

  rm -rf test/small test/luajit-tap

Closes #5025

b74a4623

Jul 02, 2020

decimal: introduce decimal_is_int · 275b4fb0

Chris Sosnin authored 4 years ago

This function will be used to determine, whether we can safely
convert to integer without an information loss.

Needed for #4415

275b4fb0

decimal: introduce strtodec function · 2709573e

Chris Sosnin authored 4 years ago

The behavior is similar with other strto* functions: parse the valid
beginning of the string and optionally store the pointer to the first
invalid character.

Needed for tarantool/tarantool#4415

2709573e

decNumber: bump new version · 4e65cfe6
Chris Sosnin authored 4 years ago

4e65cfe6

test: app-tap/logger -- test json in boottime logger · d1b3fbe9

Cyrill Gorcunov authored 4 years ago


Make sure we're allowed to setup json formatter before
box.cfg() call, ie that named boot-time logger.

Part-of #5121

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

d1b3fbe9

lua/log: allow to use json formatter early · 953f2b72

Cyrill Gorcunov authored 4 years ago


There is no reason to not allow for json formatter
on early logging stage.

We add verification that

	box.cfg{log="syslog:", log_format="json"}
or
	require('log').cfg{log="syslog:", format="json"}

is triggering error since syslog output requires
predefined structure and can't use json.

Fixes #5121

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

953f2b72

core/say: allow to use json in boot logger · 2b893041

Cyrill Gorcunov authored 4 years ago


For some reason in commit 09832455 we've disabled
to use json format in boot time logger. There is
no reason to do so.

Only syslog output format is predefined and must not
be changed, in turn json format is just a decoration
over output stream so we can use it whenever requested.

Part-of #5121

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

2b893041

Jun 30, 2020

Fix assignment operation in assertions · bb1e7d39

Nikita Pettik authored 4 years ago

Accidentally assignment is used in assertions instead of comparison
operation. Let's fix this mistake and use comparison.

bb1e7d39

Jun 29, 2020

journal: fix typo in journal_no_write_async() · 94bf7b48
Serge Petrenko authored 4 years ago

94bf7b48

journal: drop unused destroy method · b691db87

Cyrill Gorcunov authored 4 years ago


We never use this method so no need to waste space.

In-scope-of #4842

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

b691db87

iproto: drop unused iproto_type_is_request · e5eef3cf

Cyrill Gorcunov authored 4 years ago


Introduced in 157beda5
and never used since.

In-scope-of #4842

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

e5eef3cf

iproto: drop unused iproto_type_is_select · f9644e00

Cyrill Gorcunov authored 4 years ago


Last time used in 1d979029

In-scope-of #4842

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

f9644e00

iproto: drop unused iproto_type_is_sync · 62edef2e

Cyrill Gorcunov authored 4 years ago


Introduced in 157beda5
but never used since.

In-scope-of #4842

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

62edef2e

Jun 26, 2020

test: flaky box/net.box_wait_connected_gh-3856 · d51be6f4

Alexander V. Tikhonov authored 4 years ago


Found issue running test on FreeBSD VBox host:

 [011] --- box/net.box_wait_connected_gh-3856.result	Mon Jun 15 09:39:49 2020
 [011] +++ box/net.box_wait_connected_gh-3856.reject	Fri May  8 08:23:30 2020
 [011] @@ -12,7 +12,8 @@
 [011]  - opts:
 [011]      wait_connected: false
 [011]    host: 8.8.8.8
 [011] -  state: initial
 [011] +  state: error
 [011] +  error: Invalid argument
 [011]    port: '123456'
 [011]  ...
 [011]  c:close()

A. Turenko made deep investigation and found that the reason of the
fail was that getaddrinfo() returned EIA_SERVICE for an incorrect
TCP/IP port on FreeBSD, but crops it as modulo of 65536 on Linux/glibc.
Checked with his local script './getaddrinfo':

  (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456
  ----
  family: AF_INET
  socktype: SOCK_STREAM
  protocol: IPPROTO_TCP
  host: 8.8.8.8
  serv: 57920

  (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456
  getaddrinfo: Service was not recognized for socket type

So obvious fix is to change 123456 to something less or equal to
65535. Say, 1234.

The test depended on an order in which fibers were scheduled
(net_box.connect() creates a separate fiber for connecting in background
using fiber.create(), which yields). Unlikely our fiber were not get
execution time during the connection attempt, so it was more like a
formal thing.

But we can decrease probability of this situation even more if we'll
grab all connection fields just when net_box.connect() returns, not
after yield in console (which is due to waiting a next command from
test-run).

Closes #5083

Co-authored-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>

d51be6f4

test: fix flaky replication/wal_rw_stress.test.lua · 06eda0f7

Alexander V. Tikhonov authored 4 years ago

Found issue (reproduced on VBox FreeBSD machine):

 [016] --- replication/wal_rw_stress.result	Fri Feb 21 11:53:21 2020
 [016] +++ replication/wal_rw_stress.reject	Fri May  8 08:23:56 2020
 [016] @@ -73,7 +73,42 @@
 [016]  ...
 [016]  box.info.replication[1].downstream.status ~= 'stopped' or box.info
 [016]  ---
 [016] -- true
 [016] +- version: 2.5.0-27-g32f59756a
 [016] +  id: 2
 [016] +  ro: false
 [016] +  uuid: 41cbebcc-9105-11ea-96ab-08002739cbd6
 [016] +  package: Tarantool
 [016] +  cluster:
 [016] +    uuid: 397c196f-9105-11ea-96ab-08002739cbd6
 [016] +  listen: unix/:/home/vagrant/tarantool/test/var/016_replication/replica.socket-iproto
 [016] +  replication:
 [016] +    1:
 [016] +      id: 1
 [016] +      uuid: 397a1886-9105-11ea-96ab-08002739cbd6
 [016] +      lsn: 10005
 [016] +      upstream:
 [016] +        status: follow
 [016] +        idle: 0.46353673400017
 [016] +        peer: unix/:/home/vagrant/tarantool/test/var/016_replication/master.socket-iproto
 [016] +        lag: -0.45732522010803
 [016] +      downstream:
 [016] +        status: stopped
 [016] +        message: writev(1), called on fd 24, aka unix/:/home/vagrant/tarantool/test/var/016_replicati
 [016] +        system_message: Broken pipe
 [016] +    2:
 [016] +      id: 2
 [016] +      uuid: 41cbebcc-9105-11ea-96ab-08002739cbd6
 [016] +      lsn: 0
 [016] +  signature: 10005
 [016] +  status: running
 [016] +  vinyl: []
 [016] +  uptime: 2
 [016] +  lsn: 0
 [016] +  sql: []
 [016] +  gc: []
 [016] +  pid: 41231
 [016] +  memory: []
 [016] +  vclock: {1: 10005}
 [016]  ...
 [016]  test_run:cmd("switch default")
 [016]  ---

To check the downstream status and it's message need to wait until an
downstream appears. This prevents an attempt to index a nil value when
one of those functions are called before a record about a peer appears
in box.info.replication. It was observed on test:
  replication/show_error_on_disconnect
after commit
  c6bea65f ('replication: recfg with 0
quorum returns immediately').

Checked that test still checks the error for which it was created at
b9db91e1 ('xlog: fix fallocate vs
read race') patch and successfully got the needed error "tx checksum
mismatch":

[153] --- replication/wal_rw_stress.result      Fri Jun 19 15:01:49 2020
[153] +++ replication/wal_rw_stress.reject      Fri Jun 19 15:04:02 2020
[153] @@ -73,7 +73,43 @@
[153]  ...
[153]  test_run:wait_cond(function() return box.info.replication[1].downstream.status ~= 'stopped' end) or box.info
...
[153] +      downstream:
[153] +        status: stopped
[153] +        message: tx checksum mismatch

Note that wait_cond() allows to overcome a transient network
connectivity errors, but 'tx checksum mismatch' is persistent
one and will be catched.

Closes #4977

06eda0f7

test: fix flaky replication/wal_off.test.lua · 3e904475

Alexander V. Tikhonov authored 4 years ago

Found issue:

[003] --- replication/wal_off.result	Thu Apr 25 13:10:18 2019
[003] +++ replication/wal_off.reject	Tue Jul 16 17:10:31 2019
[003] @@ -95,6 +95,8 @@
[003]  ...
[003]  while string.find(box.info.replication[wal_off_id].upstream.message, check) == nil do fiber.sleep(0.01) end
[003]  ---
[003] +- error: '[string "while string.find(box.info.replication[wal_of..."]:1: bad argument
[003] +    #1 to ''find'' (string expected, got nil)'
[003]  ...
[003]  box.cfg { replication = "" }
[003]  ---

To check the upstream status and it's message need to wait until an
upstream appears. This prevents an attempt to index a nil value when
one of those functions are called before a record about a peer appears
in box.info.replication. It was observed on test:
  replication/show_error_on_disconnect
after commit
  c6bea65f ('replication: recfg with 0
quorum returns immediately').

Closes #4355

3e904475

box: reduce box_process_lua Lua GC memory usage · e88c0d21

Igor Munkin authored 4 years ago


<box_process_lua> function created a new GCfunc object for a handler
having no upvalues depending on the request context on each call.

The change introduces the following mapping:
| <handler id> -> <handler GCfunc object>
Initializing this mapping on Tarantool startup is aimed to reduce Lua GC
memory usage.

Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>
Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>

e88c0d21

test: disable JIT for Lua Fun chain iterator · 5fa7ded2

Igor Munkin authored 4 years ago


JIT compiler can generate an invalid trace for <fun.chain> iterator
(i.e. chain_gen_r1) breaking its semantics (see LuaJIT/LuaJIT#584).
Since interpreter works fine and produces the right results, disabling
JIT for this function stops execution failures.

As a result box-tap/key_def.test.lua is removed from box-tap suite
fragile tests list.

Relates to LuaJIT/LuaJIT#584
Fixes #4252

Reviewed-by: Alexander V. Tikhonov <avtikhon@tarantool.org>
Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>

5fa7ded2

Jun 23, 2020

vinyl: restart read iterator in case L0 is changed · 83462a5c

Nikita Pettik authored 4 years ago

Data read in vinyl is known to yield in case of disc access. So it opens
a window for modifications of in-memory level. Imagine following scenario:
right before data selection tuple is inserted into space. It passes first
stage of commit procedure, i.e. it is prepared to be committed but still
is not yet reached WAL. Meanwhile iterator is starting to read the same key.
At this moment prepared statement is already inserted to in-memory tree
ergo visible to read iterator. So, read iterator fetches this statement
and proceeds to disk scan. In turn, disk scan yields and in this moment
WAL fails to write statement on disk. Next, two cases are possible:
1. WAL thread has enough time to complete rollback procedure.
2. WAL thread fails to finish rollback in this time gap.

In the first case read iterator should skip statement: version of
in-memory tree has diverged from iterator's one, so we fall back into
iterator restoration procedure. Mem iterator might become invalid so
the only choice is to restart whole 'advance' routine.
Let's don't try to restore it and always restart iteration cycle if
L0 level has changed during yield.

In the second case nothing is changed to read iterator, so it simply
returns prepared statement (and it is considered to be OK).

Closes #3395

83462a5c

vinyl: fix passing uninitialized parameter to vy_page_find_key() · f84cb1aa

Nikita Pettik authored 4 years ago

vy_page_find_key() assumes that equal_key parameter is initialized since
it is used unconditionally. Originally, function was designed with
assumption that parameter is initialized by caller. Since then it has
been used in several other places, but some callers doesn't initialize
this parameter to 'false' value. Let's fix it and inside
vy_page_find_key() set this output parameter to false by default.

Closes #5078

f84cb1aa

Jun 22, 2020

box: always reconfigure box at non-first box.cfg() · 4c0d4a0c

Maria authored 5 years ago


Calling box.cfg{} more than once does not normally cause any errors
(even though it might not have any effect). In contrast, assigning
it to some variable and then using it after the box was configured
caused an error since the method was overwritten by the initial call
of <load_cfg>.

The patch fixes this issue making box.cfg behave consistently in both
scenarios.

Follow-up #4231

Co-developed-by: Alexander Turenko <alexander.turenko@tarantool.org>

4c0d4a0c

box: always wait box loading in box.execute() · e8d5515a

Alexander Turenko authored 4 years ago

<box_load_and_execute> checks whether box is configured with appropriate
locking and configures it when necessary. However it is not so for
<lbox_execute>. We should replace the former with the latter only when
box is fully loaded.

Follow-up #4231

e8d5515a

box: check whether box is loaded in box.execute() · 859df3a7

Maria authored 5 years ago


box.execute() initializes box if it is not initialized. For this sake,
box.execute() is another function (so called <box_load_and_execute>)
when box is not loaded: it loads box and calls 'real' box.execute().
However it is not enough: <box_load_and_execute> may be saved by a user
before box initialization and called after box loading or during box
loading from a separate fiber.

Note: calling <box_load_and_execute> during box loading is safe now, but
calling of box.execute() is not: the 'real' box.execute() does not
verify whether box is configured. It will be fixed in a further commit.

This commit changes <box_load_and_execute> to verify whether box is
initialized and to load box only when it is not loaded. Also it adds
appropriate locking around load_cfg() invocation from
<box_load_and_execute>.

While we're here, clarified contracts of functions that set box
configuration options.

Closes #4231

Co-developed-by: Alexander Turenko <alexander.turenko@tarantool.org>

859df3a7

small: bump new version · ebfd2d1d
Kirill Yukhin authored 4 years ago
```
- test: don't use not aligned size for mempool
```
ebfd2d1d

xlog: xdir_format_filename -- use PATH_MAX · 7520b32f

Cyrill Gorcunov authored 4 years ago


No need for +1 byte here, PATH_MAX already implies
end of string.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

7520b32f

xlog: xlog_cursor -- use sizeof with snprintf for safety · 0ca3acd9

Cyrill Gorcunov authored 4 years ago


This is more consistent than relying that array size
will remain PATH_MAX forever.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

0ca3acd9

xlog: use PATH_MAX for filename · 6fcd40ce

Cyrill Gorcunov authored 4 years ago


Similar to dirname there is no need for +1 byte.
Same time make sure xlog_open never end up without
trailing zero.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

6fcd40ce

xlog: xdir -- use PATH_MAX for dirname · 175d0b89

Cyrill Gorcunov authored 4 years ago


The PATH_MAX is the longest path including end
of string, no need for +1 byte.

Same time use sizeof(dirname) to not bound how
exactly dirname is declared.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

175d0b89

xlog: xlog_cursor -- eliminate redundant pad in the structure · 811252bb

Cyrill Gorcunov authored 4 years ago


This makes structure less in size and eliminates useless
padding (both enum and fd are integers 4 bytes long).

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

811252bb

box: drop inline from box_cfg_xc · a4391057

Cyrill Gorcunov authored 4 years ago


There is serious "inline disease" in the code:
it spread left and right without a serious reason.

The box_cfg_xc function is a pretty big one and
doesn't require being inlined anyhow.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

a4391057

Jun 19, 2020

sql: raise an error on attempt to use HASH index in SQL · e7a70be4

Kirill Yukhin authored 4 years ago

Since currently query planner is unable to use HASH indexes
and attempt to use it will likely lead to SEGFAULT, this
patch raises an error on attempt to open VDBE cursor
against HASH index.

@TarantoolBot document
Title: Doceument allowed index type for SQL
Before the change, Tarantool query planner segfaulted on
try of using non-tree index. It is blocked now w/ appropriate
error message. Need to document the behaviour.
It should be noted, that this restriction might be relaxed in future.

Closes #4659

e7a70be4

Jun 17, 2020

fio/coio: handle partial writes · a9276dae

Cyrill Gorcunov authored 4 years ago


Writing less bytes than requested is perfectly fine. In turn out
that fio.write/pwrite api simply returns 'true' even if only some
part of a buffer has been written.

Thus make coio_write and coio_pwrite to write the whole data in
a cycle. Note in most situations there will be only one pass,
partial writes are really the rare cases.

Note that we're not handling nonblocking writes here (which
could return EAGAIN) simply because we need an other api
which would accept timeouts.

Fixes #4651

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

a9276dae

Jun 16, 2020

memtx: fix tuples references on concurrent replaces · 8c53942e

Ilya Kosarev authored 4 years ago

Since 527b02a2 (memtx: add yields
during index build) memtx_build_on_replace was introduced to handle
concurrent updates. The problem here was that the tuples being handled
with this trigger did not get reference counter promotion, leading to a
number of wrong behavior cases. Now this problem is solved.
This problem was found through primary index altering with updates in
background fiber. Corresponding test is introduced.

Closes #4973

8c53942e

cmake: split UB sanitations into separate flags. · 5115d9f3

Vladislav Shpilevoy authored 4 years ago

Clang undefined behaviour sanitizer was turned on using
-fsanitize=undefined flag, which is supposed to turn on all the
sanitizations, except a few ones. Not needed sanitations were
turned off explicitly, using -fno-sanitize=<type> flags. However
appeared it does not work with some flags. For example,
nullability sanitations can't be turned off when
-fsanitize=undefined is used.

Nullability sanitations lead to lots of false-positive fails
such as typeof(*obj) where obj is NULL, or memcpy() with NULL
destination but 0 size.

The patch splits -fsanitize=undefined into separate flags and
never turns on nullability checks.

Part of #4609

5115d9f3

sql: don't build sql as a separate library · 35473d5d

Vladislav Shpilevoy authored 4 years ago

SQL heavily depends on box, and box on SQL. So they can't be
separate libraries. The build started failing with undefined box
symbols in SQL, when code of the latter has slightly changed in
one of the recent commits.

The build failed only with UB sanitizer enabled, but
'VERBOSE=1 make' showed that both with UB and without UB the build
command was the same (not counting -fsanitize flags). So the
sanitizer has nothing to do with it.

The patch makes SQL sources being built as a part of box library.

Closes #5067

35473d5d