Commits · 85eea4a8da23328a40fe2cface6bf663e7a88d3c · core / tarantool

Apr 10, 2020

coio: add *_noxc read / write functions · 85eea4a8


The popen implementation is written in C and uses coio read / write
functions. If an exception occurs, it'll pass through the C code. It
should be catched to proceed correctly.

We usually have foo() and foo_xc() (exception) functions when both
variants are necessary. Here I added non-conventional *_noxc() functions
as the temporary solution to postpone refactoring of the code and all
its usages.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

85eea4a8

coio: fix obsoleted comment in coio_write_timeout · b4360fec

Alexander Turenko authored 4 years ago


The comment was added in 52765de6, but
becomes non-actual since 1.6.6-21-gc74abc786 ('Implement special
TimedOut exception type and use it in coio and latch.')

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

b4360fec

popen: add missed diag_set() in popen IO functions · 4498d4f2

Alexander Turenko authored 4 years ago


Our usual convention for C code is to return a negative value at failure
and set an entry to the diagnostics area.

When code uses this convention consistently, it is much easier to handle
failures when using it: you always know where to find an error type and
message and how to pass the error to a C or Lua caller.

See also the previous commit ('popen: add missed diag_set in
popen_signal/delete').

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

4498d4f2

popen: remove redundant fd check before perform IO · 1ef95b99

Alexander Turenko authored 4 years ago


The function already checks flags to find out whether the file
descriptor should be available for reading / writing. When it is so, the
corresponding fd is great or equal to zero.

The further commits will add missed diagnostics for IO functions and it
is hard to write a meaningful error message for a situation that is not
possible. Moreover, we would obligated to document the error as one of
possible failures in a function contract (while it can't occur).

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

1ef95b99

popen: unblock popen_read_timeout at a first byte · 631f5f37

Alexander Turenko authored 4 years ago


Before this change popen_read_timeout() waits until a passed buffer will
be fully filled (or until EOF / timeout / IO error occurs). Now it waits
for any amount of data (but at least one byte).

It allows to communicate with an interactive child program: write, read,
repeat.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

631f5f37

popen: add const qualifier to popen_write_timeout · 04b0432d

Alexander Turenko authored 4 years ago


The buffer is for reading, we're not intend to change it.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

04b0432d

popen: decouple logger fd from stderr · 07a07b3c

Alexander Turenko authored 4 years ago


The default logger configuration writes logs to stderr.

Popen implementation holds a logger fd until execve() to be able to
write debug entries or information about a failure from a child. However
when popen flags requires to close stderr in the child, the logger fd
becomes closed: logging will fail.

Another problem appears when a user want to capture stderr and
tarantool's log level is set to debug (7). Since the logger uses stderr
and it is fed to the parent using a pipe, the logger output will not
shown on the 'real' stderr, but will be captured together with child's
program debugging output.

This commit duplicates a logger file descriptor that allows to close or
redirect child's stderr without described side effects.

See also 86ec3a5c ('popen: add logging
in child process').

Areas for improvements:

* Copy logger fd at module initialization time instead of copying of
  each popen call.

Alternatives:

* Extend logger to allow to accumulate log entries in a buffer. Flush
  the buffer from the parent process. (It is possible since vfork does
  not split a virtual memory space).

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

07a07b3c

say: allow to set a logger file descriptor · 67c6a6e6

Alexander Turenko authored 4 years ago


It is necessary to decouple stderr from a logger file descriptor in the
popen implementation.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

67c6a6e6

popen: add logging of fds closed in a child · 16c83356

Alexander Turenko authored 5 years ago


It is useful for debugging popen behaviour around file descriptors.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

16c83356

popen: add missed diag_set in popen_signal/delete · 96a25ee0

Alexander Turenko authored 4 years ago


Lua API will use content of the diagnostics area to report an error to a
caller, so it is critical to always have proper diagnostics at failure.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

96a25ee0

popen: remove retval from popen_state() · e1579978

Alexander Turenko authored 4 years ago


After the previous commit ('popen: require popen handle to be non-NULL')
it turns out that popen_state() function always succeeds. There is no
reason to return a success / failure value from it.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

e1579978

popen: require popen handle to be non-NULL · 922cef65

Alexander Turenko authored 4 years ago


Further commits will add proper entries into the diagnostics area for
failures inside popen functions. We should either report handle == NULL
case via the diagnostics area or ensure that the NULL handle case is not
possible.

The latter approach is implemented in this commit. There are two
reasons for this:

* This way simplifies function contracts (one less kind of failure).
* The popen Lua module (that will be implemented in the further commits)
  will not construct any logic using NULL as a handle. When 'NULL
  handle' error is not possible in the C API, it will be easier to
  verify that this failure is not possible the Lua API.

A user of the C API should take care to don't call those functions with
NULL handle.

Part of #4031

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

922cef65

Apr 08, 2020

test: add replication/gh-4730-applier-rollback · a82ec304

Cyrill Gorcunov authored 5 years ago


Test that diag_raise doesn't happen if async transaction
fails inside replication procedure.

Side note: I don't like merging tests with patches in
general and I hate doing so for big tests with a passion
because it hides the patch code itself. So here is a
separate patch on top of the fix.

Test-of #4730

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

a82ec304

applier: prevent nil dereference on applier rollback · 73b97984

Cyrill Gorcunov authored 4 years ago


Currently when transaction rollback happens we just drop an existing
error setting ClientError to the replicaset.applier.diag. This action
leaves current fiber with diag=nil, which in turn leads to sigsegv once
diag_raise() called right after applier_apply_tx():

 | applier_f
 |   try {
 |   applier_subscribe
 |     applier_apply_tx
 |       // error happens
 |       txn_rollback
 |         diag_set(ClientError, ER_WAL_IO)
 |         diag_move(&fiber()->diag, &replicaset.applier.diag)
 |         // fiber->diag = nil
 |       applier_on_rollback
 |         diag_add_error(&applier->diag, diag_last_error(&replicaset.applier.diag)
 |         fiber_cancel(applier->reader);
 |     diag_raise() -> NULL dereference
 |   } catch { ... }

Thus:
 - use diag_set_error() instead of diag_move() to not drop error
   from a current fiber() preventing a nil dereference;
 - put fixme mark into the code: we need to rework it in a
   more sense way.

Fixes #4730

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

73b97984

applier: reduce applier_txn_rollback_cb code density · 069901c8
Cyrill Gorcunov authored 5 years ago
```
To make it a bit more readable.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
```
069901c8

replication: merge replica_by_id into replicaset · 7d904b2a

Cyrill Gorcunov authored 5 years ago


For some reason the replica_by_id member (which is an
array of pointers) is allocated dynamically. Moreover
VCLOCK_MAX = 32 by now and extending it to some new
limit will require a way more efforts than just increase
the number.

Thus reserve memory for replica_by_id inside replicaset
statically. This allows to simplify code a bit and
drop calloc/free calls.

The former code comes from edd76a2a without any
explanation why the dynamic member is needed.

Acked-by: Konstantin Osipov <kostja.osipov@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

7d904b2a

applier: add missing diag_set on region_alloc failure · 26645974

Cyrill Gorcunov authored 5 years ago


In case if we're hitting memory limit allocating triggers
we should setup diag error to prevent nil dereference
in diag_raise call (for example from applier_apply_tx).

Note that there are region_alloc_xc helpers which are
throwing errors but as far as I understand we need the
rollback action to process first instead of immediate
throw/catch thus we use diag_set.

Acked-by: Sergey Ostanevich <sergos@tarantool.org>
Acked-by: Konstantin Osipov <kostja.osipov@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

26645974

request: add missing OutOfMemory diag_set · ae92bf93

Cyrill Gorcunov authored 5 years ago


In request_create_from_tuple and request_handle_sequence
we may be unable to request memory for tuples, don't
forget to setup diag error otherwise diag_raise will
lead to nil dereference.

Acked-by: Sergey Ostanevich <sergos@tarantool.org>
Acked-by: Konstantin Osipov <kostja.osipov@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

ae92bf93

alter: shrink txn_alter_trigger_new code · 73e81976

Cyrill Gorcunov authored 4 years ago


Instead of calling memset which is useless here
just use trigger_create helper.

Acked-by: Konstantin Osipov <kostja.osipov@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

73e81976

box: fix bootstrap comment · 78f58bfb

Cyrill Gorcunov authored 5 years ago


We're not starting new master node but
a new instance instead. The comment simply
leftover from older modifications.

Acked-by: Konstantin Osipov <kostja.osipov@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

78f58bfb

Apr 07, 2020

sql: reset values to be bound after execution · df03a7e8

Nikita Pettik authored 4 years ago

Before this patch prepared statements didn't reset bound values after
its execution. As a result, if during next execution cycle not all
parameters were provided, cached values would appear. For instance:

prep = box.prepare('select :a, :b, :c')
prep:execute({{[':a'] = 1}, {[':b'] = 2}, {[':c'] = 3}}
-- [1, 2, 3]
prep:execute({{[':a'] = 1}, {[':b'] = 2}})
-- [1, 2, 3]

However, expected result for the last query should be [1, 2, NULL].
Let's fix it and always reset all binding values before next execution.

Closes #4825

df03a7e8

iproto: support error stacked diagnostic area · 4c465312

Nikita Pettik authored 5 years ago

This patch introduces support of stacked errors in IProto protocol and
in net.box module.

Closes #1148

@TarantoolBot document
Title: Stacked error diagnostic area

Starting from now errors can be organized into lists. To achieve this
Lua table representing error object is extended with .prev field and
e:set_prev(err) method. .prev field returns previous error if any exist.
e:set_prev(err) method expects err to be error object or nil and sets
err as previous error of e. For instance:
```
e1 = box.error.new({code = 111, reason = "cause"})
e2 = box.error.new({code = 111, reason = "cause of cause"})

e1:set_prev(e2)
assert(e1.prev == e2) -- true
```
Cycles are not allowed for error lists:
```
e2:set_prev(e1)
- error: 'builtin/error.lua: Cycles are not allowed'
```
Nil is valid input to :set_prev() method:
```
e1:set_prev(nil)
assert(e1.prev == nil) -- true
```
Note that error can be 'previous' only to the one error at once:
```
e1:set_prev(e2)
e3:set_prev(e2)
assert(e1.prev == nil) -- true
assert(e3.prev == e2) -- true
```
Setting previous error does not erase its own previous members:
```
-- e1 -> e2 -> e3 -> e4
e1:set_prev(e2)
e2:set_prev(e3)
e3:set_prev(e4)
e2:set_prev(e5)
-- Now there are two lists: e1->e2->e5 and e3->e4
assert(e1.prev == e2) -- true
assert(e2.prev == e5) -- true
assert(e3.prev == e4) -- true
```
Alternatively:
```
e1:set_prev(e2)
e2:set_prev(e3)
e3:set_prev(e4)
e5:set_prev(e3)
-- Now there are two lists: e1->e2 and e5->e3->e4
assert(e1.prev == e2) -- true
assert(e2.prev == nil) -- true
assert(e5.prev == e3) -- true
assert(e3.prev == e4) -- true
``
Stacked diagnostics is also supported by IProto protocol. Now responses
containing errors always (even if there's only one error to be returned)
include new IProto key: IPROTO_ERROR_STACK (0x51). So, body corresponding to
error response now looks like:
```
MAP{IPROTO_ERROR : string, IPROTO_ERROR_STACK : ARRAY[MAP{ERROR_CODE : uint, ERROR_MESSAGE : string}, MAP{...}, ...]}
```
where IPROTO_ERROR is 0x31 key, IPROTO_ERROR_STACK is 0x52, ERROR_CODE
is 0x01 and ERROR_MESSAGE is 0x02.
Instances of older versions (without support of stacked errors in
protocol) simply ignore unknown keys and still rely only on IPROTO_ERROR
key.

4c465312

box: always promote error created via box.error() to diag · 4bcaf15e

Nikita Pettik authored 4 years ago

This patch makes box.error() always promote error to the diagnostic
area despite of passed arguments.

Closes #4829

@TarantoolBot document
Title: always promote error created via box.error() to diag

box.error() is able to accept two types of argument: either pair of code
and reason (box.error{code = 555, reason = 'Arbitrary message'}) or error
object (box.error(err)). In the first case error is promoted to
diagnostic area, meanwhile in the latter - it is not:
```
e1 = box.error.new({code = 111, reason = "cause"})
box.error({code = 111, reason = "err"})
- error: err
box.error.last()
- err
box.error(e1)
- error: cause
box.error.last()
- err
```
From now box.error(e1) sets error to diagnostic area as well:
```
box.error(e1)
- error: cause
box.error.last()
- cause
```

4bcaf15e

iproto: refactor error encoding with mpstream · ba7304fb

Kirill Shcherbatov authored 5 years ago

Refactor iproto_reply_error and iproto_write_error with a new
mpstream-based helper mpstream_iproto_encode_error that encodes
error object for iproto protocol on a given stream object.
Previously each routine implemented an own error encoding, but
with the increasing complexity of encode operation with following
patches we need a uniform way to do it.

The iproto_write_error routine starts using region location
to use region-based mpstream. It is not a problem itself, because
errors reporting is not really performance-critical path.

Needed for #1148

ba7304fb

box/error: clarify purpose of reference counting in struct error · 1447693d
Nikita Pettik authored 5 years ago

1447693d

box: use stacked diagnostic area for functional indexes · c15cef54

Nikita Pettik authored 5 years ago

Since we've introduced stacked diagnostic in previous commit, let's use
it in the code implementing functional indexes.

Part of #1148

c15cef54

box: introduce stacked diagnostic area · 3b887d04

Nikita Pettik authored 5 years ago

In terms of implementation, now struct error objects can be organized
into double-linked lists. To achieve this pointers to the next and
previous elements (cause and effect correspondingly) have been added to
struct error. It is worth mentioning that already existing rlist and
stailq list implementations are not suitable: rlist is cycled list, as a
result it is impossible to start iteration over the list from random
list entry and finish it at the logical end of the list; stailq is
single-linked list leaving no possibility to remove elements from the
middle of the list.

As a part of C interface, box_error_add() has been introduced. In
contrast to box_error_set() it does not replace last raised error, but
instead it adds error to the list of diagnostic errors having already
been set. If error is to be deleted (its reference counter hits 0 value)
it is unlinked from the list it belongs to and destroyed. Meanwhile,
error destruction leads to decrement of reference counter of its
previous error and so on.

To organize errors into lists in Lua, table representing error object in
Lua now has .prev field (corresponding to 'previous' error) and method
:set_prev(e). The latter accepts error object (i.e. created via
box.error.new() or box.error.last()) and nil value. Both field .prev and
:set_prev() method are implemented as ffi functions. Also note that
cycles are not allowed while organizing errors into lists:
e1 -> e2 -> e3; e3:set_prev(e1) -- would lead to error.

Part of #1148

3b887d04

Apr 06, 2020
- error: remove an unused global variable from diag.c · 8e7a2e11
  Vladislav Shpilevoy authored 4 years ago
  
  8e7a2e11
Mar 27, 2020

box/error: don't set error created via box.error.new to diag · eaa86088

Nikita Pettik authored 5 years ago

To achieve this let's refactor luaT_error_create() to return error
object instead of setting it via box_error_set().
luaT_error_create() is used both to handle box.error() and
box.error.new() invocations, and box.error() is still expected to set
error to diagnostic area. So, luaT_error_call() which implements
box.error() processing at the end calls diag_set_error().
It is worth mentioning that net.box module relied on the fact that
box.error.new() set error to diagnostic area: otherwise request errors
don't get to diagnostic area on client side.

Needed for #1148
Closes #4778

@TarantoolBot document
Title: Don't promote error created via box.error.new to diagnostic area

Now box.error.new() only creates error object, but doesn't set it to
Tarantool's diagnostic area:
```
box.error.clear()
e = box.error.new({code = 111, reason = "cause"})
assert(box.error.last() == nil)
---
- true
...
```
To set error in diagnostic area explicitly box.error.set() has been
introduced. It accepts error object which is set as last system error
(i.e. becomes available via box.error.last()).
Finally, box.error.new() does not longer accept error object as an
argument (this was undocumented feature).
Note that patch does not affect box.error(), which still pushes error to
diagnostic area. This fact is reflected in docs:
'''
Emulate a request error, with text based on one of the pre-defined
Tarantool errors...
'''

eaa86088

Mar 26, 2020

fio: on close() don't loose fd if internal close fails · 177679a6

Vladislav Shpilevoy authored 5 years ago

File descriptor was set to -1 regardless of whether
the object was closed properly. As a result, in case
of an error the descriptor would leak.

GC finalizer of a descriptor is left intact not to
overcomplicate it.

177679a6

swim: use fiber._internal.schedule_task() for GC · f073834b

Vladislav Shpilevoy authored 5 years ago

swim object created a new fiber in its GC function, because C
function swim_delete() yields, and can't be called from an ffi.gc
hook.

It is not needed since the fiber module has a single worker
exactly for such cases. The patch uses it.

Follow up #4727

f073834b

fio: close unused descriptors automatically · 3d5b4daa

Vladislav Shpilevoy authored 5 years ago

Fio.open() returned a file descriptor, which was not closed
automatically after all its links were nullified. In other words,
GC didn't close the descriptor.

This was not really useful, because after fio.open() an exception
may appear, and user needed to workaround this to manually call
fio_object:close(). Also this was not consistent with io.open().

Now fio.open() object closes the descriptor automatically when
GCed.

Closes #4727

@TarantoolBot document
Title: fio descriptor is closed automatically by GC

fio.open() returns a descriptor which can be closed manually by
calling :close() method, or it will be closed automatically, when
it has no references, and GC deletes it.

:close() method existed always, auto GC was added just now.

Keep in mind, that the number of file descriptors is limited, and
they can end earlier than GC will be triggered to collect not
used descriptors. It is always better to close them manually as
soon as possible.

3d5b4daa

fiber: introduce schedule_task() internal function · 8443bd93

Vladislav Shpilevoy authored 5 years ago

fiber._internal.schedule_task() is an API for a singleton fiber
worker object. It serves for not urgent delayed execution of
functions. Main purpose - schedule execution of a function, which
is going to yield, from a context, where a yield is not allowed.
Such as an FFI object's GC callback.

It will be used by SWIM and by fio, whose destruction yields, but
they need to use GC finalizer, where a yield is not allowed.

Part of #4727

8443bd93

box/error: introduce box.error.set() method · f00945a1

Nikita Pettik authored 5 years ago

box.error.set(err) sets err to instance's diagnostics area. Argument err
is supposed to be instance of error object. This method is required
since we are going to avoid adding created via box.error.new() errors to
Tarantool's diagnostic area.

Needed for #1148
Part of #4778

f00945a1

box: rename diag_add_error to diag_set_error · 55a39946

Kirill Shcherbatov authored 5 years ago

Let's rename diag_add_error() to diag_set_error() because it actually
replaces an error object in diagnostic area with a new one and this name
is not representative. Moreover, we are going to introduce a new
diag_add_error() which will place error at the top of stack diagnostic
area.

Needed for #1148

55a39946

popen: do not require space for shell args · c6b8ed77

Cyrill Gorcunov authored 5 years ago


In case of direct execute without using a shell there
is no need to require a caller to allocate redundant
space, lets pass executable name in first argument.

Since this is yet testing api we're allowed to change
without breaking aything.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

c6b8ed77

Mar 23, 2020

evio: workaround for wsl1 so_linger assertion · 734bcafc

Timur Safin authored 5 years ago


SO_LINGER makes no much sense for unix-sockets, and Microsoft WSL
is returning EINVAL if setsockopts called for SO_LINGER over unix
sockets:

  [004] 2020-03-11 18:42:29.592 [29182] main/102/app sio.c:169 !> SystemError setsockopt(SO_LINGER), called on fd 16, aka
  [004] 2020-03-11 18:42:29.592 [29182] main/102/app F> can't initialize storage: setsockopt(SO_LINGER), called on fd 16,
  [004] 2020-03-11 18:42:29.592 [29182] main/102/app F> can't initialize storage: setsockopt(SO_LINGER), called on fd 16,

And it's sort of correct here, but the problem is Linux is simply
silently ignoring it, which passes tests.

After much debates we decided to work-around this case via CMAKE
define.

NB! In a future (April/May 2020), when WSL2 with full Linux kernel
would be released we should disable this check.

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

734bcafc

Mar 20, 2020

vinyl: update mem ptr in vy_build_insert_tuple() after yield · 17f6af7d

Nikita Pettik authored 5 years ago

vy_build_insert_tuple() processes insertion into secondary indexes being
created. It contains yield points during which in-memory level of LSM
tree may change (for example rotate owing to triggered dump). So after
yield point it is required to fetch from LSM struct pointer to mem again
to operate on valid metadata. This patch updates pointer to mem after
mentioned yield point.

Closes #4810

17f6af7d

box/journal: redesign journal operations · 77ba0e35

Cyrill Gorcunov authored 5 years ago


Currently the journal provides only one method -- write,
which implies a callback to trigger upon write completion
(in contrary with 1.10 series where all commits were
processing in synchronous way).

Lets make difference between sync and async writes more
notable: provide journal::write_async method which runs
completion function once entry is written, in turn
journal:write handle transaction in synchronous way.

Redesing notes:

1) The callback for async write set once in journal
   creation. There is no need to carry callback in
   every journal entry. This allows us to save some
   memory;

2) txn_commit and txn_commit_async call txn_rollback
   where appropriate;

3) no need to call journal_entry_complete on sync
   writes anymore;

4) wal_write_in_wal_mode_none is too long, renamed
   to wal_write_none;

5) wal engine use async writes internally but it is
   transparent to callers.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

77ba0e35

box/txn: move journal allocation into separate routine · 5d6e71e3

Cyrill Gorcunov authored 5 years ago


This makes code easier to read and allows to reuse
txn allocation in sync\async writes.

Acked-by: Konstantin Osipov <kostja.osipov@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

5d6e71e3