Commits · d3a7dd17d494770b57bbac4e462f6423c82ea22b · core / tarantool

May 08, 2020

static build: dockerfile entrypoint set to exec form · d3a7dd17

HustonMmmavr authored 4 years ago

According to dockerfile reference, there are two forms of specifying
entrypoint: exec and shell. Exec form is preferrable and  allows use
this image in scripts.

Close #4960

d3a7dd17

gitlab-ci: add Catalina OSX 10.15 · 76157ef6

Alexander V. Tikhonov authored 5 years ago

Added Catalina OSX 10.15 to gitlab-ci testing and removed OSX 10.13,
due to decided to have only 2 last major releases, for now it is
10.14 and 10.15 OSX versions. Also changed the commit job for branches
from 10.14 to 10.15 OSX version.

Additional cleanup for 'box_return_mp' and 'box_session_push',
added API_EXPORT which defines nothrow, compiler warns or errors
depending on the build options.

Part of #4885
Close #4873

76157ef6

test: mark tests as fragile in a test's configs · faf7e482

Alexander V. Tikhonov authored 4 years ago

Fragiled flaky tests from parallel runs to avoid
of flaky fails in regular testing:

  box-py/snapshot.test.py                ; gh-4514
  replication/misc.test.lua              ; gh-4940
  replication/skip_conflict_row.test.lua ; gh-4958
  replication-py/init_storage.test.py    ; gh-4949
  vinyl/stat.test.lua                    ; gh-4951
  xlog/checkpoint_daemon.test.lua        ; gh-4952

Part of #4953

faf7e482

gitlab-ci: keep perf results as gitlab-ci artifacts · eeb501ec

Oleg Piskunov authored 4 years ago

Gitlab-ci pipeline modified in order to keep
performance results into gitlab-ci artifacts.

Closes #4920

eeb501ec

wal: simplify rollback · a4f4adeb

Georgy Kirichenko authored 5 years ago

Here is a summary on how and when rollback works in WAL.

Disk write failure can cause rollback. In that case the failed and
all next transactions, sent to WAL, should be rolled back.
Together. Following transactions should be rolled back too,
because they could make their statements based on what they saw in
the failed transaction. Also rollback of the failed transaction
without rollback of the next ones can actually rewrite what they
committed.

So when rollback is started, *all* pending transactions should be
rolled back. However if they would keep coming, the rollback would
be infinite. This means to complete a rollback it is necessary to
stop sending new transactions to WAL, then rollback all already
sent. In the end allow new transactions again.

Step-by-step:

1) stop accepting all new transactions in WAL thread, where
rollback is started. All new transactions don't even try to go to
disk. They added to rollback queue immediately after arriving to
WAL thread.

2) tell TX thread to stop sending new transactions to WAL. So as
the rollback queue would stop growing.

3) rollback all transactions in reverse order.

4) allow transactions again in WAL thread and TX thread.

The algorithm is long, but simple and understandable. However
implementation wasn't so easy. It was done using a 4-hop cbus
route. 2 hops of which were supposed to clear cbus channel from
all other cbus messages. Next two hops implemented steps 3 and 4.
Rollback state of the WAL was signaled by checking internals of a
preallocated cbus message.

The patch makes it simpler and more straightforward. Rollback
state is now signaled by a simple flag, and there is no a hack
about clearing cbus channel, no touching attributes of a cbus
message. The moment when all transactions are stopped and the last
one has returned from WAL is visible explicitly, because the last
sent to WAL journal entry is saved.

Also there is a single route for commit and rollback cbus
messages now, called tx_complete_batch(). This change will come
in hand in scope of synchronous replication, when WAL write won't
be enough for commit. And therefore 'commit' as a concept should
be washed away from WAL's code gradually. Migrate to solely txn
module.

a4f4adeb

console: check on_shutdown() before exit · c7341a3d

Roman Khabibov authored 5 years ago

Add check that on_shutdown() triggers were called before exit,
because in case of EOF or Ctrl+D (no signals) they were ignored.

Closes #4703

c7341a3d

May 07, 2020

vinyl: init all vars before cleanup in vy_lsm_split_range() · 4dcba1b5

Nikita Pettik authored 5 years ago

If vy_key_from_msgpack() fails in vy_lsm_split_range(), clean-up
procedure is called. However, at this moment struct vy_range *parts[2]
is not initialized ergo contains garbage and access to this structure
may result in crash, segfault or disk formatting. Let's move
initialization of mentioned variables before call of
vy_lsm_split_range().

Part of #4864

4dcba1b5

May 01, 2020

gitlab-ci: add Ubuntu Focal to S3 list · 9f281978

Alexander V. Tikhonov authored 4 years ago

Found that in commit 'travis-ci/gitlab-ci: add Ubuntu Focal 20.04'
forgot to add Ubuntu Focal to the list of the available Ubuntu
distributions in the script for saving built packages at S3.

Follow up #4863

Unverified

9f281978

Apr 30, 2020

travis-ci/gitlab-ci: add Ubuntu Focal 20.04 · 765f338e
Alexander V. Tikhonov authored 4 years ago
```
Closes #4863
```
Unverified

765f338e
Code cleanup: sync declarations and definitions · b136a61e
Sergey Ostanevich authored 5 years ago
```
API_EXPORT defines nothrow, so compiler warns or errors depending on the
build options.

Closes #4885
```
b136a61e

test: fix flaky replication/skip_conflict_row test · f81dae2d

Aleander V. Tikhonov authored 4 years ago

Fixed flaky upstream checks at replication/skip_conflict_row test,
also check on lsn set in test-run wait condition routine.

Errors fixed:

[024] @@ -66,11 +66,11 @@
[024]  ...
[024]  box.info.replication[1].upstream.message
[024]  ---
[024] -- null
[024] +- timed out
[024]  ...
[024]  box.info.replication[1].upstream.status
[024]  ---
[024] -- follow
[024] +- disconnected
[024]  ...
[024]  box.space.test:select()
[024]  ---
[024]

[004] @@ -125,11 +125,11 @@
[004]  ...
[004]  box.info.replication[1].upstream.message
[004]  ---
[004] -- Duplicate key exists in unique index 'primary' in space 'test'
[004] -...
[004] -box.info.replication[1].upstream.status
[004] ----
[004] -- stopped
[004] +- null
[004] +...
[004] +box.info.replication[1].upstream.status
[004] +---
[004] +- follow
[004]  ...
[004]  test_run:cmd("switch default")
[004]  ---
[004]

[038] @@ -174,7 +174,7 @@
[038]  ...
[038]  box.info.replication[1].upstream.status
[038]  ---
[038] -- follow
[038] +- disconnected
[038]  ...
[038]  -- write some conflicting records on slave
[038]  for i = 1, 10 do box.space.test:insert({i, 'r'}) end
Line 201 (often):

[039] @@ -201,7 +201,7 @@
[039]  -- lsn should be incremented
[039]  v1 == box.info.vclock[1] - 10
[039]  ---
[039] -- true
[039] +- false
[039]  ...
[039]  -- and state is follow
[039]  box.info.replication[1].upstream.status
[039]

[030] @@ -201,12 +201,12 @@
[030]  -- lsn should be incremented
[030]  v1 == box.info.vclock[1] - 10
[030]  ---
[030] -- true
[030] +- false
[030]  ...
[030]  -- and state is follow
[030]  box.info.replication[1].upstream.status
[030]  ---
[030] -- follow
[030] +- disconnected
[030]  ...
[030]  -- restart server and check replication continues from nop-ed vclock
[030]  test_run:cmd("switch default")
Line 230 (OSX):

[022] --- replication/skip_conflict_row.result	Thu Apr 16 21:54:28 2020
[022] +++ replication/skip_conflict_row.reject	Mon Apr 27 00:52:56 2020
[022] @@ -230,7 +230,7 @@
[022]  ...
[022]  box.info.replication[1].upstream.status
[022]  ---
[022] -- follow
[022] +- disconnected
[022]  ...
[022]  box.space.test:select({11}, {iterator = "GE"})
[022]  ---
[022]

Close #4457

f81dae2d

Apr 29, 2020

travis-ci: don't deploy 2.5+ pkgs to packagecloud · ba206b48

Alexander Turenko authored 4 years ago

Now we have S3 based infrastructure for RPM / Deb packages and GitLab CI
pipelines, which deploys packages to it.

We don't plan to add 2.5+ repositories on packagecloud.io, so instead of
usual change of target bucket from 2_N to 2_(N+1), the deploy stage is
removed.

Since all distro specific jobs are duplicated in GitLab CI pipelines and
those Travis-CI jobs are needed just for deployment, it worth to remove
them too.

Follows up #3380.
Part of #4947.

Unverified

ba206b48

Apr 28, 2020

schema: fix internal symbols dangling in _G · b56484d6

Vladislav Shpilevoy authored 5 years ago

A couple of functions were mistakenly declared as 'function'
instead of 'local function' in schema.lua. That led to their
presence in the global namespace.

Closes #4812

b56484d6

schema: fix index promotion to functional index · fcce05a4

Vladislav Shpilevoy authored 5 years ago

When index:alter() was called on a non-functional index with
specified 'func', it led to accessing a not declared variable in
schema.lua.

fcce05a4

box: replace port_tuple with port_c everywhere · 4d82478f

Vladislav Shpilevoy authored 4 years ago

Port_tuple is exactly the same as port_c, but is not able to store
raw MessagePack. In theory it sounds like port_tuple should be a
bit simpler and therefore faster, but in fact it is not.
Microbenchmarks didn't reveal any difference. So port_tuple is no
longer needed, all its functionality is covered by port_c.

Follow up #4641

4d82478f

box: introduce box_return_mp() public C function · dd36c610

Vladislav Shpilevoy authored 5 years ago

Closes #4641

@TarantoolBot document
Title: box_return_mp() public C function

Stored C functions could return a result only via
`box_return_tuple()` function. That made users create a tuple
every time they wanted to return something from a C function.

Now public C API offers another way to return - `box_return_mp()`.
It allows to return arbitrary MessagePack, not wrapped into a
tuple object. This is simpler to use for small results like a
number, boolean, or a short string. Besides, `box_return_mp()` is
much faster than `box_return_tuple()`, especially for small
MessagePack.

Note, that it is faster only if an alternative is to create a
tuple by yourself. If an already existing tuple was obtained from
an iterator, and you want to return it, then of course it is
faster to return via `box_return_tuple()`, than via extraction of
tuple data, and calling `box_return_mp()`.

Here is the function declaration from module.h:
```C
/**
 * Return MessagePack from a stored C procedure. The MessagePack
 * is copied, so it is safe to free/reuse the passed arguments
 * after the call.
 * MessagePack is not validated, for the sake of speed. It is
 * expected to be a single encoded object. An attempt to encode
 * and return multiple objects without wrapping them into an
 * MP_ARRAY or MP_MAP is undefined behaviour.
 *
 * \param ctx An opaque structure passed to the stored C procedure
 *        by Tarantool.
 * \param mp Begin of MessagePack.
 * \param mp_end End of MessagePack.
 * \retval -1 Error.
 * \retval 0 Success.
 */
API_EXPORT int
box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end);
```

dd36c610

box: introduce port_c · 4c3c9bda

Vladislav Shpilevoy authored 5 years ago

Port_c is a new descendant of struct port. It is used now for
public C functions to store their result. Currently they can
return only a tuple, but it will change soon, they will be able to
return arbitrary MessagePack.

Port_tuple is not removed, because still is used for box_select(),
for functional indexes, and in SQL as a base for port_sql.
Although that may be changed later. Functional indexes really need
only a single MessagePack object from their function. While
box_select() working via port_tuple or port_c didn't show any
significant difference during micro benchmarks.

Part of #4641

4c3c9bda

Apr 27, 2020

applier: follow vclock to the last tx row · 0edb4d97

Serge Petrenko authored 4 years ago


Since the introduction of transaction boundaries in replication
protocol, appliers follow replicaset.applier.vclock to the lsn of the
first row in an arrived batch. This is enough and doesn't lead to errors
when replicating from other instances, respecting transaction boundaries
(instances with version 2.1.2 and up).

However, if there's a 1.10 instance in 2.1.2+ cluster, it sends every
single tx row as a separate transaction, breaking the comparison with
replicaset.applier.vclock and making the applier apply part of the
changes, it has already applied when processing a full transaction
coming from another 2.x instance. Such behaviour leads to
ER_TUPLE_FOUND errors in the scenario described above.

In order to guard from such cases, follow replicaset.applier.vclock to
the lsn of the last row in tx.

Closes #4924

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

0edb4d97

sql: fix sorting rules for values of SCALAR type · 72ce442c

Roman Khabibov authored 5 years ago

Function implementing comparison during VDBE sorting routine
(sqlVdbeCompareMsgpack) did not account values of boolean type in some
cases. Let's fix it so that booleans always precede numbers if they are
sorted in ascending order.

Closes #4697

72ce442c

Apr 24, 2020

cbus: fix inconsistency in endpoint creation · d6d69c9f

Cyrill Gorcunov authored 4 years ago


The notification of wait variable shall be done under
a bound mutex locked. Otherwise the results are not
guaranteed (see pthread manuals).

Thus when we create a new endpoint via cbus_endpoint_create
and there is an other thread which sleeps inside cpipe_create
we should notify the sleeper under cbus.mutex.

Fixes #4806

Reported-by: Alexander Turenko <alexander.turenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

d6d69c9f

build: fix compilation on Alpine 3.5 · d7fa6d34

Leonid Vasiliev authored 4 years ago

The cbus hang test uses glibc pthread mutex implementation details.
The reason why mutex implementation details is used:
"For the bug reproducing the canceled thread must be canceled
during processing cpipe_flush_cb. We need to synchronize
the main thread and the canceled worker thread for that.
So, thread synchronization has been realized by means of
endpoint's mutex internal field(__data.__lock)."
Therefore, it should not compile in case of using another library.

d7fa6d34

Apr 21, 2020

say: fix syslog format · 09832455

Olga Arkhangelskaia authored 5 years ago

While refactoring of say module in commit
5db765a7 (say: fix non-informative error
messages for log cfg) format of syslog was broken.

Closes #4785

Unverified

09832455

Apr 20, 2020

Dummy commit · ad13b6d5
Kirill Yukhin authored 4 years ago

ad13b6d5

box/error: ref error.prev while accessing it · fef6505c

Nikita Pettik authored 5 years ago

In case accessing previous error doesn't come alongside with
incrementing its reference counter, it may lead to use-after-free bug.
Consider following scenario:

_, err = foo() -- foo() returns diagnostic error stack
preve = err.prev -- err.prev ref. counter == 1
err:set_prev(nil) -- err.prev ref. counter == 0 so err.prev is destroyed
preve -- accessing already freed memory

To avoid that let's increment reference counter of .prev member while
calling error.prev and set corresponding gc finalizer (error_unref()).

Closes #4887

fef6505c

box/error: don't allow overflow of error ref counter · ca79e1cf

Nikita Pettik authored 5 years ago

There's no overflow check while incrementing error's reference counter
in error_ref(). Meanwhile, stubborn users still may achieve overflow:
each call of box.error.last() increments reference counter of error
residing in diagnostic area. As a result, 2^32 calls of box.error.last()
in a row will lead to counter overflow ergo - to unpredictable results.
Let's fix it and introduce dummy check in error_ref().

ca79e1cf

test: fix LTO build for popen unit test · c44ed3c0
Kirill Yukhin authored 4 years ago
```
Fixa clash in struct names detected by LTO.
```
c44ed3c0
Fix build after popen patchset · a46611a3
Kirill Yukhin authored 4 years ago
```
Make older compiler happy w.r.t.
initialization of locals.
```
a46611a3
build: temporary fix for luajit-tap tests cmake · dfba0512
Igor Munkin authored 4 years ago
```
Fixes the regression from 335f80a0
('test: adjust luajit-tap testing machinery').
```
dfba0512

error: fix iproto error stack overlapped by old error · a219e258

Leonid Vasiliev authored 4 years ago

Fix possible overlap of IPROTO_ERROR by IPROTO_ERROR_24.
This was possible because messages are transmitted in a map and
an order is not defined. IPROTO_ERROR_24 could be parsed after
the IPROTO_ERROR, and could throw it away.

a219e258

iproto: rename IPROTO_ERROR and IPROTO_ERROR_STACK · 2b4263d3

Vladislav Shpilevoy authored 4 years ago

IPROTO_ERROR in fact is not an error. It is an error message.
Secondly, this key is deprecated in favor of IPROTO_ERROR_STACK,
which contains all attributes of the whole error stack. It uses
MP_ERROR MessagePack extenstion for that.

So IPROTO_ERROR is renamed to IPROTO_ERROR_24 (similar to how old
call was renamed to IPROTO_CALL_16). IPROTO_ERROR_STACK becomes
new IPROTO_ERROR.

Follow up #4398

2b4263d3

error: make iproto errors reuse mp_error module · 712af455

Vladislav Shpilevoy authored 5 years ago

After error objects marshaling was implemented in #4398, there
were essentially 2 versions of the marshaling - when an error is
sent inside response body, and when it is thrown and is encoded
in iproto fields IPROTO_ERROR and IPROTO_ERROR_STACK. That is not
really useful to have 2 implementation of the same feature. This
commit drops the old iproto error encoding (its IPROTO_ERROR_STACK
part), and makes it reuse the common error encoder.

Note, the encoder skips MP_EXT header. This is because

* The header is not needed - error is encoded as a value of
  IPROTO_ERROR_STACK key, so it is known this is an error. MP_EXT
  is needed only when type is unknown on decoding side in advance;

* Old clients may not expect MP_EXT in iproto fields. That is the
  case of netbox connector, at least.

Follow up #4398

@TarantoolBot document
Title: Stacked diagnostics binary protocol
Stacked diagnostics is described in details in
https://github.com/tarantool/doc/issues/1224. This commit
changes nothing except binary protocol. The old protocol should
not be documented anywhere.

`IPROTO_ERROR_STACK` is still 0x52, but format of its value is
different now. It looks exactly like `MP_ERROR` object, without
`MP_EXT` header.

```
IPROTO_ERROR_STACK: <MP_MAP> {
    MP_ERROR_STACK: <MP_ARRAY> [
        <MP_MAP> {
            ... <all the other fields of MP_ERROR> ...
        },
        ...
    ]
}
```

It is easy to see, that key `IPROTO_ERROR_STACK` is called
'stack', and `MP_ERROR_STACK` is also 'stack'. So it may be good
to rename the former key in the documentation. For example, the
old `IPROTO_ERROR` can be renamed to `IPROTO_ERROR_24` and
`IPROTO_ERROR_STACK` can be renamed to just `IPROTO_ERROR`.

712af455

error: export error_unref() function · ed217292

Vladislav Shpilevoy authored 5 years ago

C struct error objects can be created directly only in C.
C-side increments their reference counter when pushes to the Lua
stack.

It is not going to be so convenient soon. error_unpack() function
will be used in netbox to decode error object via Lua FFI.

Such error object will have 0 refs and no Lua GC callback
established. Because it won't be pushed on Lua stack natually,
from Lua C. To make such errors alive their reference counter
will be incremented and error_unref() will be set as GC callback.

Follow up for #4398

ed217292

error: add error MsgPack encoding · 345877df

Leonid Vasiliev authored 5 years ago


Co-authored-by: Vladislav <Shpilevoy&lt;v.shpilevoy@tarantool.org>

Closes #4398

@TarantoolBot document
Title: Error objects encoding in MessagePack

Until now an error sent over IProto, or serialized into
MessagePack was turned into a string consisting of the error
message. As a result, all other error object attributes were lost,
including type of the object. On client side seeing a string it
was not possible to tell whether the string is a real string, or
it is a serialized error.

To deal with that the error objects encoding is reworked from the
scratch. Now, when session setting `error_marshaling_enabled` is
true, all fibers of that session will encode error objects as a
new MP_EXT type - MP_ERROR (0x03).

```
    +--------+----------+========+
    | MP_EXT | MP_ERROR | MP_MAP |
    +--------+----------+========+

    MP_ERROR: <MP_MAP> {
        MP_ERROR_STACK: <MP_ARRAY> [
            <MP_MAP> {
                MP_ERROR_TYPE: <MP_STR>,
                MP_ERROR_FILE: <MP_STR>,
                MP_ERROR_LINE: <MP_UINT>,
                MP_ERROR_MESSAGE: <MP_STR>,
                MP_ERROR_ERRNO: <MP_UINT>,
                MP_ERROR_CODE: <MP_UINT>,
                MP_ERROR_FIELDS: <MP_MAP> {
                    <MP_STR>: ...,
                    <MP_STR>: ...,
                    ...
                },
                ...
            },
            ...
        ]
    }
```

On the top level there is a single key: `MP_ERROR_STACK = 0x00`.
More keys can be added in future, and a client should ignore all
unknown keys to keep compatibility with new versions.

Every error in the stack is a map with the following keys:
* `MP_ERROR_TYPE = 0x00` - error type. This is what is visible in
  `<error_object>.base_type` field;
* `MP_ERROR_FILE = 0x01` - file name from `<error_object>.trace`;
* `MP_ERROR_LINE = 0x02` - line from `<error_object>.trace`;
* `MP_ERROR_MESSAGE = 0x03` - error message from
  `<error_object>.message`;
* `MP_ERROR_ERRNO = 0x04` - errno saved at the moment of the error
  creation. Visible in `<error_object>.errno`;
* `MP_ERROR_CODE = 0x05` - error code. Visible in
  `<error_object>.code` and in C function `box_error_code()`.
* `MP_ERROR_FIELDS = 0x06` - additional fields depending on error
  type. For example, AccessDenied error type stores here fields
  `access_type`, `object_type`, `object_name`. Connector's code
  should ignore unknown keys met here, and be ready, that for some
  existing errors new fields can be added, old can be dropped.

345877df

box: move Lua MP_EXT decoder from tuple.c · d3a4dc68

Vladislav Shpilevoy authored 5 years ago

Lua C module 'msgpack' supports registration of custom extension
decoders for MP_EXT values. That is needed to make 'msgpack' not
depending on any modules which use it.

So far the only box-related extension were tuples - struct tuple
cdata needed to be encoded as an array.

That is going to change in next commits, where struct error cdata
appears, also depending on box. So the decoder can't be located
in src/box/lua/tuple.c. It is moved to a more common place -
src/box/lua/init.c.

Needed for #4398

d3a4dc68

error: update constructors of some errors · 6d0078d0

Leonid Vasiliev authored 5 years ago

We want to have a transparent marshalling through net.box
for errors. To do this, we need to recreate the error
on the client side with the same parameters as on the server.
For convenience, we update AccessDeniedError constructor
which has pointers to static strings and add the XlogGapError
constructor that does not require vclock.

Needed for #4398

6d0078d0

error: add session setting for error type marshaling · c7a7f1cb

Leonid Vasiliev authored 5 years ago


Errors are encoded as a string when serialized to MessagePack to
be sent over IProto or when just saved into a buffer via Lua
modules msgpackffi and msgpack.

That is not very useful on client-side, because most of the error
metadata is lost: code, type, trace - everything except the
message.

Next commits are going to dedicate a new MP_EXT type to error
objects so as everything could be encoded, and on client side it
would be possible to restore types.

But this is a breaking change in case some users use old
connectors when work with newer Tarantool instances. So to smooth
the upgrade there is a new session setting -
'error_marshaling_enabled'.

By default it is false. When it is true, all fibers of the given
session will serialize error objects as MP_EXT.

Co-authored-by: Vladislav <Shpilevoy&lt;v.shpilevoy@tarantool.org>

Needed for #4398

c7a7f1cb

session: add offset to SQL session settings array · b6c6c536

Vladislav Shpilevoy authored 5 years ago

Session settings are stored in a monolithic array. Submodules
can define a range of settings in there. For example, SQL. It
occupies settings from 0 to 8. There is a second array of only
SQL settings in build.c, of the same size, and it uses the same
indexes.

But if something will be added before SQL settings, so as they
won't start from 0, it will break the equal indexes assumption.
SQL should normalize all setting identifiers by
SESSION_SETTING_SQL_BEGIN.

b6c6c536

error: add custom error type · b728e7af

Leonid Vasiliev authored 5 years ago


Co-authored-by: Vladislav <Shpilevoy&lt;v.shpilevoy@tarantool.org>

Part of #4398

@TarantoolBot document
Title: Custom error types for Lua errors

Errors can be created in 2 ways: `box.error.new()` and `box.error()`.

Both used to take either `code, reason, <reason string args>` or
`{code = code, reason = reason, ...}` arguments.

Now in the first option instead of code a user can specify a
string as its own error type. In the second option a user can
specify both code and type. For example:

```Lua
box.error('MyErrorType', 'Message')
box.error({type = 'MyErrorType', code = 1024, reason = 'Message'})
```
Or no-throw version:
```Lua
box.error.new('MyErrorType', 'Message')
box.error.new({type = 'MyErrorType', code = 1024, reason = 'Message'})
```
When a custom type is specified, it is shown in `err.type`
attribute. When it is not specified, `err.type` shows one of
built-in errors such as 'ClientError', 'OurOfMemory', etc.

Name length limit on the custom type is 63 bytes. All what is
longer is truncated.

Original error type can be checked using `err.base_type` member,
although normally it should not be used. For user-defined types
base type is 'CustomError'.

For example:
```
tarantool> e = box.error.new({type = 'MyErrorType', code = 1024, reason = 'Message'})
---
...

tarantool> e:unpack()
---
- code: 1024
  trace:
  - file: '[string "e = box.error.new({type = ''MyErrorType'', code..."]'
    line: 1
  type: MyErrorType
  custom_type: MyErrorType
  message: Message
  base_type: CustomError
...
```

b728e7af

libev: backport select()'s limit workaround · 8acc011a

Alexander Turenko authored 4 years ago

As stated in the 'OS/X AND DARWIN BUGS' section of the libev
documentation [1], kqueue() and poll() have known problems on Mac OS, so
the library uses select() on Mac OS (it is the build time default). The
library however uses the trick to overcome 1024 fds limit: libev sets
the undocumented macro _DARWIN_UNLIMITED_SELECT, which enables linking
against select() implementation without the limit.

The magic macro stops working at some point around Mac OS 10.10 (see
[2]), because it was defined after <sys/time.h> inclusion.  For recent
Mac OS versions the macro has effect only when it is defined before
<sys/time.h> inclusion.

The macro definition was [moved][3] in libev 4.25. Excerpt from the
changelog [4]:

 | 4.25 Fri Dec 21 07:49:20 CET 2018
 | <...>
 | - move the darwin select workaround higher in ev.c, as newer versions of
 |   darwin managed to break their broken select even more.

More proper fix would be updating of libev to a newer version, however I
would postpone it until a moment when we'll have a time to properly test
everything with a new version of the library.

[1]: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AND_DARWIN_BUGS
[2]: http://lists.schmorp.de/pipermail/libev/2018q2/002788.html
[3]: http://cvs.schmorp.de/libev/ev.c?r1=1.482&r2=1.483
[4]: http://cvs.schmorp.de/libev/Changes?view=markup



Fixes #3867
Fixes #4673

Investigated-by: Maria Khaydich <maria.khaydich@tarantool.org>
Co-authored-by: Maria Khaydich <maria.khaydich@tarantool.org>

8acc011a

popen: add popen Lua module · 6d1c5ff5

Alexander Turenko authored 5 years ago


Fixes #4031

Co-developed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Co-developed-by: Igor Munkin <imun@tarantool.org>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Igor Munkin <imun@tarantool.org>

@TarantoolBot document
Title: popen module

```
Overview
========

Tarantool supports execution of external programs similarly to well
known Python's `subprocess` or Ruby's `Open3`. Note though the `popen`
module does not match one to one to the helpers these languages provide
and provides only basic functions. The popen object creation is
implemented via `vfork()` system call which means the caller thread is
blocked until execution of a child process begins.

Module functions
================

The `popen` module provides two functions to create that named popen
object: `popen.shell` which is the similar to libc `popen` syscall and
`popen.new` to create popen object with more specific options.

`popen.shell(command[, mode]) -> handle, err`
---------------------------------------------

Execute a shell command.

@param command  a command to run, mandatory
@param mode     communication mode, optional
                'w'    to use ph:write()
                'r'    to use ph:read()
                'R'    to use ph:read({stderr = true})
                nil    inherit parent's std* file descriptors

Several mode characters can be set together: 'rw', 'rRw', etc.

This function is just shortcut for popen.new({command}, opts)
with opts.{shell,setsid,group_signal} set to `true` and
and opts.{stdin,stdout,stderr} set based on `mode` parameter.

All std* streams are inherited from parent by default if it is
not changed using mode: 'r' for stdout, 'R' for stderr, 'w' for
stdin.

Raise an error on incorrect parameters:

- IllegalParams: incorrect type or value of a parameter.

Return a popen handle on success.

Return `nil, err` on a failure.
@see popen.new() for possible reasons.

Example:

 | local popen = require('popen')
 |
 | -- Run the program and save its handle.
 | local ph = popen.shell('date', 'r')
 |
 | -- Read program's output, strip trailing newline.
 | local date = ph:read():rstrip()
 |
 | -- Free resources. The process is killed (but 'date'
 | -- exits itself anyway).
 | ph:close()
 |
 | print(date)

Execute 'sh -c date' command, read the output and close the
popen object.

Unix defines a text file as a sequence of lines, each ends
with the newline symbol. The same convention is usually
applied for a text output of a command (so when it is
redirected to a file, the file will be correct).

However internally an application usually operates on
strings, which are NOT newline terminated (e.g. literals
for error messages). The newline is usually added right
before a string is written to the outside world (stdout,
console or log). :rstrip() in the example above is shown
for this sake.

`popen.new(argv[, opts]) -> handle, err`
----------------------------------------

Execute a child program in a new process.

@param argv  an array of a program to run with
             command line options, mandatory;
             absolute path to the program is required
             when @a opts.shell is false (default)

@param opts  table of options

@param opts.stdin   action on STDIN_FILENO
@param opts.stdout  action on STDOUT_FILENO
@param opts.stderr  action on STDERR_FILENO

File descriptor actions:

    popen.opts.INHERIT  (== 'inherit') [default]
                        inherit the fd from the parent
    popen.opts.DEVNULL  (== 'devnull')
                        open /dev/null on the fd
    popen.opts.CLOSE    (== 'close')
                        close the fd
    popen.opts.PIPE     (== 'pipe')
                        feed data from/to the fd to parent
                        using a pipe

@param opts.env  a table of environment variables to
                 be used inside a process; key is a
                 variable name, value is a variable
                 value.
                 - when is not set then the current
                   environment is inherited;
                 - if set to an empty table then the
                   environment will be dropped
                 - if set then the environment will be
                   replaced

@param opts.shell            (boolean, default: false)
       true                  run a child process via
                             'sh -c "${opts.argv}"'
       false                 call the executable directly

@param opts.setsid           (boolean, default: false)
       true                  run the program in a new
                             session
       false                 run the program in the
                             tarantool instance's
                             session and process group

@param opts.close_fds        (boolean, default: true)
       true                  close all inherited fds from a
                             parent
       false                 don't do that

@param opts.restore_signals  (boolean, default: true)
       true                  reset all signal actions
                             modified in parent's process
       false                 inherit changed actions

@param opts.group_signal     (boolean, default: false)
       true                  send signal to a child process
                             group (only when opts.setsid is
                             enabled)
       false                 send signal to a child process
                             only

@param opts.keep_child       (boolean, default: false)
       true                  don't send SIGKILL to a child
                             process at freeing (by :close()
                             or Lua GC)
       false                 send SIGKILL to a child process
                             (or a process group if
                             opts.group_signal is enabled) at
                             :close() or collecting of the
                             handle by Lua GC

The returned handle provides :close() method to explicitly
release all occupied resources (including the child process
itself if @a opts.keep_child is not set). However if the
method is not called for a handle during its lifetime, the
same freeing actions will be triggered by Lua GC.

It is recommended to use opts.setsid + opts.group_signal
if a child process may spawn its own childs and they all
should be killed together.

Note: A signal will not be sent if the child process is
already dead: otherwise we might kill another process that
occupies the same PID later. This means that if the child
process dies before its own childs, the function will not
send a signal to the process group even when opts.setsid and
opts.group_signal are set.

Use os.environ() to pass copy of current environment with
several replacements (see example 2 below).

Raise an error on incorrect parameters:

- IllegalParams: incorrect type or value of a parameter.
- IllegalParams: group signal is set, while setsid is not.

Return a popen handle on success.

Return `nil, err` on a failure. Possible reasons:

- SystemError: dup(), fcntl(), pipe(), vfork() or close()
               fails in the parent process.
- SystemError: (temporary restriction) the parent process
               has closed stdin, stdout or stderr.
- OutOfMemory: unable to allocate the handle or a temporary
               buffer.

Example 1:

 | local popen = require('popen')
 |
 | local ph = popen.new({'/bin/date'}, {
 |     stdout = popen.opts.PIPE,
 | })
 | local date = ph:read():rstrip()
 | ph:close()
 | print(date) -- Thu 16 Apr 2020 01:40:56 AM MSK

Execute 'date' command, read the result and close the
popen object.

Example 2:

 | local popen = require('popen')
 |
 | local env = os.environ()
 | env['FOO'] = 'bar'
 |
 | local ph = popen.new({'echo "${FOO}"'}, {
 |     stdout = popen.opts.PIPE,
 |     shell = true,
 |     env = env,
 | })
 | local res = ph:read():rstrip()
 | ph:close()
 | print(res) -- bar

It is quite similar to the previous one, but sets the
environment variable and uses shell builtin 'echo' to
show it.

Example 3:

 | local popen = require('popen')
 |
 | local ph = popen.new({'echo hello >&2'}, { -- !!
 |     stderr = popen.opts.PIPE,              -- !!
 |     shell = true,
 | })
 | local res = ph:read({stderr = true}):rstrip()
 | ph:close()
 | print(res) -- hello

This example demonstrates how to capture child's stderr.

Example 4:

 | local function call_jq(input, filter)
 |     -- Start jq process, connect to stdin, stdout and stderr.
 |     local jq_argv = {'/usr/bin/jq', '-M', '--unbuffered', filter}
 |     local ph, err = popen.new(jq_argv, {
 |         stdin = popen.opts.PIPE,
 |         stdout = popen.opts.PIPE,
 |         stderr = popen.opts.PIPE,
 |     })
 |     if ph == nil then return nil, err end
 |
 |     -- Write input data to child's stdin and send EOF.
 |     local ok, err = ph:write(input)
 |     if not ok then return nil, err end
 |     ph:shutdown({stdin = true})
 |
 |     -- Read everything until EOF.
 |     local chunks = {}
 |     while true do
 |         local chunk, err = ph:read()
 |         if chunk == nil then
 |             ph:close()
 |             return nil, err
 |         end
 |         if chunk == '' then break end -- EOF
 |         table.insert(chunks, chunk)
 |     end
 |
 |     -- Read diagnostics from stderr if any.
 |     local err = ph:read({stderr = true})
 |     if err ~= '' then
 |         ph:close()
 |         return nil, err
 |     end
 |
 |     -- Glue all chunks, strip trailing newline.
 |     return table.concat(chunks):rstrip()
 | end

Demonstrates how to run a stream program (like `grep`, `sed`
and so), write to its stdin and read from its stdout.

The example assumes that input data are small enough to fit
a pipe buffer (typically 64 KiB, but depends on a platform
and its configuration). It will stuck in :write() for large
data. How to handle this case: call :read() in a loop in
another fiber (start it before a first :write()).

If a process writes large text to stderr, it may fill out
stderr pipe buffer and stuck in write(2, ...). So we need
to read stderr in a separate fiber to handle this case.

Handle methods
==============

`popen_handle:read([opts]) -> str, err`
---------------------------------------

Read data from a child peer.

@param handle        handle of a child process
@param opts          an options table
@param opts.stdout   whether to read from stdout, boolean
                     (default: true)
@param opts.stderr   whether to read from stderr, boolean
                     (default: false)
@param opts.timeout  time quota in seconds
                     (default: 100 years)

Read data from stdout or stderr streams with @a timeout.
By default it reads from stdout. Set @a opts.stderr to
`true` to read from stderr.

It is not possible to read from stdout and stderr both in
one call. Set either @a opts.stdout or @a opts.stderr.

Raise an error on incorrect parameters or when the fiber is
cancelled:

- IllegalParams:    incorrect type or value of a parameter.
- IllegalParams:    called on a closed handle.
- IllegalParams:    opts.stdout and opts.stderr are set both
- IllegalParams:    a requested IO operation is not supported
                    by the handle (stdout / stderr is not
                    piped).
- IllegalParams:    attempt to operate on a closed file
                    descriptor.
- FiberIsCancelled: cancelled by an outside code.

Return a string on success, an empty string at EOF.

Return `nil, err` on a failure. Possible reasons:

- SocketError: an IO error occurs at read().
- TimedOut:    @a timeout quota is exceeded.
- OutOfMemory: no memory space for a buffer to read into.
- LuajitError: ("not enough memory"): no memory space for
               the Lua string.

`popen_handle:write(str[, opts]) -> str, err`
---------------------------------------------

Write data to a child peer.

@param handle        a handle of a child process
@param str           a string to write
@param opts          table of options
@param opts.timeout  time quota in seconds
                     (default: 100 years)

Write string @a str to stdin stream of a child process.

The function may yield forever if a child process does
not read data from stdin and a pipe buffer becomes full.
Size of this buffer depends on a platform. Use
@a opts.timeout when unsure.

When @a opts.timeout is not set, the function blocks
(yields the fiber) until all data is written or an error
happened.

Raise an error on incorrect parameters or when the fiber is
cancelled:

- IllegalParams:    incorrect type or value of a parameter.
- IllegalParams:    called on a closed handle.
- IllegalParams:    string length is greater then SSIZE_MAX.
- IllegalParams:    a requested IO operation is not supported
                    by the handle (stdin is not piped).
- IllegalParams:    attempt to operate on a closed file
                    descriptor.
- FiberIsCancelled: cancelled by an outside code.

Return `true` on success.

Return `nil, err` on a failure. Possible reasons:

- SocketError: an IO error occurs at write().
- TimedOut:    @a timeout quota is exceeded.

`popen_handle:shutdown(opts) -> true`
------------------------------------------

Close parent's ends of std* fds.

@param handle        handle of a child process
@param opts          an options table
@param opts.stdin    close parent's end of stdin, boolean
@param opts.stdout   close parent's end of stdout, boolean
@param opts.stderr   close parent's end of stderr, boolean

The main reason to use this function is to send EOF to
child's stdin. However parent's end of stdout / stderr
may be closed too.

The function does not fail on already closed fds (idempotence).
However it fails on attempt to close the end of a pipe that was
never exist. In other words, only those std* options that
were set to popen.opts.PIPE at a handle creation may be used
here (for popen.shell: 'r' corresponds to stdout, 'R' to stderr
and 'w' to stdin).

The function does not close any fds on a failure: either all
requested fds are closed or neither of them.

Example:

 | local popen = require('popen')
 |
 | local ph = popen.shell('sed s/foo/bar/', 'rw')
 | ph:write('lorem foo ipsum')
 | ph:shutdown({stdin = true})
 | local res = ph:read()
 | ph:close()
 | print(res) -- lorem bar ipsum

Raise an error on incorrect parameters:

- IllegalParams:  an incorrect handle parameter.
- IllegalParams:  called on a closed handle.
- IllegalParams:  neither stdin, stdout nor stderr is choosen.
- IllegalParams:  a requested IO operation is not supported
                  by the handle (one of std* is not piped).

Return `true` on success.

`popen_handle:terminate() -> ok, err`
-------------------------------------

Send SIGTERM signal to a child process.

@param handle  a handle carries child process to terminate

The function only sends SIGTERM signal and does NOT
free any resources (popen handle memory and file
descriptors).

@see popen_handle:signal() for errors and return values.

`popen_handle:kill() -> ok, err`
--------------------------------

Send SIGKILL signal to a child process.

@param handle  a handle carries child process to kill

The function only sends SIGKILL signal and does NOT
free any resources (popen handle memory and file
descriptors).

@see popen_handle:signal() for errors and return values.

`popen_handle:signal(signo) -> ok, err`
---------------------------------------

Send signal to a child process.

@param handle  a handle carries child process to be signaled
@param signo   signal number to send

When opts.setsid and opts.group_signal are set on the handle
the signal is sent to the process group rather than to the
process. @see popen.new() for details about group
signaling.

Note: The module offers popen.signal.SIG* constants, because
some signals have different numbers on different platforms.

Raise an error on incorrect parameters:

- IllegalParams:    an incorrect handle parameter.
- IllegalParams:    called on a closed handle.

Return `true` if signal is sent.

Return `nil, err` on a failure. Possible reasons:

- SystemError: a process does not exists anymore

               Aside of a non-exist process it is also
               returned for a zombie process or when all
               processes in a group are zombies (but
               see note re Mac OS below).

- SystemError: invalid signal number

- SystemError: no permission to send a signal to
               a process or a process group

               It is returned on Mac OS when a signal is
               sent to a process group, where a group leader
               is zombie (or when all processes in it
               are zombies, don't sure).

               Whether it may appear due to other
               reasons is unclear.

`popen_handle:info() -> res`
----------------------------

Return information about popen handle.

@param handle  a handle of a child process

Raise an error on incorrect parameters:

- IllegalParams: an incorrect handle parameter.
- IllegalParams: called on a closed handle.

Return information about the handle in the following
format:

    {
        pid = <number> or <nil>,
        command = <string>,
        opts = <table>,
        status = <table>,
        stdin = one-of(
            popen.stream.OPEN   (== 'open'),
            popen.stream.CLOSED (== 'closed'),
            nil,
        ),
        stdout = one-of(
            popen.stream.OPEN   (== 'open'),
            popen.stream.CLOSED (== 'closed'),
            nil,
        ),
        stderr = one-of(
            popen.stream.OPEN   (== 'open'),
            popen.stream.CLOSED (== 'closed'),
            nil,
        ),
    }

`pid` is a process id of the process when it is alive,
otherwise `pid` is nil.

`command` is a concatenation of space separated arguments
that were passed to execve(). Multiword arguments are quoted.
Quotes inside arguments are not escaped.

`opts` is a table of handle options in the format of
popen.new() `opts` parameter. `opts.env` is not shown here,
because the environment variables map is not stored in a
handle.

`status` is a table that represents a process status in the
following format:

    {
        state = one-of(
            popen.state.ALIVE    (== 'alive'),
            popen.state.EXITED   (== 'exited'),
            popen.state.SIGNALED (== 'signaled'),
        )

        -- Present when `state` is 'exited'.
        exit_code = <number>,

        -- Present when `state` is 'signaled'.
        signo = <number>,
        signame = <string>,
    }

`stdin`, `stdout`, `stderr` reflect status of parent's end
of a piped stream. When a stream is not piped the field is
not present (`nil`). When it is piped, the status may be
one of the following:

- popen.stream.OPEN    (== 'open')
- popen.stream.CLOSED  (== 'closed')

The status may be changed from 'open' to 'closed'
by :shutdown({std... = true}) call.

Example 1 (tarantool console):

 | tarantool> require('popen').new({'/usr/bin/touch', '/tmp/foo'})
 | ---
 | - command: /usr/bin/touch /tmp/foo
 |   status:
 |     state: alive
 |   opts:
 |     stdout: inherit
 |     stdin: inherit
 |     group_signal: false
 |     keep_child: false
 |     close_fds: true
 |     restore_signals: true
 |     shell: false
 |     setsid: false
 |     stderr: inherit
 |   pid: 9499
 | ...

Example 2 (tarantool console):

 | tarantool> require('popen').shell('grep foo', 'wrR')
 | ---
 | - stdout: open
 |   command: sh -c 'grep foo'
 |   stderr: open
 |   status:
 |     state: alive
 |   stdin: open
 |   opts:
 |     stdout: pipe
 |     stdin: pipe
 |     group_signal: true
 |     keep_child: false
 |     close_fds: true
 |     restore_signals: true
 |     shell: true
 |     setsid: true
 |     stderr: pipe
 |   pid: 10497
 | ...

`popen_handle:wait() -> res`
----------------------------

Wait until a child process get exited or signaled.

@param handle  a handle of process to wait

Raise an error on incorrect parameters or when the fiber is
cancelled:

- IllegalParams:    an incorrect handle parameter.
- IllegalParams:    called on a closed handle.
- FiberIsCancelled: cancelled by an outside code.

Return a process status table (the same as ph.status and
ph.info().status). @see popen_handle:info() for the format
of the table.

`popen_handle:close() -> ok, err`
---------------------------------

Close a popen handle.

@param handle  a handle to close

Basically it kills a process using SIGKILL and releases all
resources assosiated with the popen handle.

Details about signaling:

- The signal is sent only when opts.keep_child is not set.
- The signal is sent only when a process is alive according
  to the information available on current even loop iteration.
  (There is a gap here: a zombie may be signaled; it is
  harmless.)
- The signal is sent to a process or a grocess group depending
  of opts.group_signal. (@see lbox_popen_new() for details of
  group signaling).

Resources are released disregarding of whether a signal
sending succeeds: fds are closed, memory is released,
the handle is marked as closed.

No operation is possible on a closed handle except
:close(), which always successful on closed handle
(idempotence).

Raise an error on incorrect parameters:

- IllegalParams: an incorrect handle parameter.

The function may return `true` or `nil, err`, but it always
frees the handle resources. So any return value usually
means success for a caller. The return values are purely
informational: it is for logging or same kind of reporting.

Possible diagnostics (don't consider them as errors):

- SystemError: no permission to send a signal to
               a process or a process group

               This diagnostics may appear due to
               Mac OS behaviour on zombies when
               opts.group_signal is set,
               @see lbox_popen_signal().

               Whether it may appear due to other
               reasons is unclear.

Always return `true` when a process is known as dead (say,
after ph:wait()): no signal will be send, so no 'failure'
may appear.

Handle fields
=============

- popen_handle.pid
- popen_handle.command
- popen_handle.opts
- popen_handle.status
- popen_handle.stdin
- popen_handle.stdout
- popen_handle.stderr

See popen_handle:info() for description of those fields.

Module constants
================

- popen.opts
  - INHERIT (== 'inherit')
  - DEVNULL (== 'devnull')
  - CLOSE   (== 'close')
  - PIPE    (== 'pipe')

- popen.signal
  - SIGTERM (== 9)
  - SIGKILL (== 15)
  - ...

- popen.state
  - ALIVE    (== 'alive')
  - EXITED   (== 'exited')
  - SIGNALED (== 'signaled')

- popen.stream
  - OPEN    (== 'open')
  - CLOSED  (== 'closed')
```

(cherry picked from commit 34c2789b48eadd36da742f7f761a998889e70544)

6d1c5ff5