Commits · 2821872e7c44a30d83b9040b9bc6d31699acc077 · core / tarantool

Jun 27, 2018

Merge branch '1.9' into 1.10 · 2821872e
Konstantin Osipov authored 6 years ago

2821872e
net.box: update a test case after cherry-pick · e04b5b23
Konstantin Osipov authored 6 years ago
```
schema_version must be passed to perform_request in 1.9
```
e04b5b23

iproto: on input discard do nothing for closed con · a60c8dff

Vladislav Shpilevoy authored 6 years ago

When a connection is closed, some of long-poll requests still may
by in TX thread with non-discarded input. If a connection is
closed, and then an input is discarded, then connection must not
try to read new data.

The bug was introduced here:
f4d66dae by me.

Closes #3400

a60c8dff

iproto: fix IPROTO_SERVER_IS_RO key code · e5e8ef4f

Vladimir Davydov authored 6 years ago

IPROTO_SERVER_IS_RO currently has code 0x07 and is defined in the header
key section, which is wrong, because this key is only used in request
body. Let's move it to the body section, where it belongs, and set its
code to 0x29. This shouldn't break anything even if 0x07 is reused in
future, because the two codes belong to different sections and hence are
never parsed in the same function. Worst that can happen is we fail to
bootstrap a node in the cluster if it is running a newer tarantool
version.

While we are at it, let's also add the key name and change its type from
MP_UINT to MP_BOOL.

Fixes commit a8ecd1e1 ("replication: fix bug with read-only replica
as a bootstrap leader").

e5e8ef4f

xrow: fix ret code on decode failure · 56c6f533

Vladimir Davydov authored 6 years ago

Throughout the code, we return -1 on error, but decode methods return 1
for some reason, although according to comments they are supposed to
return -1. This doesn't result in any errors, because we use != 0 to
check for errors. Nevertheless, let's fix it to avoid confusion.

56c6f533

txn: remove unused C++ wrappers · 1a6432d3
Vladimir Davydov authored 6 years ago

1a6432d3

xlog: erase eof marker when reopening existing file for writing · ce44a9e0

Vladimir Davydov authored 6 years ago

When reopening an existing xlog file (as in case of vylog), we do not
erase the eof marker immediately. Instead we reposition file offset
to (file_size - sizeof eof_marker), assuming the eof marker will be
overwritten on the first write.

However, it isn't enough if we want to reuse this function for reopening
WAL files, because when scanning the WAL directory we close a file if we
read eof marker and never reopen it again, see recover_remaining_wals().
So to avoid skipping rows written to a once closed WAL, we have to erase
the eof marker when reopening an xlog file. Let's do it with truncate().

ce44a9e0

Jun 26, 2018

vinyl: fix read iterator skips source after reading cache · 13f4355c

Vladimir Davydov authored 6 years ago

If a source is used on a read iteration (i.e. the key at which it is
positioned is the next best match or, in terms of the read iterator
implementation, its front_id matches the read iterator front_id), its
history is cleaned up, see vy_read_iterator_apply_history(). This breaks
the logic behind vy_read_src_is_behind(), which assumes that the history
always points to the last used key. As a result, a source may be
mistakenly skipped, as illustrated below:

  Fiber 1                               Fiber 2
  -------                               -------
  1. Opens read iterator.
  2. Advances it to the next key.
     The returned key was read from
     a mem or run (not from cache).
     The source's history is emptied.
                                        Adds a chain containing
                                        the key read by fiber 1
                                        to the cache.
  3. Continues iteration, reads
     next few keys from the cache
     until the chain ends. The source
     used at step 2 is skipped.
  4. Calls vy_read_src_is_behind()
     on the source used at step 2 and
     skipped at step 3. It returns
     false, because its history is
     empty, thus skipping keys stored
     in it.

Fix this bug by moving the code that checks whether a source iterator
needs to be advanced from vy_read_src_is_behind() to source iterator
'skip' method, because there we always know the last key returned by
the iterator.

Basically, this returns the code we had before commit b4d57284
("vinyl: consolidate skip optimization checks in read iterator").

Closes #3477

13f4355c

Introduce privileges for object groups · af35de96

Georgy Kirichenko authored 6 years ago

Allow define access privileges for all spaces, functions and sequences.
Read and write privileges are supported for spaces, execute privilege
for sequences. Privilege granting and revoking might be done through old api
without object identification:
  box.schema.user.grant("guest", "read", "space")

Prerequisite #945

af35de96

schema: misc changes to improve code readability · 4aaefbe4
Konstantin Osipov authored 6 years ago
```
Use constants, rename methods, invert acl-object
compatibility matrix.

In scope of gh-945.
```
4aaefbe4

security: add limits on object_type-privilege pair · 983c194e

imarkov authored 7 years ago

Introduce constraints on object_type-privilege pairs.
These constraints limit senseless grants/revokes, i.e.,
sequence - execute, all space related privileges(insert, delete,
update),
function - alter, all space related privileges,
role - all privileges except create, drop, alter, execute

Prerequisite #945

983c194e

Jun 25, 2018

Merge branch '1.9' into 1.10 · 3ec12041
Konstantin Osipov authored 6 years ago

3ec12041

socket: fix race between unix tcp server stop and start · 80d379ee

Vladimir Davydov authored 6 years ago

If called on a unix socket, bind(2) creates a new file, see unix(7).
When we stop a unix tcp server, we should remove that file. Currently,
we do it from the tcp server fiber, after the server loop is broken,
which happens when the socket is closed, see tcp_server_loop(). This
opens a time window for another tcp server to reuse the same path:

    main fiber                  tcp server loop
    ----------                  ---------------

    -- Start a tcp server.
    s = socket.tcp_server('unix/', sock_path, ...)
    -- Stop the server.
    s:close()

                                socket_readable? => no, break loop

    -- Start a new tcp server. Use the same path as before.
    -- This function succeeds, because the socket is closed
    -- so tcp_server_bind_addr() will clean up by itself.
    s = socket.tcp_server('unix/', sock_path, ...)

     tcp_server_bind
      tcp_server_bind_addr
       socket_bind => EADDRINUSE
       tcp_connect => ECONNREFUSED
       -- Remove dead unix socket.
       fio.unlink(addr.port)
       socket_bind => success

                                -- Deletes unix socket used
                                -- by the new server.
                                fio.unlink(addr.port)

In particular, the race results in sporadic failures of app-tap/console
test, which restarts a tcp server using the same file path.

To fix this issue, let's close the socket after removing the socket
file. This is absolutely legit on any UNIX system, and this eliminates
the race shown above, because a new server that tries to bind on the
same path as the one already used by a dying server will not receive
ECONNREFUSED until the socket fd is closed and hence the file is
removed.

A note about the app-tap/console test. After this patch is applied,
socket.close() takes a little longer for unix tcp server, because it
yields twice, once for removing the socket file and once for closing the
socket file descriptor. As a result, on_disconnect() trigger left from
the previous test case has time to run after session.type() check.
Actually, those triggers have already been tested and we should have
cleared them before proceeding to the next test case. So instead of
adding two new on_disconnect checks to the test plan, let's clear the
triggers before session.type() test case and remove 3 on_connect and 5
on_auth checks from the test plan.

Closes #3168

80d379ee

Merge remote-tracking branch 'origin/1.9' into 1.10 · fc4829bc
Konstantin Osipov authored 6 years ago

fc4829bc

iproto: protect from false-correct size in msg header · c6951c92

Vladislav Shpilevoy authored 6 years ago

Consider this packet:

    msgpack = require('msgpack')
    data = msgpack.encode(18400000000000000000)..'aaaaaaa'

Tarantool interprets 18400000000000000000 as size of a coming
iproto request, and tries with no any checks to allocate buffer
of such size. It calculates needed capacity like this:

    capacity = start_value;
    while (capacity < size)
        capacity *= 2;

Here it is possible that on i-th iteration 'capacity' < 'size',
but 'capacity * 2' overflows 64 bits and becomes < 'size' again,
so this loop never ends and occupies 100% CPU.

Strictly speaking overflow has undefined behavior. On the
original system it led to nullifying 'capacity'.

Such size is improbable as a real packet gabarits, but can appear
as a result of parsing of some invalid packet, first bytes of
which accidentally appears to be valid MessagePack uint. This is
how the bug emerged on the real system.

Lets restrict the maximal packet size to 2GB.

Closes #3464

c6951c92

Jun 24, 2018

box: create bigrefs for tuples · 3768d4bb

Mergen Imeev authored 6 years ago

Due to limitation of reference counters for tuple being only
65535 it was possible to reach this limitation. This patch
increases capacity of reference counters to 4 billions.

Closes #3224

3768d4bb

Jun 14, 2018

session: fix box.session.sync() · 6cc31e04

Vladislav Shpilevoy authored 6 years ago

Before the patch box.session.sync() is global for the session and
is updated on each new iproto request. When the connection is
multiplexed, box.session.sync() can be changed with no finishing
a current request, if a new one arrives.

The patch makes box.session.push() local for the request,
protecting it from multiplexing mess. Box.session.sync() after
the patch can be safely used inside a request.

Closes #3450

@TarantoolBot document
Title: box.session.sync() became request local
Box.session.sync() was global for a session, so it was unusable
when the connection behind the session is multiplexed. Now
box.session.sync() is request local and can be safely used inside
the request processor.

6cc31e04

fiber: remove fiber local storage · 766feac2

Vladislav Shpilevoy authored 6 years ago

Replace it with more specific structures and pointers in order to
prepare to add `net` storage.

This allows to make the code working with fiber storage simpler,
remove useless wrappers and casts, and in the next patch - remove
broken session.sync and add fiber sync.

Note that under no circumstances fiber.h is allowed to include
application-specific headers like session.h or txn.h. One only
is allowed to announce a struct and add opaque pointer to it.

766feac2

Merge branch '1.9' into 1.10 · 57ea7669
Vladimir Davydov authored 6 years ago

57ea7669

memtx: don't delay deletion of temporary tuples during snapshot · f9299c43

Vladimir Davydov authored 6 years ago

Since tuples stored in temporary spaces are never written to disk, we
can always delete them immediately, even when a snapshot is in progress.

Closes #3432

f9299c43

Remove unused space_noop · 93ed36ea
Vladimir Davydov authored 6 years ago

93ed36ea

test: fix vinyl/upgrade/fill.lua script · ec84f36b

Vladimir Davydov authored 6 years ago

Since commit 8f63d5d9 ("vinyl: fail transaction immediately if it does
not fit in memory"), vinyl won't trigger memory dump if the size of
memory needed by a transaction is greater than the memory limit, instead
it will fail the transaction immediately. This broke the aforementioned
script, which relied on this to trigger system-wide memory dump. Fix it
by reworking the dump trigger logic used by the script: now it tries to
insert two tuples, box.cfg.vinyl_memory / 2 size each, instead of one.

Closes #3449

ec84f36b

Jun 09, 2018
- Merge branch '1.10' of github.com:tarantool/tarantool into 1.10 · 935f7586
  Konstantin Osipov authored 6 years ago
  
  935f7586
Jun 08, 2018

Fix build · c8b95be6
Vladislav Shpilevoy authored 6 years ago

c8b95be6
Merge remote-tracking branch 'origin/1.9' into 1.10 · d2260891
Alexander Turenko authored 6 years ago

Unverified

d2260891

debian: don't install systemd service file twice · e38d2762

Alexander Turenko authored 6 years ago

It fixes the following errors during tarantool installation from
packages on debian / ubuntu:

```
Unpacking tarantool (1.9.1.23.gacbd91c-1) ...
dpkg: error processing archive /var/cache/apt/archives/tarantool_1.9.1.23.gacbd91c-1_amd64.deb (--unpack):
 trying to overwrite '/lib/systemd/system/tarantool.service', which is also in package tarantool-common 1.9.1.23.gacbd91c-1
```

The problem is that tarantool.service file was shipped with
tarantool-common and tarantool packages both. It is the regression after
8925b862.

The way to avoid installing / enabling the service file within tarantool
package is to pass `--name` option to dh_systemd_enable, but do not pass
the service file name. In that case dh_systemd_enable does not found the
service file and does not enforce existence of the file.

Hope there is less hacky way to do so, but I don't found one at the
moment.

Unverified

e38d2762

box: refactor hot standby recovery · 4b818e99

Vladimir Davydov authored 6 years ago

Currently, we start a hot standby fiber even if not in hot standby mode
(see recovery_follow_local). And we scan the wal directory twice - first
time in recovery_follow_local(), second time in recovery_finalize().
Let's factor out recover_remaining_wals() from those functions and call
it explicitly. And let's call follow_local() and stop_local() only if in
hot standby mode.

Needed for #461

4b818e99

box: retrieve instance uuid before starting local recovery · b22f8e80

Vladimir Davydov authored 6 years ago

In order to find out if the current instance fell too much behind its
peers in the cluster and so needs to be rebootstrapped, we need to
connect it to remote peers before proceeding to local recovery. The
problem is box.cfg.replication may have an entry corresponding to the
instance itself so before connecting we have to start listening to
incoming connections. Since an instance is supposed to sent its uuid in
the greeting message, we also have to initialize INSTANCE_UUID early,
before we start local recovery. So this patch makes memtx engine
constructor not only scan the snapshot directory, but also read the
header of the most recent snapshot to initialize INSTANCE_UUID.

Needed for #461

b22f8e80

Merge remote-tracking branch 'origin/1.9' into 1.10 · d2cddf46
Konstantin Osipov authored 6 years ago

d2cddf46

Fix libunwind segfault · 5c3b3001

Georgy Kirichenko authored 6 years ago

Use volatile asm modifier to prevent unwanted and awkward optimizations
causing segfault while backtracing

5c3b3001

netbox: introduce iterable future objects · 39709775

Vladislav Shpilevoy authored 6 years ago

Netbox has two major ways to execute a request: sync and async.
During execution of any a server can send multiplie responses via
IPROTO_CHUNK. And the execution ways differ in how to handle the
chunks (called messages or pushes).

For a sync request a one can specify on_push callback and its
on_push_ctx argument called on each message.

When a request is async a user has a future object only, and can
not specify any callbacks. To get the pushed messages a one must
iterate over future object like this:
for i, message in future:pairs(one_iteration_timeout) do
...
end
Or ignore messages just calling future:wait_result(). Anyway
messages are not deleted, so a one can iterate over future object
again and again.

Follow up #2677

39709775

session: introduce binary box.session.push · 2b1143a7

Vladislav Shpilevoy authored 6 years ago

Box.session.push() allows to send a message to a client with no
finishing a main request. Tarantool after this patch supports
pushes over binary protocol.

IProto message is encoded using a new header code - IPROTO_CHUNK.
Push works as follows: a user calls box.session.push(message).
The message is encoded into currently active obuf in TX thread,
and then Kharon notifies IProto thread about new data.

Originally Kharon is the ferryman of Hades who carries souls of
the newly deceased across the rivers Styx and Acheron that
divided the world of the living from the world of the dead. In
Tarantool Kharon is a message and does the similar work. It
notifies IProto thread about new data in an output buffer
carrying pushed messages to IProto. Styx here is cpipe, and the
boat is cbus message.

One connection has single Kharon for all pushes. But Kharon can
not be in two places at the time. So once he got away from TX to
IProto, new messages can not send Kharon. They just set a special
flag. When Kharon is back to TX and sees the flag is set, he
immediately takes the road back to IProto.

Herewith a user is not blocked to write to obuf when Kharon is
busy. The user just updates obuf and set the flag if not set.
There is no waiting for Kharon arrival back.

Closes #2677

2b1143a7

box: Add privilleges constants to lua · 8c9a6e99

imarkov authored 7 years ago

Add lua bindings of PRIV_XXX constants.

This patch helps to avoid using numerical constants of privilleges
in schema.lua code.

Relates #945

8c9a6e99

xrow: add helper function for encoding vclock · a013f84b
Vladimir Davydov authored 6 years ago
```
So as not to duplicate the same code over and over again.
```
a013f84b

applier: remove extra new line in log message printed on connect · eaa2d482

Vladimir Davydov authored 6 years ago

An extra new line looks ugly in the log:

2018-06-06 15:22:22.682 [9807] main/101/interactive C> Tarantool 1.10.1-58-gd2272132
2018-06-06 15:22:22.682 [9807] main/101/interactive C> log level 5
2018-06-06 15:22:22.682 [9807] main/101/interactive I> mapping 268435456 bytes for memtx tuple arena...
2018-06-06 15:22:22.683 [9807] main/101/interactive I> mapping 134217728 bytes for vinyl tuple arena...
2018-06-06 15:22:22.692 [9807] main/101/interactive I> recovery start
2018-06-06 15:22:22.692 [9807] main/101/interactive I> recovering from `./00000000000000000006.snap'
2018-06-06 15:22:22.721 [9807] main/106/applier/ I> remote master is 1.10.1 at 0.0.0.0:44441

2018-06-06 15:22:22.723 [9807] main/106/applier/ C> leaving orphan mode
2018-06-06 15:22:22.723 [9807] main/101/interactive C> replica set sync complete, quorum of 1 replicas formed
2018-06-06 15:22:22.723 [9807] main/101/interactive I> ready to accept requests

eaa2d482

recovery: constify vclock argument · 5fba685c
Vladimir Davydov authored 6 years ago
```
Neither recovery_new() nor recover_remaining_wals() need to modify it.
```
5fba685c
recovery: drop unused recovery_exit · 1b0e5052
Vladimir Davydov authored 6 years ago

1b0e5052

security: Use system views instead of system spaces · 01455946

Ilya Markov authored 7 years ago

System views are used instead of direct reads of corresponding system
spaces to explore all accessible objects such as spaces, functions, users
and e.g. An operation with an inaccessible object produces a 'not found'
error even if the object exists.

In scope of #3250

Includes up fixes from Georgy

01455946

Jun 07, 2018

Merge remote-tracking branch 'origin/1.9' into 1.10 · cfdb47e1
Alexander Turenko authored 6 years ago

Unverified

cfdb47e1

test: update test-run · acbd91cc

Alexander Turenko authored 6 years ago

* added --verbose to show output of successful TAP13 test (#73)
* allow to call create_cluster(), drop_cluster() multiple times (#83)
* support configurations (*.cfg files) in core = app tests
* added return_listen_uri = <boolean> option for create_cluster()
* save and print at fail tarantool log for core = app tests (#87)

Unverified

acbd91cc