Commits · 1314b95b6ec85154b80ddb24a4e6b9bf37fe09e1 · core / tarantool

Aug 20, 2019

space: get rid of apply_initial_join_row method · 1314b95b

There's no reason to use a special method instead of the generic
space_execute_dml for applying rows received from a master during the
initial join stage. Moreover, using the special method results in not
running space.before_replace trigger, which makes it impossible to, for
example, update space engine on a replica, see the on_schema_init test
of the replication test suite.

So this patch removes the special method altogether and makes the code
that used it switch to space_execute_dml.

Closes #4417

1314b95b

memtx: enter small delayed free mode from snapshot iterator · 6136f84f

Vladimir Davydov authored 5 years ago

We must enable SMALL_DELAYED_FREE_MODE to safely use a memtx snapshot
iterator. Currently, we do that in checkpoint related callbacks, but if
we want to reuse snapshot iterators for other purposes, e.g. feeding
a read view to a newly joined replica, we better hide this code behind
snapshot iterator constructors.

6136f84f

memtx: use ref counting to pin indexes for snapshot · 82331221

Vladimir Davydov authored 5 years ago

Currently, to prevent an index from going away while it is being
written to a snapshot, we postpone memtx_gc_task's free() invocation
until checkpointing is complete, see commit 94de0a08 ("Don't take
schema lock for checkpointing"). This works fine, but makes it rather
difficult to reuse snapshot iterators for other purposes, e.g. feeding
a consistent read view to a newly joined replica.

Let's instead use index reference counting for pinning indexes for
checkpointing. A reference is taken in a snapshot iterator constructor
and released when the snapshot iterator is destroyed.

82331221

vinyl: get rid of vy_env::join_lsn · 96a7ae06

Vladimir Davydov authored 5 years ago

This fake LSN counter, which is used for assigning LSNs to Vinyl
statements during the initial join stage, was introduced a long time
ago, when LSNs were used as identifiers for lsregion allocations and
hence were supposed to grow strictly monotonically with each new
transaction. Later on, they were reused for assigning unique LSNs to
identify indexes in vylog.

These days, we don't need initial join LSNs to be unique, as we switched
to generations for lsregion allocations while in vylog we now use LSNs
only as an incarnation counter, not as a unique identifier. That said,
let's zap vy_env::join_lsn and simply assign 0 to all statements
received during the initial join stage.

To achieve that, we just need to relax an assertion in vy_tx_commit()
and remove the assumption that an LSN can't be zero in the write
iterator implementation.

96a7ae06

vinyl: don't pin index for iterator lifetime · 02da82ea

Vladimir Davydov authored 5 years ago

vinyl_iterator keeps a reference to the LSM tree it was created for
until it is destroyed, which may take indefinitely long in case the
iterator is used in Lua. Actually, we don't need to keep a reference to
the index for the whole iterator lifetime, because iterator_next()
wrapper guarantees that iterator->next won't be called for a dropped
index. What we need to do is keep a reference while we are yielding on
disk read, similarly to vinyl_index_get().

Currently, pinning an index for indefinitely long is harmless, because
an LSM tree is exempted from dump/compaction as soon as it is dropped so
we just pin some memory, that's all. However, following patches are
going to enable dump/compaction for dropped but pinned indexes in order
to implement snapshot iterator so we better relax the dependency of an
iterator on an index know.

While we are at it, let's remove env and lsm members of vinyl_iterator
struct: lsm can be accessed via vy_read_iterator embedded in the struct
while env is only needed to access iterator_pool so we better store a
pointer to the pool in vinyl_iterator instead.

02da82ea

sql: remove SQL_FUNC_SLOCHNG flag · c33f0804

Kirill Shcherbatov authored 5 years ago

The SQL_FUNC_SLOCHNG flag was useful for datetime function
that are currently not supported. So it could be removed.

Needed for #2200, #4113, #2233

c33f0804

sql: rename OP_Function to OP_BuiltinFunction · 2437b059

Kirill Shcherbatov authored 5 years ago

Renamed OP_Function opcode to OP_BuiltinFunction to introduce a
new OP_Function operation in a new meaning: a new OP_Function
would call Tarantool's function with new port-based API while
legacy OP_BuiltinFunction is an efficient implementation of
SQL Builtins functions.

Needed for #2200, #4113, #2233

2437b059

sql: rework SQL_FUNC_COUNT flag semantics · 1d69f568

Kirill Shcherbatov authored 5 years ago

Tarantool's SQL engine generates a different VDBE bytecode
for ..COUNT(*).. and ..COUNT(fieldname).. operations:
the first one produces a lightweight OP_Count operation that uses
native mechanism to report the count of record in index while
the second one pessimistically opens a space read iterator and
uses Count aggregate function.

A helper routine is_simple_count decides whether such
optimisation is correct. It used to use SQL_FUNC_COUNT flag to
mark a dummy (non-functional) function entry with 0 arguments.
This patch changes SQL_FUNC_COUNT semantics: now it is a marker
of any COUNT function, while is_simple_count relies on count
of arguments to distinguish aggregate and non-aggregate
functions.

Needed for #2200, #4113, #2233

1d69f568

sql: wrap all trim functions in dispatcher · 77e4c842

Kirill Shcherbatov authored 5 years ago

A new dispatcher function trim_func calls corresponding trim_
function implementation in relation with number of argc - a count
of arguments.

This is an important step to get rid of function's name
overloading required for replace FuncDef cache with Tarantool's
function cache.

Needed for #2200, #4113, #2233

77e4c842

sql: GREATEST, LEAST instead of MIN/MAX overload · a46b5200

Kirill Shcherbatov authored 5 years ago

This patch does two things: renames existing scalar min/max
functions and reserves names for them in NoSQL cache.

Moreover it is an important step to get rid of function's name
overloading required for replace FuncDef cache with Tarantool's
function cache.

Closes #4405
Needed for #2200, #4113, #2233

@TarantoolBot document
Title: Scalar functions MIN/MAX are renamed to LEAST/GREATEST

The MIN/MAX functions are typically used only as aggregate
functions in other RDBMS(MSSQL, Postgress, MySQL, Oracle) while
Tarantool's SQLite legacy code use them also in meaning
GREATEST/LEAST scalar function. Now it fixed.

a46b5200

sql: remove SQL_PreferBuiltin flag · 79f3bf4b

Kirill Shcherbatov authored 5 years ago

The SQL_PreferBuiltin flag is redundant (because builtin names
are forbidden for UDFs) so we may to remove it.

Needed for #4113, #2200, #2233

79f3bf4b

sql: improve vdbe_field_ref fetcher · 72279c1c

Kirill Shcherbatov authored 5 years ago

Vdbe field ref is a dynamic index over tuple fields storing
offsets to each field and filling the offset array on demand.
It is highly used in SQL, because it strongly relies on fast and
repetitive access to any field, not only indexed.

There is an optimisation for the case when a requested field
fieldno is indexed, and the tuple itself stores offset to the
field in its own small field map, used by indexes. vdbe_field_ref
then uses that map to retrieve offset value without decoding
anything. But when SQL requests any field > fieldno, then
vdbe_field_ref decodes the tuple from the beginning in the worst
case. Even having previously accessed fieldno. But it could
start decoding from the latter.

An updated vdbe_field_ref fetcher class uses a bitmask of
initialized slots to use pre-calculated offsets when possible.
This speed-ups SQL in some corner case and doesn't damage
performance in general scenarios.

Closes #4267

72279c1c

Aug 19, 2019
- log: fix segfault on _say without filename · d0e38d59
  Mons Anderson authored 5 years ago
  
  d0e38d59
- relay: set `last_row_time' to `now' in `relay_new' and `relay_start'. (#4431) · 15f12c0b
  rtokarev authored 5 years ago
  
  (cherry picked from commit 507f3721)
  15f12c0b
Aug 16, 2019

gc: randomie the next checkpoint time also after a manual box.snapshot(). · 6277f48a

Konstantin Osipov authored 5 years ago

Before this patch, snapshot interval was set randomly within
checkpoint_interval period. However, after box.snapshot(), the next
snapshot was scheduled exactly checkpoint_interval from the current time.
Many orchestration scripts snapshot entire cluster right after deployment,
to take a backup. This kills randomness, since all instances begin to
count the next checkpoint time from the current time.

Randomize the next checkpoint time after a manual snapshot as well.

Fixes gh-4432

6277f48a

test: update test-run · 05fb6faa

Alexander Turenko authored 5 years ago

pretest_clean: preserve GREATEST and LEAST built-in functions.

Needed for #4405.

05fb6faa

sql: remove mask from struct Keyword · 13ed1126

Roman Khabibov authored 5 years ago

Originally, mask in struct Keyword served to reduce set of reserved
keywords for build-dependent features. For instance, it was allowed to
disable triggers as a compilation option, and in this case TRIGGER
wouldn't be reserved word. Nowadays, our build always comprises all
features, so there's no need in this option anymore. Hence, we can
remove mask alongside with related to it macros.

Closes #4155

13ed1126

Hotfix for · 0894bec2

Nikita Pettik authored 5 years ago

It was forgotten to update result file of sql/bind.test.lua
in previous patch. Let's fix that and refresh sql/bind.result with
up-to-date results.

0894bec2

Aug 15, 2019
- sql: fix type in meta for unsigned binding · b7d595ac
  Nikita Pettik authored 5 years ago
  
  It was decided that for all integer literals we would return "INTEGER" type, not "UNSIGNED". Accidentally, after substitution of unsigned binding value type was set to "UNSIGNED". Let's fix that and set "INTEGER" type.
  b7d595ac
- luajit: Bump luajit version · 03a39c3d
  Kirill Yukhin authored 5 years ago
  
  03a39c3d
- luajit: Bump luajit version · a634bd7d
  Kirill Yukhin authored 5 years ago
  
  a634bd7d
- test: new test for LuaJIT fold machinery · 26303604
  Sergey Ostanevich authored 5 years ago
  
  https://github.com/LuaJIT/LuaJIT/issues/505
  26303604
Aug 14, 2019

wal: make wal_sync fail on write error · 2d5e56ff

Vladimir Davydov authored 5 years ago

wal_sync() simply flushes the tx<->wal request queue, it doesn't
guarantee that all pending writes are successfully committed to disk.
This works for now, but in order to implement replica join off the
current read view, we need to make sure that all pending writes have
been persisted and won't be rolled back before we can use memtx
snapshot iterators. So this patch adds a return code to wal_sync():
since now on it returns -1 if rollback is in progress and hence
some in-memory changes are going to be rolled back. We will use
this method after opening memtx snapshot iterators used for feeding
a consistent read view a newly joined replica so as to ensure that
changes frozen by the iterators have made it to the disk.

2d5e56ff

test: app/socket flaky fails at 1118 line · 952d8d1d

Alexander V. Tikhonov authored 5 years ago

Found that on high loaded hosts the test flaky fails at:

[004] --- app/socket.result	Mon Jul 15 07:18:57 2019
[004] +++ app/socket.reject	Tue Jul 16 16:37:35 2019
[004] @@ -1118,7 +1118,7 @@
[004]  ...
[004]  ch:get(1)
[004]  ---
[004] -- true
[004] +- null
[004]  ...
[004]  s:error()
[004]  ---

Found that the test in previous was used for testing the
the channel get() function timeout and the error occurred
on it, but later the checking error changed to:
"builtin/socket.lua: attempt to use closed socket" and the
test became not correct. Because for now it passes when the
socket read function runs before the socket closing, but in
this way read call doesn't wait. In the other way on high
loaded hosts the close call may occure before read call and
in this way read call halts and socket get call returns
'null'. As seen both ways are not correct to check the error.
Decided to remove this subtest.

Check commit ba7a4fee ("Add tests for socket:close closes #360")

Fixes #4354

952d8d1d

xrow: factor out helper for setting REPLACE request body · e687cacd
Vladimir Davydov authored 5 years ago
```
We will reuse it to relay a snapshot to a newly joined replica.
```
e687cacd

memtx: allow snapshot iterator to fail · aef84078

Vladimir Davydov authored 5 years ago

Memtx iterators never fail, that's why the snapshot iterator interface
doesn't support failures. However, once we introduce snapshot iterator
support for vinyl, we will need a way to handle errors in the API.

aef84078

memtx: don't store pointers to index internals in iterator · a22f2219
Vladimir Davydov authored 5 years ago
```
It's pointless as we can always access the index via iterator->index.
```
a22f2219

vinyl: move reference counting from vy_lsm to index · 7e11dd4f

Vladimir Davydov authored 5 years ago

Now, as vy_lsm and index are basically the same object, we can implement
reference counting right in struct index. This will allow us to prevent
an index from destruction when a space object it belongs to is freed
anywhere in the code, not just in vinyl.

7e11dd4f

vinyl: embed index in vy_lsm · 576611eb

Vladimir Davydov authored 5 years ago

There's no point in having vinyl_engine and vinyl_index wrapper structs
to bind vy_env and vy_lsm to struct engine and index. Instead we can
simply embed engine and index in vy_env and vy_lsm. This will simplify
further development, e.g. this will allow us to move reference counting
from vy_lsm up to struct index so that it can be used in the generic
code.

576611eb

vinyl: embed engine in vy_env · a771c2e5

Vladimir Davydov authored 5 years ago

There's no point in having vinyl_engine and vinyl_index wrapper structs
to bind vy_env and vy_lsm to struct engine and index. Instead we can
simply embed engine and index in vy_env and vy_lsm. This will simplify
further development, e.g. this will allow us to move reference counting
from vy_lsm up to struct index so that it can be used in the generic
code.

a771c2e5

decimal: add modulo operator · 8ea7106a

Serge Petrenko authored 5 years ago

Part of #4403

@TarantoolBol document
Title: Document decimal modulo operator

There is now a modulo operator for decimal numbers:
```
a = decimal.new(172.51)
a % 1
---
- '0.51'
...
a % 0.3
---
- '0.01'
...
a % 13.27
---
- '0.00'
...
a % 173
---
- '172.51'
...
a % 72
---
- '28.51'
...
720 % a
---
- '29.96'
...
```

8ea7106a

Aug 13, 2019

json: detect a new invalid json path case · ef64ee51

Vladislav Shpilevoy authored 5 years ago

JSON paths has no a strict standard, but definitely there is no
an implementation, allowing to omit '.' after [], if a next token
is a key. For example:

    [1]key

is invalid. It should be written like that:

    [1].key

Strangely, but we even had tests on the invalid case.

Closes #4419

ef64ee51

Aug 11, 2019

test: update test-run · 940a673e

Alexander Turenko authored 5 years ago

Disable a check whether yaml responses are well-formed for 'core =
tarantool' tests in test-run. The check was unable to handle complex
(dictionary / list) keys in a dictionary, because pyyaml does not
support them.

See also https://github.com/yaml/pyyaml/issues/88

Fixes #4421.

940a673e

Aug 09, 2019

test: fix flaky swim/errinj.test.lua · 4b893910

Vladislav Shpilevoy authored 5 years ago

In one place that test sends a packet and expects that it has
arrived two lines below. Under high load it may take more time.
The patch makes the test explicitly wait for the packet arrival.

Closes #4392

4b893910

Aug 08, 2019

Expose tarantool_package value to lua api (#4412) · 2e97c607

Yaroslav Dynnikov authored 5 years ago

There is compile time option PACKAGE in cmake to define
current build distribution info. By default it's
"Tarantool" for the community version and "Tarantool Enterprise"
for the enterprise version.

It's displayed in console greeting and in `box.info().package`,
but, unfortunately, it can't be accessed from Lua before `box.cfg`.

This patch exposes `require('tarantool').package`.

Close #4408

@TarantoolBot document
Title: Extend module "tarantool" with the field "package"

Beside from build info and version, module "tarantool" now provides
"package" field. By default it equals string "Tarantool", but
can differ for other distributions like "Tarantool Enterprise".

Example:

```console
tarantool> require('tarantool')
---
- version: 2.3.0-3-g302bb3241
  build:
    target: Linux-x86_64-RelWithDebInfo
    options: cmake . -DCMAKE_INSTALL_PREFIX=/opt/tarantool-install
-DENABLE_BACKTRACE=ON
    mod_format: so
    flags: ' -fexceptions -funwind-tables -fno-omit-frame-pointer
-fno-stack-protector
      -fno-common -fopenmp -msse2 -std=c11 -Wall -Wextra
-Wno-strict-aliasing -Wno-char-subscripts
      -Wno-format-truncation -fno-gnu89-inline -Wno-cast-function-type'
    compiler: /usr/bin/cc /usr/bin/c++
  pid: 'function: 0x40016cd0'
  package: Tarantool
  uptime: 'function: 0x40016cb0'
...

```

2e97c607

Aug 06, 2019

box: make functional index creation transactional · 302bb324

Kirill Shcherbatov authored 5 years ago

The _func_index space trigger used to reject an insertion of a
tuple that defines an invalid functional index.
As insertion in _index space had been completed before, a garbage
is kept in _index space in such case.

We need to do something with the yelding _func_index trigger(that
rebuilds an index) to wrap all index creation operation in DDL
transaction in further patches (because only the first DDL
operation may yeld now).

This problem could be trivially solved with preparatory
initialization of index_def function ponter: the memtx_tree
index construction would perform all required job in such case.
Therefore the following index rebuild in _func_index trigger
becomes redundant and should be omitted.

In other words, a trivial prefetch operation makes possible
a transactional index creation (with space:create_index operation).

As for index construction during recovery (a lack of function
cache during recovery was the main motivation to introduce
_func_index space), it's workflow is kept unchanged.

Follow up #1250
Needed for #4348
Closes #4401

302bb324

rfc: vylog ups and downs · 27436b40

Vladimir Davydov authored 5 years ago

As per request by Kostja, commit an RFC document with a brief history of
the vinyl metadata log infrastructure, issues it was intended to solve,
problems we are facing now, and possible ways to solve them.

27436b40

Aug 02, 2019

relay: stop relay on subscribe error · 35ef3320

Vladimir Davydov authored 5 years ago

In case an error occurs between relay_start() and cord_costart() in
relay_subscribe(), the relay status won't be reset to STOPPED. As a
result, any further attempt to re-subscribe will fail with ER_CFG:
duplicate connection with the same replica UUID. This may happen, for
example, if the WAL directory happens to be temporarily inaccessible on
the master.

Closes #4399

35ef3320

Update repository for packages · ce0f2ef8
Kirill Yukhin authored 5 years ago

View commits for tag 2.3.0 2.3.0

ce0f2ef8

box/console: Don't allow arguments in get_default_output · 4138645e

Cyrill Gorcunov authored 5 years ago

The function

 | require('console').get_default_output()

requires no arguments. Make it explcicit and print
an error otherwise.

Part-of #3834

4138645e