Commits · 41477adac6d3d1ff6cddf8bd8ff48b36f16945dc · core / tarantool

Jul 24, 2019

sql: introduce extended range for INTEGER type · 41477ada

Nikita Pettik authored 5 years ago

This patch allows to operate on integer values in range [2^63, 2^64 - 1]
It means that:
 - One can use literals from 9223372036854775808 to 18446744073709551615
 - One can pass values from mentioned range to bindings
 - One can insert and select values from mentioned range

Support of built-in functions and operators has been introduced in
previous patches.

Closes #3810
Part of #4015

41477ada

sql: make built-in functions operate on unsigned values · cf43cd8c

Nikita Pettik authored 5 years ago

As a part of introduction unsigned type in SQL, let's patch all built-in
function to make them accept and operate on unsigned value, i.e. values
which come with MEM_UInt VDBE memory type.

Part of #3810
Part of #4015

cf43cd8c

sql: refactor arithmetic operations to support unsigned ints · be389cb0

Nikita Pettik authored 5 years ago

Let's patch internal VDBE routines which add, subtract, multiply, divide
and calculate the remainder of division to allow them take operands of
unsigned type. In this respect, each operator now accepts signs of both
operands and return sign of result.

Part of #3810
Part of #4015

be389cb0

sql: separate VDBE memory holding positive and negative ints · 5d6c09b3

Nikita Pettik authored 5 years ago

As it was stated in the previous commit message, we are going to support
operations on unsigned values. Since unsigned and signed integers have
different memory representations, to provide correct results of
arithmetic operations we should be able to tell whether value is signed
or not.
This patch introduces new type of value placed in VDBE memory cell -
MEM_UInt. This flag means that value is integer and greater than zero,
hence can be fitted in range [0, 2^64 - 1]. Such approach would make
further replacing MEM_* flags with MP_ format types quite easy: during
decoding and encoding msgpack we assume that negative integers have
MP_INT type and positive - MP_UINT. We also add and refactor several
auxiliary helpers to operate on integers. Note that current changes
don't add ability to operate on unsigned integers - it is still
unavailable.

Needed for #3810
Needed for #4015

5d6c09b3

sql: remove sqlColumnDefault() function · e92802fc

Nikita Pettik authored 5 years ago

This routine implements obsolete mechanism to retrieve default column
value. Since it is not used anymore, let's remove it. Note that related
functions valieFromExpr()/valueFromFunction() are not erased since they
look pretty useful and may be involved later.

e92802fc

sql: refactor VDBE opcode OP_OffsetLimit · c98f971e

Nikita Pettik authored 5 years ago

OP_OffsetLimit instruction calculates sum of OFFSET and LIMIT values
when they are present. This sum serves as a counter of entries to be
inserted to temporary space during VDBE execution. Consider query like:

SELECT * FROM t ORDER BY x LIMIT 5 OFFSET 2;

To perform ordering alongside with applying limit and offset
restrictions, first 7 (5 + 2) entries are inserted into temporary space.
They are sorted and then first two tuples are skipped according to offset
clause. The rest tuples from temporary space get to result set.

When sum of LIMIT and OFFSET values is big enough to cause integer
overflow, we can't apply this approach. Previously, counter was simply
set to -1 which means that all entries from base table will be transferred
to ephemeral space. As a result, LIMIT clause was ignored and the result
of query would be incorrect. Motivation for this obviously wrong step was
that to perform query with such huge limit and offset values too many time
is required (like years). What is more, ephemeral spaces support
auto-generated IDs in the range up to 2^32, so there's even no opportunity
to process such queries in theory. Nevertheless, to make code cleaner
let's fix this tricky place and just raise an overflow error if result
of addition exceeds integer range.

This patch fixes obsolete comments saying that in case of overflow
execution won't stop; now limit and offset counter are always >= 0, so
removed redundant branching.

c98f971e

sql: refactor sql_atoi64() · 8c6edcf5

Nikita Pettik authored 5 years ago

We are going to allow using unsigned values in SQL and extend range of
INTEGER type. Hence, we should be able to parse and operate on integers
in range of [2^63, 2^64 - 1]. Current mechanism which involves
sql_atoi64() function doesn't allow this.

Let's refactor this function: firstly, get rid of manual parsing and use
strtoll() and strtoull() functions from standard library. Then, let's
return sign of parsed literal. In case of success now function returns 0,
-1 otherwise.

This patch also inlines sql_dec_or_hex_to_i64() to place of its only
usage: it makes code cleaner and more straightforward.

Needed for #3810
Needed for #4015

8c6edcf5

decimal: fix decimal.round() when scale == 0 · b0e5fa65

Serge Petrenko authored 5 years ago

Fixes decimal.round() with zero scale, fixes an error with
decimal.round() when rounding leads to a number with the same
precision, for example, decimal.round(9.9, 0) -> 10.

Follow-up #692

b0e5fa65

decimal: add methods trim and rescale · 8221cced

Serge Petrenko authored 5 years ago

This patch adds 2 methods to decimal library and lua module:
trim will remove all trailing fractional zeros and rescale will
either perform rounding or append excess fractional zeros.

Closes #4372

@TarantoolBot document
Title: document 2 new functions in decimal Lua module

2 new functions are added to the decimal module:
`decimal.trim()` and `decimal.rescale()`

`decimal.trim()` removes any trailing fractional zeros from the
number:
```
tarantool> a = dec.new('123.45570000')
---
...
tarantool> decimal.trim(a)
---
- '123.4557'
...
```
`decimal.rescale()` will round the number to a given scale, if it is
less than the number scale. Otherwise it will append trailing fractional
zeros, so that the resulting number scale will be the same as the given
one.
```
tarantool> a = dec.new(123.45)
---
...
tarantool> dec.rescale(a,1)
---
- '123.5'
...
tarantool> dec.rescale(a, 4)
---
- '123.4500'
...
```

8221cced

Jul 19, 2019

Bump libsmall version · 59cc9033
Kirill Yukhin authored 5 years ago

59cc9033

Extend range of printable unicode characters · cdf37876

Ivan Koptelov authored 5 years ago

Before the patch unicode characters encoded with 4 bytes
were always treated as non-printable and displayed as byte
sequences (with 'binary' tag).
With the patch, range of printable characters is extended and
include characters encoded with 4 bytes.
Currently it is: (old printable range) U (icu printable range).
Corresponding changes are also made in tarantool/libyaml.

Closes: #4090

cdf37876

Jul 18, 2019

box: refactor key_validate_parts to return key_end · e43eeb53

Kirill Shcherbatov authored 5 years ago

The key_validate_parts helper is refactored to return a pointer
to the end of a given key argument in case of success.
This is required to effectively validate a sequence of keys in
scope of functional multikey indexes.

Needed for #1260

e43eeb53

box: introduce key_def->is_multikey flag · 2c7c3e44

Kirill Shcherbatov authored 5 years ago

Previously only key definitions that have JSON paths were able
to define multikey index. We used to check multikey_path != NULL
test to determine whether given key definition is multikey.
In further patches with functional indexes this rule becomes
outdated. Functional index extracted key definition may be
multikey, but has no JSON paths.
So an explicit is_multikey flag was introduced.

Needed for #1260

2c7c3e44

sql: use common registers instead of temp. for constraints data · 0fbfeed9

Mergen Imeev authored 5 years ago

Prior to this patch, data needed to form tuple to be inserted to
_fk_constraint and _ck_constraint system spaces (to create
corresponding constraints) was stored in the range of temporary
register. After insertion, temporary registers are released. On
the other hand, this data is required for providing clean-up in
case of creation fail (i.e. removing already created constraints
within one CREATE TABLE statement). Hence, instead of using
temporary registers let's use ordinary ones.

Closes #4183

0fbfeed9

sql: clean-up in case constraint creation failed · c06a3d55

Mergen Imeev authored 5 years ago

This patch makes VDBE to perform a clean-up if the creation of a
constraint fails because of the creation of two or more
constraints of the same type with the same name and in the same
CREATE TABLE statement.

For example:
CREATE TABLE t1(
	id INT PRIMARY KEY,
	CONSTRAINT ck1 CHECK(id > 1),
	CONSTRAINT ck1 CHECK(id < 10)
);

Part of #4183

c06a3d55

sql: add OP_SetDiag opcode in VDBE · 852bb4d4

Mergen Imeev authored 5 years ago

To separate the error setting and execution halting, a new opcode
OP_SetDiag was created. The only functionality of the opcode is
the execution of diag_set(). It is important to note that
OP_SetDiag does not set is_aborted to true, so we can continue
working with other opcodes, if necessary. This function allows us
to perform cleanup in some special cases, for example, when
creating a constraint failed because of the creation of two or
more constraints with the same name in the same CREATE TABLE
statement.

Since now diag_set() is executed in OP_SetDiag, this functionality
has been removed from OP_Halt.

Needed for #4183

852bb4d4

Jul 17, 2019

auth: fix empty password authentication · c185a387

Vladimir Davydov authored 5 years ago

We are supposed to authenticate guest user without a password. This
used to work before commit 076a8420 ("Permit empty passwords in
net.box"), when guest didn't have any password. Now it has an empty
password and the check in authenticate turns out to be broken, which
breaks assumptions made by certain connectors. This patch fixes the
check.

Closes #4327

c185a387

Jul 16, 2019

sql: allow <COLLATE> only for string-like args · 2ac7f44f

Roman Khabibov authored 5 years ago

Before this patch, user could use COLLATE with non-string-like
literals, columns or subquery results. Disallow such usage.

Closes #3804

2ac7f44f

Jul 15, 2019

ddl: allow to execute non-yielding DDL statements in transactions · f266559b

Vladimir Davydov authored 5 years ago

The patch is pretty straightforward - all it does is moves checks for
single statement transactions from alter.cc to txn_enable_yield_for_ddl
so that now any DDL request may be executed in a transaction unless it
builds an index or checks the format of a non-empty space (those are the
only two operations that may yield).

There's two things that must be noted explicitly. The first is removal
of an assertion from priv_grant. The assertion ensured that a revoked
privilege was in the cache. The problem is the cache is built from the
contents of the space, see user_reload_privs. On rollback, we first
revert the content of the space to the original state, and only then
start invoking rollback triggers, which call priv_grant. As a result, we
will revert the cache to the original state right after the first
trigger is invoked and the following triggers will have no effect on it.
Thus we have to remove this assertion.

The second subtlety lays in vinyl_index_commit_modify. Before the commit
we assumed that if statement lsn is <= vy_lsm::commit_lsn, then it must
be local recovery from WAL. Now it's not true, because there may be
several operations for the same index in a transaction, and they all
will receive the same signature in on_commit trigger. We could, of
course, try to assign different signatures to them, but that would look
cumbersome - better simply allow lsn <= vy_lsm::commit_lsn after local
recovery, there's actually nothing wrong about that.

Closes #4083

@TarantoolBot document
Title: Transactional DDL

Now it's possible to group non-yielding DDL statements into
transactions, e.g.

```Lua
box.begin()
box.schema.space.create('my_space')
box.space.my_space:create_index('primary')
box.commit() -- or box.rollback()
```

Most DDL statements don't yield and hence can be run from transactions.
There are just two exceptions: creation of a new index and changing the
format of a non-empty space. Those are long operations that may yield
so as not to block the event loop for too long. Those statements can't
be executed from transactions (to be more exact, such a statement must
go first in any transaction).

Also, just like in case of DML transactions in memtx, it's forbidden to
explicitly yield in a DDL transaction by calling fiber.sleep or any
other yielding function. If this happens, the transaction will be
aborted and an attempt to commit it will fail.

f266559b

ddl: don't use space_index from AlterSpaceOp::commit,rollback · 626c5fd0

Vladimir Davydov authored 5 years ago

If there are multiple DDL operations in the same transactions, which is
impossible now, but will be implemented soon, AlterSpaceOp::commit and
rollback methods must not access space index map. To understand that,
consider the following example:

  - on_replace: AlterSpaceOp1 creates index I1 for space S1
  - on_replace: AlterSpaceOp2 moves index I1 from space S1 to space S2
  - on_commit:  AlterSpaceOp1 commits creation of index I1

AlterSpaceOp1 can't lookup I1 in S1 by id, because the index was moved
from S1 to S2 by AlterSpaceOp2. If AlterSpaceOp1 attempts to look it up,
it will access a wrong index.

Fix that by caching pointers to old and new indexes in AlterSpaceOp on
construct/prepare instead of using space_index() on commit/rollback to
access them.

626c5fd0

memtx: fix txn_on_yield for DDL transactions · 0ae5a2d7

Vladimir Davydov authored 5 years ago

Memtx engine doesn't allow yielding inside a transaction. To achieve
that, it installs fiber->on_yield trigger that aborts the current
transaction (rolls it back, but leaves it be so that commit fails).

There's an exception though - DDL statements are allowed to yield.
This is required so as not to block the event loop while a new index
is built or a space format is checked. Currently, we handle this
exception by checking space id and omitting installation of the
trigger for system spaces. This isn't entirely correct, because we
may yield after a DDL statement is complete, in which case the
transaction won't be aborted though it should:

  box.begin()
  box.space.my_space:create_index('my_index')
  fiber.sleep(0) -- doesn't abort the transaction!

This patch fixes the problem by making the memtx engine install the
on_yield trigger unconditionally, for all kinds of transactions, and
instead explicitly disabling the trigger for yielding DDL operations.

In order not to spread the yield-in-transaction logic between memtx
and txn code, let's move all fiber_on_yield related stuff to txn,
export a method to disable yields, and use the method in memtx.

0ae5a2d7

Jul 13, 2019

sql: introduce ADD CONSTRAINT CHECK statement · 260f6328

Nikita Pettik authored 5 years ago

This patch extends parser's grammar to allow to create CHECK constraints
on already existent tables via SQL facilities.

Closes #3097

@TarantoolBot document
Title: Document ADD CONSTRAINT CHECK statement

Now it is possible to add CHECK constraints to already existent table
via SQL means. To achieve this one must use following syntax:

ALTER TABLE <table> ADD CONSTRAINT <name> CHECK (<expr>);

260f6328

Fix broken build · 307133d3

Kirill Yukhin authored 5 years ago

The argument in func_c_new() is used in Debug mode only.
Mark it w/ MAYBE_UNUSED.

307133d3

Jul 12, 2019

func: consolidate func_def checks in func_def_chekc · 212c5742
Konstantin Osipov authored 5 years ago
```
A follow up on #4182
```
212c5742

box: introduce Lua persistent functions · 200a492a

Kirill Shcherbatov authored 5 years ago

Closes #4182
Closes #4219
Needed for #1260

@TarantoolBot document
Title: Persistent Lua functions

Now Tarantool supports 'persistent' Lua functions.
Such functions are stored in snapshot and are available after
restart.
To create a persistent Lua function, specify a function body
in box.schema.func.create call:
e.g. body = "function(a, b) return a + b end"

A Lua persistent function may be 'sandboxed'. The 'sandboxed'
function is executed in isolated environment:
  a. only limited set of Lua functions and modules are available:
    -assert -error -pairs -ipairs -next -pcall -xpcall -type
    -print -select -string -tonumber -tostring -unpack -math -utf8;
  b. global variables are forbidden

Finally, the new 'is_deterministic' flag allows to mark a
registered function as deterministic, i.e. the function that
can produce only one result for a given list of parameters.

The new box.schema.func.create interface is:
box.schema.func.create('funcname', <setuid = true|FALSE>,
	<if_not_exists = true|FALSE>, <language = LUA|c>,
	<body = string ('')>, <is_deterministic = true|FALSE>,
	<is_sandboxed = true|FALSE>, <comment = string ('')>)

This schema change is also reserves names for sql builtin
functions:
    TRIM, TYPEOF, PRINTF, UNICODE, CHAR, HEX, VERSION,
    QUOTE, REPLACE, SUBSTR, GROUP_CONCAT, JULIANDAY, DATE,
    TIME, DATETIME, STRFTIME, CURRENT_TIME, CURRENT_TIMESTAMP,
    CURRENT_DATE, LENGTH, POSITION, ROUND, UPPER, LOWER,
    IFNULL, RANDOM, CEIL, CEILING, CHARACTER_LENGTH,
    CHAR_LENGTH, FLOOR, MOD, OCTET_LENGTH, ROW_COUNT, COUNT,
    LIKE, ABS, EXP, LN, POWER, SQRT, SUM, TOTAL, AVG,
    RANDOMBLOB, NULLIF, ZEROBLOB, MIN, MAX, COALESCE, EVERY,
    EXISTS, EXTRACT, SOME, GREATER, LESSER, SOUNDEX,
    LIKELIHOOD, LIKELY, UNLIKELY,
    _sql_stat_get, _sql_stat_push, _sql_stat_init, LUA

A new Lua persistent function LUA is introduced to evaluate
LUA strings from SQL in future.

This names could not be used for user-defined functions.

Example:
lua_code = [[function(a, b) return a + b end]]
box.schema.func.create('summarize', {body = lua_code,
		is_deterministic = true, is_sandboxed = true})
box.func.summarize
---
- aggregate: none
  returns: any
  exports:
    lua: true
    sql: false
  id: 60
  is_sandboxed: true
  setuid: false
  is_deterministic: true
  body: function(a, b) return a + b end
  name: summarize
  language: LUA
...
box.func.summarize:call({1, 3})
---
- 4
...

@kostja: fix style, remove unnecessary module dependencies,
add comments

200a492a

sql: move LIKE UConverter object to collation library · f9551660

Kirill Shcherbatov authored 5 years ago

Moved UConverter object to collation library. This is required
to get rid of sqlRegisterBuiltinFunctions function in further
patches.

Needed for #4113, #2200, #2233

f9551660

sql: replace bool is_derived_coll marker with flag · 4e523fa0

Kirill Shcherbatov authored 5 years ago

Introduce a new flag SQL_FUNC_DERIVEDCOLL for function that may
require collation to be applied on its result instead of separate
boolean variable. This is required to get rid of FuncDef in
further patches.

Needed for #4113, #2200, #2233

4e523fa0

sql: put analyze helpers to FuncDef cache · e1a7cb7f

Kirill Shcherbatov authored 5 years ago

Previously analyze functions refer to statically defined
service FuncDef context. We need to change this approach due we
going to rework the builtins functions machinery in following
patches.

Needed for #4113, #2200, #2233

e1a7cb7f

Jul 11, 2019

sql: enable SOUNDEX sql builtin · 873a3b30

Kirill Shcherbatov authored 5 years ago

Needed for #4182

@TarantoolBot document
Title: Introduce SOUNDEX sql function

The SOUNDEX function returns a 4-character code that represents
the sound of the words in the argument. The result can be
compared to the results of the SOUNDEX function of other strings.

The current SOUNDEX function supports only Latin strings.

@kostja: fix test tap count; remove optional invocation; remove
trailing spaces; fix alignment.

873a3b30

box/memtx: Skip tuple memory from coredump by default · 9d077bb4

Cyrill Gorcunov authored 5 years ago

Quoting feature request

 | Tarantool is Database and Application Server in one box.
 |
 | Appserver development process contains a lot of
 | lua/luajit-ffi/lua-c-extension code.
 |
 | Coredump is very useful in case when some part of appserver crashed.
 | If the reason is input - data from database is not necessary. If the reason
 | is output - data from database is already in snap/xlog files.
 |
 | Therefore consider core dumps without data enabled by default.

For info: the strip_core feature has been introduced in
549140b3

Closes #4337

@TarantoolBot document
Title: Document box.cfg.strip_core

When Tarantool runs under a heavy load the memory allocated
for tuples may be very huge in size and to eliminate this
memory from being present in `coredump` file the `box.cfg.strip_core`
parameter should be set to `true`.

The default value is `true`.

9d077bb4

sql: get rid of MATCH sql builtin · 4d936cb2

Kirill Shcherbatov authored 5 years ago

In relation with FuncDef cache rework we need to clean-up
builtins list. The MATCH fucntion is a stub that raises an error
and it could be dropped.

Needed for #4182

@kostja: make the patch actually pass the tests, remove
tap count change in e_expr.test.lua, since it's disabled and was
not run

4d936cb2

Add distribution info to box.info · 366466eb

Denis Ignatenko authored 5 years ago

There is compile time option PACKAGE in cmake to define
current build distribution info. For community edition
is it Tarantool by default. For enterprise it is
Tarantool Enterprise

There were no option to check distribution name in runtime.
This change adds box.info.package output for CE and TE.

366466eb

tarantoolctl: always initialize notify_socket (#4342) · 9f76bd86

Yaroslav Dynnikov authored 5 years ago

Notify socket used to be initialized during `box.cfg()`.
There is no apparent reason for that, because we can write tarantool
apps that don't use box api at all, but still leverage the event loop
and async operations.

This patch makes initialization of notify socket independent.
Instance can notify entering event loop even if box.cfg wasn't called.

Closes #4305

9f76bd86

sql: fix passing FP values to integer iterator · 8fac6972

Nikita Pettik authored 5 years ago

Before this patch it was impossible to compare indexed field of integer
type and floating point value. For instance:

CREATE TABLE t1(id INT PRIMARY KEY, a INT UNIQUE);
INSERT INTO t1 VALUES (1, 1);
SELECT * FROM t1 WHERE a = 1.5;
---
- error: 'Failed to execute SQL statement: Supplied key type of part 0 does not match
    index part type: expected integer'
...

That happened due to the fact that type casting mechanism (OP_ApplyType)
doesn't affect FP value when it is converted to integer. Hence, FP value
was passed to the iterator over integer field which resulted in error.
Meanwhile, comparison of integer and FP values is legal in SQL.  To cope
with this problem for each equality comparison involving integer field
we emit OP_MustBeInt, which checks whether value to be compared is
integer or not. If the latter, we assume that result of comparison is
always false and continue processing query.  For inequality constraints
we pass auxiliary flag to OP_Seek** opcodes to notify it that one of key
fields must be truncated to integer (in case of FP value) alongside with
changing iterator's type: a > 1.5 -> a >= 2.

Closes #4187

8fac6972

sql: remove redundant type derivation from QP · 0d5e757d

Nikita Pettik authored 5 years ago

Before value to be scanned in index search is passed to the iterator, it
is subjected to implicit type casting (which is implemented by
OP_ApplyType). If value can't be converted to required type,
user-friendly message is raised. Without this cast, type of iterator may
not match with type of key which in turn results in unexpected error.
However, array of types which is used to provide type conversions is
different from types of indexed fields: it is modified depending on
types of comparison's operands. For instance, when boolean field is
compared with blob value, resulting type is assumed to be scalar. In
turn, conversion to scalar is no-op. As a result, value with MP_BIN
format is passed to the iterator over boolean field. To fix that let's
remove this transformation of types. Moreover, it seems to be redundant.

Part of #4187

0d5e757d

sql: remove redundant check of space format from QP · 3570f366

Nikita Pettik authored 5 years ago

In SQL we are able to execute queries involving spaces only with formats.
Otherwise, at the very beginning of query compilation error is raised.
So, after that any checks of format existence are redundant.

3570f366

sql: fix antisymmetric boolean comparison in VDBE · db12efaf

Nikita Pettik authored 5 years ago

There are a few situations when booleans can be compared with values of
other types. To process them, we assume that booleans are always less
than numbers, which in turn are less than strings. On the other hand,
function which implements internal comparison of values -
sqlMemCompare() always returns 'less' result if one of values is boolean
and another one is not, ignoring the order of values. For instance:

... max (false, 'abc') -> 'abc'
... max ('abc', false) -> false

This patch fixes this misbehaviour making boolean values always less
than values of other types.

db12efaf

sql: ANSI aliases for LENGTH() · 2885cf84

Mergen Imeev authored 5 years ago

This patch creates aliases CHARACTER_LENGTH() and CHAR_LENGTH()
for LENGTH(). These functions are added because they are described
in ANSI.

Closes #3929

@TarantoolBot document
Title: SQL functions CHAR_LENGTH() and CHARACTER_LENGTH()

The SQL functions CHAR_LENGTH() and CHARACTER_LENGTH() work the
same as the LENGTH() function. They take exactly one argument. If
an argument of type TEXT or can be cast to a TEXT value using
internal casting rules, these functions return the length of the
TEXT value that represents the argument. They throw an error if
the argument cannot be cast to a TEXT value.

2885cf84

Jul 09, 2019

swim: optimize struct swim_task layout · 31a26448

Vladislav Shpilevoy authored 5 years ago

Before the patch it was split in two parts by 1.5KB packet, and
in the constructor it was nullifying the whole volume. Obviously,
these were mistakes. The first problem breaks cache locality,
the second one flushes the cache.

31a26448

swim: pool IO tasks · 837e114e

Vladislav Shpilevoy authored 5 years ago

Before the patch each SWIM member had two preallocated task
objects, 3KB in total. It was a waste of memory, because network
load per member in SWIM is ~2 messages per round step regardless
of cluster size.

This patch moves the tasks to a pool, where they can be reused.
Even by different SWIM instances running on the same node.

837e114e