Commits · a483048437f29f38f01143ab2e6c5d9788824553 · core / tarantool

Aug 10, 2018

vinyl: fix appearance of phantom tuple in secondary index after update · e72867cb

index.update() looks up the old tuple in the primary index, applies
update operations to it, then writes a DELETE statement to secondary
indexes to delete the old tuple and a REPLACE statement to all indexes
to insert the new tuple. It also sets a column mask for both DELETE and
REPLACE statements. The column mask is a bit mask which has a bit set if
the corresponding field is updated by update operations. It is used by
the write iterator for two purposes. First, the write iterator skips
REPLACE statements that don't update key fields. Second, the write
iterator turns a REPLACE that has a column mask that intersects with key
fields into an INSERT (so that it can get annihilated with a DELETE when
the time comes). The latter is correct, because if an update() does
update secondary key fields, then it must have deleted the old tuple and
hence the new tuple is unique in terms of extended key (merged primary
and secondary key parts, i.e. cmp_def).

The problem is that a bit may be set in a column mask even if the
corresponding field does not actually get updated. For example, consider
the following example.

  s = box.schema.space.create('test', {engine = 'vinyl'})
  s:create_index('pk')
  s:create_index('sk', {parts = {2, 'unsigned'}})
  s:insert{1, 10}
  box.snapshot()
  s:update(1, {{'=', 2, 10}})

The update() doesn't modify the secondary key field so it only writes
REPLACE{1, 10} to the secondary index (actually it writes DELETE{1, 10}
too, but it gets overwritten by the REPLACE). However, the REPLACE has
column mask that says that update() does modify the key field, because a
column mask is generated solely from update operations, before applying
them. As a result, the write iterator will not skip this REPLACE on
dump. This won't have any serious consequences, because this is a mere
optimization. What is worse, the write iterator will also turn the
REPLACE into an INSERT, which is absolutely wrong as the REPLACE is
preceded by INSERT{1, 10}. If the tuple gets deleted, the DELETE
statement and the INSERT created by the write iterator from the REPLACE
will get annihilated, leaving the old INSERT{1, 10} visible.

The issue may result in invalid select() output as demonstrated in the
issue description. It may also result in crashes, because the tuple
cache is very sensible to invalid select() output.

To fix this issue let's clear key bits in the column mask if we detect
that an update() doesn't actually update secondary key fields although
the column mask says it does.

Closes #3607

e72867cb

Aug 08, 2018

say: configurable syslog destination · 6854ea19

Olga Arkhangelskaia authored 6 years ago

Added server option to syslog configuration.
Server option is responsible for log destination. At the momemt
there is two ways of usage:server=unix:/path/to/socket or
server=ipv4:port. If port is not set default udp port 514 is used.
If logging to syslog is set, however there is no server options -
default location is used: Linux /dev/log and Mac /var/run/syslog.

Closes #3487

6854ea19

Aug 07, 2018

sql: fix segfault with check referencing new table · 16c3989c

Kirill Shcherbatov authored 6 years ago

Starting from 9a543202 on tuple insertion in _space we
execute sql_checks_resolve_space_def_reference for checks
if any on executing on_replace_dd_space trigger. Routine
box_space_id_by_name that takes a look to _space space
returns not-null value, at the same time space object
doesn't present in space cache and can't be found by
space_by_id.
Before 1.10 patch 0ecabde8 was merged to 2.0 as a part of
13df2b1f box_space_id_by_name used to return BOX_ID_NIL
due to "multi-engine transaction error" that is not raised
in same situation now.

Closes #3611.

16c3989c

sql: remove SQLITE_OMIT_FOREIGN_KEY define guard · 502fb2b4

Nikita Pettik authored 6 years ago

We always compile with enabled foreign keys constraints. They still can
be turned off by <pragma foreign_keys = false> in runtime.

Follow-up #3271

502fb2b4

sql: display error on FK creation and drop failure · deea0693

Nikita Pettik authored 6 years ago

Before insertion to _fk_constraint we must be sure that there is no
entry with given <name, child id>. Otherwise, insertion will fail and
'duplicate key' will be shown. Such error message doesn't seem to be
informative enough, so lets verify before insertion to _fk_constraint
that it doesn't already contain entry with given name.
The same is for dropping constraint, but here vice versa: we test
that _fk_constraint contains entry with given name and child id.

It is worth mentioning that during CREATE TABLE processing schema id
changes and check in OP_OpenRead opcode fails (which in turn shows that
pointer to space may expire). On the other hand, _fk_constraint space
itself remains immutable, so as a temporary workaround lets use flag
indicating pointer to system space passed to OP_OpenRead. It makes
possible to use pointer to space, even if schema has changed.

Closes #3271

deea0693

sql: introduce ADD CONSTRAINT statement · 4fe0b812

Nikita Pettik authored 6 years ago

After introducing separate space for persisting foreign key
constraints, nothing prevents us from adding ALTER TABLE statement to
add or drop named constraints. According to ANSI syntax is following:

ALTER TABLE <referencing table> ADD CONSTRAINT
  <referential constraint name> FOREIGN KEY
  <left paren> <referencing columns> <right paren> REFERENCES
  <referenced table> [ <referenced columns> ] [ MATCH <match type> ]
  [ <referential triggered action> ] [ <constraint check time> ]

ALTER TABLE <referencing table> DROP CONSTRAINT <constrain name>

In our terms it looks like:

ALTER TABLE t1 ADD CONSTRAINT f1 FOREIGN KEY(id, a)
    REFERENCES t2 (id, b) MATCH FULL;
ALTER TABLE t1 DROP CONSTRAINT f1;

FK constraints which come with CREATE TABLE statement are also
persisted with auto-generated name. They are coded after space and its
indexes.

Moreover, we don't use original SQLite foreign keys anymore: those
obsolete structs have been removed alongside FK hash. Now FK constraints
are stored only in space.

Since types of referencing and referenced fields must match, and now in
SQL only PK is allowed to feature INT (other fields are always SCALAR),
some tests have been corrected to obey this rule.

Part of #3271

4fe0b812

schema: add new system space for FK constraints · 78fef3d0

Nikita Pettik authored 6 years ago

This patch introduces new system space to persist foreign keys
constraints. Format of the space:

_fk_constraint (space id = 358)

[<constraint name> STR, <parent id> UINT, <child id> UINT,
 <is deferred> BOOL, <match> STR, <on delete action> STR,
 <on update action> STR, <child cols> ARRAY<UINT>,
 <parent cols> ARRAY<UINT>]

FK constraint is local to space, so every pair <FK name, child id>
is unique (and it is PK in _fk_constraint space).

After insertion into this space, new instance describing FK constraint
is created. FK constraints are held in data-dictionary as two lists
(for child and parent constraints) in struct space.

There is a list of FK restrictions:
 - At the time of FK creation parent and child spaces must exist;
 - VIEWs can't be involved into FK processing;
 - Child space must be empty;
 - Types of referencing and referenced fields must be comparable;
 - Collations of referencing and referenced fields must match;
 - Referenced fields must compose unique index;
 - Referenced fields can not contain duplicates.

Until space (child) features FK constraints it isn't allowed to be
dropped. Implicitly referenced index also can't be dropped
(and that is why parent space can't be dropped). But :drop() method
of child space firstly deletes all FK constraint (the same as SQL
triggers, indexes etc) and then removes entry from _space.

Part of #3271

Review fixes

78fef3d0

sql: prohibit creation of FK on unexisting tables · 30287377

Nikita Pettik authored 6 years ago

Originally, SQLite allows to create table with foreign keys constraint
which refers to yet not created parent table. For instance:

CREATE TABLE child(id INT PRIMARY KEY REFERENCES parent);
CREATE TABLE parent(id INT PRIMARY KEY);

This patch bans such ability since it contradicts SQL ANSI.
Moreover, SQLite allows to drop parent table if deletion of all rows
wouldn't result in FK constraint violations. This feature has been
removed since in such situation child table would become inconsistent.

Finally, within current patch ability to create FK constraints on VIEWs
is banned as well.

Part of #3271

30287377

box: serialize calls to box.cfg · 1d3a6cb0

Vladimir Davydov authored 6 years ago

It is dangerous to call box.cfg() concurrently from different fibers.
For example, replication configuration uses static variables and yields
so calling it concurrently can result in a crash. To make sure it never
happens, let's protect box.cfg() with a lock.

Closes #3606

1d3a6cb0

Aug 03, 2018
- Fix csv crash with ending space and empty field · a9695b98
  Alexander Turenko authored 6 years ago
  
  Fixes #3489.
  a9695b98
Aug 02, 2018

sql: remove SQLite original 'fixing' routine · 1acb8646

Nikita Pettik authored 6 years ago

Now we operate only on one database, so prefixes like 'first_db.table1'
are not applieble to our SQL implementation (at least now).

1acb8646

Aug 01, 2018

txn: add helper to detect transaction boundaries · 13acfe47

Vladimir Davydov authored 6 years ago

Add txn_is_first_statement() function, which returns true if this is the
first statement of the transaction. The function is supposed to be used
from on_replace trigger to detect transaction boundaries.

Needed for #2129

13acfe47

vinyl: rename vy_task::status to is_failed · 21eed04c

Vladimir Davydov authored 6 years ago

vy_task::status stores the return code of the ->execute method. There
are only two codes in use: 0 - success and -1 - failure. So let's chage
this to a boolean flag.

21eed04c

vinyl: zap vy_scheduler::is_worker_pool_running · d77b4dc9
Vladimir Davydov authored 6 years ago
```
This flag is set iff worker_pool != NULL hence it is pointless.
```
d77b4dc9

vinyl: use cbus for communication between scheduler and worker threads · f4625e64

Vladimir Davydov authored 6 years ago

We need cbus for forwarding deferred DELETE statements generated in a
worker thread during primary index compaction to the tx thread where
they can be inserted into secondary indexes. Since pthread mutex/cond
and cbus are incompatible by their nature, let's rework communication
channel between the tx and worker threads using cbus.

Needed for #2129

f4625e64

vinyl: rename some members of vy_scheduler and vy_task struct · 46f50aad

Vladimir Davydov authored 6 years ago

I'm planning to add some new members and remove some old members from
those structs. For this to play nicely, let's do some renames:

  vy_scheduler::workers_available => idle_worker_count
  vy_scheduler::input_queue       => pending_tasks
  vy_scheduler::output_queue      => processed_tasks
  vy_task::link                   => in_pending, in_processed

46f50aad

vinyl: store pointer to scheduler in struct vy_task · 1331d232

Vladimir Davydov authored 6 years ago

Currently, we don't really need it, but once we switch communication
channel between the scheduler and workers from pthread mutex/cond to
cbus (needed for #2129), tasks won't be completed on behalf of the
scheduler fiber and hence we will need a back pointer from vy_task to
vy_scheduler.

Needed for #2129

1331d232

vinyl: do not free pending tasks on shutdown · 15c28b75

Vladimir Davydov authored 6 years ago

This is a prerequisite for switching scheduler-worker communication from
pthread mutex/cond to cbus, which in turn is needed to generate and send
deferred DELETEs from workers back to tx (#2129).

After this patch, pending tasks will be leaked on shutdown. This is OK,
as we leak a lot of objects on shutdown anyway. The proper way of fixing
this leak would be to rework shutdown without atexit() so that we can
use cbus till the very end.

Needed for #2129

15c28b75

vinyl: store full tuples in secondary index cache · 0c5e6cc8

Vladimir Davydov authored 6 years ago

Currently, both vy_read_iterator_next() and vy_point_lookup() add the
returned tuple to the tuple cache. As a result, we store partial tuples
in a secondary index tuple cache although we could store full tuples
(we have to retrieve them anyway when reading a secondary index). This
means wasting memory. Besides, when the #2129 gets implemented, there
will be tuples in a secondary index that have to be skipped as they have
been overwritten in the primary index. Caching them would be inefficient
and error prone. So let's call vy_cache_add() from the upper level and
add only full tuples to the cache.

Closes #3478
Needed for #2129

0c5e6cc8

Jul 31, 2018

vinyl: refactor unique check · 85608344

Vladimir Davydov authored 6 years ago

For the sake of further patches, let's do some refactoring:
 - Rename vy_check_is_unique to vy_check_is_unique_primary and use it
   only for checking the unique constraint of primary indexes. Also,
   make it return immediately if the primary index doesn't need
   uniqueness check, like vy_check_is_unique_secondary does.
 - Open-code uniqueness check in vy_check_is_unique_secondary instead of
   using vy_check_is_unique.
 - Reduce indentation level of vy_check_is_unique_secondary by inverting
   the if statement.

85608344

vinyl: fold vy_delete_impl · f88a0bd1

Vladimir Davydov authored 6 years ago

vy_delete_impl helper is only used once in vy_delete and it is rather
small so inlining it definitely won't hurt. On the contrary, it will
consolidate DELETE logic in one place, making the code easier to follow.

f88a0bd1

vinyl: fold vy_replace_one and vy_replace_impl · 1dfeb601

Vladimir Davydov authored 6 years ago

There's no point in separating REPLACE path between the cases when
the space has secondary indexes and when it only has the primary
index, because they are quite similar. Let's fold vy_replace_one
and vy_replace_impl into vy_replace to remove code duplication.

1dfeb601

vinyl: always get full tuple from pk after reading from secondary index · 5ceca76c

Vladimir Davydov authored 6 years ago

Currently, we don't always need a full tuple. Sometimes (e.g. for
checking uniqueness constraint), a partial tuple read from a secondary
index is enough. So we have vy_lsm_get() which reads a partial tuple
from an index. However, once the optimization described in #2129 is
implemented, it might happen that a tuple read from a secondary index
was overwritten or deleted in the primary index, but DELETE statement
hasn't been propagated to the secondary index yet, i.e. we will have to
read the primary index anyway, even if we don't need a full tuple.

That said, let us:

 - Make vy_lsm_get() always fetch a full tuple, even for secondary
   indexes, and rename it to vy_get().

 - Rewrite vy_lsm_full_by_key() as a wrapper around vy_get() and rename
   it to vy_get_by_raw_key().

 - Introduce vy_get_by_secondary_tuple() which gets a full tuple given a
   tuple read from a secondary index. For now, it's basically a call to
   vy_point_lookup(), but it'll become a bit more complex once #2129 is
   implemented.

 - Prepare vy_get() for the fact that a tuple read from a secondary
   index may be absent in the primary index, in which case it should
   try the next matching one.

Needed for #2129

5ceca76c

vinyl: simplify vy_squash_process · 128503ea

Vladimir Davydov authored 6 years ago

Since vy_point_lookup() now guarantees that it returns the newest
tuple version, we can remove the code that squashes UPSERTs from
vy_squash_process().

128503ea

vinyl: make point lookup always return the latest tuple version · 6d85c35c

Vladimir Davydov authored 6 years ago

Currently, vy_point_lookup(), in contrast to vy_read_iterator, doesn't
rescan the memory level after reading disk, so if the caller doesn't
track the key before calling this function, the caller won't be sent to
a read view in case the key gets updated during yield and hence will
be returned a stale tuple. This is OK now, because we always track the
key before calling vy_point_lookup(), either in the primary or in a
secondary index. However, for #2129 we need it to always return the
latest tuple version, no matter if the key is tracked or not.

The point is in the scope of #2129 we won't write DELETE statements to
secondary indexes corresponding to a tuple replaced in the primary
index. Instead after reading a tuple from a secondary index we will
check whether it matches the tuple corresponding to it in the primary
index: if it is not, it means that the tuple read from the secondary
index was overwritten and should be skipped. E.g. suppose we have the
primary index over the first field and a secondary index over the second
field and the following statements in the space:

  REPLACE{1, 10}
  REPLACE{1, 20}

Then reading {10} from the secondary index will return REPLACE{1, 10}, but
lookup of {1} in the primary index will return REPLACE{1, 20} which
doesn't match REPLACE{1, 10} read from the secondary index hence the
latter was overwritten and should be skipped.

The problem is in the example above we don't want to track key {1} in
the primary index before lookup, because we don't actually read its
value. So for the check to work correctly, we need the point lookup to
guarantee that the returned tuple is always the newest one. It's fairly
easy to do - we just need to rescan the memory level after yielding on
disk if its version changed.

Needed for #2129

6d85c35c

sql: remove pragma schema_version · f2debd8b

Kirill Yukhin authored 6 years ago

This pragma is dead and produces nothing else but segfault.
Along w/ this pragma, remove now dead opcodes which set/read
schema_version and all related routines.
Also, improve opcode generation script.

Part of #3541

f2debd8b

sql: prevent executing cross-engine sql · 7324a1cf

Kirill Shcherbatov authored 6 years ago

Some sql requests are complex and could contain R/W with
multiple spaces. As we have no ability to make such changes
transactionaly, we have to disallow such requests.
Since now iterators in SQL start transaction so we cant prevent
such vicious and dangerous things.

Closes #3551

7324a1cf

sql: UPPER and LOWER support COLLATE · 0ecdb0f5

Mergen Imeev authored 6 years ago

SQL functions UPPER and LOWER now works
with COLLATE as they should according to
ANSI Standart.

Closes #3052.

0ecdb0f5

Jul 30, 2018

vinyl: implement rebootstrap support · 06658416

Vladimir Davydov authored 6 years ago

If vy_log_bootstrap() finds a vylog file in the vinyl directory, it
assumes it has to be rebootstrapped and calls vy_log_rebootstrap().
The latter scans the old vylog file to find the max vinyl object id,
from which it will start numbering objects created during rebootstrap to
avoid conflicts with old objects, then it writes VY_LOG_REBOOTSTRAP
record to the old vylog to denote the beginning of a rebootstrap
section. After that initial join proceeds as usual, writing information
about new objects to the old vylog file after VY_LOG_REBOOTSTRAP marker.
Upon successful rebootstrap completion, checkpoint, which is always
called right after bootstrap, rotates the old vylog and marks all
objects created before the VY_LOG_REBOOTSTRAP marker as dropped in the
new vylog. The old objects will be purged by the garbage collector as
usual.

In case rebootstrap fails and checkpoint never happens, local recovery
writes VY_LOG_ABORT_REBOOTSTRAP record to the vylog. This marker
indicates that the rebootstrap attempt failed and all objects created
during rebootstrap should be discarded. They will be purged by the
garbage collector on checkpoint. Thus even if rebootstrap fails, it is
possible to recover the database to the state that existed right before
a failed rebootstrap attempt.

Closes #461

06658416

vinyl: simplify vylog recovery from backup · 8e710090

Vladimir Davydov authored 6 years ago

Since we don't create snapshot files for vylog, but instead append
records written after checkpoint to the same file, we have to use the
previous vylog file for backup (see vy_log_backup_path()). So when
recovering from a backup we need to rotate the last vylog to keep vylog
and checkpoint signatures in sync. Currently, we do it on recovery
completion and we use vy_log_create() instead of vy_log_rotate() for it.
This is done so that we can reuse the context that was used for recovery
instead of rereading vylog for rotation. Actually, there's no point in
this micro-optimization, because we rotate vylog only when recovering
from a backup. Let's remove it and use vy_log_rotate() for this.

Needed for #461

8e710090

replication: print master uuid when (re)bootstrapping · 71cec841

Vladimir Davydov authored 6 years ago

Currently only the remote address is printed. Let's also print the UUID,
because replicas are identified by UUID everywhere in tarantool, not by
the address. An example of the output is below:

I> can't follow eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083: required {1: 8} available {1: 12}
C> replica is too old, initiating rebootstrap
I> bootstrapping replica from eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083

I> can't follow eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083: required {1: 17, 2: 1} available {1: 20}
I> can't rebootstrap from eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083: replica has local rows: local {1: 17, 2: 1} remote {1: 23}
I> recovery start

Suggested by @kostja.

Follow-up ea69a0cd ("replication: rebootstrap instance on startup
if it fell behind").

71cec841

vinyl: zap tx_manager_vlsn · 5a772639

Vladimir Davydov authored 6 years ago

This function is not used anywhere since commit a1e005d8
("vinyl: write_iterator merges vlsns subsequnces")

5a772639

Jul 27, 2018

sql: remove preupdate hook · 5d03ba58
Kirill Yukhin authored 6 years ago
```
SQLITE_ENABLE_PREUPDATE_HOOK is dead macro.
Remove it.

Part of #2356
```
5d03ba58

sql: introduce TRUNCATE TABLE operation · dbe38b0d

Kirill Shcherbatov authored 6 years ago

To implement new TRUNCATE operation, we have introduced a
new P2 argument for OP_Clear opcode that calles box_truncate
instead of tarantoolSqlite3ClearTable.
This operation should work faster than DELETE FROM; but have
a few restricts.

Closes #2201.

@TarantoolBot document
Title: New TRUNCATE operation
TRUNCATE is DDL operation.
Removes all rows from a table or specified partitions of a table,
without logging the individual row deletions.
TRUNCATE TABLE is similar to the DELETE statement with no WHERE
clause; however, TRUNCATE TABLE is faster and uses fewer system
resources.
It couldn't be used with system tables or with tables having FKs.
It also couldn't be called in transaction.
The triggers on table will have ignored.
Example:
TRUNCATE TABLE t1;

dbe38b0d

Jul 26, 2018

lua: fix fio.rmtree to work with non empty dirs · 9917edc7
Konstantin Belyavskiy authored 6 years ago
```
Fix 'fio.rmtree' to remove a non empty directories.
And update test.

Closes #3258
```
9917edc7
lua: fix fio.rmtree to work with non empty dirs · 564a053c
Konstantin Belyavskiy authored 6 years ago
```
Fix 'fio.rmtree' to remove a non empty directories.
And update test.

Closes #3258
```
564a053c

Make access_check_ddl check for entity privileges. · d2e70f18

Serge Petrenko authored 6 years ago

Function access_check_ddl checked only for universal access, thus
granting entity or singe object access to a user would have no effect in
scope of this function.
Fix this by adding entity access checks.

Also attaching an existing sequence to a space checked for
create privilege on both space and sequence
(instead of read + write on sequence). Fixed it and changed the tests
accordingly.

Closes #3516

d2e70f18

Jul 24, 2018

box: removed unnecessary check of log string · 330886d2
Olga Arkhangelskaia authored 6 years ago
```
This check happens twice. The patch simply removes check.
```
330886d2

Allow to mix blackhole statements in other engines' transactions · d512174a

Vladimir Davydov authored 6 years ago

Blackhole doesn't need transaction control as it doesn't actually store
anything so we can mark it with ENGINE_BYPASS_TX.

d512174a

sql: get rid of Column structure · ae539a94

Kirill Shcherbatov authored 6 years ago

Get rid of is_primkey in Column structure as it becomes
redundant. Moved the last member coll with collation pointer
to field_def structure. Finally, dropped Column.

ae539a94