- Aug 10, 2018
-
-
Vladimir Davydov authored
index.update() looks up the old tuple in the primary index, applies update operations to it, then writes a DELETE statement to secondary indexes to delete the old tuple and a REPLACE statement to all indexes to insert the new tuple. It also sets a column mask for both DELETE and REPLACE statements. The column mask is a bit mask which has a bit set if the corresponding field is updated by update operations. It is used by the write iterator for two purposes. First, the write iterator skips REPLACE statements that don't update key fields. Second, the write iterator turns a REPLACE that has a column mask that intersects with key fields into an INSERT (so that it can get annihilated with a DELETE when the time comes). The latter is correct, because if an update() does update secondary key fields, then it must have deleted the old tuple and hence the new tuple is unique in terms of extended key (merged primary and secondary key parts, i.e. cmp_def). The problem is that a bit may be set in a column mask even if the corresponding field does not actually get updated. For example, consider the following example. s = box.schema.space.create('test', {engine = 'vinyl'}) s:create_index('pk') s:create_index('sk', {parts = {2, 'unsigned'}}) s:insert{1, 10} box.snapshot() s:update(1, {{'=', 2, 10}}) The update() doesn't modify the secondary key field so it only writes REPLACE{1, 10} to the secondary index (actually it writes DELETE{1, 10} too, but it gets overwritten by the REPLACE). However, the REPLACE has column mask that says that update() does modify the key field, because a column mask is generated solely from update operations, before applying them. As a result, the write iterator will not skip this REPLACE on dump. This won't have any serious consequences, because this is a mere optimization. What is worse, the write iterator will also turn the REPLACE into an INSERT, which is absolutely wrong as the REPLACE is preceded by INSERT{1, 10}. If the tuple gets deleted, the DELETE statement and the INSERT created by the write iterator from the REPLACE will get annihilated, leaving the old INSERT{1, 10} visible. The issue may result in invalid select() output as demonstrated in the issue description. It may also result in crashes, because the tuple cache is very sensible to invalid select() output. To fix this issue let's clear key bits in the column mask if we detect that an update() doesn't actually update secondary key fields although the column mask says it does. Closes #3607
-
- Aug 08, 2018
-
-
Olga Arkhangelskaia authored
Added server option to syslog configuration. Server option is responsible for log destination. At the momemt there is two ways of usage:server=unix:/path/to/socket or server=ipv4:port. If port is not set default udp port 514 is used. If logging to syslog is set, however there is no server options - default location is used: Linux /dev/log and Mac /var/run/syslog. Closes #3487
-
- Aug 07, 2018
-
-
Kirill Shcherbatov authored
Starting from 9a543202 on tuple insertion in _space we execute sql_checks_resolve_space_def_reference for checks if any on executing on_replace_dd_space trigger. Routine box_space_id_by_name that takes a look to _space space returns not-null value, at the same time space object doesn't present in space cache and can't be found by space_by_id. Before 1.10 patch 0ecabde8 was merged to 2.0 as a part of 13df2b1f box_space_id_by_name used to return BOX_ID_NIL due to "multi-engine transaction error" that is not raised in same situation now. Closes #3611.
-
Nikita Pettik authored
We always compile with enabled foreign keys constraints. They still can be turned off by <pragma foreign_keys = false> in runtime. Follow-up #3271
-
Nikita Pettik authored
Before insertion to _fk_constraint we must be sure that there is no entry with given <name, child id>. Otherwise, insertion will fail and 'duplicate key' will be shown. Such error message doesn't seem to be informative enough, so lets verify before insertion to _fk_constraint that it doesn't already contain entry with given name. The same is for dropping constraint, but here vice versa: we test that _fk_constraint contains entry with given name and child id. It is worth mentioning that during CREATE TABLE processing schema id changes and check in OP_OpenRead opcode fails (which in turn shows that pointer to space may expire). On the other hand, _fk_constraint space itself remains immutable, so as a temporary workaround lets use flag indicating pointer to system space passed to OP_OpenRead. It makes possible to use pointer to space, even if schema has changed. Closes #3271
-
Nikita Pettik authored
After introducing separate space for persisting foreign key constraints, nothing prevents us from adding ALTER TABLE statement to add or drop named constraints. According to ANSI syntax is following: ALTER TABLE <referencing table> ADD CONSTRAINT <referential constraint name> FOREIGN KEY <left paren> <referencing columns> <right paren> REFERENCES <referenced table> [ <referenced columns> ] [ MATCH <match type> ] [ <referential triggered action> ] [ <constraint check time> ] ALTER TABLE <referencing table> DROP CONSTRAINT <constrain name> In our terms it looks like: ALTER TABLE t1 ADD CONSTRAINT f1 FOREIGN KEY(id, a) REFERENCES t2 (id, b) MATCH FULL; ALTER TABLE t1 DROP CONSTRAINT f1; FK constraints which come with CREATE TABLE statement are also persisted with auto-generated name. They are coded after space and its indexes. Moreover, we don't use original SQLite foreign keys anymore: those obsolete structs have been removed alongside FK hash. Now FK constraints are stored only in space. Since types of referencing and referenced fields must match, and now in SQL only PK is allowed to feature INT (other fields are always SCALAR), some tests have been corrected to obey this rule. Part of #3271
-
Nikita Pettik authored
This patch introduces new system space to persist foreign keys constraints. Format of the space: _fk_constraint (space id = 358) [<constraint name> STR, <parent id> UINT, <child id> UINT, <is deferred> BOOL, <match> STR, <on delete action> STR, <on update action> STR, <child cols> ARRAY<UINT>, <parent cols> ARRAY<UINT>] FK constraint is local to space, so every pair <FK name, child id> is unique (and it is PK in _fk_constraint space). After insertion into this space, new instance describing FK constraint is created. FK constraints are held in data-dictionary as two lists (for child and parent constraints) in struct space. There is a list of FK restrictions: - At the time of FK creation parent and child spaces must exist; - VIEWs can't be involved into FK processing; - Child space must be empty; - Types of referencing and referenced fields must be comparable; - Collations of referencing and referenced fields must match; - Referenced fields must compose unique index; - Referenced fields can not contain duplicates. Until space (child) features FK constraints it isn't allowed to be dropped. Implicitly referenced index also can't be dropped (and that is why parent space can't be dropped). But :drop() method of child space firstly deletes all FK constraint (the same as SQL triggers, indexes etc) and then removes entry from _space. Part of #3271 Review fixes
-
Nikita Pettik authored
Originally, SQLite allows to create table with foreign keys constraint which refers to yet not created parent table. For instance: CREATE TABLE child(id INT PRIMARY KEY REFERENCES parent); CREATE TABLE parent(id INT PRIMARY KEY); This patch bans such ability since it contradicts SQL ANSI. Moreover, SQLite allows to drop parent table if deletion of all rows wouldn't result in FK constraint violations. This feature has been removed since in such situation child table would become inconsistent. Finally, within current patch ability to create FK constraints on VIEWs is banned as well. Part of #3271
-
Vladimir Davydov authored
It is dangerous to call box.cfg() concurrently from different fibers. For example, replication configuration uses static variables and yields so calling it concurrently can result in a crash. To make sure it never happens, let's protect box.cfg() with a lock. Closes #3606
-
- Aug 03, 2018
-
-
Alexander Turenko authored
Fixes #3489.
-
- Aug 02, 2018
-
-
Nikita Pettik authored
Now we operate only on one database, so prefixes like 'first_db.table1' are not applieble to our SQL implementation (at least now).
-
- Aug 01, 2018
-
-
Vladimir Davydov authored
Add txn_is_first_statement() function, which returns true if this is the first statement of the transaction. The function is supposed to be used from on_replace trigger to detect transaction boundaries. Needed for #2129
-
Vladimir Davydov authored
vy_task::status stores the return code of the ->execute method. There are only two codes in use: 0 - success and -1 - failure. So let's chage this to a boolean flag.
-
Vladimir Davydov authored
This flag is set iff worker_pool != NULL hence it is pointless.
-
Vladimir Davydov authored
We need cbus for forwarding deferred DELETE statements generated in a worker thread during primary index compaction to the tx thread where they can be inserted into secondary indexes. Since pthread mutex/cond and cbus are incompatible by their nature, let's rework communication channel between the tx and worker threads using cbus. Needed for #2129
-
Vladimir Davydov authored
I'm planning to add some new members and remove some old members from those structs. For this to play nicely, let's do some renames: vy_scheduler::workers_available => idle_worker_count vy_scheduler::input_queue => pending_tasks vy_scheduler::output_queue => processed_tasks vy_task::link => in_pending, in_processed
-
Vladimir Davydov authored
Currently, we don't really need it, but once we switch communication channel between the scheduler and workers from pthread mutex/cond to cbus (needed for #2129), tasks won't be completed on behalf of the scheduler fiber and hence we will need a back pointer from vy_task to vy_scheduler. Needed for #2129
-
Vladimir Davydov authored
This is a prerequisite for switching scheduler-worker communication from pthread mutex/cond to cbus, which in turn is needed to generate and send deferred DELETEs from workers back to tx (#2129). After this patch, pending tasks will be leaked on shutdown. This is OK, as we leak a lot of objects on shutdown anyway. The proper way of fixing this leak would be to rework shutdown without atexit() so that we can use cbus till the very end. Needed for #2129
-
Vladimir Davydov authored
Currently, both vy_read_iterator_next() and vy_point_lookup() add the returned tuple to the tuple cache. As a result, we store partial tuples in a secondary index tuple cache although we could store full tuples (we have to retrieve them anyway when reading a secondary index). This means wasting memory. Besides, when the #2129 gets implemented, there will be tuples in a secondary index that have to be skipped as they have been overwritten in the primary index. Caching them would be inefficient and error prone. So let's call vy_cache_add() from the upper level and add only full tuples to the cache. Closes #3478 Needed for #2129
-
- Jul 31, 2018
-
-
Vladimir Davydov authored
For the sake of further patches, let's do some refactoring: - Rename vy_check_is_unique to vy_check_is_unique_primary and use it only for checking the unique constraint of primary indexes. Also, make it return immediately if the primary index doesn't need uniqueness check, like vy_check_is_unique_secondary does. - Open-code uniqueness check in vy_check_is_unique_secondary instead of using vy_check_is_unique. - Reduce indentation level of vy_check_is_unique_secondary by inverting the if statement.
-
Vladimir Davydov authored
vy_delete_impl helper is only used once in vy_delete and it is rather small so inlining it definitely won't hurt. On the contrary, it will consolidate DELETE logic in one place, making the code easier to follow.
-
Vladimir Davydov authored
There's no point in separating REPLACE path between the cases when the space has secondary indexes and when it only has the primary index, because they are quite similar. Let's fold vy_replace_one and vy_replace_impl into vy_replace to remove code duplication.
-
Vladimir Davydov authored
Currently, we don't always need a full tuple. Sometimes (e.g. for checking uniqueness constraint), a partial tuple read from a secondary index is enough. So we have vy_lsm_get() which reads a partial tuple from an index. However, once the optimization described in #2129 is implemented, it might happen that a tuple read from a secondary index was overwritten or deleted in the primary index, but DELETE statement hasn't been propagated to the secondary index yet, i.e. we will have to read the primary index anyway, even if we don't need a full tuple. That said, let us: - Make vy_lsm_get() always fetch a full tuple, even for secondary indexes, and rename it to vy_get(). - Rewrite vy_lsm_full_by_key() as a wrapper around vy_get() and rename it to vy_get_by_raw_key(). - Introduce vy_get_by_secondary_tuple() which gets a full tuple given a tuple read from a secondary index. For now, it's basically a call to vy_point_lookup(), but it'll become a bit more complex once #2129 is implemented. - Prepare vy_get() for the fact that a tuple read from a secondary index may be absent in the primary index, in which case it should try the next matching one. Needed for #2129
-
Vladimir Davydov authored
Since vy_point_lookup() now guarantees that it returns the newest tuple version, we can remove the code that squashes UPSERTs from vy_squash_process().
-
Vladimir Davydov authored
Currently, vy_point_lookup(), in contrast to vy_read_iterator, doesn't rescan the memory level after reading disk, so if the caller doesn't track the key before calling this function, the caller won't be sent to a read view in case the key gets updated during yield and hence will be returned a stale tuple. This is OK now, because we always track the key before calling vy_point_lookup(), either in the primary or in a secondary index. However, for #2129 we need it to always return the latest tuple version, no matter if the key is tracked or not. The point is in the scope of #2129 we won't write DELETE statements to secondary indexes corresponding to a tuple replaced in the primary index. Instead after reading a tuple from a secondary index we will check whether it matches the tuple corresponding to it in the primary index: if it is not, it means that the tuple read from the secondary index was overwritten and should be skipped. E.g. suppose we have the primary index over the first field and a secondary index over the second field and the following statements in the space: REPLACE{1, 10} REPLACE{1, 20} Then reading {10} from the secondary index will return REPLACE{1, 10}, but lookup of {1} in the primary index will return REPLACE{1, 20} which doesn't match REPLACE{1, 10} read from the secondary index hence the latter was overwritten and should be skipped. The problem is in the example above we don't want to track key {1} in the primary index before lookup, because we don't actually read its value. So for the check to work correctly, we need the point lookup to guarantee that the returned tuple is always the newest one. It's fairly easy to do - we just need to rescan the memory level after yielding on disk if its version changed. Needed for #2129
-
Kirill Yukhin authored
This pragma is dead and produces nothing else but segfault. Along w/ this pragma, remove now dead opcodes which set/read schema_version and all related routines. Also, improve opcode generation script. Part of #3541
-
Kirill Shcherbatov authored
Some sql requests are complex and could contain R/W with multiple spaces. As we have no ability to make such changes transactionaly, we have to disallow such requests. Since now iterators in SQL start transaction so we cant prevent such vicious and dangerous things. Closes #3551
-
Mergen Imeev authored
SQL functions UPPER and LOWER now works with COLLATE as they should according to ANSI Standart. Closes #3052.
-
- Jul 30, 2018
-
-
Vladimir Davydov authored
If vy_log_bootstrap() finds a vylog file in the vinyl directory, it assumes it has to be rebootstrapped and calls vy_log_rebootstrap(). The latter scans the old vylog file to find the max vinyl object id, from which it will start numbering objects created during rebootstrap to avoid conflicts with old objects, then it writes VY_LOG_REBOOTSTRAP record to the old vylog to denote the beginning of a rebootstrap section. After that initial join proceeds as usual, writing information about new objects to the old vylog file after VY_LOG_REBOOTSTRAP marker. Upon successful rebootstrap completion, checkpoint, which is always called right after bootstrap, rotates the old vylog and marks all objects created before the VY_LOG_REBOOTSTRAP marker as dropped in the new vylog. The old objects will be purged by the garbage collector as usual. In case rebootstrap fails and checkpoint never happens, local recovery writes VY_LOG_ABORT_REBOOTSTRAP record to the vylog. This marker indicates that the rebootstrap attempt failed and all objects created during rebootstrap should be discarded. They will be purged by the garbage collector on checkpoint. Thus even if rebootstrap fails, it is possible to recover the database to the state that existed right before a failed rebootstrap attempt. Closes #461
-
Vladimir Davydov authored
Since we don't create snapshot files for vylog, but instead append records written after checkpoint to the same file, we have to use the previous vylog file for backup (see vy_log_backup_path()). So when recovering from a backup we need to rotate the last vylog to keep vylog and checkpoint signatures in sync. Currently, we do it on recovery completion and we use vy_log_create() instead of vy_log_rotate() for it. This is done so that we can reuse the context that was used for recovery instead of rereading vylog for rotation. Actually, there's no point in this micro-optimization, because we rotate vylog only when recovering from a backup. Let's remove it and use vy_log_rotate() for this. Needed for #461
-
Vladimir Davydov authored
Currently only the remote address is printed. Let's also print the UUID, because replicas are identified by UUID everywhere in tarantool, not by the address. An example of the output is below: I> can't follow eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083: required {1: 8} available {1: 12} C> replica is too old, initiating rebootstrap I> bootstrapping replica from eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083 I> can't follow eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083: required {1: 17, 2: 1} available {1: 20} I> can't rebootstrap from eb81a67e-99ee-40bb-8601-99b03fa20124 at [::1]:58083: replica has local rows: local {1: 17, 2: 1} remote {1: 23} I> recovery start Suggested by @kostja. Follow-up ea69a0cd ("replication: rebootstrap instance on startup if it fell behind").
-
Vladimir Davydov authored
This function is not used anywhere since commit a1e005d8 ("vinyl: write_iterator merges vlsns subsequnces")
-
- Jul 27, 2018
-
-
Kirill Yukhin authored
SQLITE_ENABLE_PREUPDATE_HOOK is dead macro. Remove it. Part of #2356
-
Kirill Shcherbatov authored
To implement new TRUNCATE operation, we have introduced a new P2 argument for OP_Clear opcode that calles box_truncate instead of tarantoolSqlite3ClearTable. This operation should work faster than DELETE FROM; but have a few restricts. Closes #2201. @TarantoolBot document Title: New TRUNCATE operation TRUNCATE is DDL operation. Removes all rows from a table or specified partitions of a table, without logging the individual row deletions. TRUNCATE TABLE is similar to the DELETE statement with no WHERE clause; however, TRUNCATE TABLE is faster and uses fewer system resources. It couldn't be used with system tables or with tables having FKs. It also couldn't be called in transaction. The triggers on table will have ignored. Example: TRUNCATE TABLE t1;
-
- Jul 26, 2018
-
-
Konstantin Belyavskiy authored
Fix 'fio.rmtree' to remove a non empty directories. And update test. Closes #3258
-
Konstantin Belyavskiy authored
Fix 'fio.rmtree' to remove a non empty directories. And update test. Closes #3258
-
Serge Petrenko authored
Function access_check_ddl checked only for universal access, thus granting entity or singe object access to a user would have no effect in scope of this function. Fix this by adding entity access checks. Also attaching an existing sequence to a space checked for create privilege on both space and sequence (instead of read + write on sequence). Fixed it and changed the tests accordingly. Closes #3516
-
- Jul 24, 2018
-
-
Olga Arkhangelskaia authored
This check happens twice. The patch simply removes check.
-
Vladimir Davydov authored
Blackhole doesn't need transaction control as it doesn't actually store anything so we can mark it with ENGINE_BYPASS_TX.
-
Kirill Shcherbatov authored
Get rid of is_primkey in Column structure as it becomes redundant. Moved the last member coll with collation pointer to field_def structure. Finally, dropped Column.
-