Commits · 910d2bb034fe5d0811f1bf188add5c77a5409f7f · core / tarantool

Dec 11, 2024

Georgy Moshkin authored 1 year ago and

Dmitry Ivanov committed 3 months ago

Introduce fully temporary spaces: same as data-temporary space but with
temporary metadata. Basically temporary spaces now do not exist on
restart and do not exist on replicas. They can also be created, altered
and deleted when box.cfg.read_only = true.

To avoid conflicts with spaces created on replicas, the temporary
space ids by default start in a special range starting at
BOX_SPACE_ID_TEMPORARY_MIN.

Temporary spaces currently do not support several features e.g.
foreign key references (to and from), functional indexes, sql sequences,
sql triggers, etc. This may change in the future.

Implementing temporary spaces requires temporary tuples to be
inserted into system spaces: tuples which are neither replicated or
persisted. This mostly done in on_replace_dd_* triggers by dropping the
txn->stmt->row.

Closes #8323

@TarantoolBot document
Title: Introduce fully temporary spaces with temporary metadata

Temporary spaces are now data-temporary spaces with temporary metadata.
Created by specifying { type = "temporary" } in the options.
Temporary spaces will not exist upon server restart and will not
exist on replicas. They can also be created in read-only mode.

910d2bb0

core: rename temporary spaces to data-temporary · c11e7a10

Georgy Moshkin authored 1 year ago and

Dmitry Ivanov committed 3 months ago

Everywhere where we refer to temporary spaces we now say data-temporary.
This is because temporary spaces were never truly temporary because
their definitions would still be persisted and replicated and they
couldn't be created on read-only replicas. In a following commit we will
introduce a new fully temporary type of spaces, which will be just
called 'temporary', so this commit signifies this terminology change.

NO_DOC=renaming
NO_CHANGELOG=renaming
NO_TEST=renaming

c11e7a10

sql: refactor update_view_references a bit · f6c88e6b

Aleksandr Lyapunov authored 1 year ago and

Dmitry Ivanov committed 3 months ago

The function update_view_references is called when an SQL view
is created or dropped. The goal of this function is to modify
(increment or decrement) view_ref_count member of spaces that
the view references.

There were a several issues that deserves to be refactored:
* By design in case of error it left the job partially done, so
  some space references were modified while some other - not.
  Although there was no bug since special steps were made in case
  of error, this pattern is inconvenient and should be avoided.
* In case of error the failing space name was returned via special
  argument which is not flexible and even requires allocation.
* Another argument - suppress_error - has actually never
  suppressed any error because the only case when an error could
  occur is creation of a view, which used suppress_error = false.
* Fail of that function was not actually covered with tests.

So this commit:
* Makes the function to do all or nothing.
* Forces the function to set diag by itself in case of error.
* Removes suppress_error argument while adding several asserts.\
* Adds a small test that fulfills coverage.

NO_DOC=refactoring
NO_CHANGELOG=reafactoring

f6c88e6b

box: support default field values in the space format · b2f58221

Ilya Verbin authored 2 years ago and

Dmitry Ivanov committed 3 months ago

Now a field can be assigned a default value in the space format. When a new
tuple is inserted into a space, and some of the fields contain null values,
those fields will be filled with their respective default values.

Closes #8157

@TarantoolBot document
Title: Document default field values
Product: Tarantool
Since: 3.0
Root document: https://www.tarantool.io/en/doc/latest/reference/reference_lua/box_space/format/

The format clause contains, for each field, a definition within braces:
`{name='...',type='...'[,is_nullable=...][,default=...]}`, where:

* the optional `default` value contains a default value for the field.
  Its type must be compatible with the field type. If default value is set,
  it is applied regardless of whether `is_nullable` is true or false.

Example:

```lua
tarantool> box.space.tester:format{
         > {name = 'id', type = 'unsigned'},
         > {name = 'name', type = 'string', default = 'Noname'},
         > {name = 'pass', type = 'string'},
         > {name = 'shell', type = 'string', default = '/bin/sh'}}
---
...

tarantool> box.space.tester:insert{1000, nil, 'qwerty'}
---
- [1000, 'Noname', 'qwerty', '/bin/sh']
...
```

b2f58221

box: introduce tuple_builder class · a9a20579

Ilya Verbin authored 1 year ago and

Dmitry Ivanov committed 3 months ago

It encapsulates the logic that helps to build a new MsgPack array by
concatenating tuple fields from various locations. The idea is to
postpone memory allocation and copying until the finalization.

Needed for #8157

NO_DOC=internal
NO_CHANGELOG=internal

a9a20579

fix: wrong argument type in box_auth_data_prepare() · e89b633d

Denis Smirnov authored 1 year ago and

Dmitry Ivanov committed 3 months ago

box_auth_data_prepare() method declared to return a tuple while in
reality it returned a region allocated message pack string. Fixed.

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch
NO_TEST=picodata internal patch

e89b633d

feat: extend C box API with a new auth method · 70dc777d

Denis Smirnov authored 1 year ago and

Dmitry Ivanov committed 3 months ago

1. Current commit introduces 'box_auth_data_prepare()' to prepare
   a data string for any supported authentication methods.
2. The user name argument is refactored in the auth methods: the
   null-terminated string is replaced with an address range approach.
   Now Rust users don't need to re-allocate username with CString.
3. Password length type was set to uint32_t (previously it was size_t,
   int, uint32_t for different functions). Tarantool uses murmur3a,
   so all the hashed strings should be up to 32 bit long.

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch
NO_TEST=picodata internal patch

70dc777d

feat: extend C box API with new user methods · b201f939

Denis Smirnov authored 1 year ago and

Dmitry Ivanov committed 3 months ago

Introduce new methods:

1. box_user_id_by_name - get the user identifier by name;
2. box_effective_user_id - get current effective user
   identifier;
3. box_session_user_id - get current session user identifier;
4. box_session_su - change current session user;

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch
NO_TEST=picodata internal patch

b201f939

feat: Implement LDAP authentication · 1fcdd15f

Dmitry Ivanov authored 1 year ago

This authentication method doesn't store any secrets; instead,
we delegate the whole auth to a pre-configured LDAP server. In
the method's implementation, we connect to the LDAP server and
perform a BIND operation which checks user's credentials.

Usage example:

```lua
-- Set the default auth method to LDAP and create a new user.
-- NOTE that we still have to provide a dummy password; otherwise
-- box.schema.user.create will setup an empty auth data.
box.cfg({auth_type = 'ldap'})
box.schema.user.create('demo', { password = '' })

-- Configure LDAP server connection URL and DN format string.
os = require('os')
os.setenv('TT_LDAP_URL', 'ldap://localhost:1389')
os.setenv('TT_LDAP_DN_FMT', 'cn=$USER,ou=users,dc=example,dc=org')

-- Authenticate using the LDAP authentication method via net.box.
conn = require('net.box').connect(uri, {
    user = 'demo',
    password = 'password',
    auth_type = 'ldap',
})
```

NO_DOC=internal
NO_TEST=internal
NO_CHANGELOG=internal

1fcdd15f

fix: box.schema.user.passwd doesn't change the password · bfd298d8

Maksim Kaitmazian authored 1 year ago and

Dmitry Ivanov committed 3 months ago

box.schema.user.passwd doesn't change the password for the current
user because new password is passed instead of the user name.

NO_CHANGELOG=fix an unreleased bug
NO_DOC=fix an unreleased bug

bfd298d8

fix: allow empty password and username in md5 · 21f065b2

Maksim Kaitmazian authored 1 year ago and

Dmitry Ivanov committed 3 months ago

It fixes the following assertion
```bash
tarantool: ./src/lib/core/crypt.c:84: md5_encrypt:
Assertion `password_len + salt_len > 0' failed.
```
caused by the following code
```lua
box.cfg{auth_type='md5'}
box.schema.user.password("")
```

NO_CHANGELOG=fix an unreleased feature
NO_DOC=fix an unreleased feature

21f065b2

feat: make user name argument optional · 160dee2d
Maksim Kaitmazian authored 1 year ago and Dmitry Ivanov committed 3 months ago
```
part of picodata/tarantool#21

NO_CHANGELOG=refactoring
NO_DOC=refactoring
```
160dee2d

feat: implement md5 authentication · 2e10bef3

Maksim Kaitmazian authored 1 year ago and

Dmitry Ivanov committed 3 months ago

It prevents password sniffing and avoids storing passwords on the
server in plain text but provides no protection if an attacker
manages to steal the password hash from the server.

Usage example:
```lua
-- Enable the md5 authentication method for all new users.
box.cfg({auth_type = 'md5'})

-- Reset existing user passwords to use the md5 authentication method.
box.schema.user.passwd('alice', 'topsecret')

-- Authenticate using the md5 authentication method via net.box.
conn = require('net.box').connect(uri, {
    user = 'alice',
    password = 'topsecret',
    -- Specifying the authentication method isn't strictly necessary:
    -- by default the client will use the method set in the remote
	-- server config (box.cfg.auth_type)
    auth_type = 'md5',
})
```

part of picodata/picodata/sbroad!377

@TarantoolBot document
Title: md5 authentication method

See the commit message.

2e10bef3

feat: add user name argument to `auth_method` api · ef4f31ef

Maksim Kaitmazian authored 1 year ago and

Dmitry Ivanov committed 3 months ago

User name is usually used as a salt for user password in order to
avoid password repeating.
For instance, postgres md5 authentication stores passwords as
md5("password", "user"), so that the same passwords are represented by
different hashes.

part of picodata/picodata/sbroad!377

@TarantoolBot document
Title: Document updated `box.schema.user.password` declaration.

Since auth methods can use user name for hashing, user name is
added to argument list of `box.schema.user.password`.

NO_TEST=there are no methods that use user name

ef4f31ef

fiber: introduce fiber_set_name_n function · 0ba5b9cb
Georgy Moshkin authored 1 year ago and Dmitry Ivanov committed 3 months ago
```
NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch
NO_TEST=picodata internal patch
```
0ba5b9cb

feat: add limit for max executed vdbe opcodes · a51dd994

Arseniy Volynets authored 1 year ago and

Dmitry Ivanov committed 3 months ago

- Add a configurable non-negative
session parameter "sql_vdbe_max_steps"
-- max number of opcodes that Vdbe
is allowed to execute for sql query.

- Default value can be specified in box.cfg.
If not set via box.cfg, default value
is 45000. Value 0 means that no
checks for number of executed Vdbe
opcodes will be made.

- Add the third argument to box.execute
function, that allows to specify options
for query execution. The only option
supported: sql_vdbe_max_steps. Usage
example:

```
box.execute([[select * from t]], {}, {{sql_vdbe_max_steps = 1000}})
```

part of picodata/picodata/sbroad!461

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch

a51dd994

feat: expose tuple hash calculation method · aeb4e83d

Denis Smirnov authored 1 year ago and

Dmitry Ivanov committed 3 months ago

Picodata supports cluster-wide SQL and needs some predictable
method to calculate tuple hashes for the bucket ids. Method
should be available for Lua, C and Rust users. It was decided
to expose a murmur3 hash calculation method of the key_def module.

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch

fix: tuple hash calculation tests

Tuple hash calculation tests for the C API were incorrect. Thanks
to the full pipeline with DEBUG build we detected the problem and
fixed it.

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch

aeb4e83d

cbus: introduce lcpipe - light cpipe · fb00c8ca

godzie44 authored 1 year ago and

Dmitry Ivanov committed 3 months ago

Introduced a new type of cbus pipe - lcpipe. The current pipe in the
cbus - cpipe, has a number of limitations, first of all - the cpipe
cannot be used from the 3rd party threads, cpipe only works as a channel
between two cords. That why lcpipe is needed. Its main responsibility -
create channel between any thread and tarantool cord.

Internally lcpipe is a cpipe, but:
- on flush triggers removed, cause triggers use thread-local mem-pool,
this is not possible on a third party thread
- producer event loop removed, cause there is no libev event loop in
third party thread

Also, lcpipe interface is exported to the outside world.

fix: use-after-free in `cbus_endpoint_delete`

Calling a `TRASH` macro after calling the `free`
function dereferences the pointer to the already
freed memory.

NO_DOC=picodata internal patch
NO_CHANGELOG=picodata internal patch
NO_TEST=picodata internal patch

fb00c8ca

feat(json): add option to encode decimals as string · c39f1fee

Дмитрий Кольцов authored 2 years ago and

Dmitry Ivanov committed 3 months ago

Due to inconsistency of Tarantool type casting while using strict
data types as "double" or "unsigned" it is needed
to use "number" data type in a whole bunch of cases.
However "number" may contain "decimal" that will be serialized into
string by JSON builtin module.

This commit adds "encode_decimal_as_number" parameter to json.cfg{}.
That forces to encode `decimal` as JSON number to force type
consistency in JSON output.
Use with catious - most of JSON parsers assume that number is restricted
to float64.

NO_DOC=we do not host doc

c39f1fee

sql: fix string dequoting · 7ab5f493

Denis Smirnov authored 2 years ago and

Dmitry Ivanov committed 3 months ago


Previously,

select "t1"."a" from (select "a" from "t") as "t1";

returned a result column name `t1` instead of `t1.a` because of
incorrect work of a dequoting function. The reason was that
previously sqlDequote() function finished its work when found the
first closing quote.

Old logic worked for simple selects where the column name doesn't
contain an explicit scan name ("a" -> a).
But for the sub-queries results sqlDequote() finished its work right
after the scan name ("t1"."a" -> t1). Now the function continues its
deqouting till it gets the null terminator at the end of the string.

Closes #7063

NO_DOC=don't change any public API, only a bug fix

Co-authored-by: Mergen Imeev <imeevma@gmail.com>

7ab5f493

sql: recompile expired prepared statements · 5f6a5c3c

Denis Smirnov authored 2 years ago and

Dmitry Ivanov committed 3 months ago

Actually there is no reason to throw an error and make a user
manually recreate prepared statement when it expires. A much more
user friendly way is to recreate it under hood when statement's
schema version differs from the box one.

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

5f6a5c3c

fix: default result parameter type · 170e7d91

Denis Smirnov authored 2 years ago and

Dmitry Ivanov committed 3 months ago

Problem description.

When we prepare a statement with parameters in the result columns
(for example box.prepare('select ?')) Tarantool has no information
about the type of the output column and set it to default boolean.
Then, on the execution phase, the type would be recalculated during
the parameter binding.

Tarantool expects that there is no way for parameter to appear in the
result tuple other than exactly be mentioned in the final projection.
But it is incorrect - we can easily propagate parameter from the inner
part of the join. For example

box.prepare([[select COLUMN_1 from t1 join (values (?)) as t2 on true]])

In this case column COLUMN_1 in the final projection is not a
parameter, but a "reference" to it and its type depends on the
parameter from the inner part of the join. But as Tarantool
recalculates only binded parameters in the result projection,
it doesn't change the default boolean metadata type of the COLUMN_1
and the query fails on comparison with the actual type of the tuple.

Solution.
As we don't want to patch Vdbe to make COLUMN_1 refer inner parameter,
it was decided to make a simple workaround: change the default
column type from BOOLEAN to ANY for parameters. It fixes the
comparison with the actual tuple type (we do not fail), but in some
cases get ANY column in the results where we would like to have
explicitly defined type. Also NULL parameters would also have ANY
type, though Tarantool prefers to have BOOLEAN in this case.

Closes https://github.com/tarantool/tarantool/issues/7283

NO_DOC=bug fix

170e7d91

cmake: disable feedback daemon by default · 32fa3a59
Дмитрий Кольцов authored 1 year ago and Dmitry Ivanov committed 3 months ago
```
NO_DOC=disable feedback
NO_TEST=disable feedback
```
32fa3a59

vinyl: disable tautological DELETE optimization for deferred DELETEs · 050dcf4d

Vladimir Davydov authored 3 months ago and

Dmitry Ivanov committed 3 months ago

If the write iterator sees that one DELETE statement follows another,
which isn't discarded because it's referenced by a read view, it drops
the newer DELETE, see commit a6f45d87 ("vinyl: discard tautological
DELETEs on compaction"). This is incorrect if the older DELETE is a
deferred DELETE statement (marked as SKIP READ) because such statements
are dumped out of order, i.e. there may be a statement with the LSN
lying between the two DELETEs in an older source not included into this
compaction task. If we discarded the newer DELETE, we wouldn't overwrite
this statement on major compaction, leaving garbage. Fix this issue by
disabling this optimization for deferred DELETEs.

Closes #10895

NO_DOC=bug fix

(cherry picked from commit 2945a8c9fde6df9f6cbc714f9cf8677f0fded57a)

050dcf4d

vinyl: fix cache invalidation on rollback of DELETE statement · 76345442

Vladimir Davydov authored 3 months ago and

Dmitry Ivanov committed 3 months ago

Once a statement is prepared to be committed to WAL, it becomes visible
(in the 'read-committed' isolation level) so it can be added to the
tuple cache. That's why if the statement is rolled back due to a WAL
error, we have to invalidate the cache. The problem is that the function
invalidating the cache (`vy_cache_on_write`) ignores the statement if
it's a DELETE judging that "there was nothing and there is nothing now".
This is apparently wrong for rollback. Fix it.

Closes #10879

NO_DOC=bug fix

(cherry picked from commit d64e29da2c323a4b4fcc7cf9fddb0300d5dd081f)

76345442

vinyl: fix handling of duplicate multikey entries in transaction write set · 3d584a69

Vladimir Davydov authored 3 months ago and

Dmitry Ivanov committed 3 months ago

A multikey index stores a tuple once per each entry of the indexed
array field, excluding duplicates. For example, if the array field
equals {1, 3, 2, 3}, the tuple will be stored three times. Currently,
when a tuple with duplicate multikey entries is inserted into a
transaction write set, duplicates are overwritten as if they belonged
to different statements. Actually, this is pointless: we could just as
well skip them without trying to add to the write set. Besides, this
may break the assumptions taken by various optimizations, resulting in
anomalies. Consider the following example:

```lua
local s = box.schema.space.create('test', {engine = 'vinyl'})
s:create_index('primary')
s:create_index('secondary', {parts = {{'[2][*]', 'unsigned'}}})
s:replace({1, {10, 10}})
s:update({1}, {{'=', 2, {10}}})
```

It will insert the following entries to the transaction write set
of the secondary index:

 1. REPLACE {10, 1} [overwritten by no.2]
 2. REPLACE {10, 1} [overwritten by no.3]
 3. DELETE {10, 1} [turned into no-op as REPLACE + DELETE]
 4. DELETE {10, 1} [overwritten by no.5]
 5. REPLACE {10, 1} [turned into no-op as DELETE + REPLACE]

(1-2 correspond to `replace()` and 3-5 to `delete()`)

As a result, tuple {1, {10}} will be lost forever.

Let's fix this issue by silently skipping duplicate multikey entries
added to a transaction write set. After the fix, the example above
will produce the following write set entries:

 1. REPLACE{10, 1} [overwritten by no.2]
 2. DELETE{10, 1} [turned into no-op as REPLACE + DELETE]
 3. REPLACE{10, 1} [committed]

(1 corresponds to `replace()` and 2-3 to `delete()`)

Closes #10869
Closes #10870

NO_DOC=bug fix

(cherry picked from commit 1869dce15d9a797391e45df75507078d91f1651e)

3d584a69

vinyl: skip invisible read sources · 8f7bae8c

Vladimir Davydov authored 4 months ago and

Dmitry Ivanov committed 3 months ago

A Vinyl read iterator scans all read sources (memory and disk levels)
even if it's executed in a read view from which most of the sources are
invisible. As a result, a long running scanning request may spend most
of the time skipping invisible statements. The situation is exacerbated
if the instance is experiencing a heavy write load because it would pile
up old statement versions in memory and force the iterator to skip over
them after each disk read.

Since the replica join procedure in Vinyl uses a read view iterator
under the hood, the issue is responsible for a severe performance
degradation of the master instance and the overall join procedure
slowdown when a new replica is joined to an instance running under
a heavy write load.

Let's fix this issue by making a read iterator skip read sources that
aren't visible from its read view.

Closes #10846

NO_DOC=bug fix

(cherry picked from commit 6a214e42e707b502022622866d898123a6f177f1)

8f7bae8c

vinyl: fix handling of overwritten statements in transaction write set · 3344bffc

Vladimir Davydov authored 4 months ago and

Dmitry Ivanov committed 3 months ago

Statements executed in a transaction are first inserted into the
transaction write set and only when the transaction is committed, they
are applied to the LSM trees that store indexed keys in memory. If the
same key is updated more than once in the same transaction, the old
version is marked as overwritten in the write set and not applied on
commit.

Initially, write sets of different indexes of the same space were
independent: when a transaction was applied, we didn't have a special
check to skip a secondary index statement if the corresponding primary
index statement was overwritten because in this case the secondary
index statement would have to be overwritten as well. This changed when
deferred DELETEs were introduced in commit a6edd455 ("vinyl:
eliminate disk read on REPLACE/DELETE"). Because of deferred DELETEs,
a REPLACE or DELETE overwriting a REPLACE in the primary index write
set wouldn't generate DELETEs that would overwrite the previous key
version in write sets of the secondary indexes. If we applied such
a statement to the secondary indexes, it'd stay there forever because,
since there's no corresponding REPLACE in the primary index, a DELETE
wouldn't be generated on primary index compaction. So we added a special
instruction to skip a secondary index statement if the corresponding
primary index was overwritten, see `vy_tx_prepare()`. Actually, this
wasn't completely correct because we skipped not only secondary index
REPLACE but also DELETE. Consider the following example:

```lua
local s = box.schema.space.create('test', {engine = 'vinyl'})
s:create_index('primary')
s:create_index('secondary', {parts = {2, 'unsigned'}})

s:replace{1, 1}

box.begin()
s:update(1, {{'=', 2, 2}})
s:update(1, {{'=', 2, 3}})
box.commit()
```

UPDATEs don't defer DELETEs because, since they have to query the old
value, they can generate DELETEs immediately so here's what we'd have
in the transaction write set:

 1. REPLACE {1, 2} in 'test.primary' [overwritten by no.4]
 2. DELETE {1, 1} from 'test.secondary'
 3. REPLACE {1, 2} in 'test.secondary' [overwritten by no.5]
 4. REPLACE{1, 3} in 'test.primary'
 5. DELETE{1, 2} from 'test.secondary'
 6. REPLACE{1, 3} in 'test.secondary'

Statement no.2 would be skipped and marked as overwritten because of
the new check, resulting in {1, 1} never deleted from the secondary
index. Note, the issue affects spaces both with and without enabled
deferred DELETEs.

This commit fixes this issue by updating the check to only skip REPLACE
statements. It should be safe to apply DELETEs in any case.

There's another closely related issue that affects only spaces with
enabled deferred DELETEs. When we generate deferred DELETEs for
secondary index when a transaction is committed (we can do it if we find
the previous version in memory), we assume that there can't be a DELETE
in a secondary index write set. This isn't true: there can be a DELETE
generated by UPDATE or UPSERT. If there's a DELETE, we have nothing to
do unless the DELETE was optimized out (marked as no-op).

Both issues were found by `vinyl-luatest/select_consistency_test.lua`.

Closes #10820
Closes #10822

NO_DOC=bug fix

(cherry picked from commit 6a87c45deeb49e4e17ae2cc0eeb105cc9ee0f413)

3344bffc

Nov 21, 2024

sql: do not use raw index for count · 1b12f241

Andrey Saranchin authored 4 months ago

Currently, we use raw index for count operation instead of
`box_index_count`. As a result, we skip a check if current transaction
can continue and we don't begin transaction in engine if needed. So, if
count statement is the first in a transaction, it won't be tracked by
MVCC since it wasn't notified about the transaction. The commit fixes
the mistake. Also, the commit adds a check if count was successful and
covers it with a test.

In order to backport the commit to 2.11, space name was wrapped with
quotes since it is in lower case and addressing such spaces with SQL
without quotes is Tarantool 3.0 feature. Another unsupported feature is
prohibition of data access in transactional triggers - it was used in a
test case so it was rewritten.

Closes #10825

NO_DOC=bugfix

(cherry picked from commit 0656a9231149663a0f13c4be7466d4776ccb0e66)

1b12f241

Nov 12, 2024

test: reduce dump count in vinyl-luatest/select_consistency_test · a2cc2b00

Vladimir Davydov authored 4 months ago

The test expects at least 10 dumps to be created in 60 seconds. It
usually works but sometimes, when the host is heavy loaded, Vinyl
doesn't produce enough dumps in time and fails the test. On CI the test
usually fails with 7-9 dumps. To avoid flaky failures, let's reduce the
expected dump count down to 5.

Closes #10752

NO_DOC=test fix
NO_CHANGELOG=test fix

(cherry picked from commit 5325abd3441ecb4b589799c32ec181d88724b8a8)

a2cc2b00

vinyl: fix duplicate multikey stmt accounting with deferred deletes · 26a3c8cf

Vladimir Davydov authored 4 months ago

`vy_mem_insert()` and `vy_mem_insert_upsert()` increment the row count
statistic of `vy_mem` only if no statement is replaced, which is
correct, while `vy_lsm_commit()` increments the row count of `vy_lsm`
unconditionally. As a result, `vy_lsm` may report a non-zero statement
count (via `index.stat()` or `index.len()`) after a dump. This may
happen only with a non-unique multikey index, when the statement has
duplicates in the indexed array, and only if the `deferred_deletes`
option is enabled, because otherwise we drop duplicates when we form
the transaction write set, see `vy_tx_set()`. With `deferred_deletes`,
we may create a `txv` for each multikey entry at the time when we
prepare to commit the transaction, see `vy_tx_handle_deferred_delete()`.

Another problem is that `vy_mem_rollback_stmt()` always decrements
the row count, even if it didn't find the rolled back statement in
the tree. As a result, if the transaction with duplicate multikey
entries is rolled back on WAL error, we'll decrement the row count
of `vy_mem` more times than necessary.

To fix this issue, let's make the `vy_mem` methods update the in-memory
statistic of `vy_lsm`. This way they should always stay in-sync. Also,
we make `vy_mem_rollback_stmt()` skip updating the statistics in case
the rolled back statement isn't present in the tree.

This issue results in `vinyl-luatest/select_consistency_test.lua`
flakiness when checking `index.len()` after compaction. Let's make
the test more thorough and also check that `index.len()` equals
`index.count()`.

Closes #10751
Part of #10752

NO_DOC=bug fix

(cherry picked from commit e8810c555d4e6ba56e6c798e04216aa11efb5304)

26a3c8cf

Nov 07, 2024

upgrade: fix upgrading from schema 1.6.9 · 4e4d4bc1

Nikita Zheleztsov authored 4 months ago

This commit fixes some cases of upgrading schema from 1.6.9:

1. Fix updating empty password for users. In 1.6 credentials were array
   in _user, in 1.7.5 they became map.

2. Automatically update the format of user spaces. Format of system
   spaces have been properly fixed during upgrade to 1.7.5. However,
   commit 519bc82e ("Parse and validate space formats") introduced
   strict checking of format field in 1.7.6. So, the format of user
   spaces should be also fixed.

Back in 1.6 days, it was allowed to write anything in space format.
This commit only fixes valid uses of format:
    {name = 'a', type = 'number'}
    {'a', type = 'number'}
    {'a', 'num'}
    {'a'}

Invalid use of format (e.g. {{}}, or {{5, 'number'}} will cause error
anyway. User has to fix the format on old version and only after that
start a new one.

This commit also introduces the test, which checks, that we can
properly upgrade from 1.6.9 to the latest versions, at least in basic
cases.

Closes #10180

NO_DOC=bugfix

(cherry picked from commit f69e2ae488b3620e31f1a599d8fb78a66917dbfd)

4e4d4bc1

Nov 01, 2024

memtx: fix use-after-free on background index build · a4456c10

Andrey Saranchin authored 5 months ago

When building an index in background, we create on_rollback triggers for
tuples inserted concurrently. The problem here is on_rollback trigger
has independent from `index` and `memtx_ddl_state` lifetime - it can be
called after the index was build (and `memtx_ddl_state` is destroyed)
and even after the index was altered. So, in order to avoid
use-after-free in on_rollback trigger, let's drop all on_rollback
triggers when the DDL is over. It's OK because all owners of triggers
are already prepared, hence, in WAL or replication queue (since we
build indexes in background only without MVCC so the transactions cannot
yield), so if they are rolled back, the same will happen to the DDL.

In order to delete on_rollback triggers, we should collect them into a
list in `memtx_ddl_state`. On the other hand, when the DML statement is
over (committed or rolled back), we should delete its trigger from the
list to prevent use-after-free. That's why the commit adds the on_commit
trigger to background build process.

Closes #10620

NO_DOC=bugfix

(cherry picked from commit d8d82dba4c884c3a7ad825bd3452d35627c7dbf4)

a4456c10

test: use `justrun` module from luatest · f0bcd78b

Yaroslav Lobankov authored 7 months ago

NO_DOC=test
NO_TEST=test
NO_CHANGELOG=test

(cherry picked from commit 90d197ded13d49dfc405ff80bbd183b2e260dc56)

f0bcd78b

test: use `treegen` module from luatest · e8b0dcf6

Yaroslav Lobankov authored 7 months ago

Also, adapt tests and helpers in accordance with the module interface.

NO_DOC=test
NO_TEST=test
NO_CHANGELOG=test

(cherry picked from commit bd27df009c403e89c003d5b66763c0f0bbf08440)

e8b0dcf6

test: fix several more tests after the luatest bump · 9dfd78d8

Serge Petrenko authored 8 months ago

The test gh_10088 was committed in parallel with the luatest bump and
thus slipped from the post-bump tests fixup in commit cfd4bf46
("test: adapt tests to the new luatest version"). Fix it now.

Also tests gh_6539 and gh_7231 queried `box.cfg.log` wrongly, but this
didn't make them fail, they just stopped testing what they were supposed
to. Fix them as well

NO_CHANGELOG=test
NO_TEST=test
NO_DOC=test

(cherry picked from commit 2a18de391895d8ec7a39e3d3dcee659fe79f7bc9)

9dfd78d8

test: adapt tests to the new luatest version · 3d80ea0e

Oleg Chaplashkin authored 8 months ago

With the new version of Luatest you have to be careful with the server
log file. We used to get it very simply:

    box.cfg.log

Now it is more correct to use the following approach:

    rawget(_G, 'box_cfg_log_file') or box.cfg.log

Closes tarantool/test-run#439

NO_DOC=test
NO_TEST=test
NO_CHANGELOG=test

(cherry picked from commit cfd4bf46)

3d80ea0e

Oct 31, 2024

luafun: bump submodule · 58f3e1df

Andrey Saranchin authored 7 months ago

The commit bumps luafun to the new version with a bunch of bugfixes:
* Now `chain` works correctly with iterators without `param`.
* Now `drop_while` supports stateful iterators.
* The module is populated with missing `maximum_by` alias of `max_by`.
* Now `nth` and `length` work correctly with other luafun iterators.

Since our index iterators are stateful (can return different values
with the same `state` passed), the old `drop_while` implementation
didn't work well with them - it was skipping an extra element.
The bump resolves this issue.

Note that there are still methods that don't work correctly with
`index:pairs` - `cycle`, `head` and `is_null`.

Closes #6403

NO_DOC=bugfix

(cherry picked from commit ec758869f8364624efaff58bdd4ebc7c133ede0a)

58f3e1df

Oct 18, 2024

memtx: always read prepared tuples of system spaces · 7f0b2bee

Andrey Saranchin authored 6 months ago

Since we often search spaces, users, funcs and so on in internal caches
that have `read-committed` isolation level (prepared tuples are seen),
let's always allow to read prepared tuples of system spaces.

Another advantage of such approach is that we never handle MVCC when
working with system spaces, so after the commit they will behave in the
same way - prepared tuples will be seen. The only difference is that
readers of prepared rows will be aborted if the row will be rolled back.

By the way, the inconsistency between internal caches and system spaces
could lead to crash in some sophisticated scenarios - the commit fixes
this problem as well because now system spaces and internal caches are
synchronized.

Closes #10262
Closes tarantool/security#131

NO_DOC=bugfix

(cherry picked from commit b33f17b25de6bcbe3ebc236250976e4a0250e75e)

7f0b2bee

alter: wait for previous alters to commit on DDL · ea1c829f

Andrey Saranchin authored 5 months ago

Yielding DDL operations acquire DDL lock so that the space cannot be
modified under its feet. However, there is a case when it actually can:
if a yielding DDL has started when there is another DDL is being
committed and it gets rolled back due to WAL error, `struct space`
created by rolled back DDL is deleted - and it's the space being altered
by the yielding DDL. In order to fix this problem, let's simply wait for
all previous alters to be committed.

We could use `wal_sync` to wait for all previous transactions to be
committed, but it is more complicated - we need to use `wal_sync` for
single instance and `txn_limbo_wait_last_txn` when the limbo queue has
an owner. Such approach has more pitfalls and requires more tests to
cover all cases. When relying on `struct alter_space` directly, all
situations are handled with the same logic.

Alternative solutions that we have tried:
1. Throw an error in the case when user tries to alter space when
   there is another non-committed alter. Such approach breaks applier
   since it applies rows asynchronously. Trying applier to execute
   operations synchronously breaks it even harder.
2. Do not use space in `build_index` and `check_format` methods. In this
   case, there is another problem: rollback order. We have to rollback
   previous alters firstly, and the in-progress one can be rolled back
   only after it's over. It breaks fundamental memtx invariant: rollback
   order must be reverse of replace order. We could try to use
   `before_replace` triggers for alter, but the patch would be bulky.

Closes #10235

NO_DOC=bugfix

(cherry picked from commit fee8c5dd6b16471739ed8512ba4137ff2e7274aa)

ea1c829f