Commits · 8b10902dac66cca6350f170276ffe24dac9a7dc4 · core / tarantool

Jun 02, 2023

test: do not run box.cfg{} on test runner for replication tests · 8b10902d

Serge Petrenko authored 1 year ago

Some replication tests (linearizable_test.lua and
bootstrap_strategy_test.lua) used default test-runner to test box.cfg{}
calls which are expected to fail.

Since box.cfg{} is going to be prohibited on default test runner, let's
move such test cases into properly initialized servers.

NO_DOC=test
NO_CHANGELOG=test

8b10902d

test: ban direct calling of box.cfg() · fc3426d8

Oleg Chaplashkin authored 1 year ago

Direct call and configuration of the runner instance is prohibited. Now
if you need to test something with specific configuration use a server
instance please (see luatest.Server module).

In-scope-of tarantool/luatest#245

NO_DOC=ban calling box.cfg
NO_TEST=ban calling box.cfg
NO_CHANGELOG=ban calling box.cfg

fc3426d8

Jun 01, 2023

net.box: resolve IPROTO feature names using box.iproto.feature · 606e50c4

Vladimir Davydov authored 1 year ago

Drop the IPROTO_FEATURE_NAMES table and use box.iproto.feature in
iproto_feature_resolve so that we don't have to update it manually every
time we add a new feature.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

606e50c4

iproto: replace iproto_constant with string arrays · 75632133

Vladimir Davydov authored 1 year ago

There are no substantial gaps in the remaining IPROTO constant enums so
there's no need in iproto_constant struct. Instead we can generate
string arrays, as we usually do. This is more flexible because it allows
us to look up a name by code. It's also consistent with iproto_type and
iproto_key names.

The only tricky part here is the iproto_flag enum because it contains
bit masks. To generate names for the flags, we add the auxiliary enum
iproto_flag_bit that contains bit numbers.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

75632133

iproto: generate iproto_type_strs from IPROTO_TYPES · 42dc000e

Vladimir Davydov authored 1 year ago

Currently, we fill iproto_type_strs only for command codes exported to
box.stat while for the rest of command codes we have a switch-case in
the iproto_type_name function. This is ugly and error-prone because we
can easily forget to update iproto_type_name when we add a new command
code. Let's generate iproto_type_strs automatically just like we
generate iproto_key_strs.

There are a few things that should be noted here:
 - We don't generate strings for IPROTO_TYPE_ERROR and IPROTO_UNKNOWN
   because the former has a big code while the latter has a negative
   code. The only place where we need the strings is exporting IPROTO
   constants to Lua so now we just export these special codes explicitly
   there.
 - We don't generate strings for IPROTO codes reserved for vinyl because
   they aren't exported to Lua and use a different naming convention.
   As before, we have a switch-case in iproto_type_name for them.
 - We remove IPROTO_RESERVED_TYPE_STAT_MAX because it isn't a reserved
   code. Instead we define IPROTO_TYPE_STAT_MAX explicitly in the
   iproto_type enum as IPROTO_ROLLBACK + 1. This allows us to remove
   the condition that skips "RESERVED" constants from the code that
   exports IPROTO constants to Lua.
 - Before this change iproto_type_strs didn't have names for OK,
   CALL_16, and NOP, because they aren't shown in box.stat. After this
   change the names are present so we have to filter out the stat items
   explicitly in the rmean_foreach callback.

Generating iproto_type_strs makes iproto_type_constants useless so we
drop it in the scope of this patch and start using iproto_type_strs
to populate box.iproto.type.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

42dc000e

iproto: generate strings for vinyl constants · 96599a5b

Vladimir Davydov authored 1 year ago

Currently, we fill vy_page_info_key_strs, vy_run_info_key_strs, and
vy_row_index_key_strs manually, which is inconvenient and error-prone.
Let's generate them automatically from enum member names, like we do
for IPROTO keys.

Note, we have to rename VY_RUN_INFO_BLOOM and VY_RUN_INFO_BLOOM_LEGACY
to VY_RUN_INFO_BLOOM_FILTER and VY_RUN_INFO_BLOOM_FILTER_LEGACY to
preserve the xlog reader output.

Still, the result isn't exactly the same:
 - An underscore is used instead of a space.
 - Strings are upper case now, not lower case, as they used to be.
 - VY_ROW_INDEX_DATA is now translated to "data", not "row index".

The key names are used for two purposes:
 - For reporting ER_INVALID_INDEX_FILE error in vy_run.c. The changes
   enumerated above don't really matter there.
 - In the xlog reader. We replace spaces with underscores anyway there
   and convert the names to the lower case so the only problem is that
   "row_index" is replaced with "data" in xlog reader output. This
   should be fine though because (a) from the context it's clear that
   the data belong to a row index section, (b) reading vinyl index files
   is only useful for debugging and introspection, and (c) the field is
   a part of vinyl internals and was never documented properly.

After this change we can remove the code replacing spaces with
underscores from the xlog reader because all IPROTO constant names
now use underscores.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

96599a5b

iproto: generate iproto_key_strs from IPROTO_KEYS · 99e8abe2

Vladimir Davydov authored 1 year ago

Currently, we fill iproto_key_strs manually, which is inconvenient and
error-prone. Let's generate it automatically from enum member names.

The result isn't exactly the same:
 - An underscore is used instead of a space.
 - Strings are upper case now, not lower case, as they used to be.
 - IPROTO_REQUEST_TYPE is now translated to  "REQUEST_TYPE", not "type".
 - IPROTO_OPS is now translated to "OPS" not "operations".

The key names are used for two purposes:
 - For reporting ER_MISSING_REQUEST_FIELD error while decoding a packet
   in xrow.c. The changes enumerated above don't really matter there.
 - In the xlog reader. Here we do need some workarounds. First, we have
   to convert the names to the lower case. Second, we have to use "type"
   and "operations" instead of generated names for IPROTO_REQUEST_TYPE
   and IPROTO_OPS. Spaces are already translated to underscores so we
   don't need to do anything about it.

Generating iproto_key_strs makes iproto_key_constants useless so we
drop it in the scope of this patch and start using iproto_key_strs to
populate box.iproto.key.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

99e8abe2

iproto: generate iproto_key_type from IPROTO_KEYS · 26b9cc86

Vladimir Davydov authored 1 year ago

Let's merge the key value type information into IPROTO_KEYS to keep them
close together.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

26b9cc86

iproto: strip prefixes from generated constant strings · 37dd09af

Vladimir Davydov authored 1 year ago

There's no need to add prefixes to generated iproto constant strings
(like IPROTO_, IPROTO_FEATURE_, etc) because we strip them anyway when
exporting constants to Lua. Let's drop the prefixes to cleanup the code.
Note that enum constants themselves still have the prefixes to avoid
name clashes.

Follow-up #8443
Follow-up commit b3fb883b ("iproto: export IPROTO constants to Lua
automatically")

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

37dd09af

May 31, 2023

replication: fix crash on access to a not yet ready relay · 0ef5e3b2

Serge Petrenko authored 1 year ago

All the code outside of relay.cc judges about relay's liveliness looking
only at relay state. When relay->state is RELAY_FOLLOW, the relay is
considered operational.

This is not always true: for example, both relay_push_raft() and
relay_trigger_vclock_sync() are only possible after relay thread pairs
with tx via the cbus. This happens **after** the relay enters
RELAY_FOLLOW state.

Fix the possible access to uninitialized cpipe by
relay_trigger_vclock_sync(): make it a nop until the relay is paired
with tx.

Closes #7991

NO_DOC=bugfix
NO_TEST=covered by replication-luatest/linearizable_test.lua

0ef5e3b2

relay: refactor is_raft_enabled flag · b787f328

Serge Petrenko authored 1 year ago

Relay had a is_raft_enabled member with mixed meaning: firstly, it was set
to true only when relay was ready to accept messages via cbus, and
secondly, it was set to true only for replicas which need raft updates
(newer than Tarantool 2.6.0 and not anonymous).

Let's better use the flag only as an indication that the relay is ready
to accept cbus pushes, and check whether the relay needs raft updates
separately.

The flag will be reused in the following commit, which will make tx
check that relay is connected prior to sending a message to it.

Prerequisite #7991

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

b787f328

box: allow to set *_uuid options to NULL · 3aa029b3

Mergen Imeev authored 1 year ago

This patch allows to set replicaset_uuid and instance_uuid to box.NULL.
This fixes the issue described in #8714, however it introduces another
change in behavior - we can now set these parameters to NULL even if
they weren't NULL before. However, since we still cannot set a different
uuid after setting the parameters to NULL, and we can still set the old
uuid for them, this behavior is considered acceptable.

Closes #8714

NO_DOC=bugfix
NO_CHANGELOG=the bug was not released

3aa029b3

test: disable Lua JIT in app-luatest/http_client_test · 53c94bc7
Vladimir Davydov authored 1 year ago
```
We'll enable it when #8718 is fixed.

NO_DOC=test
NO_CHANGELOG=test
```
53c94bc7

May 29, 2023

raft: fix spurious split-vote · 2afde5b1

Serge Petrenko authored 1 year ago

Due to a typo raft candidate counted a vote for another node as a vote
for self in its split-vote detector. This could lead to spurious
split-vote detection in cases when another node wins elections with a bare
minimum of votes for it (exactly a quorum of votes).

Closes #8698

NO_DOC=bugfix

2afde5b1

raft: make promote bump term and vote at once · 17371215

Serge Petrenko authored 1 year ago

box.ctl.promote() was implemented as follows: an instance bumps the
term and marks itself a candidate, but doesn't vote for self
immediately. Instead it relies on the machinery which makes a candidate
vote for self as soon as it persists a new term.

This differs from a normal election start due to leader timeout: there
term and vote are bumped at once.

Besides, this increases probability of box.ctl.promote() resulting in
other node getting elected: if a node first broadcasts a term without a
vote, it is not considered a candidate, so other candidates might start
elections and vote for themselves.

Let's bring promote into line with automatic elections.

Closes #8497

NO_DOC=bugfix

17371215

raft: persist vote for self together with term bump · 8a124e50

Serge Petrenko authored 1 year ago

Commit c9155ac8 ("raft: persist new term and vote separately") made
the nodes persist new term and vote separately, using 2 WAL writes.
Writing the term first is needed to flush all the ongoing transactions,
so that the node's vclock is updated and can be checked against the
candidate's vclock. Otherwise it could happen that the node persists a
vote for some candidate only to find that it's vclock would actually
become incomparable with the candidate's.

Actually, this guard is not needed when checking a vote for self,
because a node can always vote for self. Besides, splitting term bump
and vote can lead to increased probability of split-vote. It may happen
that a candidate bumps and broadcasts the new term without a vote,
making other nodes vote for self. Let's go back to writing term and vote
together for self votes.

This change makes raft candidate persist term bump and vote for self in
one WAL write instead of two, so all the tests which count WAL writes or
expect 2 separate state updates for term and vote are rewritten.

Prerequisite #8497

NO_DOC=not user-visible
NO_CHANGELOG=not user-visible

8a124e50

May 26, 2023

changelog: mark changelog entry for gh-7149 as breaking · 3b244fc4

Vladimir Davydov authored 1 year ago

Fixes commit 97c2c9a4 ("box: disable DDL with old schema").
Follow-up #7149

NO_DOC=changelog
NO_TEST=changelog

3b244fc4

box: disable DDL with old schema · 97c2c9a4

Vladimir Davydov authored 1 year ago

** Implementation details **

We disable DDL by patching the existing on_replace_dd_system_space
trigger callback installed for each system space so that now it raises
an error in case the current schema version is less than the most
recent one known to this build. Since to perform a schema upgrade
we need to execute DDL, we suppress the error for the fiber that is
currently running a schema upgrade. To achieve that, the upgrade script
calls box_schema_upgrade_begin and box_schema_upgrade_end before
starting and after completing a schema upgrade. The functions keep track
of the fiber that is currently running a schema upgrade so that we can
allow all DDL operations for it. We also allow DDL during recovery so
that we can replay DDL statements written to the WAL.

Since there may be a bug in the `box.schema.upgrade` implementation,
we export `box.internal.run_schema_upgrade`, which runs the given
function as a schema upgrade script (allowing DDL). The user may use
this function to recover after a schema upgrade failure.

** Note about the tests **

A test server instance started by luatest grants permissions to the
guest user so that luatest can execute commands on it. It means that if
a test uses a generated snap file committed to the repository for a test
server instance, it will fail because granting permissions is a DDL
operation. To prevent this, we have to regenerate snap files so that
they contain all required permissions. This works because a test server
instance grants permissions with the `if_not_exists` flag.

The problem is that it isn't easy to regenerate the snap files for the
following tests because there's no generator script:
 - `test/box-luatest/gh_6794_recover_nonmatching_xlogs_test.lua`
 - `test/box-luatest/gh_7974_force_recovery_bugs_test.lua`

So we temporarily disable these tests and file tickets to fix them.

Other notes:
 - We drop `test/box-luatest/upgrade/2.9.1` and make the test using it
   use `test/box-luatest/upgrade/2.10.0` instead. We do this because
   2.9.1 was never released and the earliest Tarantool version using the
   2.9.1 schema version is 2.10.0. This shouldn't affect the test
   anyhow.
 - We drop the part of the `user_auth_history_last_modified_upgrade`
   test that checks that creating users/roles with an old schema works
   fine because this is forbidden now.
 - We wrap the code that creates a space with an old schema in the
   downgrade test in `box.internal.run_schema_upgrade`. Even though it's
   unsupported now, we still need to check that space creation works
   after a downgrade.

Closes #7149

@TarantoolBot document
Title: Document that DDL is disabled with an old system schema

Executing DDL operations with an old (not upgraded) system schema is
dangerous and might result in unexpected breakages. So we decided to
explicitly forbid all DDL operations with an old system schema until
`box.schema.upgrade()` is called.  Note, one can still call `box.schema`
functions with an old schema provided they do nothing, for example, if
an object is created with the `if_not_exists` flag and the object with
same id already exists:

```lua
box.schema.create_space('test', {if_not_exists = true})
```

Otherwise an attempt to create a space with an old schema will raise
an error like shown below:

```yaml
tarantool> box.schema.space.create('test')
---
- error: Your schema version is 1.6.8 while Tarantool
    3.0.0-entrypoint-262-g3eaba1cef686 requires a more recent
    schema version. Please, consider using box.schema.upgrade().
...
```

97c2c9a4

box: disallow to drop system spaces · 8ae45007

Magomed Kostoev authored 1 year ago

The patch adds a new check to the _space on_replace trigger failing
on attempt to drop a system table.

Closes #5279

NO_DOC=bugfix

8ae45007

May 25, 2023

changelog: proofread some luajit changelogs · e64568b2
Yaroslav Lobankov authored 1 year ago
```
NO_DOC=changelog update
NO_TEST=changelog update
```
e64568b2

metrics: bump to new version · 8bbc73ce

Yaroslav Lobankov authored 1 year ago

Bump the metrics submodule to 1.0.0 version.

NO_DOC=submodule bump
NO_TEST=submodule bump
NO_CHANGELOG=submodule bump

8bbc73ce

test: bump test-run to new version · 252865ce

Yaroslav Lobankov authored 1 year ago

Bump test-run to new version with the following improvements:

- lib: propagate test status 'skip' [1]
- Show overall progress while running [2]
- Follow test timeout for luatest [3]
- Run luatest test by pattern [4]
- Refactor command to run luatest test [5]
- Bump luatest to 0.5.7-39-g89da427 [6]
- consistent mode: fix worker's vardir calculation [7]

[1] tarantool/test-run@6fbb7fd
[2] tarantool/test-run@c5fa909
[3] tarantool/test-run@f67d523
[4] tarantool/test-run@264af05
[5] tarantool/test-run@e19bb11
[6] tarantool/test-run@3e74192
[7] tarantool/test-run@aac77f5

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

252865ce

box: cleanup on tuple encoding failure · 9f9142d6

Nikolay Shirokovskiy authored 1 year ago

Currently on tuple encoding failure we raise Lua error. In many placess
the error is not handled in Lua C code and we get misc leaks. Let's
instead pass error as return value.

Note that generally speaking encoding code can raise an error on OOM.
Which will lead to leak again. Hopefully application will be killed by
OOM killer instead. Other then that we expect no more errors in the
code. If code calls a user defined callback then pcall is used (see
lua_field_inspect_ucdata for example). So the turn from raising errors
to returning error code seems the right direction.

Closes #7939

NO_DOC=bugfix

9f9142d6

small: bump version · 45c9a096

Nikolay Shirokovskiy authored 1 year ago

This will bring new ibuf_truncate method.

Part of #7939

NO_TEST=internal
NO_CHANGELOG=internal
NO_DOC=internal

45c9a096

May 24, 2023

luajit: bump new version · cde911d0

Igor Munkin authored 1 year ago

* Fix IR_RENAME snapshot number. Follow-up fix for a32aeadc.
* OSX: Disable unreliable assertion for external frame unwinding.
* Disable unreliable assertion for external frame unwinding.
* Handle on-trace OOM errors from helper functions.
* LJ_GC64: Make ASMREF_L references 64 bit.
* lldb: introduce luajit-lldb
* x64/LJ_GC64: Fix emit_rma().
* Limit path length passed to C library loader.

Closes #7745
Part of #4808
Part of #8069
Part of #8516

NO_DOC=LuaJIT submodule bump
NO_TEST=LuaJIT submodule bump

cde911d0

box: fix unique violation in functional index with nullable parts · 6bcd51f9

Ilya Verbin authored 1 year ago

Currently is_nullable property of a functional index part disables the
unique property of the index. The bug is in func_index_compare(), which
compares functional keys first, and if they are equal it compares the
primary keys. This behaviour is correct only when some part of the key
is NULL (and for non-unique indexes), but for now the primary keys are
compared unconditionally. Fix this by checking for NULL key parts.

Closes #8587

NO_DOC=bugfix

6bcd51f9

May 23, 2023

sql: check printf() for failure · 13159230

Mergen Imeev authored 1 year ago

This patch adds a check that sqlXPrintf() does not fail in the built-in
SQL function printf(). There are two possible problems: the result might
get too large, or there might be an integer overflow because internally
int values are converted to size_t.

Closes #tarantool/security#122

NO_DOC=bugfix

13159230

sql: assert in xferOptimization() · 039f714d

Mergen Imeev authored 1 year ago

This patch fixes problems with INSERT INTO ... SELECT FROM optimization.
These problems appeared after 6b8acd8f, where the check became redundant,
but was not updated. Two problems arose:
1) an assertion or segmentation fault when optimization was used and the
source space does not have an index;
2) optimization can be used even if the indexes are incompatible.

The second problem does not result in changes that are user-visible, so
there is no test.

Closes #8661

NO_DOC=bugfix

039f714d

May 22, 2023

box: add hostname to box.info · adb14c06

Gleb Kashkin authored 1 year ago

Hostname is a useful piece of information in state reports. So it was
decided to add it to box.info.

Hostname is obtained on requested and is not cached.

Closes #8605

@TarantoolBot document
Title: Add hostname to box.info

This patch adds hostname to box.info, it can be useful e.g. to supplement
various instance state reports. It is not cached and is requested on
each call.

adb14c06

sql: fix invalid negation · 088b32f3

Eli Kobrin authored 1 year ago

The error of invalid negation occurred because of invalid check,
which did not cover the case when value is equal to INT64_MIN, so the
negation of INT64_MIN equal to itself. This must be fixed, because
negation of INT64_MIN is undefined behavior.

It is fixed by the explicit check for the value variable.

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring

088b32f3

box: support space and index names in IPROTO requests · b9550f19

Georgiy Lebedev authored 1 year ago

Add support for accepting IPROTO requests with space or index name instead
of identifier (name is preferred over identifier to disambiguate missing
identifiers from zero identifiers): mark space identifier request
key as present upon encountering space name, and delay resolution of
identifier until request gets to transaction thread.

Add support for sending DML requests from net.box connection objects with
disabled schema fetching by manually specifying space or index name or
identifier: when schema fetching is disabled, the space and index tables of
connections return wrapper tables that store necessary context (space or
index name or identifier, determined by type, connection object and space
for indexes) for performing requests. The space and index tables cache the
wrapper table they return.

Closes #8146

@TarantoolBot document
Title: Space and index name in IPROTO requests

Refer to design document for details:
https://www.notion.so/tarantool/Schemafull-IPROTO-cc315ad6bdd641dea66ad854992d8cbf?pvs=4#f4d4b3fa2b3646f1949319866428b6c0

b9550f19

box: add `space_by_name` and `space_index_by_name` for arbitrary strings · bf086dc9

Georgiy Lebedev authored 1 year ago

Change original `space_by_name` to `space_by_name0` and
`space_index_by_name` to `space_index_by_name0`, since they accept
NULL-terminated names, and add `space_by_name` and `space_index_by_name`
for arbitrary strings.

Needed for #8146

NO_CHANGELOG=refactoring
NO_DOC=refactoring
NO_TEST=refactoring

bf086dc9

May 19, 2023

replication: allow to re-register with new UUID · 4507c59d

Vladislav Shpilevoy authored 2 years ago

Previously it wasn't allowed to change instance UUID in _cluster.
When needed, it had to be done manually by deleting the instance
from _cluster and inserting it back with a new UUID. Or not to be
done at all.

Re-UUID (like re-name) was reported to be used when people didn't
want to register new replica IDs. They wanted to rejoin lost
replicas from scratch but keep the numeric ID. With UUID they
could deal by either setting it explicitly to the old value on a
new instance, or by doing the manual re-UUID like described above.

This commit is supposed to make things simpler. If a replica has a
name, then its re-join with another UUID is not an error. Its
record in _cluster is automatically updated to store the new UUID.

That is only possible if the old-UUID-instance is not connected
anymore and is not listed in replication cfg.

Closes #5029

@TarantoolBot document
Title: Instance rebootstrap with new UUID but same ID and name
If an instance has a non-empty instance name
(`box.cfg.instance_name`), then at rebootstrap it can keep the
name and its old numeric ID (space `_cluster['id']` field).

This might be needed if one doesn't want to pollute `_cluster`
with new rows, and somewhy doesn't want to or can't just drop the
rows belonging to the dead replicas.

In order for this to work 1) the rebootstrapping replica must keep
its old non-empty instance name, 2) the other instances should not
have any alive connections to the old dead replica. Ideally, the
old replica should be just deleted from `box.cfg.replication`
everywhere.

When that works, the old row in `_cluster` is automatically
updated with the new instance UUID.

4507c59d

replication: introduce instance name · 9e2d46f9

Vladislav Shpilevoy authored 2 years ago

The instance name is carried with instance UUID everywhere in the
replication protocols. It is visible in all other instances via
_cluster and is displayed in monitoring.

Part of #5029

@TarantoolBot document
Title: `box.cfg.instance_name` and `box.info.name`
The new option `box.cfg.instance_name` allows to assign the
instance name to a human-readable text value to be displayed in
the new info key - `box.info.name`. Instances can see names of
their peers in `box.info.replication[id].name`.

The name is broadcasted in "box.id" built-in event as
"instance_name" key. It is string when set and nil when not set.

When set, it has to be unique in the instance's replicaset.

If a name wasn't set on cluster bootstrap (was forgotten or the
cluster is upgraded from a version < 3.0), then it can be set
on an already running instance via `box.cfg.instance_name`.

To change or drop an already installed name one has to use
`box.cfg.force_recovery == true` in all instances of the cluster.
After the name is updated and all the instances synced, the
`force_recovery` can be set back to `false`.

The name can be <= 63 symbols long, can consist only of chars
['0'-'9'], '-' and 'a'-'z'. It must start with a letter. When
upper-case letters are used in `box.cfg`, they are automatically
converted to lower-case. The names are host- and DNS-friendly.

9e2d46f9

replication: introduce replicaset name · 5bca2295

Vladislav Shpilevoy authored 2 years ago

The replicaset name is carried with replicaset UUID wherever any
sanity validations are needed like whether 2 instances belong to
the same replicaset.

Part of #5029

@TarantoolBot document
Title: `box.cfg.replicaset_name` and `box.info.replicaset.name`
The new option `box.cfg.replicaset_name` allows to assign the
replicaset name to a human-readable text value to be displayed in
the new info key - `box.info.replicaset.name` - and to be
validated when the instances in the replicaset connect to each
other.

The name is broadcasted in "box.id" built-in event as
"replicaset_name" key. It is string when set and nil when not set.

When set, it has to match in all instances of the entire
replicaset.

If a name wasn't set on cluster bootstrap (was forgotten or the
cluster is upgraded from a version < 3.0), then it can be set
on an already running instance via `box.cfg.replicaset_name`.

To change or drop an already installed name one has to use
`box.cfg.force_recovery == true` in all instances of the cluster.
After the name is updated and all the instances synced, the
`force_recovery` can be set back to `false`.

The name can be <= 63 symbols long, can consist only of chars
['0'-'9'], '-' and 'a'-'z'. It must start with a letter. When
upper-case letters are used in `box.cfg`, they are automatically
converted to lower-case. The names are host- and DNS-friendly.

5bca2295

replication: introduce cluster name · cb9307a7

Vladislav Shpilevoy authored 2 years ago

The patch adds 2 new entities to replication: the concept of a
cluster which has multiple replicasets and a name for this
cluster.

The name so far doesn't participate in any replication protocols.
It is just stored in _schema and is validated against the config.

The old mentions of 'cluster' (in logs, in some protocol keys like
in the feedback daemon) everywhere are now considered obsolete and
probably will be eventually replaced with 'replicaset'.

Part of #5029

@TarantoolBot document
Title: `box.cfg.cluster_name` and `box.info.cluster.name`
The new option `box.cfg.cluster_name` allows to assign the cluster
name to a human-readable text value to be displayed in the new
info key - `box.info.cluster.name` - and to be validated when the
instances in the cluster connect to each other.

The name is broadcasted in "box.id" built-in event as
"cluster_name" key. It is string when set and nil when not set.

When set, it has to match in all instances of the entire cluster
in all its replicasets.

If a name wasn't set on cluster bootstrap (was forgotten or the
cluster is upgraded from a version < 3.0), then it can be set
on an already running instance via `box.cfg.cluster_name`.

To change or drop an already installed name one has to use
`box.cfg.force_recovery == true` in all instances of the cluster.
After the name is updated and all the instances synced, the
`force_recovery` can be set back to `false`.

The name can be <= 63 symbols long, can consist only of chars
'0'-'9', '-' and 'a'-'z'. It must start with a letter. When
upper-case letters are used in `box.cfg`, they are automatically
converted to lower-case. The names are host- and DNS-friendly.

cb9307a7

box: validate global ids after boot in one func · 7fd0d2a5

Vladislav Shpilevoy authored 2 years ago

The new function check_global_ids_integrity() checks that the
replicaset UUID specified in the config and found in the data
match. Instance UUID is created at bootstrap and validated at the
beginning of recovery, not in the end. Hence not checked here.

For now this function is not very useful, but soon there will be
more global IDs stored in WAL which will need validation.

Needed for #5029

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered

7fd0d2a5

box: introduce node_name funcs and constants · efbc7762

Vladislav Shpilevoy authored 1 year ago

Node name stores a DNS- and host- friendly string name. It will be
used in the next patches for some new global names: cluster,
replicaset, and instance.

Part of #5029

NO_DOC=internal
NO_CHANGELOG=internal

efbc7762

info: rename box.info.cluster -> replicaset · ef86e000

Vladislav Shpilevoy authored 2 years ago

It was named 'cluster', but really was just about the replicaset.
This is going to be even more confusing soon, because there will
be introduced an actual concept of cluster as multiple
replicasets.

The patch renames it to 'replicaset'. `box.info.cluster` now means
the whole cluster and is empty so far. Next patches will add here
the cluster name.

Part of #5029

@TarantoolBot document
Title: `box.info.cluster` is renamed to `box.info.replicaset`

Done since 3.0.0. The old behaviour can be reverted back via the
`compat` option `box_info_cluster_meaning`.

`box.info.cluster` key is still here, but now means a totally
different thing - the entire cluster with all its replicasets.

<h2>Compat documentation</h2>

`box.info.cluster` default meaning is the whole cluster with all
its replicasets. To get info about only the current replicaset
`box.info.replicaset` should be used.

In old versions (< 3.0.0) `box.info.cluster` meant the current
replicaset and `box.info.replicaset` didn't exist.

<h3>Old and new behaviour</h3>

New behaviour:
```
tarantool> box.info.cluster
---
- <some cluster keys>
...

tarantool> box.info.replicaset
---
- uuid: <replicaset uuid>
- <... other attributes of the replicaset>
...
```
Old behaviour:
```
tarantool> box.info.cluster
---
- uuid: <replicaset uuid>
- <... other attributes of the replicaset>
...

tarantool> box.info.replicaset (= nil on < 3.0.0)
---
- uuid: <replicaset uuid>
- <... other attributes of the replicaset>
...
```

<h3>Known compatibility issues</h3>

VShard versions < 0.1.24 do not support the new behaviour.

<h3>Detecting issues in you codebase</h3>

Look for all usages of `box.info.cluster`, `info.cluster`, and
even just `.cluster`, `['cluster']`, `["cluster"]`. For the new
behaviour to work all of them have to use 'replicaset' key.

ef86e000

schema: replace 'cluster' -> 'replicaset_uuid' · aa987a82

Vladislav Shpilevoy authored 2 years ago

Replicaset UUID was stored in _schema['cluster'] tuple. This is
going to be confusing soon, because there will be introduced an
actual concept of cluster as multiple replicasets.

The patch renames it to 'replicaset_uuid'.

Part of #5029

@TarantoolBot document
Title: Update '_schema' with new 'replicaset_uuid' key

Currently _schema system space is documented to have 'cluster' key
with replicaset UUID value. Now this key is deleted (since 3.0)
and the UUID is stored in 'replicaset_uuid' key.

aa987a82