Commits · 30547157db74a2869b7da4247e8c97c9ec39a6a9 · core / tarantool

Jul 15, 2024

vinyl: use broadcast instead of signal to notify about dump completion · 30547157

Vladimir Davydov authored 8 months ago

There may be more than one fiber waiting on `vy_scheduler::dump_cond`:

```
box.snapshot
  vinyl_engine_wait_checkpoint
    vy_scheduler_wait_checkpoint

space.create_index
  vinyl_space_build_index
    vy_scheduler_dump
```

To avoid hang, we should use `fiber_cond_broadcast`.

Closes #10233

NO_DOC=bug fix

30547157

small: bump new version with UBSan fixes · 3e183044

Lev Kats authored 8 months ago

This patch bumped small to the new version that does not trigger
UBSan with *_entry* macros and should support new oss-fuzz builder.

New commits:

* rlist: make its methods accept const arguments
* lsregion: introduce lsregion_to_iovec method
* rlist: make foreach_enrty_* macros not to use UB

Fixes: #10143

NO_DOC=small submodule bump
NO_TEST=small submodule bump
NO_CHANGELOG=small submodule bump

3e183044

trivia: use __builtin* for offsetof macro · 27e94824

Lev Kats authored 8 months ago

Changed default tarantool `offsetof` macro implementation so it don't
access members of null pointer in typeof that triggers UBsan.

Needed for #10143

NO_DOC=bugfix
NO_CHANGELOG=minor
NO_TEST=tested manually with fuzzer

27e94824

Jul 09, 2024

uuid: relax UUID value validation · b0b32bff

Igor Munkin authored 1 year ago

This patch completely relaxes UUID checks and accepts an arbitrary
128-bit sequence as an UUID for binary data. String representations
still should match the grammars in RFC 4122, Section 3 [1] and RFC 9562,
Section 4 [2].

[1]: https://datatracker.ietf.org/doc/html/rfc4122#section-3
[2]: https://datatracker.ietf.org/doc/html/rfc9562#name-uuid-format

Closes #5444

@TarantoolBot document
Title: uuid: relaxed UUID validation

[The UUID module documentation][1] mentions that Tarantool generates
UUIDs following the rules for RFC 4122,[version 4, variant 1][2]. It is
worth mentioning that the user can store an arbitrary 128-bit sequence
as an UUID for binary data. String representations still should match
the grammars in RFC 4122, [Section 3][3], and RFC 9562, [Section 4][4].

[1]: https://www.tarantool.io/en/doc/latest/reference/reference_lua/uuid/
[2]: https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)
[3]: https://datatracker.ietf.org/doc/html/rfc4122#section-3
[4]: https://datatracker.ietf.org/doc/html/rfc9562#name-uuid-format

b0b32bff

test: rewrite engine/uuid.test.lua with luatest · 462b20dc
Igor Munkin authored 1 year ago
```
NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
```
462b20dc
test: rewrite app/uuid.test.lua with luatest · 5e83fb1b
Igor Munkin authored 1 year ago
```
NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
```
5e83fb1b

Jul 08, 2024

luajit: bump new version · bf01fb20

Sergey Kaplun authored 8 months ago

* Correct fix for stack check when recording BC_VARG.
* test: remove inline suppressions of _TARANTOOL
* FFI: Fix ffi.alignof() for reference types.
* FFI: Fix sizeof expression in C parser for reference types.
* FFI: Allow ffi.metatype() for typedefs with attributes.
* FFI: Fix ffi.metatype() for non-raw types.
* Maintain chain invariant in DCE.
* build: introduce option LUAJIT_ENABLE_TABLE_BUMP
* ci: add tablebump flavor for exotic builds
* test: allow `jit.parse` to return aborted traces
* Handle all types of errors during trace stitching.
* Use generic trace error for OOM during trace stitching.
* Check for IR_HREF vs. IR_HREFK aliasing in non-nil store check.
* cmake: set cmake_minimum_required only once
* cmake: fix warning about minimum required version
* ci: add a workflow for testing with AVX512 enabled
* test: introduce a helper read_file
* OSX/iOS/ARM64: Fix generation of Mach-O object files.
* OSX/iOS/ARM64: Fix bytecode embedding in Mach-O object file.
* build: introduce LUAJIT_USE_UBSAN option
* ci: enable UBSan for sanitizers testing workflow
* cmake: add the build directory to the .gitignore
* Prevent sanitizer warning in snap_restoredata().
* Avoid negation of signed integers in C that may hold INT*_MIN.
* Show name of NYI bytecode in -jv and -jdump.

Closes #9924
Closes #8473

NO_DOC=LuaJIT submodule bump
NO_TEST=LuaJIT submodule bump

bf01fb20

fiber: phohibit fiber self join · 1e1bf36d

Nikolay Shirokovskiy authored 8 months ago

In this case join will just hang. Instead let's raise an error in case
of Lua API and panic in case of C API.

Closes #10196

NO_DOC=minor

1e1bf36d

Jul 04, 2024

fiber: fix leak on dead joinable fiber search · 7db4de75

Nikolay Shirokovskiy authored 8 months ago

When fiber is accessed from Lua we create a userdata object and keep the
reference for future accesses. The reference is cleared when fiber is
stopped. But if fiber is joinable is still can be found with
`fiber.find`. In this case we create userdata object again.
Unfortunately as fiber is already stopped we fail to clear the
reference. The trigger memory that clear the reference is also leaked.
As well as fiber storage if it is accessed after fiber is stopped.

Let's add `on_destroy` trigger to fiber and clear the references there.

Note that with current set of LSAN suppressions the trigger memory leak
of the issue is not reported.

Closes #10187

NO_DOC=bugfix

7db4de75

Jul 03, 2024

config: expose experimental.config.utils.schema · ef716a3e

Alexander Turenko authored 8 months ago

The module is renamed from `internal.config.utils.schema` to
`experimental.config.utils.schema` without changes.

It is useful for validation of configuration data in roles and
applications.

Also, it provides a couple of methods that aim to simplify usual tasks
around processing of hierarchical configuration data. For example,

* get/set a nested value
* apply defaults from the schema
* filter data based on annotations from the schema
* transform a hierarchical data using a function
* merge two hierarchical values
* parse environment variable according to its type in the schema

See https://github.com/tarantool/doc/issues/4279 for an in-depth
description.

Fixes #10117

NO_DOC=https://github.com/tarantool/doc/issues/4279

ef716a3e

Jun 26, 2024

box: fix memleak on functional index drop · 319357d5

Nikolay Shirokovskiy authored 8 months ago

We just don't free functional index keys on functional index drop now.
Let's approach keys deletion as in the case of primary index drop ie
let's drop these keys in background.

We should set `use_hint` to `true` in case of MEMTX_TREE_VTAB_DISABLED
tree index methods because `memtx_tree_disabled_index_vtab` uses
`memtx_tree_index_destroy<true>`. Otherwise we get read outside of index
structure for stub functional index on destroy for introduced `is_func`
field (which is reported by ASAN).

Closes #10163

NO_DOC=bugfix

319357d5

Jun 25, 2024

third_party: update libcurl from 8.7.0 to 8.8.0+patches · 7192bf66

Sergey Bronnikov authored 9 months ago

The patch updates curl module to the version 8.8.0 [1] plus
a number of commits in a range curl-8_8_0..30de937bda0f because
it includes a fix for a regression [2] caught on the previous bump.
The new version brings a number of functional fixes.

Previous changelog entry has been removed because duplicate
entries about bumps in release changelog confuses end users.

Closes #9612

1. https://curl.se/changes.html#8_8_0
2. https://github.com/curl/curl/issues/13740

NO_DOC=libcurl submodule bump
NO_TEST=libcurl submodule bump

7192bf66

third_party: update libcurl from 8.6.0 to 8.7.1 · 63cb2bf6

Sergey Bronnikov authored 11 months ago

The patch updates curl module to the version 8.7.1 [1][2] that
brings a number of functional and security fixes, and updates
CMake module for building curl library.

Security fixes:

- CVE-2024-2004: Usage of disabled protocol. (low)
- CVE-2024-2398: HTTP/2 push headers memory-leak. (medium)
- CVE-2024-2379: QUIC certificate check bypass with wolfSSL. (low)
- CVE-2024-2466: TLS certificate check bypass with mbedTLS. (medium)

Changes in CMake module:

- Option `USE_OPENSSL_QUIC` was added and disabled by default [3]

Previous changelog entry has been removed because duplicate
entries about bumps in release changelog confuses end users.

The bump was blocked by a regression in libcurl [4][5].

1. https://curl.se/changes.html#8_7_1
2. https://github.com/curl/curl/compare/curl-8_6_0...curl-8_7_1
3. https://github.com/curl/curl/commit/8e741644a229c3791963b4f5cae1dcfccba842dd
4. https://curl.se/mail/lib-2024-03/0059.html
5. https://github.com/curl/curl/issues/13260

NO_DOC=libcurl submodule bump
NO_TEST=libcurl submodule bump

63cb2bf6

third_party: update libcurl from 8.5.0+patch to 8.6.0 · 00cfc959

Sergey Bronnikov authored 1 year ago

The patch updates curl module to the version 8.6.0 [1][2] that
brings a number of functional fixes, and updates CMake module for
building curl library.

Changes in CMake module:

- Option `ENABLE_CURL_MANUAL` was added and disabled by default [3]
- Option `BUILD_LIBCURL_DOCS` was added and disabled by default [3]

The patch follows up commit 9bdf2bab ("httpc: fix reading data
in a chunked request") where curl submodule was updated to
a version based on 8.5.0 release with applied patch with fix [4].

Previous changelog entry has been removed because duplicate
entries about bumps in release changelog confuses end users.

This bump was blocked by a regression in libcurl [5].

1. https://curl.se/changes.html#8_6_0
2. https://github.com/curl/curl/compare/curl-8_5_0...curl-8_6_0
3. https://github.com/curl/curl/commit/a808aab06851d4364ab1773c664df3d906a497a9
4. https://github.com/curl/curl/commit/cdd905a9854305657ebbe645095e1189dcda28c7
5. https://github.com/curl/curl/commit/b8c003832d730bb2f4b9de4204675ca5d9f7a903

NO_DOC=libcurl submodule bump
NO_TEST=libcurl submodule bump

00cfc959

Jun 24, 2024

config: add missing ssl.ssl_cert for etcd · 13249eb3

Georgy Moiseev authored 9 months ago

etcd configuration section allows to connect to TLS-encrypted etcd
cluster, providing a way to pass `ssl.ssl_key`. But it is not enough
when etcd server have client cert auth enabled and has a CA file, since
it requires a ssl_cert as well. Actually, propagating ssl_cert is
already a part of the EE connect code [1], we just missing the top-level
config option.

Fixes https://github.com/tarantool/tarantool-ee/issues/827

1. https://github.com/tarantool/tarantool-ee/blame/1138443c46e7a6e1bb855277bc6cb3333240131c/src/box/lua/config/source/etcd.lua#L103

@TarantoolBot document
Title: config: add missing ssl.ssl_cert for etcd

etcd configuration section already allows to set `ssl.ssl_key`. Now it
also allows to pass `ssl.ssl_cert`.

13249eb3

Jun 21, 2024

sio: use kern.ipc.somaxconn for listen() on Mac · 7e9a872f

Vladislav Shpilevoy authored 9 months ago

listen() on Mac used to take SOMAXCONN as the backlog size. It is
just 128, which is too small when connections are incoming too
fast. They get rejected.

Increase of the queue size wasn't possible, because the limit was
hardcoded. But now sio takes the runtime limit from
kern.ipc.somaxconn sysctl setting.

One weird thing is that when set too high, it seems to have no
effect, like if nothing was changed. Specifically, values above
32767 are not doing anything, even though stay visible in
kern.ipc.somaxconn.

It seems listen() on Mac internally might be using 'short' or
int16_t to store the queue size and it gets broken when anything
above INT16_MAX is used. The code truncates the queue size to this
value if the given one is too high.

Closes #8130

NO_DOC=bugfix
NO_TEST=requires root privileges for testing

7e9a872f

ci: avoid sending perf stat to Influx from forks · d2240bf7

Sergey Kaplun authored 9 months ago

This patch is the follow-up for the commit
49946a72 ("ci: send perf statistics to InfluxDB").
Since secrets are unavailable for fork repositories, the sending step
fails due to a missed InfluxDB URL and token. This patch allows to run
this step only for on push events or PRs from the main repository
itself.

NO_DOC=CI
NO_CHANGELOG=CI
NO_TEST=CI

d2240bf7

Jun 20, 2024

ci: add workflow to check downgrade versions · 6d856347

Nikolay Shirokovskiy authored 9 months ago

Tarantool has hardcoded list of versions it can downgrade to. This list
should consist of all the released versions less than Tarantool version.
This workflow helps to make sure we update the list before release.

It is run on pushing release tag to the repo, checks the list and fails
if it misses some released version less than current. In this case we
are supposed to update downgrade list (with required downgrade code) and
update the release tag.

Closes #8319

NO_TEST=ci
NO_CHANGELOG=ci
NO_DOC=ci

6d856347

Jun 18, 2024

ci: send perf statistics to InfluxDB · 49946a72

Sergey Kaplun authored 9 months ago

This patch adds an additional steps in the <perf_micro.yml> workflow to
aggregate and send aggregated data to InfluxDB via curl.

Also, this patch adds the corresponding environment variables to be used
during workflow to preserve the original commit hash and branch name.

NO_DOC=CI
NO_CHANGELOG=CI
NO_TEST=CI

49946a72

perf: add aggregator helper for bench statistics · 95257919

Sergey Kaplun authored 9 months ago

This patch adds a helper script to aggregate the benchmark results from
JSON files to the format parsable by the InfluxDB line protocol [1].

All JSON files from the <perf/output> directory are benchmark results
and aggregated into the <perf/output/summary.txt> file that can be
posted to the InfluxDB. The results are aggregated via the new target
test-perf-aggregate, which is run only if some JSON files with results
are missed.

[1]: https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

95257919

perf: move compare.lua to the tools directory · 02690a24

Sergey Kaplun authored 9 months ago

This file can be used to compare the results of Lua benchmarks. Since it
has a general purpose, it is moved to the <perf/tools> directory.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

02690a24

perf: save perf results for the benchmarks · 2c60a941

Sergey Kaplun authored 10 months ago

This patch saves the output of the performance tests in the JSON format
to be processed later. The corresponding directory is added to the
<.gitignore>.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

2c60a941

perf: provide items_per_second metric in bps_tree · cd2f4838

Sergey Kaplun authored 9 months ago

This patch considers the number of iterations as the number of items
proceeded by the corresponding benchmark, so it may be used for the
`items_per_second` counter.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

cd2f4838

perf: standardize gh-7089-vclock-copy benchmark · ccb7a649

Sergey Kaplun authored 9 months ago

The output now contains items per second without the mean time in
seconds. The number of iterations is reduced to 40 to avoid running the
test too long. The `wal_mode` option (default is "none") is set via
command line flags, as far as the number of nodes (default is 10). Also,
the master nodes are set up via the `popen()` command without using any
Makefile.

Also, two new options are introduced:
* The `--output` option allows you to specify the output file.
* The `--output_format` option means the format for the printed output.
  The default is "console". It prints items proceeded per second to the
  stdout. The "json" format contains all the information about the
  benchmark in a format similar to Google Benchmark's.

Usually, these options should be used together to dump machine-readable
results for the benchmarks.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

ccb7a649

perf: rename subtests in column_scan · 72dd9324

Sergey Kaplun authored 9 months ago

This patch renames subtests in column scan to avoid the usage of `,`
(the separator) in the tag name for the InfluxDB report.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

72dd9324

perf: standardize column_scan benchmark · cd54f50a

Sergey Kaplun authored 9 months ago

The output now contains items per second instead of time in seconds.

Also, two new options are introduced:
* The `--output` option allows you to specify the output file.
* The `--output_format` option means the format for the printed output.
  The default is "console". It prints rows proceeded per second to the
  stdout. The "json" format contains all the information about the
  benchmark in a format similar to Google Benchmark's.

Usually, these options should be used together to dump machine-readable
results for the benchmarks.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

cd54f50a

perf: standardize box_select benchmark · 7c51884d

Sergey Kaplun authored 9 months ago

The output now contains items per second instead of time in nanoseconds.

Also, two new options are introduced:
* The `--output` option allows you to specify the output file.
* The `--output_format` option means the format for the printed output.
  The default is "console". It just prints the number of iterations
  proceeded per second to the stdout. The "json" format contains all the
  information about the benchmark in a format similar to Google
  Benchmark's.

Usually, these options should be used together to dump machine-readable
results for the benchmarks.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

7c51884d

perf: standardize uri_escape_unescape benchmark · 0b1ecf66

Sergey Kaplun authored 9 months ago

Two new options are introduced:
* The `--output` option allows you to specify the output file.
* The `--output_format` option means the format for the printed output.
  The default is "console". It just prints the amount of iterations
  proceeded per second to the stdout. The "json" format contains all the
  information about the benchmark in a format similar to Google
  Benchmark's.

Usually, these options should be used together to dump machine-readable
results for the benchmarks.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

0b1ecf66

perf: clarify comments in the uri_escape_unescape · b337ff1e

Sergey Kaplun authored 9 months ago

This patch rewrites comments regarding JIT compiler options to avoid
confusion.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

b337ff1e

perf: standardize 1mops_write benchmark · a29becf4

Sergey Kaplun authored 10 months ago

The (not commented) output now contains average items per second instead
of peak speed.

Also, two new options are introduced:
* The `--output` option allows you to specify the output file.
* The `--output_format` option means the format for the printed output.
  The default is "console". It just prints inserts per second value to
  the stdout. The "json" format contains all the information about the
  benchmark in a format similar to Google Benchmark's.

Usually, these options should be used together to dump machine-readable
results for the benchmarks.

The test can still be run standalone to be usable alongside Tarantool's
repository.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

a29becf4

perf: introduce benchmark.lua helper module · 3110ef9a

Sergey Kaplun authored 9 months ago

This module helps to aggregate various subbenchmark runs and dump them
either to stdout or to the specified file. Also, it allows you to output
results in the JSON format.

Usually, these options should be used together to dump machine-readable
results for the benchmarks.

Also, set the `LUA_PATH` environment variable for the Lua benchmarks to
make the introduced module requirable.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

3110ef9a

gitignore: add perf/lua/Makefile · 435906db

Sergey Kaplun authored 9 months ago

This is a follow-up for the commit
49d9a874 ("perf: add targets for running
Lua performance tests"), where the corresponding perf test target is
introduced.

NO_DOC=perf test
NO_CHANGELOG=perf test
NO_TEST=perf test

435906db

config: use vshard-ee if available · 29519c71

Alexander Turenko authored 9 months ago

The new closed-source `vshard-ee` module was recently introduced. It is
based on the open-source `vshard` module and, as far as I know, provides
the same API.

Let's configure `vshard-ee` in the same way as `vshard` if sharding is
enabled in tarantool's configuration.

Prefer `vshard-ee` if both are available.

Fixes https://github.com/tarantool/tarantool-ee/issues/815

@TarantoolBot document
Title: config: vshard-ee is now supported

The declarative configuration supports `vshard-ee` in addition to
`vshard` since Tarantool 3.1.1 and 3.2+.

`vshard` is mentioned in the documentation at least [here][1]. All such
places should be updated to mention both `vshard-ee` and `vshard`.

[1]: https://www.tarantool.io/en/doc/latest/reference/configuration/configuration_reference/#sharding

29519c71

loaders: add require_first function (internal) · 3abd4f96

Alexander Turenko authored 9 months ago

Usage example:

```lua
local loaders = require('internal.loaders')
local vshard = loaders.require_first('vshard-ee', 'vshard')
```

The function is for internal use.

It would be nice to have something of this kind in the public API, but
I'm not going to solve it within this patch.

Needed for https://github.com/tarantool/tarantool-ee/issues/815

NO_DOC=no public API changes
NO_CHANGELOG=see NO_DOC

3abd4f96

Jun 17, 2024

box: feature `tuple:format` to get a format of a tuple · 6d5f1db5

DerekBum authored 9 months ago

This patch adds `tuple:format()` method to get a format
of a tuple.

Closes #10005

@TarantoolBot document
Title: New `format` method for `box.tuple`
Product: Tarantool
Since: 3.2

The `tuple:format` method returns a format of a tuple.

6d5f1db5

test/fuzz: speedup string serialization · 3d97334f

Sergey Bronnikov authored 9 months ago

- clamp before cleaning string because cleaning is not cheap
  (O(n), where max n is equal to kMaxStrLength)
- call cleaning for identifiers only, there is no sense to
  cleaning string literals
- replace symbols disallowed by Lua grammar in indentifier's
  names with '_'

The patch saves 16 sec on 145k samples (401 sec before the patch
and 385 sec after the patch). It is actually not so much, but it
is about 2.5 min per hour.

NO_CHANGELOG=testing
NO_DOC=testing

3d97334f

Jun 14, 2024
- lsan: add another FP leak suppression · c5b3e594
  Nikolay Shirokovskiy authored 9 months ago
  
  See #8890 NO_TEST=internal NO_CHANGELOG=internal NO_DOC=internal
  c5b3e594
Jun 13, 2024

ci: add a workflow to check for entrypoint tags · c06d0d14

Nikolay Shirokovskiy authored 1 year ago

Check check-entrypoint.sh comment for explanation of what entrypoint tag
is. The workflow fails if current branch does not have a most recent
entrypoint tag that it should have.

Part of #8319

NO_TEST=ci
NO_CHANGELOG=ci
NO_DOC=ci

c06d0d14

vinyl: fix gc vs vylog race leading to duplicate record · 9d3859b2

Vladimir Davydov authored 9 months ago

Vinyl run files aren't always deleted immediately after compaction,
because we need to keep run files corresponding to checkpoints for
backups. Such run files are deleted by the garbage collection procedure,
which performs the following steps:

 1. Loads information about all run files from the last vylog file.
 2. For each loaded run record that is marked as dropped:
    a. Tries to remove the run files.
    b. On success, writes a "forget" record for the dropped run,
       which will make vylog purge the run record on the next
       vylog rotation (checkpoint).

(see `vinyl_engine_collect_garbage()`)

The garbage collection procedure writes the "forget" records
asynchronously using `vy_log_tx_try_commit()`, see `vy_gc_run()`.
This procedure can be successfully executed during vylog rotation,
because it doesn't take the vylog latch. It simply appends records
to a memory buffer which is flushed either on the next synchronous
vylog write or vylog recovery.

The problem is that the garbage collection isn't necessarily loads
the latest vylog file because the vylog file may be rotated between
it calls `vy_log_signature()` and `vy_recovery_new()`. This may
result in a "forget" record written twice to the same vylog file
for the same run file, as follows:

  1. GC loads last vylog N
  2. GC starts removing dropped run files.
  3. CHECKPOINT starts vylog rotation.
  4. CHECKPOINT loads vylog N.
  5. GC writes a "forget" record for run A to the buffer.
  6. GC is completed.
  7. GC is restarted.
  8. GC finds that the last vylog is N and blocks on the vylog latch
     trying to load it.
  9. CHECKPOINT saves vylog M (M > N).
 10. GC loads vylog N. This triggers flushing the forget record for
     run A to vylog M (not to vylog N), because vylog M is the last
     vylog at this point of time.
 11. GC starts removing dropped run files.
 12. GC writes a "forget" record for run A to the buffer again,
     because in vylog N it's still marked as dropped and not forgotten.
     (The previous "forget" record was written to vylog M).
 13. Now we have two "forget" records for run A in vylog M.

Such duplicate run records aren't tolerated by the vylog recovery
procedure, resulting in a permanent error on the next checkpoint:

```
ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Run XXXX forgotten but not registered
```

To fix this issue, we move `vy_log_signature()` under the vylog latch
to `vy_recovery_new()`. This makes sure that GC will see vylog records
that it's written during the previous execution.

Catching this race in a function test would require a bunch of ugly
error injections so let's assume that it'll be tested by fuzzing.

Closes #10128

NO_DOC=bug fix
NO_TEST=tested manually with fuzzer

9d3859b2

tuple: don't use offset_slot_cache in vinyl threads · 19d1f1cc

Vladimir Davydov authored 9 months ago

`key_part::offset_slot_cache` and `key_part::format_epoch` are used for
speeding up tuple field lookup in `tuple_field_raw_by_part()`. These
structure members are accessed and updated without any locks, assuming
this code is executed exclusively in the tx thread. However, this isn't
necessarily true because we also perform tuple field lookups in vinyl
read threads. Apparently, this can result in unexpected races and bugs,
for example:

```
  #1  0x590be9f7eb6d in crash_collect+256
  #2  0x590be9f7f5a9 in crash_signal_cb+100
  #3  0x72b111642520 in __sigaction+80
  #4  0x590bea385e3c in load_u32+35
  #5  0x590bea231eba in field_map_get_offset+46
  #6  0x590bea23242a in tuple_field_raw_by_path+417
  #7  0x590bea23282b in tuple_field_raw_by_part+203
  #8  0x590bea23288c in tuple_field_by_part+91
  #9  0x590bea24cd2d in unsigned long tuple_hint<(field_type)5, false, false>(tuple*, key_def*)+103
  #10 0x590be9d4fba3 in tuple_hint+40
  #11 0x590be9d50acf in vy_stmt_hint+178
  #12 0x590be9d53531 in vy_page_stmt+168
  #13 0x590be9d535ea in vy_page_find_key+142
  #14 0x590be9d545e6 in vy_page_read_cb+210
  #15 0x590be9f94ef0 in cbus_call_perform+44
  #16 0x590be9f94eae in cmsg_deliver+52
  #17 0x590be9f9583e in cbus_process+100
  #18 0x590be9f958a5 in cbus_loop+28
  #19 0x590be9d512da in vy_run_reader_f+381
  #20 0x590be9cb4147 in fiber_cxx_invoke(int (*)(__va_list_tag*), __va_list_tag*)+34
  #21 0x590be9f8b697 in fiber_loop+219
  #22 0x590bea374bb6 in coro_init+120
```

Fix this by skipping this optimization for threads other than tx.

No test is added because reproducing this race is tricky. Ideally, bugs
like this one should be caught by fuzzing tests or thread sanitizers.

Closes #10123

NO_DOC=bug fix
NO_TEST=tested manually with fuzzer

19d1f1cc