Commits · fae9c5bbe6c5c83dc5a8cda153a3a3ff91991c5d · core / tarantool

Oct 03, 2023

Performance tests added to perf directory are not automated and
currently we run these tests manually from time to time. From other side
source code that used rarely could lead to software rot [1].

The patch adds CMake target "test-perf" and GitHub workflow, that runs
these tests in CI. Workflow is based on workflow release.yml, it builds
performance tests and runs them.

1. https://en.wikipedia.org/wiki/Software_rot

NO_CHANGELOG=testing
NO_DOC=testing
NO_TEST=testing

(cherry picked from commit 5edcb712)

29211065

perf: add targets for running C performance tests · 4771c1b8

Sergey Bronnikov authored 1 year ago

The patch adds a targets for each C performance test in a directory
perf/ and a separate target "test-c-perf" that runs all C performance
tests at once.

NO_CHANGELOG=testing
NO_DOC=testing
NO_TEST=test infrastructure

(cherry picked from commit 68623381)

4771c1b8

perf: add targets for running Lua performance tests · 81b624fb

Sergey Bronnikov authored 1 year ago

The patch adds a targets for each Lua performance test in a directory
perf/lua/ (1mops_write_perftest, box_select_perftest,
uri_escape_unescape_perftest) and a separate target "test-lua-perf" that
runs all Lua performance tests at once.

NO_CHANGELOG=testing
NO_DOC=testing
NO_TEST=test infrastructure

(cherry picked from commit 49d9a874)

81b624fb

perf: initial version of 1M operations test · 77bd900b

Sergey Ostanevich authored 1 year ago

The test can be used for regression testing. It is advisable to tune
the machine: check the NUMA configuration, fix the pstate or similar
CPU autotune. Although, running dozen times gives more-less stable
result for the peak performance, that should be enough for regression
identification.

NO_DOC=adding an internal test
NO_CHANGELOG=ditto
NO_TEST=ditto

(cherry picked from commit 10870343)

77bd900b

perf: add test for box select · 520d6049

Vladimir Davydov authored 1 year ago

The test runs get, select, pairs space methods with various arguments in
a loop and prints the average method run time in nanoseconds (lower is
better).

Usage:

  tarantool box_select.lua

Output format:

  <test-case> <run-time>

Example:

  $ tarantool box_select.lua --pattern 'get|select_%d$'
  get_0 155
  get_1 240
  select_0 223
  select_1 335
  select_5 2321

Options:

  --pattern <string>  run only tests matching the pattern; use '|'
                      to specify more than one pattern, for example,
                      'get|select'
  --read_view         use a read view (EE only)

Apart from the test, this patch also adds a script that compares test
results:

  $ tarantool box_select.lua --pattern get > base
  $ tarantool box_select.lua --pattern get > patched1
  $ tarantool box_select.lua --pattern get > patched2
  $ tarantool compare.lua base patched1 patched2
         base          patched1          patched2
  get_0   149       303 (+103%)       147 (-  1%)
  get_1   239       418 (+ 74%)       238 (-  0%)

NO_DOC=perf test
NO_TEST=perf test
NO_CHANGELOG=perf test

(cherry picked from commit 114d09f5)

520d6049

Jan 24, 2023

perf/cmake: add a function for generating perf test targets · ca58d6c9

Sergey Bronnikov authored 2 years ago

Commit 2be74a65 ("test/cmake: add a function for generating unit
test targets") added a function for generating unit test targets in
CMake. This function makes code simpler and less error-prone.

Proposed patch adds a similar function for generating performance test
targets in CMake.

NO_CHANGELOG=build infrastructure updated
NO_DOC=build infrastructure updated
NO_TEST=build infrastructure updated

ca58d6c9

Dec 27, 2022

perf: add uri.escape/unescape test · 3cc0b3cf

Sergey Bronnikov authored 2 years ago

Added a simple benchmark for URI escape/unescape.

Part of #3682

NO_DOC=documentation is not required for performance test
NO_CHANGELOG=performance test
NO_TEST=performance test

3cc0b3cf

Aug 26, 2022

perf: introduce Light benchmark · 9818bba4

Nikita Pettik authored 2 years ago

Benchmark is implemented using Google Benchmark lib. Here's benchmark
settings:
 - values: we use structure (tuple) containing pointer to heap memory
           and size (all payload is of the same size - 32 bytes);
 - keys: unsigned char (first byte in the tuple memory);
 - hash function: FNV-1a;
 - value comparator: std::memcmp();
 - value count: 10k - 100k - 1M

Before each test we prepare vector of tuples storing truly random
values.

Here's the list of results obtained on my PC (i7-8700 12 X 4600 MHz):

Insertions: ~20-12M per second;
Find (no misses): ~58-16M* per second (find by key gives the same result);
Find (many misses): ~84-30M per second;
Iteration with dereference: ~450M per second;
Insertions after erase: ~50-17M* per second;
Find after erase: ~52-17M* per second (the same as without erase);
Delete: ~32-8M* per second.

* The first value is for 10k values in hash table; second - is for 1M.

Just to have some baseline here results for quite similar benchmark for
std::unordered_map (it is also included in source file):

Insertions: ~26-8M per second;
Find (no misses): ~44-11M per second;
Iteration with dereference: ~265-56M per second;
Find after erase: ~37-13M per second.

Part of #7338

NO_TEST=<Benchmark>
NO_DOC=<Benchmark>
NO_CHANGELOG=<Benchmark>

9818bba4

perf: use C++ 14 standard · e48835fd

Nikita Pettik authored 2 years ago

There are a lot of pretty things introduced in 14 standard,
so let's use it.

NO_DOC=<Build change>
NO_TEST=<Build change>
NO_CHANGELOG=<Build change>

e48835fd

perf: move debug warning to a separate header · 0a7764a7

Nikita Pettik authored 2 years ago

It's useful and can be used in all performance tests, so let's move it
to a separate header.

NO_TEST=<Refactoring>
NO_DOC=<Refactoring>
NO_CHANGELOG=<Refactoring>

0a7764a7

Jun 28, 2022

tuple: refactor flags · 9da70207

Nikita Pettik authored 2 years ago

Before this patch struct tuple had two boolean bit fields: is_dirty and
has_uploaded_refs. It is worth mentioning that sizeof(boolean) is
implementation depended. However, in code it is assumed to be 1 byte
(there's static assertion restricting the whole struct tuple size by 10
bytes). So strictly speaking it may lead to the compilation error on
some non-conventional system. Secondly, bit fields anyway consume at
least one size of type (i.e. there's no space benefits in using two
uint8_t bit fields - they anyway occupy 1 byte in total). There are
several known pitfalls concerning bit fields:
 - Bit field's memory layout is implementation dependent;
 - sizeof() can't be applied to such members;
 - Complier may raise unexpected side effects
   (https://lwn.net/Articles/478657/).

Finally, in our code base as a rule we use explicit masks:
txn flags, vy stmt flags, sql flags, fiber flags.

So, let's replace bit fields in struct tuple with single member called
`flags` and several enum values corresponding to masks (to be more
precise - bit positions in tuple flags).

NO_DOC=<Refactoring>
NO_CHANGELOG=<Refactoring>
NO_TEST=<Refactoring>

9da70207

May 18, 2022

replication: fix race in accessing vclock by applier and tx threads · ddec704e

Serge Petrenko authored 2 years ago

When applier ack writer was moved to applier thread, it was overlooked
that it would start sharing replicaset.vclock between two threads.

This could lead to the following replication errors on master:

 relay//102/main:reader V> Got a corrupted row:
 relay//102/main:reader V> 00000000: 81 00 00 81 26 81 01 09 02 01

Such a row has an incorrectly-encoded vclock: `81 01 09 02 01`.
When writer fiber encoded the vclock length (`81`), there was only one
vclock component: {1: 9}, but at the moment of iterating over the
components, another WAL write was reported to TX thread, which bumped
the second vclock component {1: 9, 2: 1}.

Let's fix the race by delivering a copy of current replicaset vclock to
the applier thread.

Also add a perf test to the perf/ directory.

Closes #7089
Part-of tarantool/tarantool-qa#166

NO_DOC=internal fix
NO_TEST=hard to test

ddec704e

Mar 24, 2022

box: introduce a pair of tuple_format_new helpers · 4b8dc6b7

Aleksandr Lyapunov authored 3 years ago

tuple_format_new has lots of arguments, all of them necessary
indeed. But a small analysss showed that almost always there are
only two kinds of usage of that function: with lots of zeros as
arguments and lots of values taken from space_def.

Make two versions of tuple_format_new:
simple_tuple_format_new, with all those zeros omitted, and
space_tuple_format_new, that takes space_def as an argument.

NO_DOC=refactoring
NO_CHANGELOG=refactoring

4b8dc6b7

Mar 23, 2022

Fix undefined reference to `set_sigint_cb` function · 4c48af26

mechanik20051988 authored 2 years ago

We should link box_test_utils to tuple perf test to
prevent this error.

Follow up #2717

NO_CHANGELOG=build fix
NO_DOC=build fix
NO_TEST=build fix

4c48af26

Mar 03, 2022

alter: implement ability to set compression for tuple fields · a51313a4

mechanik20051988 authored 3 years ago

Implement ability to set compression for tuple fields. Compression type
for tuple fields is set in the space format, and can be set during space
creation or during setting of a new space format.
```lua
format = {{name = 'x', type = 'unsigned', compression = 'none'}}
space = box.schema.space.create('memtx_space', {format = format})
space:drop()
space = box.schema.space.create('memtx_space')
space:format(format)
```
For opensource build only one compression type ('none') is
supported. This type of compression means its absence, so
it doesn't affect something.

Part of #2695

NO_CHANGELOG=stubs for enterprise version
NO_DOC=stubs for enterprise version

a51313a4

Feb 03, 2022

test: fix incorrect resource release · 438ce64e

mechanik20051988 authored 3 years ago

There were two problems with resource release in performance test:
- because of manually zeroing of `box_tuple_last`, tuple_format
  structure was not deleted. `box_tuple_last` should be zeroed in
  `tuple_free` function.
- invalid loop for resource release in one of the test cases.
This patch fix both problems.

NO_CHANGELOG=test fix
NO_DOC=test fix

438ce64e

Dec 09, 2021

cmake: align folders dependencies · d8097325

Sergey Ostanevich authored 3 years ago

Use of PROJECT_ prefix gives ability to build the project as a
submodule of other projects.

d8097325

Aug 18, 2021

memtx: introduce memtx_set_tuple_format_vtab() · 7417aed7

Nikita Pettik authored 3 years ago

This is helper to set proper tuple_format vtable depending on allocator
symbolic name.

Follow-up #5419

7417aed7

memtx: introduce allocator_settings structure · 96793697

Nikita Pettik authored 3 years ago

It is assumed to accumulate all allocation setting across all allocators
in order to unify Allocator::create() interface.

Follow-up #5419

96793697

memtx: implement template tuple allocation · 94e137bc

mechanik20051988 authored 3 years ago

Patch which prepare ability to select memory allocator.
Changed tuple allocation functions to templates, with
parameterized by the memory allocator type.
Part of #5419

94e137bc

memtx: replace direct function calls to calls via pointers from vtab · 801c906d

mechanik20051988 authored 3 years ago

Previously in memtx space direct memtx_tuple_new/memtx_tuple_delete
function calls were used. Also pointers to functions, used for alloc/free
memory for memtx tuples are stored in tuple_format_vtab. Replaced direct
memtx_tuple_new and memtx_tuple_delete function calls in memtx_space to
calls via pointers from vtab.
Part of #5419

801c906d

Aug 12, 2021
- perf: introduce tuple perf test · 3dea259c
  Aleksandr Lyapunov authored 3 years ago
  
  Part of #5385
  3dea259c