- Oct 03, 2023
-
-
Sergey Bronnikov authored
Performance tests added to perf directory are not automated and currently we run these tests manually from time to time. From other side source code that used rarely could lead to software rot [1]. The patch adds CMake target "test-perf" and GitHub workflow, that runs these tests in CI. Workflow is based on workflow release.yml, it builds performance tests and runs them. 1. https://en.wikipedia.org/wiki/Software_rot NO_CHANGELOG=testing NO_DOC=testing NO_TEST=testing (cherry picked from commit 5edcb712)
-
Sergey Bronnikov authored
The patch adds a targets for each C performance test in a directory perf/ and a separate target "test-c-perf" that runs all C performance tests at once. NO_CHANGELOG=testing NO_DOC=testing NO_TEST=test infrastructure (cherry picked from commit 68623381)
-
Sergey Bronnikov authored
The patch adds a targets for each Lua performance test in a directory perf/lua/ (1mops_write_perftest, box_select_perftest, uri_escape_unescape_perftest) and a separate target "test-lua-perf" that runs all Lua performance tests at once. NO_CHANGELOG=testing NO_DOC=testing NO_TEST=test infrastructure (cherry picked from commit 49d9a874)
-
Sergey Ostanevich authored
The test can be used for regression testing. It is advisable to tune the machine: check the NUMA configuration, fix the pstate or similar CPU autotune. Although, running dozen times gives more-less stable result for the peak performance, that should be enough for regression identification. NO_DOC=adding an internal test NO_CHANGELOG=ditto NO_TEST=ditto (cherry picked from commit 10870343)
-
Vladimir Davydov authored
The test runs get, select, pairs space methods with various arguments in a loop and prints the average method run time in nanoseconds (lower is better). Usage: tarantool box_select.lua Output format: <test-case> <run-time> Example: $ tarantool box_select.lua --pattern 'get|select_%d$' get_0 155 get_1 240 select_0 223 select_1 335 select_5 2321 Options: --pattern <string> run only tests matching the pattern; use '|' to specify more than one pattern, for example, 'get|select' --read_view use a read view (EE only) Apart from the test, this patch also adds a script that compares test results: $ tarantool box_select.lua --pattern get > base $ tarantool box_select.lua --pattern get > patched1 $ tarantool box_select.lua --pattern get > patched2 $ tarantool compare.lua base patched1 patched2 base patched1 patched2 get_0 149 303 (+103%) 147 (- 1%) get_1 239 418 (+ 74%) 238 (- 0%) NO_DOC=perf test NO_TEST=perf test NO_CHANGELOG=perf test (cherry picked from commit 114d09f5)
-
- Jan 24, 2023
-
-
Sergey Bronnikov authored
Commit 2be74a65 ("test/cmake: add a function for generating unit test targets") added a function for generating unit test targets in CMake. This function makes code simpler and less error-prone. Proposed patch adds a similar function for generating performance test targets in CMake. NO_CHANGELOG=build infrastructure updated NO_DOC=build infrastructure updated NO_TEST=build infrastructure updated
-
- Dec 27, 2022
-
-
Sergey Bronnikov authored
Added a simple benchmark for URI escape/unescape. Part of #3682 NO_DOC=documentation is not required for performance test NO_CHANGELOG=performance test NO_TEST=performance test
-
- Aug 26, 2022
-
-
Nikita Pettik authored
Benchmark is implemented using Google Benchmark lib. Here's benchmark settings: - values: we use structure (tuple) containing pointer to heap memory and size (all payload is of the same size - 32 bytes); - keys: unsigned char (first byte in the tuple memory); - hash function: FNV-1a; - value comparator: std::memcmp(); - value count: 10k - 100k - 1M Before each test we prepare vector of tuples storing truly random values. Here's the list of results obtained on my PC (i7-8700 12 X 4600 MHz): Insertions: ~20-12M per second; Find (no misses): ~58-16M* per second (find by key gives the same result); Find (many misses): ~84-30M per second; Iteration with dereference: ~450M per second; Insertions after erase: ~50-17M* per second; Find after erase: ~52-17M* per second (the same as without erase); Delete: ~32-8M* per second. * The first value is for 10k values in hash table; second - is for 1M. Just to have some baseline here results for quite similar benchmark for std::unordered_map (it is also included in source file): Insertions: ~26-8M per second; Find (no misses): ~44-11M per second; Iteration with dereference: ~265-56M per second; Find after erase: ~37-13M per second. Part of #7338 NO_TEST=<Benchmark> NO_DOC=<Benchmark> NO_CHANGELOG=<Benchmark>
-
Nikita Pettik authored
There are a lot of pretty things introduced in 14 standard, so let's use it. NO_DOC=<Build change> NO_TEST=<Build change> NO_CHANGELOG=<Build change>
-
Nikita Pettik authored
It's useful and can be used in all performance tests, so let's move it to a separate header. NO_TEST=<Refactoring> NO_DOC=<Refactoring> NO_CHANGELOG=<Refactoring>
-
- Jun 28, 2022
-
-
Nikita Pettik authored
Before this patch struct tuple had two boolean bit fields: is_dirty and has_uploaded_refs. It is worth mentioning that sizeof(boolean) is implementation depended. However, in code it is assumed to be 1 byte (there's static assertion restricting the whole struct tuple size by 10 bytes). So strictly speaking it may lead to the compilation error on some non-conventional system. Secondly, bit fields anyway consume at least one size of type (i.e. there's no space benefits in using two uint8_t bit fields - they anyway occupy 1 byte in total). There are several known pitfalls concerning bit fields: - Bit field's memory layout is implementation dependent; - sizeof() can't be applied to such members; - Complier may raise unexpected side effects (https://lwn.net/Articles/478657/). Finally, in our code base as a rule we use explicit masks: txn flags, vy stmt flags, sql flags, fiber flags. So, let's replace bit fields in struct tuple with single member called `flags` and several enum values corresponding to masks (to be more precise - bit positions in tuple flags). NO_DOC=<Refactoring> NO_CHANGELOG=<Refactoring> NO_TEST=<Refactoring>
-
- May 18, 2022
-
-
Serge Petrenko authored
When applier ack writer was moved to applier thread, it was overlooked that it would start sharing replicaset.vclock between two threads. This could lead to the following replication errors on master: relay//102/main:reader V> Got a corrupted row: relay//102/main:reader V> 00000000: 81 00 00 81 26 81 01 09 02 01 Such a row has an incorrectly-encoded vclock: `81 01 09 02 01`. When writer fiber encoded the vclock length (`81`), there was only one vclock component: {1: 9}, but at the moment of iterating over the components, another WAL write was reported to TX thread, which bumped the second vclock component {1: 9, 2: 1}. Let's fix the race by delivering a copy of current replicaset vclock to the applier thread. Also add a perf test to the perf/ directory. Closes #7089 Part-of tarantool/tarantool-qa#166 NO_DOC=internal fix NO_TEST=hard to test
-
- Mar 24, 2022
-
-
Aleksandr Lyapunov authored
tuple_format_new has lots of arguments, all of them necessary indeed. But a small analysss showed that almost always there are only two kinds of usage of that function: with lots of zeros as arguments and lots of values taken from space_def. Make two versions of tuple_format_new: simple_tuple_format_new, with all those zeros omitted, and space_tuple_format_new, that takes space_def as an argument. NO_DOC=refactoring NO_CHANGELOG=refactoring
-
- Mar 23, 2022
-
-
mechanik20051988 authored
We should link box_test_utils to tuple perf test to prevent this error. Follow up #2717 NO_CHANGELOG=build fix NO_DOC=build fix NO_TEST=build fix
-
- Mar 03, 2022
-
-
mechanik20051988 authored
Implement ability to set compression for tuple fields. Compression type for tuple fields is set in the space format, and can be set during space creation or during setting of a new space format. ```lua format = {{name = 'x', type = 'unsigned', compression = 'none'}} space = box.schema.space.create('memtx_space', {format = format}) space:drop() space = box.schema.space.create('memtx_space') space:format(format) ``` For opensource build only one compression type ('none') is supported. This type of compression means its absence, so it doesn't affect something. Part of #2695 NO_CHANGELOG=stubs for enterprise version NO_DOC=stubs for enterprise version
-
- Feb 03, 2022
-
-
mechanik20051988 authored
There were two problems with resource release in performance test: - because of manually zeroing of `box_tuple_last`, tuple_format structure was not deleted. `box_tuple_last` should be zeroed in `tuple_free` function. - invalid loop for resource release in one of the test cases. This patch fix both problems. NO_CHANGELOG=test fix NO_DOC=test fix
-
- Dec 09, 2021
-
-
Sergey Ostanevich authored
Use of PROJECT_ prefix gives ability to build the project as a submodule of other projects.
-
- Aug 18, 2021
-
-
Nikita Pettik authored
This is helper to set proper tuple_format vtable depending on allocator symbolic name. Follow-up #5419
-
Nikita Pettik authored
It is assumed to accumulate all allocation setting across all allocators in order to unify Allocator::create() interface. Follow-up #5419
-
mechanik20051988 authored
Patch which prepare ability to select memory allocator. Changed tuple allocation functions to templates, with parameterized by the memory allocator type. Part of #5419
-
mechanik20051988 authored
Previously in memtx space direct memtx_tuple_new/memtx_tuple_delete function calls were used. Also pointers to functions, used for alloc/free memory for memtx tuples are stored in tuple_format_vtab. Replaced direct memtx_tuple_new and memtx_tuple_delete function calls in memtx_space to calls via pointers from vtab. Part of #5419
-
- Aug 12, 2021
-
-
Aleksandr Lyapunov authored
Part of #5385
-