Skip to content
Snippets Groups Projects
  1. Nov 08, 2023
  2. Oct 19, 2023
    • Vladimir Davydov's avatar
      box: call tuple_free from box_free · 05751e6c
      Vladimir Davydov authored
      There are four problems we have to address to make this possible:
      
       1. memtx_engine_shutdown may delete the tuple referenced by
          box_tuple_last so that tuple_free, which is called later by
          box_free, will crash trying to free it. Fix this by clearing
          box_tuple_last in memtx_engine_shutdown.
      
       2. tuple_format_destroy and tuple_field_delete, called by it, expect
          all constraints to be detached. Let's destroy the constraints if
          this isn't the case. This effectively reverts commit 7a87b9a5
          ("box: do not call constraint[i].destroy() in
          tuple_field_delete()").
      
       3. tuple_field_delete, called by tuple_format_destroy, expects the
          default value function to be unpinned. Let's unpin it if this isn't
          the case. To avoid linking dependencies between the tuple and box
          libraries, we have to introduce a virtual destructor for
          field_default_func.
      
       4. The tuple_format unit test calls tuple_free after box_free. If
          box_free calls tuple_free by itself, this leads to a crash. Fix this
          by removing tuple_free and tuple_init calls from the test.
      
      Closes #9174
      
      NO_DOC=code health
      NO_CHANGELOG=code health
      NO_TEST=checked by existing tests
      05751e6c
  3. Oct 09, 2023
    • Georgiy Lebedev's avatar
      perf: add memtx benchmark · 2b7d9027
      Georgiy Lebedev authored
      This first version is quite basic and only benchmarks random `get`s of
      existing keys and `select`s of all keys for a tree index (these benchmarks
      are needed for #6964) — its main goal is to provide a foundation (i.e., all
      the necessary initialization logic) for benchmarking memtx. Extending this
      benchmark using the provided memtx singleton and fixture should be fairly
      simple.
      
      The results of running this benchmark compiled with clang-16 on my Intel
      MacBook Pro (13-inch, 2020) laptop [1]:
      
      NO_WRAP
      georgiy.lebedev@georgiy-lebedev perf % ./memtx.perftest --benchmark_min_warmup_time=10 --benchmark_repetitions=10 --benchmark_report_aggregates_only=true --benchmark_display_aggregates_only=true
      2023-10-02T12:59:36+03:00
      Running ./memtx.perftest
      Run on (8 X 2000 MHz CPU s)
      CPU Caches:
        L1 Data 48 KiB
        L1 Instruction 32 KiB
        L2 Unified 512 KiB (x4)
        L3 Unified 6144 KiB
      Load Average: 5.67, 10.05, 7.89
      mapping 4398046511104 bytes for memtx tuple arena...
      Actual slab_alloc_factor calculated on the basis of desired slab_alloc_factor = 1.090508
      fiber has not yielded for more than 0.500 seconds
      --------------------------------------------------------------------------------------------------------
      Benchmark                                              Time             CPU   Iterations UserCounters...
      --------------------------------------------------------------------------------------------------------
      MemtxFixture/TreeGetRandomExistingKeys_mean          682 ns          667 ns           10 items_per_second=1.51504M/s
      MemtxFixture/TreeGetRandomExistingKeys_median        704 ns          693 ns           10 items_per_second=1.44387M/s
      MemtxFixture/TreeGetRandomExistingKeys_stddev       81.7 ns         72.7 ns           10 items_per_second=169.696k/s
      MemtxFixture/TreeGetRandomExistingKeys_cv          11.97 %         10.90 %            10 items_per_second=11.20%
      MemtxFixture/TreeGet1RandomExistingKey_mean          253 ns          241 ns           10 items_per_second=4.20104M/s
      MemtxFixture/TreeGet1RandomExistingKey_median        233 ns          229 ns           10 items_per_second=4.36911M/s
      MemtxFixture/TreeGet1RandomExistingKey_stddev       46.7 ns         29.7 ns           10 items_per_second=464.187k/s
      MemtxFixture/TreeGet1RandomExistingKey_cv          18.43 %         12.34 %            10 items_per_second=11.05%
      MemtxFixture/TreeSelectAll_mean                  4766728 ns      4705622 ns           10 items_per_second=27.978M/s
      MemtxFixture/TreeSelectAll_median                4605936 ns      4580478 ns           10 items_per_second=28.6184M/s
      MemtxFixture/TreeSelectAll_stddev                 447495 ns       349499 ns           10 items_per_second=1.85573M/s
      MemtxFixture/TreeSelectAll_cv                       9.39 %          7.43 %            10 items_per_second=6.63%
      NO_WRAP
      
      [1]: https://support.apple.com/kb/SP819?locale=en_US
      
      Needed for #6964
      
      NO_CHANGELOG=benchmark
      NO_DOC=benchmark
      NO_TEST=benchmark
      2b7d9027
  4. Oct 03, 2023
    • Sergey Bronnikov's avatar
      ci: run performance tests · 5edcb712
      Sergey Bronnikov authored
      Performance tests added to perf directory are not automated and
      currently we run these tests manually from time to time. From other side
      source code that used rarely could lead to software rot [1].
      
      The patch adds CMake target "test-perf" and GitHub workflow, that runs
      these tests in CI. Workflow is based on workflow release.yml, it builds
      performance tests and runs them.
      
      1. https://en.wikipedia.org/wiki/Software_rot
      
      NO_CHANGELOG=testing
      NO_DOC=testing
      NO_TEST=testing
      5edcb712
    • Sergey Bronnikov's avatar
      perf: add targets for running C performance tests · 68623381
      Sergey Bronnikov authored
      The patch adds a targets for each C performance test in a directory
      perf/ and a separate target "test-c-perf" that runs all C performance
      tests at once.
      
      NO_CHANGELOG=testing
      NO_DOC=testing
      NO_TEST=test infrastructure
      68623381
    • Sergey Bronnikov's avatar
      perf: add targets for running Lua performance tests · 49d9a874
      Sergey Bronnikov authored
      The patch adds a targets for each Lua performance test in a directory
      perf/lua/ (1mops_write_perftest, box_select_perftest,
      uri_escape_unescape_perftest) and a separate target "test-lua-perf" that
      runs all Lua performance tests at once.
      
      NO_CHANGELOG=testing
      NO_DOC=testing
      NO_TEST=test infrastructure
      49d9a874
  5. Sep 29, 2023
    • Magomed Kostoev's avatar
      box: implement sort_order in indexes · b1990b21
      Magomed Kostoev authored
      The `sort_order` parameter was introduced earlier but had no effect
      until now. Now it allows to specify a sort (iteration) order for
      each key part.
      
      The parameter is only applicable to ordered indexes, so any value
      except 'undef' for the `sort_order` is disallowed for all indexes
      except TREE. The 'undef' value of the `sort_order` field of the
      `key_part_def` is translated to 'asc' on `key_part` creation.
      
      In order to make the key def aware if its index is unordered, the
      signature of `key_def_new` has been changed: the `for_func_index`
      parameter has been moved to the new `flags` parameter and
      `is_unordered` flag has been introduced.
      
      Alternative iterator names has been introduced (which are aliases
      to regular iterators): box.index.FORWARD_[INCLUSIVE/EXCLUSIVE],
      box.index.REVERSE_[INCLUSIVE/EXCLUSIVE].
      
      By the way fixed the `key_hint_stub` overload name, which supposed
      to be called `tuple_hint_stub`.
      
      `tuple_hint` and `key_hint` template declarations has been changed
      because of the checkpatch diagnostics.
      
      Closes #5529
      
      @TarantoolBot document
      Title: Now it's possible to specify sort order of each index part.
      
      Sort order specifies the way indexes iterate over tuples with
      different fields in the same part. It can be either ascending
      (which is the case by default) and descending.
      
      Tuples with different ascending parts are sorted in indexes from
      lesser to greater, whereas tuples with different descending parts
      are sorted in the opposte order: from greater to lesser.
      
      Given example:
      
      ```lua
      box.cfg{}
      
      s = box.schema.create_space('tester')
      pk = s:create_index('pk', {parts = {
        {1, 'unsigned', sort_order = 'desc'},
        {2, 'unsigned', sort_order = 'asc'},
        {3, 'unsigned', sort_order = 'desc'},
      }})
      
      s:insert({1, 1, 1})
      s:insert({1, 1, 2})
      s:insert({1, 2, 1})
      s:insert({1, 2, 2})
      s:insert({2, 1, 1})
      s:insert({2, 1, 2})
      s:insert({2, 2, 1})
      s:insert({2, 2, 2})
      s:insert({3, 1, 1})
      s:insert({3, 1, 2})
      s:insert({3, 2, 1})
      s:insert({3, 2, 2})
      ```
      
      In this case field 1 and 3 are descending, whereas field 2 is
      ascending. So `s:select()` will return this result:
      
      ```yaml
      ---
      - [3, 1, 2]
      - [3, 1, 1]
      - [3, 2, 2]
      - [3, 2, 1]
      - [2, 1, 2]
      - [2, 1, 1]
      - [2, 2, 2]
      - [2, 2, 1]
      - [1, 1, 2]
      - [1, 1, 1]
      - [1, 2, 2]
      - [1, 2, 1]
      ...
      ```
      
      Beware, that when using other sort order than 'asc' for any field
      'GE', 'GT', 'LE' and 'LT' iterator lose their meaning and specify
      'forward inclusive', 'forward exclusive', 'reverse inclusive' and
      'reverse exclusive' iteration direction respectively. Given example
      above, `s:select({2}, {iterator = 'GT'})` will return this:
      
      ```yaml
      ---
      - [1, 1, 2]
      - [1, 1, 1]
      - [1, 2, 2]
      - [1, 2, 1]
      ...
      ```
      
      And `s:select({1}, {iterator = 'LT'})` will give us:
      
      ```yaml
      ---
      - [2, 2, 1]
      - [2, 2, 2]
      - [2, 1, 1]
      - [2, 1, 2]
      - [3, 2, 1]
      - [3, 2, 2]
      - [3, 1, 1]
      - [3, 1, 2]
      ...
      ```
      
      In order to be more clear alternative iterator aliases can be used:
      'FORWARD_INCLUSIVE', 'FORWARD_EXCLUSIVE', 'REVERSE_INCLUSIVE',
      'REVERSE_EXCLUSIVE':
      
      ```
      > s:select({1}, {iterator = 'REVERSE_EXCLUSIVE'})
      ---
      - [2, 2, 1]
      - [2, 2, 2]
      - [2, 1, 1]
      - [2, 1, 2]
      - [3, 2, 1]
      - [3, 2, 2]
      - [3, 1, 1]
      - [3, 1, 2]
      ...
      ```
      b1990b21
  6. Aug 08, 2023
    • Sergey Ostanevich's avatar
      perf: initial version of 1M operations test · 10870343
      Sergey Ostanevich authored
      The test can be used for regression testing. It is advisable to tune
      the machine: check the NUMA configuration, fix the pstate or similar
      CPU autotune. Although, running dozen times gives more-less stable
      result for the peak performance, that should be enough for regression
      identification.
      
      NO_DOC=adding an internal test
      NO_CHANGELOG=ditto
      NO_TEST=ditto
      10870343
  7. Jul 25, 2023
    • Vladimir Davydov's avatar
      perf: add test for box select · 114d09f5
      Vladimir Davydov authored
      The test runs get, select, pairs space methods with various arguments in
      a loop and prints the average method run time in nanoseconds (lower is
      better).
      
      Usage:
      
        tarantool box_select.lua
      
      Output format:
      
        <test-case> <run-time>
      
      Example:
      
        $ tarantool box_select.lua --pattern 'get|select_%d$'
        get_0 155
        get_1 240
        select_0 223
        select_1 335
        select_5 2321
      
      Options:
      
        --pattern <string>  run only tests matching the pattern; use '|'
                            to specify more than one pattern, for example,
                            'get|select'
        --read_view         use a read view (EE only)
      
      Apart from the test, this patch also adds a script that compares test
      results:
      
        $ tarantool box_select.lua --pattern get > base
        $ tarantool box_select.lua --pattern get > patched1
        $ tarantool box_select.lua --pattern get > patched2
        $ tarantool compare.lua base patched1 patched2
               base          patched1          patched2
        get_0   149       303 (+103%)       147 (-  1%)
        get_1   239       418 (+ 74%)       238 (-  0%)
      
      NO_DOC=perf test
      NO_TEST=perf test
      NO_CHANGELOG=perf test
      114d09f5
  8. Mar 23, 2023
  9. Jan 24, 2023
    • Sergey Bronnikov's avatar
      perf/cmake: add a function for generating perf test targets · ca58d6c9
      Sergey Bronnikov authored
      Commit 2be74a65 ("test/cmake: add a function for generating unit
      test targets") added a function for generating unit test targets in
      CMake. This function makes code simpler and less error-prone.
      
      Proposed patch adds a similar function for generating performance test
      targets in CMake.
      
      NO_CHANGELOG=build infrastructure updated
      NO_DOC=build infrastructure updated
      NO_TEST=build infrastructure updated
      ca58d6c9
  10. Dec 27, 2022
    • Sergey Bronnikov's avatar
      perf: add uri.escape/unescape test · 3cc0b3cf
      Sergey Bronnikov authored
      Added a simple benchmark for URI escape/unescape.
      
      Part of #3682
      
      NO_DOC=documentation is not required for performance test
      NO_CHANGELOG=performance test
      NO_TEST=performance test
      3cc0b3cf
  11. Aug 26, 2022
    • Nikita Pettik's avatar
      perf: introduce Light benchmark · 9818bba4
      Nikita Pettik authored
      Benchmark is implemented using Google Benchmark lib. Here's benchmark
      settings:
       - values: we use structure (tuple) containing pointer to heap memory
                 and size (all payload is of the same size - 32 bytes);
       - keys: unsigned char (first byte in the tuple memory);
       - hash function: FNV-1a;
       - value comparator: std::memcmp();
       - value count: 10k - 100k - 1M
      
      Before each test we prepare vector of tuples storing truly random
      values.
      
      Here's the list of results obtained on my PC (i7-8700 12 X 4600 MHz):
      
      Insertions: ~20-12M per second;
      Find (no misses): ~58-16M* per second (find by key gives the same result);
      Find (many misses): ~84-30M per second;
      Iteration with dereference: ~450M per second;
      Insertions after erase: ~50-17M* per second;
      Find after erase: ~52-17M* per second (the same as without erase);
      Delete: ~32-8M* per second.
      
      * The first value is for 10k values in hash table; second - is for 1M.
      
      Just to have some baseline here results for quite similar benchmark for
      std::unordered_map (it is also included in source file):
      
      Insertions: ~26-8M per second;
      Find (no misses): ~44-11M per second;
      Iteration with dereference: ~265-56M per second;
      Find after erase: ~37-13M per second.
      
      Part of #7338
      
      NO_TEST=<Benchmark>
      NO_DOC=<Benchmark>
      NO_CHANGELOG=<Benchmark>
      9818bba4
    • Nikita Pettik's avatar
      perf: use C++ 14 standard · e48835fd
      Nikita Pettik authored
      There are a lot of pretty things introduced in 14 standard,
      so let's use it.
      
      NO_DOC=<Build change>
      NO_TEST=<Build change>
      NO_CHANGELOG=<Build change>
      e48835fd
    • Nikita Pettik's avatar
      perf: move debug warning to a separate header · 0a7764a7
      Nikita Pettik authored
      It's useful and can be used in all performance tests, so let's move it
      to a separate header.
      
      NO_TEST=<Refactoring>
      NO_DOC=<Refactoring>
      NO_CHANGELOG=<Refactoring>
      0a7764a7
  12. Jun 28, 2022
    • Nikita Pettik's avatar
      tuple: refactor flags · 9da70207
      Nikita Pettik authored
      Before this patch struct tuple had two boolean bit fields: is_dirty and
      has_uploaded_refs. It is worth mentioning that sizeof(boolean) is
      implementation depended. However, in code it is assumed to be 1 byte
      (there's static assertion restricting the whole struct tuple size by 10
      bytes). So strictly speaking it may lead to the compilation error on
      some non-conventional system. Secondly, bit fields anyway consume at
      least one size of type (i.e. there's no space benefits in using two
      uint8_t bit fields - they anyway occupy 1 byte in total). There are
      several known pitfalls concerning bit fields:
       - Bit field's memory layout is implementation dependent;
       - sizeof() can't be applied to such members;
       - Complier may raise unexpected side effects
         (https://lwn.net/Articles/478657/).
      
      Finally, in our code base as a rule we use explicit masks:
      txn flags, vy stmt flags, sql flags, fiber flags.
      
      So, let's replace bit fields in struct tuple with single member called
      `flags` and several enum values corresponding to masks (to be more
      precise - bit positions in tuple flags).
      
      NO_DOC=<Refactoring>
      NO_CHANGELOG=<Refactoring>
      NO_TEST=<Refactoring>
      9da70207
  13. May 18, 2022
    • Serge Petrenko's avatar
      replication: fix race in accessing vclock by applier and tx threads · ddec704e
      Serge Petrenko authored
      When applier ack writer was moved to applier thread, it was overlooked
      that it would start sharing replicaset.vclock between two threads.
      
      This could lead to the following replication errors on master:
      
       relay//102/main:reader V> Got a corrupted row:
       relay//102/main:reader V> 00000000: 81 00 00 81 26 81 01 09 02 01
      
      Such a row has an incorrectly-encoded vclock: `81 01 09 02 01`.
      When writer fiber encoded the vclock length (`81`), there was only one
      vclock component: {1: 9}, but at the moment of iterating over the
      components, another WAL write was reported to TX thread, which bumped
      the second vclock component {1: 9, 2: 1}.
      
      Let's fix the race by delivering a copy of current replicaset vclock to
      the applier thread.
      
      Also add a perf test to the perf/ directory.
      
      Closes #7089
      Part-of tarantool/tarantool-qa#166
      
      NO_DOC=internal fix
      NO_TEST=hard to test
      ddec704e
  14. Mar 24, 2022
    • Aleksandr Lyapunov's avatar
      box: introduce a pair of tuple_format_new helpers · 4b8dc6b7
      Aleksandr Lyapunov authored
      tuple_format_new has lots of arguments, all of them necessary
      indeed. But a small analysss showed that almost always there are
      only two kinds of usage of that function: with lots of zeros as
      arguments and lots of values taken from space_def.
      
      Make two versions of tuple_format_new:
      simple_tuple_format_new, with all those zeros omitted, and
      space_tuple_format_new, that takes space_def as an argument.
      
      NO_DOC=refactoring
      NO_CHANGELOG=refactoring
      4b8dc6b7
  15. Mar 23, 2022
  16. Mar 03, 2022
    • mechanik20051988's avatar
      alter: implement ability to set compression for tuple fields · a51313a4
      mechanik20051988 authored
      Implement ability to set compression for tuple fields. Compression type
      for tuple fields is set in the space format, and can be set during space
      creation or during setting of a new space format.
      ```lua
      format = {{name = 'x', type = 'unsigned', compression = 'none'}}
      space = box.schema.space.create('memtx_space', {format = format})
      space:drop()
      space = box.schema.space.create('memtx_space')
      space:format(format)
      ```
      For opensource build only one compression type ('none') is
      supported. This type of compression means its absence, so
      it doesn't affect something.
      
      Part of #2695
      
      NO_CHANGELOG=stubs for enterprise version
      NO_DOC=stubs for enterprise version
      a51313a4
  17. Feb 03, 2022
    • mechanik20051988's avatar
      test: fix incorrect resource release · 438ce64e
      mechanik20051988 authored
      There were two problems with resource release in performance test:
      - because of manually zeroing of `box_tuple_last`, tuple_format
        structure was not deleted. `box_tuple_last` should be zeroed in
        `tuple_free` function.
      - invalid loop for resource release in one of the test cases.
      This patch fix both problems.
      
      NO_CHANGELOG=test fix
      NO_DOC=test fix
      438ce64e
  18. Dec 09, 2021
  19. Aug 18, 2021
  20. Aug 12, 2021
Loading