Skip to content
Snippets Groups Projects
  1. May 07, 2019
    • Kirill Shcherbatov's avatar
      box: introduce multikey indexes in memtx · f1d9f257
      Kirill Shcherbatov authored
      - In the case of multikey index arises an ambiguity: which key
        should be used in the comparison. The previously introduced
        comparison hints act as an non-negative numeric index of key
        to use,
      - Memtx B+ tree replace and build_next methods have been
        patched to insert the same tuple multiple times by different
        logical indexes of the key in the array,
      - Map fields have been expanded service areas "extent" that
        contain an offset of multikey index keys by additional logical
        index.
      
      Part of #1257
      
      @TarantoolBot document
      Title: introduce multikey indexes in memtx
      Any JSON index in which at least one partition contains "[*]"
      - array index placeholder sign is called "Multikey".
      Such indexes allows you to automatically index set of documents
      having same document structure.
      
      Multikey indexes design have a number of restrictions that must
      be taken into account:
       - it cannot be primary because of the ambiguity arising from
         it's definition (primary index requires the one unique key
         that identify tuple),
       - if some node in the JSON tree of all defined indexes contains
         an array index placeholder [*], no other JSON path can use an
         explicit JSON index on it's nested field.
       - it support "unique" semantics, but it's uniqueness a little
         different from conventional indexes: you may insert a tuple
         in which the same key occurs multiple times in a unique
         multikey index, but you cannot insert a tuple when any of its
         keys is in some other tuple stored in space,
       - the unique multikey index "duplicate" conflict occurs when
         the sets of extracted keys have a non-empty logical
         intersection
       - to identify the different keys by which a given data tuple is
         indexed, each key is assigned a logical sequence number in
         the array defined with array index placeholder [*] in index
         (such array is called multikey index root),
       - no index partition can contain more than one array index
         placeholder sign [*] in it's JSON path,
       - all parts containing JSON paths with array index placeholder
         [*] must have the same (in terms of json tokens) prefix
         before this placeholder sign.
      
      Example 1:
      s = box.schema.space.create('clients')
      s:format({{name='name', type='string'}, {name='phone', type='array'}})
      name_idx = s:create_index('name_idx', {parts = {{'name', 'string'}}})
      phone_idx = s:create_index('phone_idx', {parts = {{'phone[*]',
      'string'}}})
      s:insert({"Jorge", {"911", "89457609234"}})
      s:insert({"Bob", {"81239876543"}})
      
      phone_idx:get("911")
      ---
      - ['Jorge', ['911', '89457609234']]
      ...
      
      Example 2:
      s = box.schema.space.create('withdata')
      pk = s:create_index('pk')
      parts = {
      	{2, 'str', path = 'data[*].name'},
              {2, 'str', path = 'data[*].extra.phone'}
      }
      idx = s:create_index('idx', {parts = parts})
      s:insert({1, {data = {{name="A", extra={phone="111"}},
                            {name="B", extra={phone="111"}}},
                   garbage = 1}}
      idx:get({'A', '111'})
      f1d9f257
    • Georgy Kirichenko's avatar
      Use mempool to alloc wal messages · 771c00d4
      Georgy Kirichenko authored
      Don't use fiber gc region to alloc wal messages. This relaxes friction
      between fiber life cycle and transaction processing.
      
      Prerequisites: #1254
      (cherry picked from commit bedc2e06521c2f7a4a6d04510c8f72fa57a44f96)
      771c00d4
    • Georgy Kirichenko's avatar
      Alloc journal entry on a txn memory region · 6f6af986
      Georgy Kirichenko authored
      Use txn memory to allocate a journal entry structure.
      This relaxes a dependency between a journal entry and a fiber.
      
      Prerequisites: #1254
      (cherry picked from commit 92e68deb21ab17aacf43d8ca409f587b9da86c07)
      6f6af986
    • Georgy Kirichenko's avatar
      Introduce a txn memory region · 06c73f1d
      Georgy Kirichenko authored
      Attach a separate memory region for each txn structure in order to store
      all txn internal data until the transaction finished. This patch is a
      preparation to detach a txn from a fiber and a fiber gc storage.
      
      Prerequisites: #1254
      (cherry picked from commit c1486242445ebf82b8644c21ac7434d89ddeb3b1)
      06c73f1d
  2. May 06, 2019
    • Kirill Shcherbatov's avatar
      lua: add key_def lua module · 22db9c26
      Kirill Shcherbatov authored
      Needed for #3276.
      Fixes #3398.
      Fixes #4025.
      
      @TarantoolBot document
      Title: lua: key_def module
      
      It is convenient to have tuple compare function into lua-land for the
      following case:
       - exporting key from tuple to iterate over secondary non-unique index
         and delete tuples from space
       - support comparing a tuple with a key / a tuple, support merging
         key_defs from Lua
       - factor out key parts parsing code from the tuples merger
      
      A key_def instance has the following methods:
       - :extract_key(tuple)           -> key (as tuple)
          Receives tuple or Lua table. Returns tuple representing extracted
          key.
       - :compare(tuple_a, tuple_b)    -> number
          Receives tuples or Lua tables.
          Returns:
          - a value > 0 when tuple_a > tuple_b,
          - a value == 0 when tuple_a == tuple_b,
          - a value < 0 otherwise.
       - :compare_with_key(tuple, key) -> number
          - a value > 0 when key(tuple) > key,
          - a value == 0 when key(tuple) == key,
          - a value < 0 otherwise.
       - :merge(another_key_def)       -> new key_def instance
          Constructs a new key definition with a set union of key parts
          from first and second key defs
       - :totable()                    -> table
          Dump key_def object as a Lua table (needed to support __serialize
          method)
      
      The root key_def library exports all instance methods directly.
      
      The format of `parts` parameter in the `key_def.new(parts)` call is
      compatible with the following structures:
      * box.space[...].index[...].parts;
      * net_box_conn.space[...].index[...].parts.
      
      Example for extract_key():
      
      ```lua
      -- Remove values got from a secondary non-unique index.
      local key_def_lib = require('key_def')
      local s = box.schema.space.create('test')
      local pk = s:create_index('pk')
      local sk = s:create_index('test', {unique = false, parts = {
          {2, 'number', path = 'a'}, {2, 'number', path = 'b'}}})
      s:insert{1, {a = 1, b = 1}}
      s:insert{2, {a = 1, b = 2}}
      local key_def = key_def_lib.new(pk.parts)
      for _, tuple in sk:pairs({1})) do
          local key = key_def:extract_key(tuple)
          pk:delete(key)
      end
      ```
      22db9c26
    • avtikhon's avatar
      test: errinj* tests need fixes for stress runs · 20681ebd
      avtikhon authored
      Tests failed with the following issues:
      
      [010] --- vinyl/errinj.result Tue Mar 19 17:52:48 2019
      [010] +++ vinyl/errinj.reject Tue Mar 19 19:07:58 2019
      [010] @@ -81,7 +81,7 @@
      [010] -- fails due to scheduler timeout
      [010] box.snapshot();
      [010] ---
      [010] -- error: Error injection 'vinyl dump'
      [010] +- ok
      [010] ...
      [010] fiber.sleep(0.06);
      [010] ---
      [010]
      
      Decided to remove the current check for box snapshot fail on
      scheduler timeout.
      
      [035] --- vinyl/errinj_stat.result	Mon May  6 13:11:27 2019
      [035] +++ vinyl/errinj_stat.reject	Mon May  6 17:58:48 2019
      [035] @@ -250,7 +250,7 @@
      [035]  ...
      [035]  box.snapshot()
      [035]  ---
      [035] -- ok
      [035] +- error: Error injection 'vinyl dump'
      [035]  ...
      [035]  i:compact()
      [035]  ---
      [035]
      
      Decided to add the wait condition checker of the scheduler
      tasks_completed number instead of the fiber delay.
      
      Close #4058
      
      (cherry picked from commit d64e4c95f1b72d1f32f9689a33319545506e1c58)
      20681ebd
    • Kirill Shcherbatov's avatar
      salad: introduce bps_tree_delete_identical routine · 63c2ade6
      Kirill Shcherbatov authored
      A new routine bps_tree_delete_identical performs an element
      deletion if and only if the found element is identical
      to the routine argument.
      
      Needed for #1257
      63c2ade6
    • Kirill Shcherbatov's avatar
      box: introduce field_map_builder class · 4d567b12
      Kirill Shcherbatov authored
      The new field_map_builder class encapsulates the logic associated
      with field_map allocation and initialization. In the future it
      will be extended to allocate field_map that has extensions.
      
      Needed for #1257
      4d567b12
    • Kirill Shcherbatov's avatar
      box: introduce tuple_format_iterator class · 203f1a07
      Kirill Shcherbatov authored
      The similar code in tuple_field_map_create and
      vy_stmt_new_surrogate_delete_raw that performs tuple decode
      with tuple_format has been refactored as reusable
      tuple_format_iterator class.
      
      Being thus encapsulated, this code will be uniformly managed and
      extended in the further patches in scope of multikey indexes.
      
      Extended engine/json test with vy_stmt_new_surrogate_delete_raw
      corner case test. There was no problem before this patch, but
      small bug appeared during tuple_format_iterator_next
      implementation was not covered.
      
      Needed for #1257
      203f1a07
    • Cyrill Gorcunov's avatar
      test: app/fio -- Fixup modes for open · 69a6365e
      Cyrill Gorcunov authored
      There were typos -- we should use octal base, othervise
      numbers are treated as decimals.
      69a6365e
    • avtikhon's avatar
      test: enable parallel testing for vinyl suite · 09b8a604
      avtikhon authored
      Need to switch on is_parallel flag for vinyl suite
      to be able to run subtests in parallel using:
      ./test_run.py -j'<threads>'
      
      Test vinyl/throttle.test.lua temporary disabled due
      to it tests the performance which fails in parallel
      runs because of highloaded hardware.
      
      Fix #4158
      09b8a604
    • Cyrill Gorcunov's avatar
      core/coio_file: copyfile -- Make it behave as regular cp · 7b378bc6
      Cyrill Gorcunov authored
      Traditional cp utility opens destination with O_TRUNC
      flag, iow it drops old content of the target file if
      such exists.
      
      Fixes #4181
      7b378bc6
    • Alexander V. Tikhonov's avatar
      travis-ci: set jobs not to stop on failed tests · 5f87a3a3
      Alexander V. Tikhonov authored
      Added --force flag to travis-ci jobs not to stop on failed tests.
      Due to any found failed test breaks the testing it masks the other
      fails and in the following ways it's not good:
      - flaky test masks real problem
      - release testing needs overall result to fix it fast
      - parallel testing may produce flaky test
      
      Close: #4131
      5f87a3a3
  3. May 02, 2019
    • avtikhon's avatar
      test: fix replication/gc flaky failures · 35b5095a
      avtikhon authored
      Two problems are fixed here. The first one is about correctness of the
      test case. The second is about flaky failures.
      
      About correctness. The test case contains the following lines:
      
       | test_run:cmd("switch replica")
       | -- Unblock the replica and break replication.
       | box.error.injection.set("ERRINJ_WAL_DELAY", false)
       | box.cfg{replication = {}}
      
      Usually rows are applied and the new vclock is sent to the master before
      replication will be disabled. So the master removes old xlog before the
      replica restart and the next case tests nothing.
      
      This commit uses the new test-run's ability to stop a tarantool instance
      with a custom signal and stops the replica with SIGKILL w/o dropping
      ERRINJ_WAL_DELAY. This change fixes the race between applying rows and
      disabling replication and so makes the test case correct.
      
      About flaky failures. They were look like so:
      
       | [029] --- replication/gc.result Mon Apr 15 14:58:09 2019
       | [029] +++ replication/gc.reject Tue Apr 16 09:17:47 2019
       | [029] @@ -290,7 +290,12 @@
       | [029] ...
       | [029] wait_xlog(1) or fio.listdir('./master')
       | [029] ---
       | [029] -- true
       | [029] +- - 00000000000000000305.vylog
       | [029] + - 00000000000000000305.xlog
       | [029] + - '512'
       | [029] + - 00000000000000000310.xlog
       | [029] + - 00000000000000000310.vylog
       | [029] + - 00000000000000000310.snap
       | [029] ...
       | [029] -- Stop the replica.
       | [029] test_run:cmd("stop server replica")
       | <...next cases could have induced mismathes too...>
      
      The reason of the fail is that a replica applied all rows from the old
      xlog, but didn't sent an ACK with a new vclock to a master, because the
      replication was disabled before that. The master stops relay and keeps
      the old xlog. When the replica starts again it subscribes with the
      vclock value that instructs a relay to open the new xlog.
      
      Tarantool can remove an old xlog just after a replica's ACK when
      observes that the xlog was fully read by all replicas. But tarantool
      does not remove xlogs when a replica is subscribed. This is not a big
      problem, because such 'stuck' xlog file will be removed with a next xlog
      removal.
      
      There was the attempt to fix this behaviour and remove old xlogs at
      subscribe, see the following commits:
      
      * b5b4809c ('replication: update replica
        gc state on subscribe');
      * 766cd3e1 ('Revert "replication: update
        replica gc state on subscribe"').
      
      Anyway, this commit fixes this flaky failures, because stops the replica
      before applying rows from the old xlog. So when the replica starts it
      continues reading from the old xlog and the xlog file will be removed
      when will be fully read.
      
      Closes #4162
      35b5095a
    • Vladislav Shpilevoy's avatar
      swim: explicitly stop old ev_io input/output on rebind · de2906a8
      Vladislav Shpilevoy authored
      When an input/output file descriptor was changed, SWIM did this:
      
          swim_ev_io_set(io, new_fd)
          swim_ev_io_start(io)
      
      It worked in an assumption that libev allows to change fd on fly,
      and the tests were passing because fake event loop was used, not
      real libev.
      
      But it didn't work. Libev requires explicit ev_io_stop() called
      on the old descriptor. This patch makes this:
      
          swim_ev_io_stop(io)
          //do bind ...
      
          swim_ev_io_set(io, new_fd)
          swim_ev_io_start(io)
      
      Part of #3234
      de2906a8
    • Vladislav Shpilevoy's avatar
      swim: allow to omit host in URI · 5320d8f0
      Vladislav Shpilevoy authored
      Before the patch swim.cfg() was returning an error on an attempt
      to use URI like 'port', without a host. But it would be useful,
      easy, and short to allow that, and use '127.0.0.1' host by
      default. Compare:
      
          swim:cfg({uri = 1234})
                  vs
          swim:cfg({uri = '127.0.0.1:1234'})
      
      It is remarkable that box.cfg{listen} also allows to omit host.
      
      Note, that use '0.0.0.0' (INADDR_ANY) is forbidden. Otherwise
      
          1) Different instances interacting with this one via not the
             same interfaces would see different source IP addresses.
             It would mess member tables;
      
          2) This instance would not be able to encode its IP address
             in the meta section, because it has no a fixed IP. At the
             same time omission of it and usage of UDP header source
             address is not possible as well, because UDP header is not
             encrypted and therefore is not safe to look at.
      
      Part of #3234
      5320d8f0
    • Vladislav Shpilevoy's avatar
      swim: introduce member reference API · 42bdc367
      Vladislav Shpilevoy authored
      Struct swim_member pointer is used to learn member's status,
      payload, incarnation etc. To obtain a pointer, a one should
      either lookup the member by UUID, or obtain from iterators API.
      The former is long, the latter is useless when a point lookup is
      needed.
      
      On the other hand, it was not safe to keep struct swim_member
      pointer for a long time, because it could be deleted at any
      moment.
      
      This patch allows to reference a member and be sure that it will
      not be deleted until dereferenced explicitly. The member still
      can be dropped from the member table, but its memory will be
      valid. To detect that a member is dropped, a user can use
      swim_member_is_dropped() function.
      
      Part of #3234
      42bdc367
    • Vladislav Shpilevoy's avatar
      swim: do not use ev_timer_start · fd9c6c9d
      Vladislav Shpilevoy authored
      Appeared that libev changes 'ev_timer.at' field to a remaining
      time value, and it can't be used as a storage for a timeout. By
      the same reason ev_timer_start() can't be used to reuse a timer.
      
      On the contrary, 'ev_timer.repeat' is not touched by libev, and
      ev_timer_again() allows to reuse a timer.
      
      This patch replaces 'at' with 'repeat' and ev_timer_start() with
      ev_timer_again().
      
      The bug was not detected by unit tests, because they implement
      their own event loop and do not change ev_timer.at. Now they do
      to prevent a regression.
      
      Part of #3234
      fd9c6c9d
    • Alexander Turenko's avatar
      test: update test-run · 71f7ecf1
      Alexander Turenko authored
      Added the signal option into 'stop server' command.
      
      How to use:
      
       | test_run:cmd('stop server foo with signal=KILL')
      
      The 'stop server foo' command without the option sends SIGTERM as
      before.
      
      This feature is intended to be used in a fix of #4162 ('test:
      gc.test.lua test fails on *.xlog files cleanup').
      71f7ecf1
  4. Apr 30, 2019
  5. Apr 29, 2019
    • Vladislav Shpilevoy's avatar
      test: drop invalid assert from swim test transport · 906fb352
      Vladislav Shpilevoy authored
      The assertion was checking that a next event object is not the
      same as the previous, but
      
          1) the previous was deleted already to this moment;
          2) comparison was done by pointer
      
      The first problem would be enough to drop it. The second is
      already curious - looks like after the old event was deleted,
      the next event was allocated right on the same memory. This is
      why their pointers are equal and the assertion fails.
      
      For example, swim_timer_event_process() - it deletes the event
      object and calls ev_invoke() which can generate a new event on
      the just freed memory.
      906fb352
    • avtikhon's avatar
      test: vinyl/errinj test fails under highload · 8c3bd9d8
      avtikhon authored
      Test "check that all dump/compaction tasks that are in progress at
      the time when the server stops are aborted immediately.", but in
      real the awaiting time of 1 second is not enough due to runs in
      parallel and it fails, like:
      
      [009] --- vinyl/errinj.result Tue Apr 16 16:43:36 2019
      [009] +++ vinyl/errinj.reject Wed Apr 17 09:42:36 2019
      [009] @@ -530,7 +530,7 @@
      [009] ...
      [009] t2 - t1 < 1
      [009] ---
      [009] -- true
      [009] +- false
      [009] ...
      [009] test_run:cmd("cleanup server test")
      [009] ---
      [009]
      
      in 100 parallel runs the failed delays were found:
      
      [002] +- 1.4104716777802
      [022] +- 1.3933029174805
      [044] +- 1.4296517372131
      [033] +- 1.6380662918091
      [001] +- 1.9799520969391
      [027] +- 1.7067711353302
      [043] +- 1.3778221607208
      [034] +- 1.3820221424103
      [032] +- 1.3820221424103
      [020] +- 1.6275615692139
      [050] +- 1.6275615692139
      [048] +- 1.1880359649658
      
      Decided to avoid of use the time check at all and change the
      ERRINJ_VY_RUN_WRITE_STMT_TIMEOUT to ERRINJ_VY_DUMP_DELAY injection.
      In this way the time checks were completely removed.
      
      Next issue met was the error:
      
      vy_quota.c:298 !> SystemError Failed to allocate 2097240 bytes
          in lsregion for vinyl transaction: Cannot allocate memory
      
      That is why the merged 2 subtests were divided into 2 standalone
      subtests to be able to set the memory limit of the 2nd subtest to
      2097240 value.
      
      Close #4169
      8c3bd9d8
    • Vladimir Davydov's avatar
      vinyl: be pessimistic about write rate when setting dump watermark · b9b8e8af
      Vladimir Davydov authored
      We set the dump watermark using the following formula
      
          limit - watermark     watermark
          ---------------- = --------------
             write_rate      dump_bandwidth
      
      This ensures that by the time we run out of memory quota, memory
      dump will have been completed and we'll be able to proceed. Here
      the write_rate is the expected rate at which the workload will
      write to the database while the dump is in progress. Once the dump
      is started, we throttle the workload in case it exceeds this rate.
      
      Currently, we estimate the write rate as a moving average observed
      for the last 5 seconds. This performs poorly unless the workload
      write rate is perfectly stable: if the 5 second average turns out to
      be even slightly less than the max rate, the workload may experience
      long stalls during memory dump.
      
      To avoid that let's use the max write rate multiplied by 1.5 instead
      of the average when setting the watermark. This means that we will
      start dump earlier than we probably could, but at the same time this
      will tolerate write rate fluctuations thus minimizing the probability
      of stalls.
      
      Closes #4166
      b9b8e8af
    • Alexander Turenko's avatar
      httpc: fix zero timeout handling · 47bd51b5
      Alexander Turenko authored
      When libcurl is built with --enable-threaded-resolver (which is default)
      and the version of the library is 7.60 or above, libcurl calls a timer
      callback with exponentially increasing timeout_ms value during DNS
      resolving.
      
      This behaviour was introduced in curl-7_59_0-36-g67636222f (see [1],
      [2]). During first ten milliseconds the library sets a timer to a passed
      time divided by three (see Curl_resolver_getsock()). It is possible that
      passed time is zero during at least several thousands of iterations.
      
      Before this commit we didn't set a libev timer in curl_multi_timer_cb()
      when a timeout_ms value is zero, but call curl_multi_process()
      immediately. Libcurl however can call curl_multi_timer_cb() again and
      here we're going into a recursion that stops only when timeous_ms
      becomes positive. Often we generate several thousands of stack frames
      within this recursion and exceed 512KiB of a fiber stack size.
      
      The fix is easy: set a libev timer to call curl_multi_process() even
      when a timeout_ms value is zero.
      
      The reason why we did the call to curl_multi_process() immediately is
      the unclear wording in the CURLMOPT_TIMERFUNCTION option documentation.
      This documentation page was fixed in curl-7_64_0-88-g47e540df8 (see [3],
      [4], [5]).
      
      There is also the related change in curl-7_60_0-121-g3ef67c686 (see [6],
      [7]): after this commit libcurl calls a timer callback with zero
      timeout_ms during a first three milliseconds of asynchronous DNS
      resolving.
      
      Fixes #4179.
      
      [1]: https://github.com/curl/curl/pull/2419
      [2]: https://github.com/curl/curl/commit/67636222f42b7db146b963deb577a981b4fcdfa2
      [3]: https://github.com/curl/curl/issues/3537
      [4]: https://github.com/curl/curl/pull/3601
      [5]: https://github.com/curl/curl/commit/47e540df8f32c8f7298ab1bc96b0087b5738c257
      [6]: https://github.com/curl/curl/pull/2685
      [7]: https://github.com/curl/curl/commit/3ef67c6861c9d6236a4339d3446a444767598a58
      47bd51b5
    • Alexander Turenko's avatar
      travis-ci: set right flags in release testing jobs · c308f35d
      Alexander Turenko authored
      It is important to have testing jobs that build the project with both
      -Werror and -O2 to keep the code clean. -O2 is needed, because some
      compiler warnings are available only after extra analyzing passes that
      are disabled with lesser optimization levels.
      
      The first attempt to add -Werror for release testing jobs was made in
      da505ee7 ('Add -Werror for CI (1.10
      part)'), but it mistakely doesn't enable -O2 for RelWithDebInfoWError
      build. It is possible to fix it in this way:
      
       | --- a/cmake/compiler.cmake
       | +++ b/cmake/compiler.cmake
       | @@ -113,10 +113,14 @@ set (CMAKE_C_FLAGS_DEBUG
       |      "${CMAKE_C_FLAGS_DEBUG} ${CC_DEBUG_OPT} -O0")
       |  set (CMAKE_C_FLAGS_RELWITHDEBINFO
       |      "${CMAKE_C_FLAGS_RELWITHDEBINFO} ${CC_DEBUG_OPT} -O2")
       | +set (CMAKE_C_FLAGS_RELWITHDEBINFOWERROR
       | +    "${CMAKE_C_FLAGS_RELWITHDEBINFOWERROR} ${CC_DEBUG_OPT} -O2")
       |  set (CMAKE_CXX_FLAGS_DEBUG
       |      "${CMAKE_CXX_FLAGS_DEBUG} ${CC_DEBUG_OPT} -O0")
       |  set (CMAKE_CXX_FLAGS_RELWITHDEBINFO
       |      "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} ${CC_DEBUG_OPT} -O2")
       | +set (CMAKE_CXX_FLAGS_RELWITHDEBINFOWERROR
       | +    "${CMAKE_CXX_FLAGS_RELWITHDEBINFOWERROR} ${CC_DEBUG_OPT} -O2")
       |
       |  unset(CC_DEBUG_OPT)
      
      However I think that a build type (and so `tarantool --version`) should
      not show whether -Werror was passed or not. So I have added
      ENABLE_WERROR CMake option for that. It can be set like so:
      
       | cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_WERROR=ON
      
      Enabled the option in testing Travis-CI jobs with the RelWithDebInfo
      build type. Deploy jobs don't include it as before.
      
      Fixed all -Wmaybe-uninitialized and -Wunused-result warnings. A few
      notes about the fixes:
      
      * net.box does not validate received data in general, so I don't add a
        check for autoincrement IDs too. Set the ID to INT64_MIN, because this
        value is less probably will appear here in a normal case and so is the
        best one to signal a user that something probably going wrongly.
      * xrow_decode_*() functions could read uninitialized data from
        row->body[0].iov_base in xrow_on_decode_err() when printing a hex code
        for a row. It could be possible when the received msgpack was empty
        (row->bodycnt == 0), but there were expected keys (key_map != 0).
      * getcwd() is marked with __attribute__((__warn_unused_result__)) in
        glibc, but the buffer filled by this call is not used anywhere and so
        just removed.
      * Vinyl -Wmaybe-uninitialized warnings are false positive ones.
      
      Added comments and quotes into .travis.yml to ease reading. Removed
      "test" word from the CentOS 6 job name, because we don't run tests on
      this distro (disabled in the RPM spec).
      
      Fixes #4178.
      c308f35d
    • Cyrill Gorcunov's avatar
      core/coio_file: Use eio_sendfile_sync instead of a chunk mode · 04bf646f
      Cyrill Gorcunov authored
      eio library provides a portable version of sendfile syscall
      which works a way more efficient than explicit copying file
      by 4K chunks.
      04bf646f
  6. Apr 26, 2019
    • Vladislav Shpilevoy's avatar
      swim: optimize struct swim memory layout · 8b350d42
      Vladislav Shpilevoy authored
      The same problem that occured with struct swim_member, has
      happened with struct swim - it contains a huge structure right in
      the middle, struct swim_task. It consumes 1.5Kb and obviously
      splits the most accessible struct swim attributes into multiple
      cache lines.
      
      This patch moves struct swim_task to the bottom as well as other
      members, related to dissemination component.
      8b350d42
    • Kirill Shcherbatov's avatar
      sql: fix boolean variables binding · 15c990cb
      Kirill Shcherbatov authored
      The bindings mechanism was not updated in scope of BOOLEAN static
      type patch. Fixed.
      15c990cb
  7. Apr 25, 2019
    • Vladislav Shpilevoy's avatar
      swim: optimize memory layout of struct swim_member · e1c94e90
      Vladislav Shpilevoy authored
      Struct swim_member describes attributes of one remote member of
      a SWIM cluster. It is relatively often accessed. And it has two
      huge structures - struct swim_packet ping/ack_task. Each is 1500
      bytes. When these tasks are in the middle of the structure, they
      split and spoil cache lines.
      
      This patch moves the whole failure detection attribute family to
      the bottom of the structure, and moves these two tasks to the
      end of layout.
      e1c94e90
    • Vladislav Shpilevoy's avatar
      swim: introduce suspicion · 8f903360
      Vladislav Shpilevoy authored
      Suspicion component is a way how SWIM protects from
      false-positive failure detections. When the network is slow, or
      a SWIM node does not manage to process messages in time because
      of being overloaded, other nodes will not receive ACKs in time,
      but it is too soon to declare the member dead.
      
      The nodes will mark the member as suspected, and will ping it
      indirectly, via other members. It 1) gives the suspected member
      more time to respond on ACKs, 2) protects from the case when it
      is a network problem on particular channels.
      
      Part of #3234
      8f903360
    • Vladislav Shpilevoy's avatar
      swim: introduce routing · 8c9b88d2
      Vladislav Shpilevoy authored
      Before the patch SWIM packets were being sent quite
      straightforward from one instance to another with transparent
      routing on Internet Level of TCP/IP. But the SWIM paper
      describes last yet not implemented component - suspicion
      mechanism.
      
      So as not to overload this message with suspicion details it is
      enough to say that it makes possible sending a packet through an
      intermediate SWIM instance, not directly.
      
      This commit extends the SWIM protocol with a new transport-level
      section named 'routing'. It allows to send indirect SWIM messages
      transparently via packet forwarding implemented fully inside
      transportation component, in swim_io.c.
      
      Part of #3234
      8c9b88d2
    • Vladislav Shpilevoy's avatar
      swim: store sender UUID in swim io tasks · 776397ba
      Vladislav Shpilevoy authored
      Struct swim_task is an asynchronous task generated by the SWIM
      core and scheduled to be sent when a next EV_WRITE event appears.
      
      It has a callback 'complete' called when the task finally sent
      its packet into the network. In this callback a next SWIM round
      step can be scheduled, or set a deadline for a ping. Usually it
      requires to know to which member the packet was sent. For this
      UUID is required, but swim_task operates by inet addresses only.
      
      At this moment UUID necessity can be bypassed via container_of
      or via some queues, but it is not so once suspicion component is
      introduced.
      
      The patch adds sender's UUID to struct swim_task.
      
      Part of #3234
      776397ba
    • Vladislav Shpilevoy's avatar
      swim: wrap sio_strfaddr() · caeadc15
      Vladislav Shpilevoy authored
      SIO provides a function sio_strfaddr() to obtain string
      representation of an arbitrary struct sockaddr. Call of this
      function usually looks bulky because it requires explicit cast to
      const struct sockaddr *, and expects the address size in the
      second paremeter.
      
      SWIM uses only AF_INET addresses and always casts them to
      const struct sockaddr * + passes sizeof(struct sockaddr_in) to
      each invocation of sio_strfaddr(). This patch wraps
      sio_strfaddr() with a function making these preparations.
      
      Part of #3234
      caeadc15
    • Vladislav Shpilevoy's avatar
      sio: make sio_strfaddr() using tt_static_buf · 84638518
      Vladislav Shpilevoy authored
      Sometimes it is wanted to format multiple addresses and call
      sio_strfaddr() multiple times, but it uses one static thread
      local buffer, and next calls overwrite results of previous ones.
      
      On the contrary, tt_static_buf() is not a single buffer per
      thread - it is 4 buffers. Now sio_strfaddr() uses it and it is
      possible to call it 4 times without rewriting old results.
      
      Also, this update makes .bss section a bit smaller - -1 static
      buffer of size 1025, and +16 bytes for tt_static buffers. Total:
      -1009 bytes.
      84638518
    • Vladislav Shpilevoy's avatar
      swim: drop swim_uuid_str() function · 8770fd5c
      Vladislav Shpilevoy authored
      It appeared  that tt_uuid lib provides the same function:
      tt_uuid_str().
      8770fd5c
    • Vladislav Shpilevoy's avatar
      test: introduce swim packet filter by destination address · 7251efa5
      Vladislav Shpilevoy authored
      The filter is going to be used to test the SWIM suspicion
      component. The destination filter will break certain network
      channels, and the suspicion component shall withstand that.
      
      Part of #3234
      7251efa5
    • Vladislav Shpilevoy's avatar
      test: remove swim packet filter destructors · adb63189
      Vladislav Shpilevoy authored
      Swim test packet filters are supposed to filter out packets
      matching certain criteria or with a probability. They were
      implemented as a filter-function and userdata passed into the
      former on each invocation. Usually it was allocated on heap and
      needed deletion. But it appeared that much simpler is to store
      the filters inside struct swim_node, pass it as userdata, and get
      rid of userdata destructors and dynamic allocations.
      
      The patch is motivated by necessity to add one new filter, which
      anyway will need struct swim_node as userdata.
      adb63189
Loading