Skip to content
Snippets Groups Projects
  1. Nov 07, 2018
  2. Nov 05, 2018
  3. Nov 03, 2018
  4. Nov 02, 2018
    • Mergen Imeev's avatar
      box: wrong is_nullable for multiple indexes · 52b84d2e
      Mergen Imeev authored
      If field isn't defined by space format, than in case of multiple
      indexes field option is_nullable was the same as it was for last
      index that defines it. This is wrong as it should be 'true' only
      if it is 'true' for all indexes that defines it.
      
      Closes #3744.
      52b84d2e
  5. Nov 01, 2018
    • Vladimir Davydov's avatar
      httpc: fix compilation with libcurl >= 7.62.0 · 02da15f7
      Vladimir Davydov authored
      Starting from libcurl 7.62.0, CURL_SSL_CACERT is defined as a macro
      alias to CURLE_PEER_FAILED_VERIFICATION, see
      
        https://github.com/curl/curl/commit/3f3b26d6feb0667714902e836af608094235fca2
      
      This breaks compilation:
      
        httpc.c:337:7: error: duplicate case value 'CURLE_PEER_FAILED_VERIFICATION'
                case CURLE_PEER_FAILED_VERIFICATION:
                     ^
        httpc.c:336:7: note: previous case defined here
                case CURLE_SSL_CACERT:
                     ^
        curl.h:589:26: note: expanded from macro 'CURLE_SSL_CACERT'
        #define CURLE_SSL_CACERT CURLE_PEER_FAILED_VERIFICATION
                                 ^
      
      Fix this by using CURLE_SSL_CACERT only if libcurl version is less
      than 7.62.0.
      
      Note, we can't use CURL_AT_LEAST_VERSION to check libcurl version,
      because it isn't available in libcurl shipped with CentOS 6.
      02da15f7
    • Vladimir Davydov's avatar
      httpc: fix curl version check in httpc_set_keepalive · 74e20755
      Vladimir Davydov authored
      Obviously, the version check as it is now won't work once libcurl 8.0.0
      is released. Use LIBCURL_VERSION_NUM to correctly check libcurl version.
      
      Note, we can't use CURL_AT_LEAST_VERSION to check libcurl version,
      because it isn't available in libcurl shipped with CentOS 6.
      
      Fixex commit 7e62ac79 ("Add HTTP client based on libcurl").
      74e20755
    • Vladimir Davydov's avatar
      test: fix engine/ddl sporadic hang · 38845d6e
      Vladimir Davydov authored
      In Vinyl DDL aborts all affected writers before modifying a space so we
      must use pcall() to avoid hang.
      
      Closes #3786
      38845d6e
  6. Oct 29, 2018
    • Vladimir Davydov's avatar
      replication: keep header when request is modified by before_replace · 480c55b6
      Vladimir Davydov authored
      When space.before_replace trigger modifies the result of a remote
      operation, we clear the request header so that it gets rebuilt on
      commit. This is incorrect, because as a result we don't bump the
      master's component of the replica's vclock, which leads to the request
      being applied again when the replica reconnects. The issue manifests
      itself in sporadic replication/before_replace test failures.
      
      Fix it by updating the request header rather than clearing it so that
      replica id and lsn get preserved.
      
      Closes #3722
      480c55b6
    • Serge Petrenko's avatar
      hot_standby: reflect amount of recovered rows in box.info · 85299d97
      Serge Petrenko authored
      To be able to switch to hot_standby instance with minimal downtime, we
      need to know how far is it behind the primary instance, i.e. up to what
      vclock we have recovered. Previously this was impossible because
      box.info.vclock always referenced replicaset.vclock, which isn't updated
      during hot_standby.
      
      Introduce a pointer to relevant vclock: either recovery vclock (during
      local recovery) or replicaset.vclock (at all other times) and use it in
      box.info.vclock, box.info.lsn and box.info.signature.
      
      @locker: renamed last_row_vclock to box_vclock and constified it.
      
      Closes #3002
      85299d97
    • Alexander Turenko's avatar
      test: fix unix socket conflict in socket.test.lua · f950eb31
      Alexander Turenko authored
      Increased socket.readable / socket.wait timeouts.
      
      Rewritten port choosing: repeat bind+listen until succeeds, exclude
      incorrect port 65536 from the range.
      
      All these changes are needed to run the test in parallel on several
      test-run workers to investigate flaky failures of the test / of the
      suite. Some of these changes can also eliminate possible flaky failures.
      f950eb31
  7. Oct 26, 2018
    • Georgy Kirichenko's avatar
      lua: fix tuple cdata collecting · 022a3c50
      Georgy Kirichenko authored
      In some cases luajit does not collect cdata objects which were
      transformed with ffi.cast as tuple_bless does. In consequence, internal
      table with gc callback overflows and then lua crashes. There might be an
      internal luajit issue because it fires only for jitted code. But assigning
      a gc callback before transformation fixes the problem.
      
      Closes #3751
      022a3c50
    • Vladimir Davydov's avatar
      vinyl: do not account bloom filters to runtime quota · e4338cc5
      Vladimir Davydov authored
      Back when bloom filters were introduced, neither box.info.memory() nor
      box.stat.vinyl().memory didn't exist so bloom filters were accounted to
      box.runtime.info().used for lack of a better place. Now, there's no
      point to account them there. In fact, it's confusing, because bloom
      filters are allocated with malloc(), not from the runtime arena, so
      let's drop it.
      e4338cc5
    • Vladimir Davydov's avatar
      vinyl: fix memory leak in slice stream · 0066457c
      Vladimir Davydov authored
      If a tuple read from a run by a slice stream happens to be out of the
      slice bounds, it will never be freed. Fix it.
      
      The leak was introduced by commit c174c985 ("vinyl: implement new
      simple write iterator").
      0066457c
  8. Oct 25, 2018
  9. Oct 23, 2018
    • Alexander Turenko's avatar
      xlog: fix sync_is_async xlog option · 55dcde00
      Alexander Turenko authored
      The behaviour change was introduced in cda3cb55: sync_is_async option
      was forgotten to be updated from xdir; sync_interval was forgotten too,
      but was restored in 1900c58b.
      
      The commit fixes the performance regression around 6-14% for average RPS
      on default nosqlbench workload with 30 seconds duration. The additional
      information about benchmarking can be found in #3747.
      
      Thanks to Vladimir Davydov (@locker) for the investigation of the
      cda3cb55 changes.
      
      Closes #3747
      
      (cherry picked from commit cd9cc4c5)
      55dcde00
  10. Oct 13, 2018
    • Vladimir Davydov's avatar
      replication: fix rebootstrap crash in case master has replica's rows · d4ce7447
      Vladimir Davydov authored
      During SUBSCRIBE the master sends only those rows originating from the
      subscribed replica that aren't present on the replica. Such rows may
      appear after a sudden power loss in case the replica doesn't issue
      fdatasync() after each WAL write, which is the default behavior. This
      means that a replica can write some rows to WAL, relay them to another
      replica, then stop without syncing WAL file. If this happens we expect
      the replica to read its own rows from other members of the cluster upon
      restart. For more details see commit eae84efb ("replication: recover
      missing local data from replica").
      
      Obviously, this feature only makes sense for SUBSCRIBE. During JOIN
      we must relay all rows. This is how it initially worked, but commit
      adc28591 ("replication: do not delete relay on applier disconnect"),
      witlessly removed the corresponding check from relay_send_row() so that
      now we don't send any rows originating from the joined replica:
      
        @@ -595,8 +630,7 @@ relay_send_row(struct xstream *stream, struct xrow_header *packet)
                 * it). In the latter case packet's LSN is less than or equal to
                 * local master's LSN at the moment it received 'SUBSCRIBE' request.
                 */
        -       if (relay->replica == NULL ||
        -           packet->replica_id != relay->replica->id ||
        +       if (packet->replica_id != relay->replica->id ||
                    packet->lsn <= vclock_get(&relay->local_vclock_at_subscribe,
                                              packet->replica_id)) {
                        relay_send(relay, packet);
      
      (relay->local_vclock_at_subscribe is initialized to 0 on JOIN)
      
      This only affects the case of rebootstrap, automatic or manual, because
      when a new replica joins a cluster there can't be any rows on the master
      originating from it. On manual rebootstrap, i.e. when the replica files
      are deleted by the user and the replica is restarted from an empty
      directory with the same UUID (set via box.cfg.instance_uuid), this isn't
      critical - the replica will still receive those rows it should have
      received during JOIN once it subscribes. However, in case of automatic
      rebootstrap this can result in broken order of xlog/snap files, because
      the replica directory still contains old xlog/snap files created before
      rebootstrap. The rebootstrap logic expects them to have strictly less
      vclocks than new files, but if JOIN stops prematurely, this condition
      may not hold, leading to a crash when the vclock of a new xlog/snap is
      inserted into the corresponding xdir.
      
      This patch fixes this issue by restoring pre eae84efb behavior: now
      we create a new relay for FINAL JOIN instead of reusing the one attached
      to the joined replica so that relay_send_row() can detect JOIN phase and
      relay all rows in this case. It also adds a comment so that we don't
      make such a mistake in future.
      
      Apart from fixing the issue, this patch also fixes a relay leak in
      relay_initial_join() in case engine_join_xc() fails, which was also
      introduced by the above mentioned commit.
      
      A note about xlog/panic_on_broken_lsn test. Now the relay status isn't
      reported by box.info.replication if FINAL JOIN failed and the replica
      never subscribed (this is how it worked before commit eae84efb) so
      we need to tweak the test a bit to handle this.
      
      Closes #3740
      d4ce7447
  11. Oct 12, 2018
    • Kirill Yukhin's avatar
      Add compile_commands.json to git ignore · e0017ad6
      Kirill Yukhin authored
      e0017ad6
    • Vladimir Davydov's avatar
      vinyl: implement basic transaction throttling · c0d8063b
      Vladimir Davydov authored
      If the rate at which transactions are ready to write to the database is
      greater than the dump bandwidth, memory will get depleted before the
      previously scheduled dump is complete and all newer transactions will
      have to wait, which may take seconds or even minutes:
      
        W> waited for 555 bytes of vinyl memory quota for too long: 15.750 sec
      
      This patch set implements basic transaction throttling that is supposed
      to help avoid unpredictably long stalls. Now the transaction write rate
      is always capped by the observed dump bandwidth, because it doesn't make
      sense to consume memory at a greater rate than it can be freed. On top
      of that, when a dump begins, we estimate the amount of time it is going
      to take and limit the transaction write rate accordingly.
      
      Note, this patch doesn't take into account compaction when setting the
      rate limit so compaction threads may still fail to keep up with dumps,
      increasing the read amplification. It will be addressed later.
      
      Closes #1862
    • Vladimir Davydov's avatar
      vinyl: fix memory dump trigger · 45d61b66
      Vladimir Davydov authored
      vy_quota_signal() doesn't wake up a consumer if it won't be able to
      proceed because of the memory limit. This is OK, but it doesn't attempt
      to trigger memory dump in this case either. As a result, it may occur
      that dump isn't triggered and all waiting consumers are aborted by
      timeout.  E.g. this happens if memory dump releases no memory, which is
      possible because memory is allocated and freed in 16 MB chunks. This
      results in occasional vinyl/quota_tmeout test failures.
      
      Fix this by moving the dump trigger right in vy_quota_may_use() so that
      it's called whenever we consider a consumer for wakeup.
      45d61b66
    • Vladimir Davydov's avatar
      vinyl: do not account small dumps for bandwidth estimation · e351b3e6
      Vladimir Davydov authored
      Small dumps (e.g. triggered by box.snapshot) have too high overhead
      associated with file creation so taking them into account for bandwidth
      estimation may result in erroneous transaction throttling. Let's ignore
      dumps of size less than 1 MB.
      
      Needed for #1862
      e351b3e6
    • Vladimir Davydov's avatar
      vinyl: do not try to trigger dump in regulator if already in progress · b80f437f
      Vladimir Davydov authored
      This is pointless since trigger_dump_cb callback will return right away
      in such a case. Let's wrap trigger_dump_cb in vy_regulator_trigger_dump
      method, which will actulally invoke the callback only if the previous
      dump has already completed (i.e. vy_regulator_dump_complete was called).
      
      This also gives us a definite place in code where we can adjust the rate
      limit so as to guarantee that a triggered memory dump will finish before
      we hit the hard memory limit (this will be done later).
      
      Needed for #1862
      b80f437f
    • Vladimir Davydov's avatar
      vinyl: bypass format validation for statements loaded from disk · 3846d9b2
      Vladimir Davydov authored
      When the format of a space is altered, we walk over all tuples stored in
      the primary index and check them against the new format. This doesn't
      guarantee that all *statements* stored in the primary index conform to
      the new format though, because the check isn't performed for deleted or
      overwritten statements, e.g.
      
        s = box.schema.space.create('test', {engine = 'vinyl'})
        s:create_index('primary')
        s:insert{1}
        box.snapshot()
        s:delete{1}
      
        -- The following command will succeed, because the space is empty,
        -- however one of the runs contains REPLACE{1}, which doesn't conform
        -- to the new format.
        s:create_index('secondary', {parts = {2, 'unsigned'}})
      
      This is OK as we will never return such overwritten statements to the
      user, however we may still need to read them. Currently, this leads
      either to an assertion failure or to a read error in
      
        vy_stmt_decode
         vy_stmt_new_with_ops
          tuple_init_field_map
      
      We could probably force major compaction of the primary index to purge
      such statements, but it is complicated as there may be a read view
      preventing the write iterator from squashing such a statement, and
      currently there's no way to force destruction of a read view.
      
      So this patch simply disables format validation for all tuples loaded
      from disk (actually we already skip format validation for all secondary
      index statements and for DELETE statements in primary indexes so this
      isn't as bad as it may seem). To do that, it adds a boolean parameter to
      tuple_init_field_map() that disables format validation, and then makes
      vy_stmt_new_with_ops(), which is used for constructing vinyl statements,
      set it to false. This is OK as all statements inserted into a vinyl
      space are validated explicitly with tuple_validate() anyway.
      
      This is rather a workaround for the lack of a better solution.
      
      Closes #3540
      3846d9b2
    • Vladimir Davydov's avatar
      test: fix spurious box/sql test failure · 23e71c6e
      Vladimir Davydov authored
      For some reason this test uses 555 for space id, which may be taken by
      a previously created space:
      
      Test failed! Result content mismatch:
      --- box/sql.result        Fri Oct  5 17:23:25 2018
      +++ box/sql.reject        Fri Oct 12 19:38:51 2018
      @@ -12,12 +12,14 @@
       ...
       _ = box.schema.space.create('test1', { id = 555 })
       ---
      +- error: Duplicate key exists in unique index 'primary' in space '_space'
       ...
      
      Reproduce file:
      
      ---
      - [box/rtree_point.test.lua, null]
      - [box/transaction.test.lua, null]
      - [box/tree_pk.test.lua, null]
      - [box/access.test.lua, null]
      - [box/cfg.test.lua, null]
      - [box/admin.test.lua, null]
      - [box/lua.test.lua, null]
      - [box/bitset.test.lua, null]
      - [box/role.test.lua, null]
      - [box/sql.test.lua, null]
      ...
      
      Remove { id = 555 } to make sure it never happens.
      23e71c6e
    • Alexander Turenko's avatar
      Add Linux/clang CI target · 181bb3e7
      Alexander Turenko authored
      Replaced targets generation using a matrix expansion + exclusion list
      with the explicit targets list. Gave meagingful names for targets.
      
      Fixes #3673.
      181bb3e7
    • Vladimir Davydov's avatar
      xlog: fix filename in error messages · 4a464f8a
      Vladimir Davydov authored
       - xlog_rename() doesn't strip xlog->filename of inprogress suffix so
         write errors will mistakenly report the filename as inprogress.
       - xlog_create() uses a name without inprogress suffix for error
         reporting while it actually creates an inprogress file.
      4a464f8a
  12. Oct 10, 2018
    • Georgy Kirichenko's avatar
      socket: fix polling in case of spurious wakeup · e6bd7748
      Georgy Kirichenko authored
      socket_writable/socket_readable handles socket.iowait spurious wakeup
      until event is happened or timeout is exceeded.
      
      Closes #3344
      e6bd7748
    • Vladimir Davydov's avatar
      vinyl: fix for deferred DELETE overwriting newer statement · 63912c30
      Vladimir Davydov authored
      A deferred DELETE may be generated after a newer statement for the same
      key was inserted into a secondary index and hence land in a newer run.
      Since the read iterator assumes that newer sources always contain newer
      statements for the same key, we mark all deferred DELETE statements with
      VY_STMT_SKIP_READ flag, which makes run/mem iterators ignore them. The
      flag must be persisted when a statement is written to disk, but it is
      not. Fix this.
      
      Fixes commit 504bc805 ("vinyl: do not store meta in secondary index
      runs").
      63912c30
    • Alexander Turenko's avatar
      test: disable feedback daemon test on Mac OS in CI · ab868a6b
      Alexander Turenko authored
      The fail is known and should not have any influence on our CI results.
      
      The test should be enabled back after a fix of #3558.
      ab868a6b
  13. Oct 08, 2018
    • Vladimir Davydov's avatar
      cmake: fix sync_file_range detection · e1aa1a3d
      Vladimir Davydov authored
      sync_file_range is declared only if _GNU_SOURCE macro is defined.
      Also, in order to be used in a source file, HAVE_SYNC_FILE_RANGE
      must be present in config.h.cmake.
      
      Fixes commit caae99e5 ("Refactor xlog writer").
      e1aa1a3d
  14. Oct 06, 2018
  15. Oct 05, 2018
    • Vladimir Davydov's avatar
      replication: ref checkpoint needed to join replica · bae6f037
      Vladimir Davydov authored
      Before joining a new replica we register a gc_consumer to prevent
      garbage collection of files needed for join and following subscribe.
      Before commit 9c5d851d ("replication: remove old snapshot files not
      needed by replicas") a consumer would pin both checkpoints and WALs so
      that would work as expected. However, the above mentioned commit
      introduced consumer types and marked a consumer registered on replica
      join as WAL-only so if the garbage collector was invoked during join, it
      could delete files corresponding to the relayed checkpoint resulting in
      replica join failure. Fix this issue by pinning the checkpoint used for
      joining a replica with gc_ref_checkpoint and unpinning once join is
      complete.
      
      The issue can only be reproduced if there are vinyl spaces, because
      deletion of an open snap file doesn't prevent the relay from reading it.
      The existing replication/gc test would catch the issue if it triggered
      compaction on the master so we simply tweak it accordingly instead of
      adding a new test case.
      
      Closes #3708
      bae6f037
    • Vladimir Davydov's avatar
      gc: call gc_run unconditionally when consumer is advanced · a3542586
      Vladimir Davydov authored
      gc_consumer_unregister and gc_consumer_advance don't call gc_run in case
      the consumer in question isn't leftmost. This code was written back when
      gc_run was kinda heavy and would call engine/wal callbacks even if it
      wouldn't really need to. Today gc_run will bail out shortly, without
      making any complex computation, let alone invoking garbage collection
      callbacks, in case it has nothing to do so those optimizations are
      pointless. Let's remove them.
      a3542586
Loading