Skip to content
Snippets Groups Projects
  1. Jun 02, 2018
  2. Jun 01, 2018
    • Vladimir Davydov's avatar
      vinyl: fix compaction vs checkpoint race resulting in invalid gc · b25e3168
      Vladimir Davydov authored
      The callback invoked upon compaction completion uses checkpoint_last()
      to determine whether compacted runs may be deleted: if the max LSN
      stored in a compacted run (run->dump_lsn) is greater than the LSN of the
      last checkpoint (gc_lsn) then the run doesn't belong to the last
      checkpoint and hence is safe to delete, see commit 35db70fa ("vinyl:
      remove runs not referenced by any checkpoint immediately").
      
      The problem is checkpoint_last() isn't synced with vylog rotation - it
      returns the signature of the last successfully created memtx snapshot
      and is updated in memtx_engine_commit_checkpoint() after vylog is
      rotated. If a compaction task completes after vylog is rotated but
      before snap file is renamed, it will assume that compacted runs do not
      belong to the last checkpoint, although they do (as they have been
      appended to the rotated vylog), and delete them.
      
      To eliminate this race, let's use vylog signature instead of snap
      signature in vy_task_compact_complete().
      
      Closes #3437
      b25e3168
  3. May 31, 2018
    • Vladimir Davydov's avatar
      vinyl: fix false-positive assertion at exit · ff02157f
      Vladimir Davydov authored
      latch_destroy() and fiber_cond_destroy() are basically no-op. All they
      do is check that latch/cond is not used. When a global latch or cond
      object is destroyed at exit, it may still have users and this is OK as
      we don't stop fibers at exit. In vinyl this results in the following
      false-positive assertion failures:
      
        src/latch.h:81: latch_destroy: Assertion `l->owner == NULL' failed.
      
        src/fiber_cond.c:49: fiber_cond_destroy: Assertion `rlist_empty(&c->waiters)' failed.
      
      Remove "destruction" of vy_log::latch to suppress the first one. Wake up
      all fibers waiting on vy_quota::cond before destruction to suppress the
      second one. Add some test cases.
      
      Closes #3412
      ff02157f
  4. May 29, 2018
  5. May 25, 2018
  6. May 24, 2018
    • Georgy Kirichenko's avatar
      replication: add strict ordering for appliers operating in a full mesh · edd76a2a
      Georgy Kirichenko authored
      In some cases when an applier processing yielded, other applier might
      start some conflicting operation and break replication and database
      consistency.
      Now applier locks a per-server-id latch before processing a transaction.
      This guarantees that there is only one applier request for each server
      in progress at each given moment.
      
      The problem was very rare until full mesh topologies in vinyl
      became a commonplace.
      
      Fixes gh-3339
      edd76a2a
  7. May 22, 2018
  8. May 17, 2018
  9. May 15, 2018
    • Vladimir Davydov's avatar
      test: improve vinyl/select_consistency · 47fe6ced
      Vladimir Davydov authored
      Improve the test by decreasing range_size so that it creates a lot of
      ranges for test indexes, not just one. This helped find bugs causing
      the crash described in #3393.
      
      Follow-up #3393
      47fe6ced
    • Vladimir Davydov's avatar
      vinyl: do not panic if secondary index is inconsistent with primary · 1558c538
      Vladimir Davydov authored
      Although the bug in vy_task_dump_complete() due to which a tuple could
      be lost during dump was fixed, there still may be affected deployments
      as the bug was persisted on disk. To avoid occasional crashes on such
      deployments, let's make vinyl_iterator_secondary_next() skip tuples that
      are present in a secondary index but missing in the primary.
      
      Closes #3393
      1558c538
    • Vladimir Davydov's avatar
      vinyl: fix lost key on dump completion · 1f0023ad
      Vladimir Davydov authored
      vy_task_dump_complete() creates a slice per each range overlapping with
      the newly written run. It uses vy_range_tree_psearch(min_key) to find
      the first overlapping range and nsearch(max_key) to find the range
      immediately following the last overlapping range. This is incorrect as
      nsearch rb tree method returns the element matching the search key if it
      is present in the tree. That is, if the max key written to a run turns
      out to be equal the beginning of a range, the slice won't be created for
      it and it will be silently and persistently lost.
      
      The issue manifests itself as crash in vinyl_iterator_secondary_next(),
      when we fail to find the tuple in the primary index corresponding to a
      statement found in a secondary index.
      
      Part of #3393
      1f0023ad
    • Vladimir Davydov's avatar
      vinyl: fix EQ check in run iterator · 7ee79a0a
      Vladimir Davydov authored
      vy_run_iterator_seek() is supposed to check that the resulting statement
      matches the search key in case of ITER_EQ, but if the search key lies at
      the beginning of the slice, it doesn't. As a result, vy_point_lookup()
      may fail to find an existing tuple as demonstrated below.
      
      Suppose we are looking for key {10} in the primary index which consists
      of an empty mem and two runs:
      
          run 1: DELETE{15}
          run 2: INSERT{10}
      
      vy_run_iterator_next() returns DELETE{15} for run 1 because of the
      missing EQ check and vy_point_lookup() stops at run 1 (since the
      terminal statement is found) and mistakenly returns NULL.
      
      The issue manifests itself as crash in vinyl_iterator_secondary_next(),
      when we fail to find the tuple in the primary index corresponding to a
      statement found in a secondary index.
      
      Part of #3393
      7ee79a0a
    • Alexander Turenko's avatar
      Add test case for fiber safety of digest.pbkdf2 · ec9ec946
      Alexander Turenko authored
      Follows up #3396.
      ec9ec946
  10. May 14, 2018
  11. May 08, 2018
    • Ilya Markov's avatar
      socket: Fix socket test · 2b973c05
      Ilya Markov authored
      In sequential launch of app-tap/console.test, tests failed with "User
      exists" and binding errors.
      
      Make sockets path relative.
      Add users cleanup.
      
      Relates #3168
      2b973c05
  12. May 07, 2018
    • Georgy Kirichenko's avatar
      Don't try to lock a ddl latch in a multistatement tx · c7012534
      Georgy Kirichenko authored
      Any ddl is prohibited in a multistatement transaction, there is no
      reason to try to lock a ddl latch in tis case. Locking for already
      locked latch will cause an yield and a silent transaction rollback, and
      this will crash or assert tarantool server.
      
      Fixes #2783
      c7012534
  13. May 05, 2018
  14. May 03, 2018
    • Vladislav Shpilevoy's avatar
      digest: fix error in base64 encode options · 6e1ac12e
      Vladislav Shpilevoy authored
      Any option of base64 leads to urlsafe encoding. It is wrong, and
      caused by incorrect flag checking. Fix it.
      
      Closes #3358
      6e1ac12e
    • Konstantin Osipov's avatar
      iproto: follow up patch for the fix for blocked connection · 1dcdc98e
      Konstantin Osipov authored
      * rename request_limit.test.lua to net_msg_max.test.lua
      * make net_msg_max.test.lua stable (courtesy of @Gerold103)
      * exclude disconnect messages from iproto_msg_max limit
      * add a separate warning for throttling based on readahead buffer overflow
      1dcdc98e
    • Vladislav Shpilevoy's avatar
      iproto: connection could block forever after a CALL request · f4d66dae
      Vladislav Shpilevoy authored
      Starting with 1.9, CALL request which yields releases
      the intput buffer in net thread before CALL is complete.
      A release trigger is fired when the CALL fiber yields.
      
      The problem is that by default the input socket is not
      included into poll() list of the event loop: thanks to an
      optimization by @kostja for strict request/response scenario,
      the socket is included into poll() list only after the response
      is sent to the client. Thus, the following could happen:
      
      * a client sends a long-polling request
      * the request yields and maybe never finishes
      * the socket is not being read until the long-polling request
        is finished
      
      The patch is to explicitly feed EV_READ event to the event
      loop on the client socket whenever we release the input buffer
      for a long-polling request.
      
      We may remove iproto_resume() from net_discard_input() along
      with this patch since iproto_resume() will be called by
      iproto_connection_on_input().
      f4d66dae
  15. Apr 18, 2018
    • Ilya Markov's avatar
      wal: Update request header after sequence update · 41589229
      Ilya Markov authored
      When tuple in insert/replace request has NULL value
      in the field incremented by sequence,
      request body is changed, NULL is replaced by value taken from
      sequence.
      But request header is not updated.
      So Redo log, which takes body from header if header exists,
      writes the old version of request to wal.
      
      Fixed this with updating header value after handling the sequence.
      
      Closes #3247
      41589229
    • Konstantin Belyavskiy's avatar
      replication: fix broken cases with quorum=0 · 01b8ebc3
      Konstantin Belyavskiy authored
      This commit is related with 6d81fa99
      With replication_connect_quorum=0 set, previous commit broke replication
      since skip applier_resume() and applier_start() parts.
      Fix it and add more test cases.
      
      Closes #3278
      01b8ebc3
    • Konstantin Belyavskiy's avatar
      replication: fix bug with read-only replica as a bootstrap leader · a8ecd1e1
      Konstantin Belyavskiy authored
      When bootstrapping a new cluster, each replica from replicaset can
      be chosen as a leader, but if it is 'read-only', bootstrap will
      failed with an error.
      Fixed it by eliminating read-only replicas from voting by adding
      access rights information to IPROTO_REQUEST_VOTE reply.
      
      Closes #3257
      a8ecd1e1
  16. Apr 11, 2018
  17. Apr 10, 2018
  18. Apr 09, 2018
    • Konstantin Belyavskiy's avatar
      replication: fix bug with zero replication_connect_quorum · 6d81fa99
      Konstantin Belyavskiy authored
      If 'box.cfg.read_only' is false, 'replication' defines at least one
      replica (other than itself), but they are not available at the time
      of box.cfg execution and replication_connect_quorum is set to zero,
      master displays 'orphan' status instead of 'running' since logic
      which cnange this state is executed only after successfull connection.
      
      Closes #3278
      6d81fa99
  19. Apr 07, 2018
    • Vladimir Davydov's avatar
      vinyl: fix crash if index is dropped while read task is in progress · 2a7cf7f5
      Vladimir Davydov authored
      If a fiber waiting for a read task to complete is cancelled, it will
      leave the read iterator immediately, leaving the read task pending.
      If the index is dropped before the read task is complete, the task
      will attempt to dereference a deleted run upon completion:
      
          0  0x560b4007dbbc in print_backtrace+9
          1  0x560b3ff80a1d in _ZL12sig_fatal_cbiP9siginfo_tPv+1e7
          2  0x7f52b09190c0 in __restore_rt+0
          3  0x7f52af6ea30a in bzero+5a
          4  0x560b3ffc7a99 in mempool_free+2a
          5  0x560b3ffcaeb7 in vy_page_read_cb_free+47
          6  0x560b400806a2 in cbus_call_done+3f
          7  0x560b400805ea in cmsg_deliver+30
          8  0x560b40080e4b in cbus_process+51
          9  0x560b4003046b in _ZL10tx_prio_cbP7ev_loopP10ev_watcheri+2b
          10 0x560b4023d86e in ev_invoke_pending+ca
          11 0x560b4023e772 in ev_run+5a0
          12 0x560b3ff822dc in main+5ed
          13 0x7f52af6862b1 in __libc_start_main+f1
          14 0x560b3ff801da in _start+2a
          15 (nil) in +2a
      
      Fix this by elevating the run reference counter per each read task.
      
      Note, currently we use vy_run::refs not only as a reference counter, but
      also as a counter of slices created for the run - see how we compare it
      to vy_run::compacted_slice_count in vy_task_compact_complete(). This
      isn't going to work anymore, obviously. Now we need to count slices
      created per each run in a separate counter, vy_run::slice_count. Anyway,
      it was a rather dubious hack to abuse reference counter for counting
      slices and it's good to finally get rid of it.
      2a7cf7f5
    • Vladimir Davydov's avatar
      vinyl: use ERRINJ_DOUBLE for ERRINJ_VY_READ_PAGE_TIMEOUT · 8dc9895f
      Vladimir Davydov authored
      We use ERRINJ_DOUBLE for all other timeout injections. This makes them
      more flexible as we can inject an arbitrary timeout in tests, not just
      enable some hard-coded timeout. Besides, it makes tests easier to
      follow. So let's use ERRINJ_DOUBLE for ERRINJ_VY_READ_PAGE_TIMEOUT too.
      8dc9895f
    • Vladimir Davydov's avatar
      alter: do not crash if sequence is created for space with no indexes · 95aefec3
      Vladimir Davydov authored
      If a space has no indexes, index_find() will return NULL, which will be
      happily dereferenced by on_replace_dd_sequence(). Looks like this bug
      goes back to the time when we made index_find() exception-free and
      introduced index_find_xc() wrapper. Fix it and add a test case.
      95aefec3
  20. Apr 05, 2018
    • Ilya Markov's avatar
      log: Fix syslog logger · 7c7a2fa1
      Ilya Markov authored
      * Remove rewriting format of default logger in case of syslog option.
      * Add facility option parsing and use parsed results in format message
        according to RFC3164. Possible values and default value of syslog
        facility are taken from nginx (https://nginx.ru/en/docs/syslog.html)
      * Move initialization of logger type and format fucntion before
        initialization of descriptor in log_XXX_init, so that we can test
        format function of syslog logger.
      
      Closes gh-3244.
      7c7a2fa1
  21. Apr 04, 2018
  22. Apr 03, 2018
    • Vladimir Davydov's avatar
      vinyl: fail transaction immediately if it does not fit in memory · 8f63d5d9
      Vladimir Davydov authored
      If the size of a transaction is greater than the configured memory
      limit (box.cfg.vinyl_memory), the transaction will hang on commit
      for 60 seconds (box.cfg.vinyl_timeout) and then fail with the
      following error message:
      
        Timed out waiting for Vinyl memory quota
      
      This is confusing. Let's fail such transactions immediately with
      OutOfMemory error.
      
      Closes #3291
      8f63d5d9
Loading