Skip to content
Snippets Groups Projects
  1. Nov 17, 2017
    • Vladimir Davydov's avatar
      vinyl: discard tautological DELETEs on compaction · a6f45d87
      Vladimir Davydov authored
      The write iterator never discards DELETE statements referenced by a read
      view unless it is major compaction. However, a DELETE is useless in case
      it is preceded by another DELETE for the same key. Let's skip such
      tautological DELETEs. It is not only a useful optimization on its own -
      it will also help us annihilate INSERT+DELETE pairs on compaction.
      
      Needed for #2875
      a6f45d87
  2. Nov 16, 2017
  3. Nov 15, 2017
    • Vladimir Davydov's avatar
      Fix replication/gc test · 473b85a3
      Vladimir Davydov authored
      Make sure the master receives an ack from the replica and performs
      garbage collection before checking the checkpoint count.
      473b85a3
    • Vladimir Davydov's avatar
      relay: don't delete xlog files until replica confirms receipt · 2d86127b
      Vladimir Davydov authored
      We remove old xlog files as soon as we have sent them to all replicas.
      However, the fact that we have successfully sent something to a replica
      doesn't necessarily mean the replica will have received it. If a replica
      fails to apply a row (for instance, it is out of memory), replication
      will stop, but the data files have already been deleted on the master so
      that when the replica is back online, the master won't find appropriate
      xlog to feed to the replica and replication will stop again.
      
      The user visible effect is the following error message in the log and in
      the replica status:
      
        Missing .xlog file between LSN 306 {1: 306} and 311 {1: 311}
      
      There is no way to recover from this but to re-bootstrap the replica
      from scratch.
      
      The issue was introduced by commit ba09475f ("replica: advance gc
      state only when xlog is closed"), which targeted at making the status
      update procedure as lightweight and fast as possible and so moved
      gc_consumer_advance() from tx_status_update() to a special gc message.
      A gc message is created and sent to TX as soon as an xlog is relayed.
      Let's rework this so that gc messages are appended to a special queue
      first and scheduled only when the relay receives the receipt
      confirmation from the replica.
      
      Closes #2825
      2d86127b
    • Vladimir Davydov's avatar
      vinyl: log reads that take too long · af63fcbe
      Vladimir Davydov authored
      If read of a single statement from vinyl takes more than the value of
      box.cfg.too_long_threshold, the request will be logged:
      
        512/1: select([1], EQ) => REPLACE([100001, 1], lsn=200006) took too long: 0.626 sec
      
      This is useful for debugging.
      
      While we are at it, let's also remove 'timeout' from the vinyl engine
      constructor arguments and set it with box_set_vinyl_timeout() on box
      initialization instead, similarly to vinyl_max_tuple_size.
      
      Closes #2871
      af63fcbe
    • Vladimir Davydov's avatar
      vinyl: zap vy_key_snprint() and vy_key_str() · 41e08261
      Vladimir Davydov authored
      Use generic tuple_snprint() and tuple_str() instead.
      41e08261
    • Vladimir Davydov's avatar
      vinyl: merge vy_cursor and vinyl_iterator · 54f9d3b7
      Vladimir Davydov authored
      The vinyl_iterator struct was introduced as a C++ wrapper around
      vy_cursor. Since there's no C++ code left in the engine, and both
      structures are defined in the same file, we can merge them now.
      54f9d3b7
    • Vladimir Davydov's avatar
      vinyl: remove engine wrapper functions · f5fca22f
      Vladimir Davydov authored
      The engine infrastructure was initially implemented in C++ so we
      needed the wrappers to provide C++ API to Vinyl. Now everything is
      in C so we don't need them any more. Let's fold them in vinyl.c.
      
      Note, this patch does not touch vinyl_engine, vinyl_index, and
      vinyl_iterator structures, they are still there, it just gets rid
      of the intermediate layer of wrapper functions, which is not needed
      any more.
      f5fca22f
    • Vladimir Davydov's avatar
      vinyl: pass force_recovery on engine initialization · 85fd9907
      Vladimir Davydov authored
      Accessing configuration from inside an engine implementation
      violates encapsulation.
      85fd9907
    • Vladimir Davydov's avatar
      Fix race in garbage collection · b8717738
      Vladimir Davydov authored
      Engine callbacks that perform garbage collection may sleep, because they
      use coio for removing files to avoid blocking the TX thread. If garbage
      collection is called concurrently from different fibers (e.g. from relay
      fibers), we may attempt to delete the same file multiple times. What is
      worse xdir_collect_garbage(), used by engine callbacks to remove files,
      isn't safe against concurrent execution - it first unlinks a file via
      coio, which involves a yield, and only then removes the corresponding
      vclock from the directory index. This opens a race window for another
      fiber to read the same clock and yield, in the interim the vclock can be
      freed by the first fiber:
      
        #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
        #1  0x00007f105ceda3fa in __GI_abort () at abort.c:89
        #2  0x000055e4c03f4a3d in sig_fatal_cb (signo=11) at main.cc:184
        #3  <signal handler called>
        #4  0x000055e4c066907a in vclockset_remove (rbtree=0x55e4c1010e58, node=0x55e4c1023d20) at box/vclock.c:215
        #5  0x000055e4c06256af in xdir_collect_garbage (dir=0x55e4c1010e28, signature=342, use_coio=true) at box/xlog.c:620
        #6  0x000055e4c0417dcc in memtx_engine_collect_garbage (engine=0x55e4c1010df0, lsn=342) at box/memtx_engine.c:784
        #7  0x000055e4c0414dbf in engine_collect_garbage (lsn=342) at box/engine.c:155
        #8  0x000055e4c04a36c7 in gc_run () at box/gc.c:192
        #9  0x000055e4c04a38f2 in gc_consumer_advance (consumer=0x55e4c1021360, signature=342) at box/gc.c:262
        #10 0x000055e4c04b4da8 in tx_gc_advance (msg=0x7f1028000aa0) at box/relay.cc:250
        #11 0x000055e4c04eb854 in cmsg_deliver (msg=0x7f1028000aa0) at cbus.c:353
        #12 0x000055e4c04ec871 in fiber_pool_f (ap=0x7f1056800ec0) at fiber_pool.c:64
        #13 0x000055e4c03f4784 in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=0x55e4c04ec6d4 <fiber_pool_f>, ap=0x7f1056800ec0) at fiber.h:665
        #14 0x000055e4c04e6816 in fiber_loop (data=0x0) at fiber.c:631
        #15 0x000055e4c0687dab in coro_init () at /home/vlad/src/tarantool/third_party/coro/coro.c:110
      
      Fix this by serializing concurrent execution of garbage collection
      callbacks with a latch.
      b8717738
    • Vladimir Davydov's avatar
      box: disable schema auto upgrade for replication · 582a85d4
      Vladimir Davydov authored
      Currently, box.schema.upgrade() is called automatically after box.cfg()
      if the upgrade is considered safe (currently, only upgrade to 1.7.5 is
      "safe"). However, no upgrade is safe in case replication is configured,
      because it can easily result in replication conflicts. Let's disable
      auto upgrade if the 'replication' configuration option is set.
      
      Closes #2886
      582a85d4
    • Roman Tsisyk's avatar
  4. Nov 13, 2017
  5. Nov 10, 2017
  6. Nov 06, 2017
Loading