Skip to content
Snippets Groups Projects
  1. Sep 27, 2016
    • Vladimir Davydov's avatar
      vinyl: patch holes in index on recovery · 9532f5be
      Vladimir Davydov authored
      Successful compaction may create a range w/o tuples, and currently we
      don't store empty range files on disk. As a result, such a range won't
      be loaded on recovery, which breaks the index tree invariant (prev->end
      equals next->begin). Hence we must silently create a new range for each
      gap found on recovery.
      9532f5be
    • Vladimir Davydov's avatar
      vinyl: do not remove empty range on split · 1ff57e79
      Vladimir Davydov authored
      Successful merge+split may result in ranges w/o tuples. Before commit
      559102a7 ("vinyl: store range lower and upper bounds on disk") it was OK
      to delete such ranges, because there was no range->end. To illustrate
      this, suppose compaction splits a range as follows:
      
        [A, C) => [A, B) + [B, C)
      
      If range [B, C) turns out to be empty, then we could simply drop it as
      all inserts would go to range [A, B) and as B was not stored anywhere,
      it would effectively become range [A, C).
      
      After the above-mentioned commit, however, removal of range [B, C)
      breaks the index invariant that for each two adjacent ranges prev->end
      always equals next->begin. This, in turn, can result in data loss in
      case [A, B) gets split again, as B will be used to break the run write
      loop (see vy_range_compact_execute()) which is premature since there may
      be tuples >= B.
      
      That being said, let's remove this small optimization altogether.
      1ff57e79
    • Vladimir Davydov's avatar
      vinyl: warn about stale and partial ranges on recovery · ca0445de
      Vladimir Davydov authored
      A "stale" range is an old range file left after compaction. A "partial"
      range is a range file left after failed split. Although we can handle
      such files, they should not normally exist in the index directory, so
      let's warn about their presence on recovery.
      ca0445de
    • Vladimir Davydov's avatar
      Fix vinyl/compact test · 524789c0
      Vladimir Davydov authored
      The vinyl/compact issues two snapshots and expects that there will be
      two runs after it, then it keeps waiting until compaction merges them.
      There is a race condition intrinsic to this test - compaction might have
      finished before the test checks that there are exactly two runs. To
      eliminate it, let's set the test index's compact_wm parameter to 3 and
      add one more snapshot to trigger compaction.
      
      Since currently, it's impossible to modify vinyl options dynamically,
      this patch moves compact_wm to index options where page_size and
      range_size reside.
      
      Closes #1758
      524789c0
    • Konstantin Osipov's avatar
      3b7e4b46
    • Nick Zavaritsky's avatar
      899bc6d8
    • Alexandr Lyapunov's avatar
  2. Sep 26, 2016
    • Nick Zavaritsky's avatar
      Fix gh-1772: broken tarantoolctl eval · 33fc47e5
      Nick Zavaritsky authored
      33fc47e5
    • Vladimir Davydov's avatar
      vinyl: remove old range file on compaction asap · 3977c18f
      Vladimir Davydov authored
      Currently, we postpone old range file removal until checkpoint, but we
      can do it right after successful compaction - this will save us some
      disk space.
      3977c18f
    • Vladimir Davydov's avatar
      test: vinyl: test recovery after incomplete splits · b094b089
      Vladimir Davydov authored
      The idea behind the test is simple - create several invalid range files,
      i.e. those left from previous dumps and incomplete splits, then restart
      the server and check that the content of the space was not corrupted.
      
      To make it possible, we need to (1) prevent the garbage collector from
      removing unused range files and (2) make the split procedure fail after
      successfully writing the first range. We use error injection to achieve
      that.
      
      The test runs as follows:
      
       1. Disable garbage collection with the aid of error injection.
      
       2. Add a number of tuples to the test space that would make it split.
          Rewrite them several times with different values so that different
          generations of ranges on disk would have different contents.
      
       3. Inject error to the split procedure.
      
       4. Rewrite the tuples another couple of rounds. This should trigger
          split which is going to fail leaving invalid range files with newer
          ids on the disk.
      
       5. Restart the server and check that the test space content was not
          corrupted.
      b094b089
    • Vladimir Davydov's avatar
      vinyl: zap range index · c344dfff
      Vladimir Davydov authored
      Currently, we store all range ids in an .index file after each range
      tree modification. On recovery, we open the latest .index file, get the
      list of all ranges, and load them. This .index file introduces extra
      complexity to the compaction task: as we can get a consistent list of
      all range ids only in the tx thread, we must either write .index file
      from the tx thread (which we do now), or introduce a special task for
      it, which would be scheduled on compaction completion. The former way
      degrades performance of the tx thread, while the latter complicates the
      code.
      
      Actually, we can do range recovery w/o having to maintain .index files:
      as newer ranges always have greater ids, we can just recover ranges
      starting from the greatest id and disregarding ranges that are already
      spanned by the index tree. For instance, suppose range A was split in
      ranges B and C. Then we recover ranges B and C first (they do not
      intersect, so everything's fine), then we get to A and see that it is
      already spanned (by B and C), so we just throw it away. If on split, B
      (or C) was not created for some reason, then A will not be fully spanned
      by the index, and we replace B (or C) with A, still getting a consistent
      index view.
      
      This patch implements the recovery process as per above and removes the
      .index file. Note, to avoid loading stale index data after drop-create,
      we have to name range files not only by id, but also by index lsn (just
      like the .index files). As before, old range file removal is postponed
      until checkpoint.
      c344dfff
    • Vladimir Davydov's avatar
      vinyl: store range lower and upper bounds on disk · 559102a7
      Vladimir Davydov authored
      Rename range->min_key to range->begin, as it actually denotes not the
      minimal key across all entries in the range, but the lower bound of the
      range, and introduce range->end for the upper bound of the range.
      For adjacent ranges left->end == right->begin. If a range is leftmost,
      then range->begin == NULL. If a range is rightmost, then range->end ==
      NULL. Store range->{begin,end} in range file on checkpoint and load them
      on recovery.
      
      This is required by the following patch to check that ranges do not
      intersect.
      559102a7
    • Vladimir Davydov's avatar
      vinyl: init range->path in vy_range_new() · ccea6f32
      Vladimir Davydov authored
      All we need to initialize range->path is range->id and index->path. Both
      are known at the time of range allocation and never change. So let's do
      range->path initialization right in vy_range_new() instead of postponing
      it until range recovery/write.
      ccea6f32
    • Vladimir Davydov's avatar
      vinyl: don't print path when reporting temp file creation failure · 6ab11491
      Vladimir Davydov authored
      It is uninitialized in case of error injection.
      6ab11491
    • Vladimir Davydov's avatar
      072a739d
    • Vladislav Shpilevoy's avatar
      6c515231
    • Vladislav Shpilevoy's avatar
      Remove read_iterator_get · 45dc4887
      Vladislav Shpilevoy authored
      45dc4887
    • Vladislav Shpilevoy's avatar
      Add comments for write_iterator · 5dc19cdf
      Vladislav Shpilevoy authored
      5dc19cdf
    • Vladislav Shpilevoy's avatar
      616c039a
    • Vladislav Shpilevoy's avatar
    • Vladislav Shpilevoy's avatar
    • Vladislav Shpilevoy's avatar
      Update vy_write_iterator_next() and remove get() · 4bbb8570
      Vladislav Shpilevoy authored
      vy_write_iterator_next() now is used for getting the next
      tuple.
      vy_write_iterator->curr_tuple was removed.
      vy_write_iterator->keeping_tuple is used for keeping
      the tuple that is need between two invocations of next() but
      must be deleted after.
      Fixed the memory management in vy_write_iterator_next.
      Optimized purging in write_iterator.
      4bbb8570
    • Georgy Kirichenko's avatar
      Fix zstd decompression. Fixed #1789 · e90498d7
      Georgy Kirichenko authored
      e90498d7
    • Vladimir Davydov's avatar
      vinyl: zap scheduler->indexes array · acb35b0c
      Vladimir Davydov authored
      We have env->indexes list. No need to store all indexes in an array in
      addition to that. Note, I move rlist_add adding a new index to the
      env->indexes list from vy_index_new() to vy_index_open() so that it only
      becomes visible to the scheduler after having been successfully loaded.
      This change does not make any difference apart from that.
      acb35b0c
    • Vladimir Davydov's avatar
      vinyl: zap index->ref_lock · 471a785f
      Vladimir Davydov authored
      All manipulations on index->refs, which ref_lock is supposed to protect,
      are done from the tx thread, so the lock is not needed.
      471a785f
    • Alexandr Lyapunov's avatar
      edfde593
  3. Sep 23, 2016
  4. Sep 22, 2016
Loading