Skip to content
Snippets Groups Projects
  1. May 02, 2017
    • Vladimir Davydov's avatar
      vinyl: don't recover the same run for each its slice · 10a739b5
      Vladimir Davydov authored
      Currently, on recovery we create and load a new vy_run per each slice,
      so if there's more than one slice created for a run, we will have the
      same run duplicated in memory. To avoid that, maintain the hash of all
      runs loaded during recovery of the current index, and look up the run
      there when a slice is created instead of creating a new run.
      
      Note, we don't need to do anything like this on initial join, as we
      delete the run right after sending it to the replica, so we can just
      create a new run each time we make a slice.
      10a739b5
    • Vladimir Davydov's avatar
      vinyl: store run slices in metadata log · f18dbce6
      Vladimir Davydov authored
      In order to recover run slices, we need to store info about them in the
      metadata log, so this patch introduces two new records:
       - VY_LOG_INSERT_SLICE: takes IDs of the slice, the range to insert the
         slice into, and the run the slice is for. Also, it takes the slice
         boundaries as after coalescing two ranges a slice inserted into the
         resulting range may be narrower than the range.
       - VY_LOG_DELETE_SLICE: takes ID of the slice to delete.
      
      Also, it renames VY_LOG_INSERT_RUN and VY_LOG_DELETE_RUN to
      VY_LOG_CREATE_RUN and VY_LOG_DROP_RUN.
      
      Note, we don't need to keep deleted ranges (and slices) in the log until
      the garbage collection wipes them away any more, because they are not
      needed by deleted run records, which garbage collection targets at.
      f18dbce6
    • Vladimir Davydov's avatar
      vinyl: rename range_{begin,end} keys to {begin,end} in vy_log · b80a2cf8
      Vladimir Davydov authored
      The same keys will be used to specify slice boundaries, so let's call
      them in a neutral way. No functional changes.
      b80a2cf8
    • Vladimir Davydov's avatar
      vinyl: count number of slices per run · d716f54f
      Vladimir Davydov authored
      Currently, there can't be more than one slice per run, but this will
      change one the single memory level is introduced. Then we will have to
      count the number of slices per each run so as not to unaccount the same
      run more than once on each slice deletion. Unfortunately, we can't use
      vy_run->refs to count the number of slices created per each run,
      because, although vy_run->refs is only incremented per each slice
      allocated for the run, this includes slices that were removed from
      ranges and stay allocated only because of being pinned by open
      iterators. So we add one more counter to vy_run, slice_count, and
      introduce new helpers to be used for slice creation/destruction,
      vy_run_make_slice() and vy_run_destroy_slice(), which inc/dec the
      counter.
      d716f54f
    • Vladimir Davydov's avatar
      vinyl: make check for empty range on split more thorough · 09d56944
      Vladimir Davydov authored
      There's a sanity check in vy_range_needs_split() that assures the
      resulting ranges are not going to be empty: it checks the split key
      against the oldest run's min key. The check is not enough for the slice
      concept, because even if the split key is > min key, it still can be <
      the beginning of the slice.
      09d56944
    • Vladimir Davydov's avatar
      vinyl: add slice size estimate · 3740ff9d
      Vladimir Davydov authored
      We use run->info.keys to estimate the size of a new run's bloom filter.
      We use run->info.size to trigger range split/coalescing. If a range
      contains a slice that spans only a part of a run, we can't use run->info
      stats, so this patch introduces the following slice stats: number of
      keys (for the bloom filter) and the size on disk (for split/coalesce).
      These two counters are not accurate, they are only estimates, because
      calculating exact numbers would require disk reads. Instead we simply
      take the corresponding run's stat and multiply it by
      
          slice page count / run page count
      3740ff9d
    • Vladimir Davydov's avatar
      vinyl: separate accounting of ranges and runs · fccaa3f1
      Vladimir Davydov authored
      There will be more than one slice per run, i.e. the same run will be
      used jointly by multiple ranges. To make sure that a run isn't accounted
      twice, separate run accounting from range accounting.
      fccaa3f1
    • Vladimir Davydov's avatar
      vinyl: teach run iterator to respect slice boundaries · c0bb544d
      Vladimir Davydov authored
      Make sure that we start iteration within the given slice and end it as
      soon as the current position leaves the slice boundaries. Note, the
      overhead caused by extra comparisons is only incurred if the slice has
      non-NULL boundaries, which is only the case if the run is shared among
      ranges.
      c0bb544d
    • Vladimir Davydov's avatar
      vinyl: introduce the concept of run slices · b3078fec
      Vladimir Davydov authored
      When we finally move to single memory tree per index (currently we
      maintain one per range), dump will result in creation of a single run.
      To add such runs to ranges and be able to iterate over statements that
      are within a particular range, we introduce a new concept, a run slice.
      A run slice is a simple object that references a run and contains begin
      and end keys inherited from the range it belongs to. Runs are now not
      referenced directly by ranges, instead we use slices as an intermediary.
      For now the concept is oversimplified: there may only be one slice per
      run, but the following patches will remove this limitation.
      b3078fec
    • Vladimir Davydov's avatar
      vinyl: use tuple instead of raw msgpack for range begin and end · d285d0ce
      Vladimir Davydov authored
      Maintaining range->{begin,end} as tuples is useful for the concept of
      run slices, which is introduced by the following patches. A slice may
      inherit its begin and end from a range, so basically we have two
      alternatives: either copy keys or take references. The latter seems to
      be more straightforward.
      d285d0ce
    • Vladimir Davydov's avatar
      Rework vinyl/recover test · 0e00b714
      Vladimir Davydov authored
      The vinyl/recover test was written long time ago. Back then recovery was
      a convoluted procedure based on scanning the data directory so that it
      was crucial to generate a number of stale files to check its validity.
      Nowadays, there's no point in it thanks to the metadata log. Moreover,
      when the single memory level is introduced, ERRINJ_VY_RANGE_SPLIT won't
      make sense any more as range splitting will not involve a worker thread.
      So we can remove errinj from this test.
      
      Another problem with this test is that it doesn't take into account data
      compression (all tuples generated by it are compressed perfectly).
      Handle this by generating random padding strings.
      
      Also, when we snapshot vinyl stats before restart, there still may be
      compaction in progress that will modify the stats in the last moment,
      resulting in a sporadic failure. Address that by checking stats
      separately, for another space with compaction disabled.
      0e00b714
    • Vladimir Davydov's avatar
      Revert "vinyl: introduce vy_range.is_level_zero flag" · 7eed9ecd
      Vladimir Davydov authored
      This reverts commit f3764063.
      
      With the recently proposed concept of run slices, this commit is not
      needed to implement the single memory level any more.
      
      Conflicts:
      	src/box/vinyl.c
      	src/box/vy_log.c
      	src/box/vy_log.h
      7eed9ecd
    • Vladimir Davydov's avatar
      Revert "vinyl: split vy_range_get_write_iterator into two functions" · 748a6a13
      Vladimir Davydov authored
      This reverts commit f3ecce75.
      
      With the recently proposed concept of run slices, this commit is not
      needed to implement the single memory level any more.
      
      Conflicts:
      	src/box/vinyl.c
      748a6a13
    • Vladimir Davydov's avatar
      Revert "vinyl: reset max_dump_size in compact_new()" · 8805441c
      Vladimir Davydov authored
      This reverts commit 55685eaf.
      
      With the recently proposed concept of run slices, this commit is not
      needed to implement the single memory level any more.
      
      Conflicts:
      	src/box/vinyl.c
      8805441c
    • Alexandr Lyapunov's avatar
      vinyl: tx_serial.test: do not append very long strings in lua. · 0e77440b
      Alexandr Lyapunov authored
      Patch fb9c8b32 introduced
      generation of reproduce code and dump of it to the log. But
      the problem is that the code is initially generated in a big
      lua string using repeated concatenation in a loop. Such a use
      of lua strings is too vulnerable in terms of performance.
      Avoid repeated concatenation of lua string in tx_serial.test.
      0e77440b
    • Vladimir Davydov's avatar
      Revert "vinyl: make coalesce as separate task" · 72d81059
      Vladimir Davydov authored
      This reverts commit 44367f5c.
      
      With the recently proposed concept of run slices, this commit is not
      needed to implement the single memory level any more.
      
      Conflicts:
      	src/box/vinyl.c
      72d81059
    • Vladimir Davydov's avatar
      Revert "vinyl: restict the run_iterator return statements" · add6cdc2
      Vladimir Davydov authored
      This reverts commit a839af29.
      
      With the recently proposed concept of run slices, this commit is not
      needed to implement the single memory level any more.
      add6cdc2
    • Vladimir Davydov's avatar
      Revert "vinyl: implement deferred restore of runs and mem" · a1b3c5fd
      Vladimir Davydov authored
      This reverts commit 1360f193.
      
      The idea behind this patch makes sense for memory iterator, but the
      implementation is incorrect: it's not enough to patch ->restore, because
      memory iterator can start right from ->next_key, bypassing ->restore.
      Since other hunks of this patch are not needed in the scope of the run
      slice paradigm, and what is done by this patch needs to be rewritten
      from scratch anyway, revert the whole patch.
      a1b3c5fd
    • Vladimir Davydov's avatar
      vinyl: fix txw iterator for empty key · 3ea07c3b
      Vladimir Davydov authored
      Fixes commit 976b31cb ("vinyl: fix order error in txv_iterator_start").
      
      If search key is empty, txw iterator starts from the first or last entry
      in the write set depending on the iterator direction. This is incorrect,
      because the write set is grouped by index so if the first/last entry
      happens to be for another index, txw iterator will stop immediately even
      if there are statements for the given index in the write set. Instead we
      must take the first/last statement for the given index. We can use
      psearch/nsearch for it - it will position to the first/last element in
      the tree equal to the search key; since the search key is equal to any
      statement of the given index in case the given key is empty, it will do
      the job.
      
      While we are at it, also remove handling 'key == NULL' case from the
      write_set_key_cmp() as it is not used anywhere.
      3ea07c3b
  2. Apr 27, 2017
    • Georgy Kirichenko's avatar
      gh-2045: Update snapshot timestamp if snapshot already exists. · c10874f4
      Georgy Kirichenko authored
      Update snapshot timestamp if a snapshot already exists.
      Do not produce error if the snapshot already exists.
      Fixes gh-2045.
      c10874f4
    • Konstantin Osipov's avatar
      vinyl: remove a redundant copy-paste · fe94c66f
      Konstantin Osipov authored
      Use one function to initialize a global read view, pass read
      view lsn in.
      fe94c66f
    • Alexandr Lyapunov's avatar
      vinyl: add a special read view for upsert squash process. · 3dfd6a05
      Alexandr Lyapunov authored
      The upsert squash process must not touch or take into consideration
      non-committed statemens because they are owned by TX manager and
      thus might be changed or removed by it's will.
      Add a committed read view in TX manager (in addition to global
      read view) that allows to see only committed statements.
      Use the read view in squash process.
      Move upsert count calculation, simple inplace upsert
      squash and upsert process invocation from preparation stage to
      commit stage of TX; without this change after a squash process
      prepared statements will not get a change to use simple inplace
      upsert squash.
      
      fix gh-2382.
      3dfd6a05
    • Alexandr Lyapunov's avatar
      vinyl: fix squashing of invalid upserts. · 9c5b3515
      Alexandr Lyapunov authored
      Due to some historical reasons, vy_upsert hides some fatal errors
      in the upsert statement and returns the original base statement
      with original lower lsn. In that case a squash process replaced
      wrong statement with wrong lsn doing usless work. Fix it.
      9c5b3515
    • Alexandr Lyapunov's avatar
      vinyl: fix a minor issue in upsert squash process. · 3a41148d
      Alexandr Lyapunov authored
      There might be a case, when a non-upsert statement was inserted just
      after a sqush process of the same key was started. Before this patch
      the squash process copied that non-upsert statement and reinserted
      it into the index for some reason.
      Make the squash process to detect the case and quit immediately.
      3a41148d
    • Roman Tsisyk's avatar
      vinyl: remove box.info.vinyl().metric and .global · 11c729f6
      Roman Tsisyk authored
      We are not ready to freeze `box.info().vinyl()` and `index:info()` output
      right now. Let's keep this API for internal use only. Please note that
      the output might change in the future.
      
      See #1662
      11c729f6
    • Roman Tsisyk's avatar
      Refactor box.info output · 56462bca
      Roman Tsisyk authored
      * Rename box.info.server.id to box.info.id
      * Rename box.info.server.uuid to box.info.uuid
      * Rename box.info.server.lsn to box.info.lsn
      * Rename box.info.cluster.signature to box.info.signature
      * Drop box.info.server section
      * Return `nil` instead of `0` for box.info.id during bootstrap
      * Return `nil` instead of `-1` for box.info.lsn during bootstrap
      
      Sample output:
      
      ```
      tarantool> box.info
      ---
      - version: 1.7.3-538-g7ee75dee4
        id: 1
        ro: false
        vclock: {}
        uptime: 3
        lsn: 0
        vinyl: []
        pid: 20714
        status: running
        uuid: e6d913b9-a8b8-4873-ba94-14cf4357fec6
        signature: 0
        replication:
          1:
            id: 1
            uuid: e6d913b9-a8b8-4873-ba94-14cf4357fec6
            lsn: 0
        cluster:
          uuid: 3cfa8749-0fba-4b13-b8eb-db90f79d1485
      ```
      
      Closes #723
      56462bca
    • Roman Tsisyk's avatar
      Update test-run · 4f30f959
      Roman Tsisyk authored
      Prepare for box.info.server removal.
      
      See #723
      4f30f959
  3. Apr 25, 2017
  4. Apr 24, 2017
  5. Apr 21, 2017
Loading