Skip to content
Snippets Groups Projects
  1. Sep 26, 2017
    • Konstantin Osipov's avatar
      sequence: speed up allocation of sequence data tuple during checkpoint · c0a30d17
      Konstantin Osipov authored
      No need to allocate a tuple buffer for every sequence value.
      c0a30d17
    • Vladimir Davydov's avatar
      box: write up-to-date sequence values to snapshot · 775c02f7
      Vladimir Davydov authored
      Currently, _sequence_data is written to the snapshot the same way as any
      other memtx space - the snapshot iterator simply walks over the space
      tuples and dumps them one-by-one. This is OK now, because _sequence_data
      is always up-to-date. However, after the next patch, its contents may
      become stale - due to technical reasons we won't update _sequence_data
      if a sequence is used for auto increment in a space. To make sure we
      still snapshot sequence values properly, let's introduce a special
      iterator for this space that walks over the sequence cache instead of
      the space contents.
      
      Needed for #389
      775c02f7
    • Vladimir Davydov's avatar
      index: use snapshot_iterator for writing snapshot · 7675c28b
      Vladimir Davydov authored
      Currently, Index::createSnapshotIterator() returns a pointer to struct
      iterator, so the iterator implementation has to deal with struct tuple,
      which might not always be an option. Since the iterator is only used for
      writing a snapshot, struct tuple isn't really necessary, raw msgpack
      would be enough. So let's introduce snapshot_iterator which returns raw
      msgpack instead of struct tuple. It is required by the next patch to
      make a snapshot of _sequence_data space.
      
      Needed for #389
      7675c28b
    • Vladimir Davydov's avatar
      test: vinyl: check select consistency · 738c4a18
      Vladimir Davydov authored
      738c4a18
    • Vladimir Davydov's avatar
      vinyl: track DELETEs skipped by read iterator in conflict manager · bcb1bbec
      Vladimir Davydov authored
      Currently, read iterator adds only final statements (i.e. statements
      returned to the user) to the conflict manager while intermediate DELETEs
      are silently ignored. This can result to the following select()
      inconsistency.
      
      Suppose there are REPLACE{10}, DELETE{20} and REPLACE{30} (it does not
      matter which source owns which statement in this example) and select({})
      is called. The read iterator returns REPLACE{10} and is called again. It
      scans the sources, finds DELETE{20} and silently skips to the next
      statement. If its fiber yields now, and REPLACE{15} is inserted, the
      read iterator will ignore it, because its curr_stmt is {20} and so it
      will return {30}, i.e. the select() will return [{10}, {30}]. The same
      select() called from the same transaction afterwards will, however,
      return [{10}, {20}, {30}], which is inconsistent with the previous
      result. This happens, because interval [{10}, {20}] was not added to the
      conflict manager before the yield and so the transaction was not sent to
      read view.
      
      To avoid this, we must track not only final statements, but also
      intermediate DELETEs.
      
      There's a nuance to this patch regarding REQ case that needs to be noted
      explicitly. Since source iterators can't handle REQ, in contrast to EQ,
      we change vy_read_iterator->iterator_type from REQ to LE and set the
      need_check_eq flag when the iterator is opened. The need_check_eq flag
      makes read iterator check if the resulting statement equals the search
      key before returning it to the user. This check must be performed before
      vy_tx_track() is called, otherwise we might add a larger interval to the
      conflict manager than necessary, resulting in unnecessary conflicts
      (vinyl/tx_gap_lock test will surely catch that). So this patch moves
      call to vy_tx_track() along with the need_check_eq check immediately
      after vy_read_iterator_next_key() irrespective of the statement type.
      The problem is vy_cache_add() still has to be called on the final
      statement:
      
       * WAS:
         vy_cache_add(result)
         need_check_eq check; may set result to NULL
         vy_tx_track(result)
      
       * NOW:
         need_check_eq check; may set result to NULL
         vy_tx_track(result)
         vy_cache_add(result)
      
      Beside the result, vy_cache_add() is passed the iterator type so it
      should handle NULL properly and it sure does in case of EQ, but not in
      case of REQ, because REQ was changed to LE. That is, when vy_cache_add()
      is passed NULL (search end marker) in case of REQ, it thinks it saw the
      leftmost statement across all keys, not the leftmost statement equal to
      the search key.
      
      I decided that the best way to overcome this is to make read iterator
      handle REQ separately, just like EQ, and only replace it with LE right
      before passing it down to a source iterator. This gives vy_cache_add()
      enough info to handle the REQ case properly and makes the code look more
      symmetrical IMO.
      bcb1bbec
    • Vladimir Davydov's avatar
      vinyl: fix read source restore after statement removal · 330d53f0
      Vladimir Davydov authored
      Suppose, a source iterator was positioned at the minimal statement
      across all sources before yielding to read a disk source. Then it has
      front_id equal to the front_id of the read iterator, which indicates
      that the source iterator needs to be advanced on the next iteration.
      However, if the statement was removed during the yield, the iterator
      gets repositioned on the next statement by the restoration procedure
      (see vy_read_iterator_next_key()). In this case it isn't positioned on
      the minimal statement any more and hence must not be advanced on the
      next iteration, but its front_id remains the same, which means it will
      be advanced, possibly skipping a statement.
      
      The situation described above can happen only in case of the tuple
      cache, because other sources are append-only. Obviously, to avoid this
      kind of trouble, we must reset the front_id if a source iterator is
      restored to a statement that is greater than the minimal statement.
      330d53f0
    • Vladimir Davydov's avatar
      vinyl: fix double free in cache iterator restore · a2a82803
      Vladimir Davydov authored
      We drop a reference to itr->curr_stmt before restoring the iterator
      position so we must not drop it while iterating over the cache tree.
      This mistake results in funky crashes in various places, e.g. in
      vy_stmt_compare(). Shame on me: I missed that during self review.
      
      Fixes: 073c8d07 ("vinyl: rework cache iterator restore")
      a2a82803
    • Vladimir Davydov's avatar
      vinyl: fold vy_merge_iterator in vy_read_iterator · 4fed8854
      Vladimir Davydov authored
      Initially merge sort implementation was supposed to be shared between
      read and write iterators hence we got vy_merge_iterator. However, over
      time such a design decision proved to be inadequate, because it turned
      out that the two iterators can barely share a few lines - so different
      are the tasks they solve. The write iterator has already been rewritten
      without the use of the vy_merge_iterator so we can now fold the latter
      in vy_read_iterator. While we are at it, spruce up some comments a bit.
      4fed8854
    • Vladimir Davydov's avatar
      vinyl: rename vy_read_iterator->curr_stmt to last_stmt · 6fc49730
      Vladimir Davydov authored
      So as not to confuse it with vy_merge_iterator->curr_stmt.
      6fc49730
    • Vladimir Davydov's avatar
      vinyl: remove redundant merge iterator flags · 4af159ec
      Vladimir Davydov authored
       - search_started is not needed, (curr_stmt != NULL) can be used
         instead.
       - unique_optimization and one_value can be checked in-place.
      4af159ec
    • Vladimir Davydov's avatar
      vinyl: fold range iterator in read iterator · 6c8d9a56
      Vladimir Davydov authored
      Range iterator is trivial and only used by the read iterator, so it
      doesn't deserve a layer of abstraction. Fold it in.
      6c8d9a56
    • Vladimir Davydov's avatar
      vinyl: simplify iteration over ranges in read iterator · 9da5698e
      Vladimir Davydov authored
      Historically, we have this 'belong_range/range_ended' optimization in
      the read iterator: we move to the next range iff all sources that
      are marked as 'belong_range' are done. It is supposed to spare us a
      comparison with a range boundary on each iteration. However, with the
      introduction of the tuple cache and the unified memory level, it became
      more of a pain in the ass rather than a valuable asset. Thanks to it, we
      have to read all runs that belong to the current range before we can
      jump to the next range, even if all statements spanned by the current
      range are in the cache. Also, when all runs in a range have been read,
      we immediately jump to the next range and read its runs, even if the
      user intends to stop the iterator after reading a memory statement that
      is within the current range. That said, we effectively trade disk
      accesses for simple in-memory comparison operations, which is silly.
      
      Let's get rid of this "optimization" and make iteration over ranges
      straightforward: after the merge iterator returns a statement, compare
      it to the current range boundary and jump to the next range if it is
      outside.
      9da5698e
    • Vladislav Shpilevoy's avatar
      Introduce base64_encode options · cd4e26d4
      Vladislav Shpilevoy authored
      * no-wrap: do not wrap each 72 result symbols on a new line;
      * no-pad:  do not add padding at the end of a result;
      * urlsafe: no-wrap + no-pad + replace '+' -> '-' and '/' -> '_'.
      
      Closes #2777
      Closes #2478
      Closes #2479
      cd4e26d4
    • Alexander Turenko's avatar
      Add an option to allow using system ZStd · 664fc51e
      Alexander Turenko authored
      The option is '-DENABLE_BUNDLED_ZSTD' and defaults to ON.
      
      There are two goals of making this conditional, both related to building
      Tarantool on Gentoo Linux from an ebuild:
      
      * Avoid bundled ZStd building issue w/o pay to investivage it.
      * Allow user to choose between system and bundled library.
      664fc51e
  2. Sep 25, 2017
    • Vladimir Davydov's avatar
      salad: light: allow to store values of arbitrary size · 04835943
      Vladimir Davydov authored
      A value stored in a light hash must be convertible to int and have such
      a size that sizeof(LIGHT(record)) is a power of two. Let's remove these
      limitations.
      04835943
    • Vladimir Davydov's avatar
      salad: light: make C compatible · abc9d055
      Vladimir Davydov authored
       - Use 'static inline' instead of 'inline'
       - Move GROW_INCREMENT enum out of LIGHT(record) struct
      abc9d055
    • Vladimir Davydov's avatar
      vinyl: do not start reader threads if vinyl is unused · 57e8ff91
      Vladimir Davydov authored
      So as not to confuse the user, let's start reader threads when recovery
      completes only if there are Vinyl indexes. Otherwise, start them on the
      first Vinyl index creation.
      
      See #2664
      57e8ff91
    • Vladimir Davydov's avatar
      vinyl: do not start worker threads on checkpoint if vinyl is unused · b98a38ba
      Vladimir Davydov authored
      Vinyl starts its worker threads on checkpoint, even if there is no Vinyl
      spaces to dump out there. This is a mess. Let's wake up the scheduler
      only if we actually have something to dump. BTW, this patch removes that
      annoying 'vinyl checkpoint done' message in case nothing was dumped.
      
      See #2664
      b98a38ba
    • Vladimir Davydov's avatar
      vinyl: factor out scheduler checkpoint functions · 0dc8d031
      Vladimir Davydov authored
      Introduce the following functions:
      
        vy_scheduler_begin_checkpoint()
        vy_scheduler_wait_checkpoint()
        vy_scheduler_end_checkpoint()
      
      and make the Vinyl API methods use them instead of playing with the
      scheduler directly.
      
      Rationale:
       - It will help move the scheduler to a separate file in future.
       - We want to avoid waking the scheduler on checkpoint if Vinyl is not
         used, because the scheduler starts worker threads on the first
         wakeup. At the same time, we have to call vy_log_rotate() on
         checkpoint no matter if Vinyl is used or not (it needs to update its
         internal housekeeping). Factoring out the scheduler functions will
         allow to achieve this goal neatly.
      0dc8d031
    • Vladimir Davydov's avatar
      vinyl: use checkpoint_last() instead of vy_scheduler->last_checkpoint · d6d70836
      Vladimir Davydov authored
      There's no point in maintaining the LSN of the last checkpoint in Vinyl,
      we have checkpoint_last() for retrieving it. Make this helper a little
      bit more friendly by allowing to omit vclock argument and use it
      wherever appropriate.
      
      Here's why I need to do that now. I'm planning to omit calling scheduler
      checkpoint functions altogether if there's nothing to dump (so as not to
      start worker threads and print 'vinyl checkpoint done' message in case
      Vinyl is not used), but one of those functions updates last_checkpoint,
      which other pieces of code (gc primarily) relies on. I could factor
      last_checkpoint out of vy_scheduler, but why if I can simply use
      checkpoint_last() instead.
      d6d70836
    • Vladimir Davydov's avatar
      box: do not invoke garbage collection on checkpoint_count change · 884abc0b
      Vladimir Davydov authored
      Currently, box_set_checkpoint_count() not only changes the corresponding
      configuration option, but also invokes garbage collection. The problem
      is it is called in the beginning of box_cfg(), before recovering engine
      data. At that time engines are not ready to invoke garbage collection
      (e.g. Vinyl hasn't scanned vylog directory yet). Besides, the server
      might be started in the hot standby mode, in which case calling garbage
      collection is obviously wrong.
      
      To fix this, we could somehow detect if box_set_checkpoint_count() is
      called from box_cfg() and avoid invoking the garbage collector if so,
      but I think it isn't worth it. Instead let's just not invoke garbage
      collection when this option changes. It's not a big deal - old files
      will be removed by the next snapshot anyway.
      884abc0b
    • Vladimir Davydov's avatar
      vinyl: check existence of vinyl data directory on demand · 114f0403
      Vladimir Davydov authored
      It's inappropriate to fail startup if the vinyl data directory does not
      exist, because vinyl might not even be in use. Move the check to the
      time when a vinyl index is opened.
      
      See #2664
      114f0403
    • Vladimir Davydov's avatar
      vinyl: fix gc of incomplete runs left from dropped indexes · e8e7ed18
      Vladimir Davydov authored
      If an index is dropped while a dump/compaction task is in progress, the
      task will be aborted, but VY_LOG_DROP_RUN record won't be written to the
      metadata log so the run will remain incomplete (VY_LOG_PREPARE_RUN) and
      pin the dropped index in the log until the server is restarted and all
      incomplete runs are purged. This results in occasional failures of the
      vinyl/gc test which checks that vylog files are removed after all vinyl
      spaces are dropped. There's actually no reason not to log deletion of an
      incomplete run that belongs to a dropped index so let's do this to fix
      this problem.
      
      Closes #2486
      e8e7ed18
    • Vladimir Davydov's avatar
      memtx: undef BPS tree macros used to define memtx_tree · 622c4bcf
      Vladimir Davydov authored
      So that BPS tree can be redefined in files that include memtx_tree.h
      622c4bcf
  3. Sep 22, 2017
    • Roman Tsisyk's avatar
      tarantoolctl: fix "attempt to index local 'cfg'" · 873632f2
      Roman Tsisyk authored
      Fixes #2761
      873632f2
    • Roman Tsisyk's avatar
      curl: prefer version from homebrew on macOS · 947b378d
      Roman Tsisyk authored
      Apple's version of libcurl is outdated and buggy.
      Use libcurl from homebrew by default.
      
      Closes #2772
      947b378d
    • Eugene Leonovich's avatar
      doc: tweak extra/dist/example.lua · 563fcafc
      Eugene Leonovich authored
      queue.start() is not needed anymore.
      563fcafc
    • Roman Tsisyk's avatar
      vinyl: fix error injection in vy_cache_iterator_restore() · e80c3ecb
      Roman Tsisyk authored
      Follow up the previous commit.
      e80c3ecb
    • Vladimir Davydov's avatar
      vinyl: rework cache iterator restore · 073c8d07
      Vladimir Davydov authored
      The cache iterator restoration function is muddy and suboptimal:
      
       - If last_stmt equals NULL, it doesn't attempt to change the iterator
         position although it should restart the iterator. Moreover, it
         doesn't even check EQ condition in this case, i.e. it may return a
         statement that doesn't satisfy the search criteria.
      
       - It may restore iterator position to last_stmt, but the restoration
         semantics mandates that the returned statement must be strictly
         greater than last_stmt if the iterator was restored.
      
       - If the iterator was invalidated, it first positions the iterator to
         the last known position (curr_stmt) and then steps back to the
         statement closest to last_stmt although it could jump to last_stmt
         immediately.
      
      Let's rewrite it.
      073c8d07
    • Vladimir Davydov's avatar
      vinyl: clean up upsert squashing procedure of read iterator · 98ac5be8
      Vladimir Davydov authored
      vy_merge_iterator_squash_upserts() returns an UPSERT, which is then
      converted to a REPLACE by vy_read_iterator_next(). Fold the latter into
      the former to make the workflow easier to follow. Also, remove useless
      suppress_error argument.
      98ac5be8
    • Vladimir Davydov's avatar
      vinyl: do not handle restore in next_key/next_lsn callbacks · 6fb87432
      Vladimir Davydov authored
      Not only restore() callback can restore the iterator position after the
      source changed - next_key() and next_lsn() can do that too, which brings
      additional complexity. Do we really need it? Not really:
      
       - We can avoid handling restoration in next_key() by simply calling
         restore() before it. This is cheap as restore() is a no-op if the
         source was not changed.
      
       - Regarding next_lsn() - when it comes to calling it, we iterate over
         sources from the most recent one to the oldest, which means that the
         only sources out there that may yield, on-disk runs, are guaranteed
         to be iterated last, after all mutable sources (txw, cache, mems), so
         none of the mutable sources needs to be restored in this case.
      
      Simplify the read iterator as per above.
      6fb87432
    • Vladimir Davydov's avatar
      vinyl: make sure next_lsn callback is only called after next_key · e5229867
      Vladimir Davydov authored
      Currently, ->next_lsn() may be (and actually is) called right after
      opening the iterator, which means that apart from advancing the iterator
      to the next version of the same statement it has to handle the case of
      starting iteration. This complicates the callback semantics so let's
      require ->next_key() to be called first.
      e5229867
  4. Sep 21, 2017
  5. Sep 20, 2017
  6. Sep 19, 2017
    • Vladimir Davydov's avatar
      box: implement persistent sequences · f797eec4
      Vladimir Davydov authored
      This patch implements a new object type, persistent sequences. Sequences
      are created with function box.schema.sequence.create(name, options).
      Options include min/max, start value, increment, cache size, just like
      in Postgresql, although 'cache' is ignored for now. All sequences can be
      accessed via box.sequence.<name>, similarly to spaces. To generate a
      sequence value, use seq:next() method. To retrieve the last generated
      value, use seq:get(). A sequence value can also be reset to the start
      value or to any other value using seq:reset() and seq:set() methods.
      
      Needed for #389
      f797eec4
    • Vladimir Davydov's avatar
      box: use trigger to push space data to Lua · 1f736583
      Vladimir Davydov authored
      Currently, it is done by space_cache_replace/delete which violates
      incapsulation. Let's introduce a trigger that is fired after a change in
      a space definition is committed and use it to propagate changes to Lua.
      Patch by @kostja.
      1f736583
    • lenkis's avatar
      Update man file for tarantool · fca3d686
      lenkis authored
      fca3d686
    • Vladimir Davydov's avatar
      box: add boolean field type · 702bcac2
      Vladimir Davydov authored
      702bcac2
  7. Sep 18, 2017
Loading