Skip to content
Snippets Groups Projects
  1. Jun 16, 2017
    • Vladimir Davydov's avatar
      vinyl: do not call vy_scheduler_complete_dump on index deletion · 824ceb32
      Vladimir Davydov authored
      Currently, vy_scheduler_remove_mem() calls vy_scheduler_complete_dump()
      if vy_scheduler_dump_in_progress() returns false, but the latter doesn't
      necessarily mean that the dump has just been completed. The point is
      that vy_scheduler_remove_mem() is called not only for a memory tree that
      has just been dumped to disk, but also for all memory trees of a dropped
      index, i.e. dropping an index when there's no dump in progress results
      in vy_scheduler_complete_dump() invocation. This doesn't do any harm
      now, but looks ugly. Besides, I'm planning to account dump bandwidth in
      vy_scheduler_complete_dump(), which must only be done on actual dump
      completion.
      824ceb32
    • Vladimir Davydov's avatar
      vinyl: factor out functions for memory dump start and completion · 9719b28a
      Vladimir Davydov authored
      Following patches will add more logic to them, so it's better to factor
      them out now to keep the code clean. No functional changes.
      9719b28a
    • Vladimir Davydov's avatar
      vinyl: fix crash if snapshot is called while dump is in progress · dbfd515f
      Vladimir Davydov authored
      Currently, to force dumping all in-memory trees, box.snapshot()
      increments scheduler->generation directly. If dump is in progress and
      there's a space that has more than one index and all its secondary
      indexes have been dumped by the time box.snapshot() is called and its
      primary index is being dumped, incrementing the generation will force
      the scheduler to start dumping secondary indexes of this space again
      (provided, of course, the space has fresh data). Then, creating a dump
      task for a secondary index will attempt to pin the primary index - see
      vy_task_dump_new() => vy_scheduler_pin_index() - which will crash,
      because the primary index is being dumped and hence can't be removed
      from the scheduler by vy_scheduler_pin_index():
      
        Segmentation fault
        #0  0x40c3a4 in sig_fatal_cb(int)+214
        #1  0x7f6ac7981890 in ?
        #2  0x4610bd in vy_scheduler_remove_index+46
        #3  0x4610fe in vy_scheduler_pin_index+49
        #4  0x45f93e in vy_task_dump_new+1478
        #5  0x46137e in vy_scheduler_peek_dump+282
        #6  0x461467 in vy_schedule+47
        #7  0x461bf8 in vy_scheduler_f+1143
      
      To fix that let's trigger dump (by bumping generation) only from the
      scheduler fiber, from vy_scheduler_peek_dump(). The checkpoint will
      force the scheduler to schedule dump by setting checkpoint_in_progress
      flag and setting checkpoint_generation.
      
      Closes #2508
      dbfd515f
    • Konstantin Osipov's avatar
      40d86fe6
    • Vladimir Davydov's avatar
      alter: init space truncate_count after recovering snapshot · 99d6a4f4
      Vladimir Davydov authored
      The replace trigger of _truncate system space (on_replace_dd_truncate)
      does nothing on insertion into or deletion from the space - it only
      updates space truncate_count when a tuple gets updated. As a result,
      space truncate_count isn't initialized properly after recovering
      snapshot. This does no harm to memtx, because it doesn't use space
      truncate_count at all, but it breaks the assumption made by vinyl that
      if space truncate_count is less than index truncate_count (which is
      loaded from vylog), the space will be truncated during WAL recovery and
      hence there's no point in applying statements to the space (see
      vy_is_committed_one). As a result, all statements inserted into a vinyl
      space after snapshot following truncation of the space, are ignored on
      WAL recovery. To fix that, we must initialize space truncate_count when
      a tuple is inserted into _truncate system space.
      
      Closes #2521
      99d6a4f4
    • Roman Tsisyk's avatar
      Unverified
      72be507f
    • Ilya's avatar
      Add HTTP client based on libcurl · 7e62ac79
      Ilya authored
      Inpspired by tarantool/curl module by Vasiliy Soshnikov.
      Reviewed and refactored by Roman Tsisyk.
      
      Closes #2083
      7e62ac79
    • Roman Tsisyk's avatar
      Fix name clash in reflection.h · 414635ed
      Roman Tsisyk authored
      Rename `struct type` to `struct type_info` and `struct method` to
      `struct method_info` to fix name clash with curl/curl.h
      414635ed
  2. Jun 15, 2017
  3. Jun 14, 2017
    • Vladislav Shpilevoy's avatar
      vinyl: decrease usage of vy_mem_older_lsn · 9b8062d5
      Vladislav Shpilevoy authored
      Do not call vy_mem_older_lsn on each UPSERT commit. Older lsn
      statement is used to squash big count of upserts and to turn
      UPSERT into REPLACE, if the older statement has appeared to be
      not UPSERT.
      But n_upserts could be calculated on prepare phase almost free,
      because the bps has method bps_insert_get_iterator, which
      returns iterator to the inserted statement. We can move this
      iterator forward to the older lsn without searching in the tree and
      update n_upserts.
      
      On a commit phase we can get the n_upserts, calculated on a prepare
      phase, and call vy_mem_older_lsn only if there is a sense to
      optimize the UPSERT.
      
      Closes #1988
      9b8062d5
    • Vladislav Shpilevoy's avatar
      vinyl: rename vy_tx_prepare.replace to vy_tx_prepare.repsert · 95e9a101
      Vladislav Shpilevoy authored
      According to the code, 'replace' tuple can also have UPSERT type.
      Lets name it 'repsert' = 'replace' + 'upsert'.
      95e9a101
  4. Jun 13, 2017
  5. Jun 12, 2017
  6. Jun 10, 2017
  7. Jun 09, 2017
    • Vladimir Davydov's avatar
      Improve output of vinyl/gc test · 90901285
      Vladimir Davydov authored
      In case of failure, print files that were not deleted and
      the output of box.internal.gc.info().
      
      Needed for #2486
      90901285
    • Vladimir Davydov's avatar
      Rename box.cfg.vinyl_threads to vinyl_write_threads · c568a658
      Vladimir Davydov authored
      To match box.cfg.vinyl_read_threads introduced by the previous patch.
      c568a658
    • Vladimir Davydov's avatar
      vinyl: use cbus instead of coeio for reading run pages · de885dbf
      Vladimir Davydov authored
      vy_run_iterator_load_page() uses coeio, which is extremely inefficient
      for our cases:
      
       - it locks/unlocks mutexes every time when a task is queued, scheduled,
         or finished
       - it invokes ev_async_send(), which writes to eventfd and wakes up TX
         loop every time on every task completion
       - it blocks tasks until a free worker is available, which leads to
         unpredictable delays
      
      This patch replaces coeio with cbus in the similar way we do TX <-> WAL
      interaction. The number of reader threads is set by a new configuration
      option, vinyl_read_threads, which is set to 1 by default.
      
      Note, this patch doesn't bother adjusting cbus queue length, i.e. it is
      set to INT_MAX as per default. While this is OK when there are a lot of
      concurrent read requests, this might be suboptimal for low-bandwidth
      workloads, resulting in higher latencies. We should probably update the
      queue length dynamically depending on how many clients are out there.
      
      Closes #2493
      de885dbf
  8. Jun 08, 2017
    • bigbes's avatar
      Fix for couple of build problems · 2ba51ab2
      bigbes authored
      2ba51ab2
    • Vladimir Davydov's avatar
      Add engine/truncate test · e9fc8d48
      Vladimir Davydov authored
      e9fc8d48
    • Vladimir Davydov's avatar
      Rework space truncation · 353bcdc5
      Vladimir Davydov authored
      Space truncation that we have now is not atomic: we recreate all indexes
      of the truncated space one by one. This can result in nasty failures if
      a tuple insertion races with the space truncation and sees some indexes
      truncated and others not.
      
      This patch redesigns space truncation as follows:
      
       - Truncate is now triggered by bumping a counter in a new system space
         called _truncate. As before, space truncation is implemented by
         recreating all of its indexes, but now this is done internally in one
         go, inside the space alter trigger. This makes the operation atomic.
      
       - New indexes are created with Handler::createIndex method, old indexes
         are deleted with Index::~Index. Neither Index::commitCreate nor
         Index::commitDrop are called in case of truncation, in contrast to
         space alter. Since memtx needs to release tuples referenced by old
         indexes, and vinyl needs to log space truncation in the metadata log,
         new Handler methods are introduced, prepareTruncateSpace and
         commitTruncateSpace, which are passed the old and new spaces. They
         are called before and after truncate record is written to WAL,
         respectively.
      
       - Since Handler::commitTruncateSpace must not fail while vylog write
         obviously may, we reuse the technique used by commitCreate and
         commitDrop methods of VinylIndex, namely leave the record we failed
         to write in vylog buffer to be either flushed along with the next
         write or replayed on WAL recovery. To be able to detect if truncation
         was logged while recovering WAL, we introduce a new vylog record
         type, VY_LOG_TRUNCATE_INDEX which takes truncate_count as a key: if
         on WAL recovery index truncate_count happens to be <= space
         truncate_count, then it it means that truncation was not logged and
         we need to log it again.
      
      Closes #618
      Closes #2060
      353bcdc5
    • Vladimir Davydov's avatar
      vinyl: convert vy_index->tree to pointer · 801f32c7
      Vladimir Davydov authored
      Space truncate rework done by the next patch requires the ability to
      swap data stored on disk between two indexes on recovery so as not to
      reload all runs every time a space gets truncated. Since we can't swap
      content of two rb tree (due to rbt_nil), convert vy_index->tree to a
      pointer.
      801f32c7
    • Roman Tsisyk's avatar
      Fix -Wunused on on Clang · 85009195
      Roman Tsisyk authored
      85009195
    • Georgy Kirichenko's avatar
      Lock schema for space and index alteration · 5a200cb3
      Georgy Kirichenko authored
      Lock schema before any changes to space and index dictionary and unlock
      only after commit or rollback. This allow many parallel data definition
      statements. Issue #2075
      5a200cb3
    • Georgy Kirichenko's avatar
      Add before statement trigger for spaces · c60fa224
      Georgy Kirichenko authored
      We need to lock box schema while editing a ddl space. This lock should
      be done before any changes in a ddl space. Before trigger is the good
      place to issue a schema lock. See #2075
      c60fa224
    • Vladimir Davydov's avatar
      box: require box.cfg.checkpoint_count to be >= 1 · 27b86b1d
      Vladimir Davydov authored
      We must store at least one snapshot, otherwise we wouldn't recover
      after restart, so if checkpoint_count is set to 0, we disable garbage
      collection. This contravenes the notion followed everywhere else in
      tarantool: if we want an option value (timeout, checkpoint count, etc)
      to be infinite, we should set it to a very big number, not to 0.
      Make checkpoint_count comply.
      27b86b1d
    • Vladimir Davydov's avatar
      box: rework internal garbage collection API · 2c547c26
      Vladimir Davydov authored
      The current gc implementation has a number of flaws:
      
       - It tracks checkpoints, not consumers, which makes it impossible to
         identify the reason why gc isn't invoked. All we can see is the
         number of users of each particular checkpoint (reference counter),
         while it would be good to know what references it (replica or
         backup).
      
       - While tracking checkpoints suits well for backup and initial join, it
         doesn't look good when used for subscribe, because replica is
         supposed to track a vclock, not a checkpoint.
      
       - Tracking checkpoints from box/gc also violates encapsulation:
         checkpoints are, in fact, memtx snapshots, so they should be tracked
         by memtx engine, not by gc, as they are now. This results in
         atrocities, like having two snap xdirs - one in memtx, another in gc.
      
       - Garbage collection is invoked by a special internal function,
         box.internal.gc.run(), which is passed the signature of the oldest
         checkpoint to save. This function is then used by the snapshot daemon
         to maintain the configured number of checkpoints. This brings
         unjustified complexity to the snapshot daemon implementation: instead
         of just calling box.snapshot() periodically it has to take on
         responsibility to invoke the garbage collector with the right
         signature. This also means that garbage collection is disabled unless
         snapshot daemon is configured to be running, which is confusing, as
         snapshot daemon is disabled by default.
      
      So this patch reworks box/gc as follows:
      
       - Checkpoints are now tracked by memtx engine and can be accessed via a
         new module box/src/checkpoint.[hc], which provides simple wrappers
         around corresponding MemtxEngine methods.
      
       - box/gc.[hc] now tracks not checkpoints, but individual consumers that
         can be registered, unregistered, and advanced. Each consumer has a
         human-readable name displayed by box.internal.gc.info():
      
         tarantool> box.internal.gc.info()
         ---
         - consumers:
           - name: backup
             signature: 8
           - name: replica 885a81a9-a286-4f06-9cb1-ed665d7f5566
             signature: 12
           - name: replica 5d3e314f-bc03-49bf-a12b-5ce709540c87
             signature: 12
           checkpoints:
           - signature: 8
           - signature: 11
           - signature: 12
         ...
      
       - box.internal.gc.run() is removed. Garbage collection is now invoked
         automatically by box.snapshot() and doesn't require the snapshot
         daemon to be up and running.
      2c547c26
Loading