Skip to content
Snippets Groups Projects
  1. Feb 20, 2018
    • Vladislav Shpilevoy's avatar
      Rename field_type_is_compatible to field_type1_contains_type2 · 5d2614d8
      Vladislav Shpilevoy authored
      Compatibility must be commutative, but this function is not
      commutative. It checks, that one type can store values of another
      type, but not conversely.
      5d2614d8
    • Vladislav Shpilevoy's avatar
      06aa0266
    • Vladimir Davydov's avatar
      vinyl: allow to disable bloom filter for index · 68a5e6df
      Vladimir Davydov authored
      Not all workloads need bloom filters enabled for all indexes. Let's
      allow to disable them on per-index basis by setting bloom_fpr to 1.
      This will allow to save some memory if bloom filters are unused.
      
      Closes #3138
      68a5e6df
    • Vladimir Davydov's avatar
      Check vinyl index options on box cfg and space alter · 632592bf
      Vladimir Davydov authored
      Currently, one can set insane values for most vinyl index options, which
      will most certainly result in a crash (e.g. bloom_fpr = 100). Add some
      sanity checks.
      632592bf
    • Vladimir Davydov's avatar
      replication: fix rebootstrap race that results in broken subscription · 72ed72a2
      Vladimir Davydov authored
      While a node of the cluster is re-bootstrapping (joining again),
      other nodes may try to re-subscribe to it. They will fail, because
      the rebootstrapped node hasn't tried to subscribe hence hasn't been
      added to the _cluster table yet and so is not present in the hash
      at the subscriber's side for replica_on_applier_reconnect() to look
      it up.
      
      Fix this by making a subscriber create an id-less (REPLICA_ID_NIL)
      struct replica in this case and reattach the applier to it. It will
      be assigned an id when it finally subscribes and is registered in
      _cluster.
      
      Fixes 71b33405 replication: reconnect applier on master rebootstrap
      72ed72a2
    • Vladimir Davydov's avatar
      replication: stop syncing if quorum cannot be formed · 042c07dc
      Vladimir Davydov authored
      If box.cfg() successfully connects to a number of replicas sufficient to
      form a quorum (>= box.cfg.replication_connect_quorum), it won't return
      until it syncs with all of them (lag <= box.cfg.replication_sync_lag).
      If one of the replicas forming a quorum disconnects permanently while
      sync is in progress, box.cfg() will hang forever.
      
      Such a behavior is rather unreasonable. After all, syncing a quorum is
      best-effort. It would be much more sensible to return from box.cfg()
      leaving the instance in the 'orphan' mode in this case. This patch does
      exactly that: now if we detect that not enough replicas are connected to
      form a quorum while we are syncing we stop syncing immediately.
      042c07dc
    • Vladimir Davydov's avatar
      Introduce replication_connect_timeout configuration option · 0e9b87c7
      Vladimir Davydov authored
      Currently, the max time box.cfg() may wait for connection to replicas to
      be established is hardcoded to box.cfg.replication_timeout times 4. As
      a result, users can't revert to pre replication_connect_quorum behavior,
      when box.cfg() blocks until it connects to all replicas. To fix that,
      let's introduce a new configuration option, replication_connect_timeout,
      which determines the replication configuration timeout. By default the
      option is set to 4 seconds.
      
      Closes #3151
      0e9b87c7
    • Vladimir Davydov's avatar
      vinyl: skip needless checks for duplicates on INSERT · 36838c7c
      Vladimir Davydov authored
      If a unique index includes all parts of another unique index, we can
      skip the check for duplicates for it on INSERT. Let's mark all such
      indexes with a special flag on CREATE/ALTER and optimize out the check
      if the flag is set. If there are two indexes that index the same set
      of fields, check uniqueness for the one with a lower id, because it
      is likelier to have have a "warmer" cache.
      
      Closes #3154
      36838c7c
    • Vladimir Davydov's avatar
      vinyl: delete runs compacted during join immediately · 9c446a20
      Vladimir Davydov authored
      We keep run files corresponding to (at least) the last snapshot, because
      we need them for backups and replication. Deletion of compacted run
      files is postponed until the next snapshot. As a consequence, we don't
      delete run files created on a replica during the join stage. However, in
      contrast to run files created during normal operation, these are pure
      garbage and should be deleted right away. Not deleting them can result
      in depletion of disk space, because vinyl has quite high write
      amplification by design.
      
      We can't write a functional test for this, because there's no way to
      guarantee that compaction started during join will finish before join
      completion - if it doesn't, compacted runs won't be removed, because
      they will be assigned to the snapshot created by join.
      
      Closes #3162
      9c446a20
    • Vladislav Shpilevoy's avatar
      vinyl: forbid vinyl index key definition alter · c31dd19a
      Vladislav Shpilevoy authored
      Vinyl index key definition is stored in vylog even if an index is
      empty, and we while do not have a method to update it. So vinyl
      index key definition alter is forbidden even on an empty space.
      
      Closes #3169
      c31dd19a
  2. Feb 19, 2018
  3. Feb 17, 2018
    • Georgy Kirichenko's avatar
      Don't exit from ddl until all other ddls are flushed · 350b645a
      Georgy Kirichenko authored
      If any ddl operation is in progress then all other ddls are
      waiting on schema latch. But after first ddl will be done any
      other request may be issued just after it and commit order will be
      broken in case of multimaster replication.  To prevent this
      behavior any ddl operation should wait until all queued ddls are
      done.
      
      Fixes #2951
      350b645a
    • Georgy Kirichenko's avatar
      Enhance latch behavior · 03fe7a2f
      Georgy Kirichenko authored
      Prevent latch lock interception by other already scheduled or active
      fiber if there is only one waiting. This is needed for strict latch
      ordering.
      03fe7a2f
  4. Feb 16, 2018
  5. Feb 15, 2018
    • Konstantin Osipov's avatar
      Revert "replication: disconnect applier on timeout" · a7871247
      Konstantin Osipov authored
      This reverts commit 99c7a971.
      a7871247
    • Vladimir Davydov's avatar
      vinyl: warn when transaction waits for quota for too long · d6a61904
      Vladimir Davydov authored
      If a vinyl transaction stalls waiting for quota for more than
      box.cfg.too_long_threshold seconds, emit a warning to the log:
      
        W> waited for 699089 bytes of vinyl memory quota for too long: 0.504 sec
      
      This will help us understand whether our users experience lags
      due to absence of throttling in vinyl (see #1862).
      
      Closes #3096
      d6a61904
    • imarkov's avatar
      schema: improve arguments check in grant or revoke on universe · 7e652c22
      imarkov authored
      The name of the universe is optional, so we don't check it. If a user
      wants to specify extra options in the grant, such as if_not_exists, and
      mistakes object name argument with options argument, options are
      silently ignored:
      
        box.schema.user.grant('tnt', 'read,write,execute', 'universe', {if_not_exists = true})
      
      Fix this by adding Lua code that ensures that universe name is a scalar
      (string or nil).
      
      Closes #3146
      7e652c22
  6. Feb 13, 2018
  7. Feb 11, 2018
    • Vladimir Davydov's avatar
      vinyl: implement space.bsize, index.bsize, and index.len · f3ca517d
      Vladimir Davydov authored
       - space.bsize returns the size of user data stored in the space.
         It is the sum of memory.bytes and disk.bytes as reported by
         the primary index.
      
       - index.bsize returns the size of memory used for indexing data.
         It is the sum of memory.index_size, disk.index_size, and
         disk.bloom_size as reported by index.info. For secondary indexes
         we also add the size of binary data stored on disk (disk.bytes),
         because it is only needed to build the index.
      
       - index.len returns the total number of rows stored in the index.
         It is the sum of memory.rows and disk.rows as reported by
         index.info. Note, it may be greater than the number of tuples
         stored in the space, because it includes DELETE and UPDATE
         statements.
      
      Closes #2863
      Closes #3056
      f3ca517d
    • Vladimir Davydov's avatar
      vinyl: report size of memory used for indexing data in index.info · eea5967b
      Vladimir Davydov authored
      This patch adds the following statistics to index.info:
      
       - memory.index_size - size of memory tree extents
       - cache.index_size - size of cache tree extents
       - disk.index_size - size of page index
       - disk.bloom_size - size of bloom filters
      eea5967b
    • Konstantin Belyavskiy's avatar
      httpc: allow to use unix socket as connection endpoint · 04e75f2c
      Konstantin Belyavskiy authored
      This patch adds a new connection option to http client, 'unix_socket'.
      The option specifies the path to the unix socket to use as connection
      endpoint instead of TCP:
      
        httpc = require('http.client')
        httpc.request('GET', 'http://localhost/index.html', nil,
                      {unix_socket = '/var/run/docker.sock'})
      
      The option is supported only if tarantool was built with libcurl 7.40.0
      or newer. For older versions, an attempt to use the option will result
      in a Lua exception.
      
      Suggested and first implemented by @rosik.
      The test was refactored by @locker.
      
      Closes #3040
      04e75f2c
  8. Feb 10, 2018
    • Vladimir Davydov's avatar
      Make box.once() wait until instance enters rw mode · 33980fc5
      Vladimir Davydov authored
      It will help resolve box.once() conflicts in case master is rw
      and replica is ro.
      
      Closes #2537
      33980fc5
    • Vladimir Davydov's avatar
      Add Lua helpers to wait for server to switch to/from ro mode · 1d45d7b4
      Vladimir Davydov authored
      This patch adds two new Lua function, box.ctl.wait_ro() and
      box.ctl.wait_rw(), that block the current fiber until the
      server switches to read-only or read-write mode, respectively.
      Both functions take the timeout as an optional argument.
      
      Needed for #2537
      1d45d7b4
    • Vladimir Davydov's avatar
      Fix compilation with ENABLE_BACKTRACE=OFF · ddb6f0b5
      Vladimir Davydov authored
        src/lua/init.c: In function ‘tarantool_panic_handler’:
        src/lua/init.c:321:2: error: implicit declaration of function ‘print_backtrace’ [-Werror=implicit-function-declaration]
          print_backtrace();
          ^~~~~~~~~~~~~~~
      
        src/lua/fiber.c:244:1: error: ‘lbox_fiber_statof_bt’ defined but not used [-Werror=unused-function]
         lbox_fiber_statof_bt(struct fiber *f, void *cb_ctx)
         ^~~~~~~~~~~~~~~~~~~~
      ddb6f0b5
  9. Feb 08, 2018
    • Vladimir Davydov's avatar
      txn: fix rollback in sub statement · 6b49134d
      Vladimir Davydov authored
      There are two issues in the rollback code:
      
       - txn_rollback_stmt() rollbacks the current autocommit transaction even
         if it is called from a sub-statement. As a result, if a sub-statement
         (i.e. a statement called from a before_replace or on_replace trigger)
         fails (e.g. due to a conflict), it will trash the current transaction
         leading to a bad memory access upon returning from the trigger.
      
       - txn_begin_stmt() calls txn_rollback_stmt() on failure even if it did
         not instantiate the statement. So if it is called from a trigger and
         fails (e.g. due to nesting limit), it may trash the parent statement,
         again leading to a crash.
      
      Fix them both and add some tests.
      
      Closes #3127
      6b49134d
    • Vladimir Davydov's avatar
      alter: do not require index rebuild to clear uniqueness · 7528303c
      Vladimir Davydov authored
      Obviously, there's no point in rebuilding an index if all we do is
      relaxing the uniqueness property. This will also allow us to clear
      the uniqueness flag for vinyl indexes, which do not support rebuild.
      
      Note, a memtx tree index stores a pointer to either cmp_def or key_def
      depending on whether the index is unique. Hence to clear the uniqueness
      flag without rebuilding the index, we need to update this pointer. To do
      that, we add a new index virtual method, update_def.
      
      Closes #2449
      7528303c
    • Vladimir Davydov's avatar
      index: remove unused C++ wrappers · dea88836
      Vladimir Davydov authored
      dea88836
  10. Feb 06, 2018
    • Vladimir Davydov's avatar
      replication: allow to rebootstrap replica from read-only master · 8b08ec59
      Vladimir Davydov authored
      If an instance is read-only, an attempt to join a new replica to it will
      fail with ER_READONLY, because joining a replica to a cluster implies
      registration in the _cluster system space. However, if the replica is
      already registered, which is the case if it is being rebootstrapped with
      the same uuid (see box.cfg.instance_uuid), the record corresponding to
      the replica is already present in the _cluster space and hence no write
      operation is required. Still, rebootstrap fails with the same error.
      
      Let's rearrange the access checks to make it possible to rebootstrap a
      replica from a read-only master provided it has the same uuid.
      
      Closes #3111
      8b08ec59
    • Vladimir Davydov's avatar
      vinyl: don't check key uniqueness if indexed fields are not updated · ab726031
      Vladimir Davydov authored
      We can save a lookup in a secondary index on update if indexed fields
      are not modified. The extra check comes for free as we have a bit mask
      of all updated fields.
      
      Closes #2980
      ab726031
    • Vladimir Davydov's avatar
      replication: fix cluster node rebootstrap · 4e62423e
      Vladimir Davydov authored
      When a tarantool instance starts for the first time (the local directory
      is empty), it chooses the peer with the lowest UUID as the bootstrap
      master. As a result, one cannot reliably rebootstrap a cluster node
      (delete all local files and restart): if the node happens to have the
      lowest UUID in the cluster after restart, it will assume that it is the
      leader of a new cluster and bootstrap locally, splitting the cluster in
      two.
      
      To fix this problem, let's always give preference to peers with a higher
      vclock when choosing a bootstrap master and only fall back on selection
      by UUID if two or more peers have the same vclock. To achieve that, we
      need to introduce a new iproto request type for fetching the current
      vclock of a tarantool instance (we cannot squeeze the vclock in the
      greeting, because the latter is already packed). The new request type is
      called IPROTO_REQUEST_VOTE so that in future it can be reused for a more
      sophisticated leader election algorithm. It has no body and does not
      require authentication. In reply to such a request, a tarantool instance
      will send IPROTO_OK and its current vclock. If the version of the master
      is >= 1.7.7, an applier will send IPROTO_REQUEST_VOTE to fetch the
      master's vclock before trying to authenticate. The vclock will then be
      to determine the node to bootstrap from.
      
      Closes #3108
      4e62423e
    • Vladimir Davydov's avatar
      Cleanup xrow.h · e1d0946b
      Vladimir Davydov authored
      No functional changes, just a trivial cleanup:
      
       - Move all C functions inside extern "C" section.
       - Rename xrow_decode_join to xrow_decode_join_xc.
       - Make XXX_xc wrappers around XXX functions.
      e1d0946b
  11. Feb 05, 2018
    • Vladimir Davydov's avatar
      applier: do not print 'authenticated' message if connecting as guest · 674c1058
      Vladimir Davydov authored
      Before commit 2788dc1b ("Add APPLIER_READY state") we only printed
      the 'authenticated' message to the log in case credentials were set in
      the replication URI. The commit changed that: now we print the message
      even in case of guest connections, when applier does not send the AUTH
      command to the master at all. As a result if guest connections are not
      permitted by the master, the applier will keep printing 'authenticated'
      after every unsuccessful attempt to subscribe. This is misleading. Let
      us revert back to the behavior we had before commit 2788dc1b.
      
      Closes #3113
      674c1058
  12. Feb 02, 2018
Loading