Skip to content
Snippets Groups Projects
  1. Oct 25, 2018
  2. Sep 20, 2018
  3. Sep 15, 2018
    • Alexander Turenko's avatar
      Fix Debug build on GCC 8 · 8c538963
      Alexander Turenko authored
      Fixed false positive -Wimplicit-fallthrough in http_parser.c by adding a
      break. The code jumps anyway, so the execution flow is not changed.
      
      Fixed false positive -Wparenthesis in reflection.h by removing the
      parentheses. The argument 'method' of the macro 'type_foreach_method' is
      just name of the loop variable and is passed to the macro for
      readability reasons.
      
      Fixed false positive -Wcast-function-type triggered by reflection.h by
      adding -Wno-cast-function-type for sources and unit tests. We cast a
      pointer to a member function to an another pointer to member function to
      store it in a structure, but we cast it back before made a call. It is
      legal and does not lead to an undefined behaviour.
      
      Fixes #3685.
      Unverified
      8c538963
  4. Sep 04, 2018
    • Vladimir Davydov's avatar
      box: sync on replication configuration update · 113ade24
      Vladimir Davydov authored
      Now box.cfg() doesn't return until 'quorum' appliers are in sync not
      only on initial configuration, but also on replication configuration
      update. If it fails to synchronize within replication_sync_timeout,
      box.cfg() returns without an error, but the instance enters 'orphan'
      state, which is basically read-only mode. In the meantime, appliers
      will keep trying to synchronize in the background, and the instance
      will leave 'orphan' state as soon as enough appliers are in sync.
      
      Note, this patch also changes logging a bit:
       - 'ready to accept request' is printed on startup before syncing
         with the replica set, because although the instance is read-only
         at that time, it can indeed accept all sorts of ro requests.
       - For 'connecting', 'connected', 'synchronizing' messages, we now
         use 'info' logging level, not 'verbose' as they used to be, because
         those messages are important as they give the admin idea what's
         going on with the instance, and they can't flood logs.
       - 'sync complete' message is also printed as 'info', not 'crit',
         because there's nothing critical about it (it's not an error).
      
      Also note that we only enter 'orphan' state if failed to synchronize.
      In particular, if the instnace manages to synchronize with all replicas
      within a timeout, it will jump from 'loading' straight into 'running'
      bypassing 'orphan' state. This is done for the sake of consistency
      between initial configuration and reconfiguration.
      
      Closes #3427
      
      @TarantoolBot document
      Title: Sync on replication configuration update
      The behavior of box.cfg() on replication configuration update is
      now consistent with initial configuration, that is box.cfg() will
      not return until it synchronizes with as many masters as specified
      by replication_connect_quorum configuration option or the timeout
      specified by replication_connect_sync occurs. On timeout, it will
      return without an error, but the instance will enter 'orphan' state.
      It will leave 'orphan' state as soon as enough appliers have synced.
    • Olga Arkhangelskaia's avatar
      box: add replication_sync_timeout configuration option · ca9fc33a
      Olga Arkhangelskaia authored
      In the scope of #3427 we need timeout in case if an instance waits for
      synchronization for too long, or even forever. Default value is 300.
      
      Closes #3674
      
      @locker: moved dynamic config check to box/cfg.test.lua; code cleanup
      
      @TarantoolBot document
      Title: Introduce new configuration option replication_sync_timeout
      After initial bootstrap or after replication configuration changes we
      need to sync up with replication quorum. Sometimes sync can take too
      long or replication_sync_lag can be smaller than network latency we
      replica will stuck in sync loop that can't be cancelled.To avoid this
      situations replication_sync_timeout can be used. When time set in
      replication_sync_timeout is passed replica enters orphan state.
      Can be set dynamically. Default value is 300 seconds.
      ca9fc33a
    • Olga Arkhangelskaia's avatar
      box: make replication_sync_lag option dynamic · 5eb5c181
      Olga Arkhangelskaia authored
      In #3427 replication_sync_lag should be taken into account during
      replication reconfiguration. In order to configure replication properly
      this parameter is made dynamic and can be changed on demand.
      
      @locker: moved dynamic config check to box/cfg.test.lua
      
      @TarantoolBot document
      Title: recation_sync_lag option can be set dynamically
      box.cfg.recation_sync_lag now can be set at any time.
      5eb5c181
  5. Aug 30, 2018
  6. Aug 29, 2018
    • Georgy Kirichenko's avatar
      iproto: don't throw exception in replication handler · 2e87902e
      Georgy Kirichenko authored
      It is an error to throw an error out of a cbus message handler because
      it breaks cbus message delivery. In case of replication throwing an
      error prevents iproto against replication socket closing.
      
      Closes #3642
      2e87902e
    • Vladimir Davydov's avatar
      Update test-run · 3e382d6e
      Vladimir Davydov authored
      So as instances started by test-run don't inherit file descriptors
      corresponding to logs and sockets of all running instances.
      
      Needed for testing #3642
      3e382d6e
  7. Aug 24, 2018
    • Alexander Turenko's avatar
      socket: prevent recvfrom from returning garbage · 87f9be4d
      Alexander Turenko authored
      In C recvfrom function sets addrlen parameter to zero when called on TCP
      socket (at least on Linux). The src_addr parameter can contain garbage
      in the case, so we should not dereference it.
      
      Before this commit socket:recvfrom() can return 'from' table with only
      family field (don't sure why, but addr->sa_family often contain PF_INET
      value in my case) or return nil depending on the garbage at the address.
      Now it always return nil.
      87f9be4d
    • Serge Petrenko's avatar
      replication: fix exit with ER_NO_SUCH_USER during bootstrap · 33950162
      Serge Petrenko authored
      When replication is configured via some user created in box.once()
      function and box.once() takes more than replication_timeout seconds
      to execute, appliers recieve ER_NO_SUCH_USER error, which they don't
      handle. This leads to occasional test failures in replication suite.
      Fix this by handling the aforementioned case in applier_f() and add a
      test case.
      
      Closes #3637
      33950162
    • Mergen Imeev's avatar
      lua: wrong 'tomap' work with nullable fields · 50a0f1e8
      Mergen Imeev authored
      Tuple method 'tomap' in some cases worked improperly if tuple
      length less than it should be according to space format. Fixed
      in this patch.
      
      Closes #3631
      50a0f1e8
  8. Aug 21, 2018
    • Konstantin Belyavskiy's avatar
      lua: fix for option pid_file overwritten by tarantoolctl · a1d685f3
      Konstantin Belyavskiy authored
      During startup tarantoolctl ignores 'pid_file' option and set it to
      default value.
      This cause a fault if user tries to execute config with option set.
      In case of being started with tarantoolctl shadow this option with
      additional wrapper around box.cfg.
      
      Closes #3214
      a1d685f3
    • Konstantin Belyavskiy's avatar
      Add FindICONV and iconv wrapper · dcac64af
      Konstantin Belyavskiy authored
      Fixing build under FreeBSD:
      Undefined symbol "iconv_open"
      Add compile time build check with FindICONV.cmake
      and a wrapper to import relevant symbol names with include file.
      
      Closes #3441
      dcac64af
    • Serge Petrenko's avatar
      box: fix long uri output in box.info() · aa7831c2
      Serge Petrenko authored
      lua_pushapplier() had an inexplicably small buffer for uri representation.
      Enlarged the buffer. Also lua_pushapplier() didn't take into account
      that uri_format() could return a value larger than buffer size. Fixed.
      
      Closes #3630
      aa7831c2
  9. Aug 14, 2018
    • Serge Petrenko's avatar
      replication: do not ignore replication_connect_quorum · c1a16b26
      Serge Petrenko authored
      On bootstrap and after initial configuration replication_connect_quorum
      was ignored. The instance tried to connect to every replica listed in
      replication parameter, and failed if it wasn't possible.
      
      The patch alters this behaviour. An instance still tries to connect to
      every node listed in box.cfg.replication, but does not raise an error if
      it was able to connect to at least replication_connect_quorum instances.
      
      Closes #3428
      
      @TarantoolBot document
      Title: replication_connect_quorum is not ignored
      Now on replica set bootstrap and in case of replication reconfiguration
      (e.g. calling box.cfg{replication=...} for the second time) tarantool
      doesn't fail, if it couldn't connect to to every replica, but could
      connect to replication_connect_quorum replicas. If after
      replication_connect_timeout seconds the instance is not connected to at
      least replication_connect_quorum other instances, we throw an error.
      c1a16b26
    • Serge Petrenko's avatar
      test: add arguments to replication instances · 438a4e65
      Serge Petrenko authored
      Add start arguments to replication test instances to control
      replication_timeout and replication_connect_timeout settings
      between restarts.
      
      Needed for #3428
      438a4e65
    • Serge Petrenko's avatar
      Update test-run · f702beeb
      Serge Petrenko authored
      Allows to pass arguments to servers started with create_cluster().
      f702beeb
  10. Aug 13, 2018
  11. Aug 11, 2018
    • Vladimir Davydov's avatar
      test: fix box/bitset test failure · 444355dd
      Vladimir Davydov authored
      Reproduce file:
      
      - [box/access.test.lua, null]
      - [box/iterator.test.lua, null]
      - [box/bitset.test.lua, null]
      
      The issue happens, because box/bitset.lua:dump() uses iterate(), which
      gets cleared by box/iterator test. Fix this by using utils.iterate()
      instead.
      444355dd
  12. Aug 10, 2018
    • Vladimir Davydov's avatar
      vinyl: fix appearance of phantom tuple in secondary index after update · e72867cb
      Vladimir Davydov authored
      index.update() looks up the old tuple in the primary index, applies
      update operations to it, then writes a DELETE statement to secondary
      indexes to delete the old tuple and a REPLACE statement to all indexes
      to insert the new tuple. It also sets a column mask for both DELETE and
      REPLACE statements. The column mask is a bit mask which has a bit set if
      the corresponding field is updated by update operations. It is used by
      the write iterator for two purposes. First, the write iterator skips
      REPLACE statements that don't update key fields. Second, the write
      iterator turns a REPLACE that has a column mask that intersects with key
      fields into an INSERT (so that it can get annihilated with a DELETE when
      the time comes). The latter is correct, because if an update() does
      update secondary key fields, then it must have deleted the old tuple and
      hence the new tuple is unique in terms of extended key (merged primary
      and secondary key parts, i.e. cmp_def).
      
      The problem is that a bit may be set in a column mask even if the
      corresponding field does not actually get updated. For example, consider
      the following example.
      
        s = box.schema.space.create('test', {engine = 'vinyl'})
        s:create_index('pk')
        s:create_index('sk', {parts = {2, 'unsigned'}})
        s:insert{1, 10}
        box.snapshot()
        s:update(1, {{'=', 2, 10}})
      
      The update() doesn't modify the secondary key field so it only writes
      REPLACE{1, 10} to the secondary index (actually it writes DELETE{1, 10}
      too, but it gets overwritten by the REPLACE). However, the REPLACE has
      column mask that says that update() does modify the key field, because a
      column mask is generated solely from update operations, before applying
      them. As a result, the write iterator will not skip this REPLACE on
      dump. This won't have any serious consequences, because this is a mere
      optimization. What is worse, the write iterator will also turn the
      REPLACE into an INSERT, which is absolutely wrong as the REPLACE is
      preceded by INSERT{1, 10}. If the tuple gets deleted, the DELETE
      statement and the INSERT created by the write iterator from the REPLACE
      will get annihilated, leaving the old INSERT{1, 10} visible.
      
      The issue may result in invalid select() output as demonstrated in the
      issue description. It may also result in crashes, because the tuple
      cache is very sensible to invalid select() output.
      
      To fix this issue let's clear key bits in the column mask if we detect
      that an update() doesn't actually update secondary key fields although
      the column mask says it does.
      
      Closes #3607
      e72867cb
  13. Aug 08, 2018
    • Mergen Imeev's avatar
      test: fix box/errinj.test.lua sporadic failure · 8c06a069
      Mergen Imeev authored
      In some cases operation box.snapshot() takes longer than expected.
      This leads to situations when the previous error is reported instead
      of the new one. Now these errors completely separated.
      
      Closes #3599
      8c06a069
  14. Aug 07, 2018
  15. Aug 03, 2018
  16. Aug 02, 2018
  17. Jul 26, 2018
  18. Jul 22, 2018
    • Vladimir Davydov's avatar
      replication: unregister replica with gc if deleted from cluster · ea28a925
      Vladimir Davydov authored
      When a replica is removed from the cluster table, the corresponding
      replica struct isn't destroyed unless both the relay and the applier
      attached to it are stopped, see replica_clear_id(). Since replica struct
      is a holder of the garbage collection state, this means that in case an
      evicted replica has an applier or a relay that fails to exit for some
      reason, garbage collection will hang.
      
      A relay thread stops as soon as the replica it was started for receives
      a row that tries to delete it from the cluster table (because this isn't
      allowed by the cluster space trigger, see on_replace_dd_cluster()).
      If a replica isn't running, the corresponding relay can't run as well,
      because writing to a closed socket isn't allowed. That said, a relay
      can't block garbage collection.
      
      An applier, however, is deleted only when replication is reconfigured.
      So if a replica that was evicted from the cluster was configured as a
      master, its replica struct will hang around blocking garbage collection
      for as long as the replica remains in box.cfg.replication. This is what
      happens in #3546.
      
      Fix this issue by forcefully unregistering a replica with the garbage
      collector when it is deleted from the cluster table. This is OK as it
      won't be able to resubscribe and so we don't need to keep WALs for it
      any longer. Note, the relay thread may still be running when a replica
      is deleted from the cluster table, in which case we can't unregister it
      with the garbage collector right away, because the relay may need to
      access the garbage collection state. In such a case, leave the job to
      replica_clear_relay, which is called as soon as the relay thread exits.
      
      Closes #3546
      ea28a925
  19. Jul 19, 2018
  20. Jul 17, 2018
    • Kirill Shcherbatov's avatar
      net.box: fix invalid index:count() with iterator · 25b9f0f0
      Kirill Shcherbatov authored
      Net.box didn't pass options containing iterator to
      server side.
      There were also invalid results for two :count tests in
      net.box.result file.
      
      Thanks @ademenev for contributing problem and help with
      problem locating.
      
      Closes #3262.
      25b9f0f0
  21. Jul 16, 2018
  22. Jul 13, 2018
  23. Jul 12, 2018
    • Kirill Shcherbatov's avatar
      third-party: update libyaml submodule · aeabe633
      Kirill Shcherbatov authored
      Need to update tests as with fixup in upstrem
      commit baf636a74b4b6d055d93e2d01366d6097eb82d90
      Author: Tina Müller <cpan2@tinita.de>
      Date:   Thu Jun 14 19:27:04 2018 +0200
      
      The closing single quote needs to be indented...
      if it's on its own line.
      
      Closes #3275.
      aeabe633
Loading