Skip to content
Snippets Groups Projects
  1. Dec 06, 2018
    • Sergei Voronezhskii's avatar
      test: replication parallel mode on · f5c8b825
      Sergei Voronezhskii authored
      Part of #2436, #3232
      f5c8b825
    • Sergei Voronezhskii's avatar
      test: use wait_cond to check follow status · f41548b7
      Sergei Voronezhskii authored
      After setting timeouts in `box.cfg` and before making a `replace` needs
      to wait for replicas in `follow` status. Then if `wait_follow()` found
      not `follow` status it returns true. Which immediately causes an error.
      
      Fixes #3734
      Part of #2436, #3232
      f41548b7
    • Sergei Voronezhskii's avatar
      test: put require in proper places · d2f28afa
      Sergei Voronezhskii authored
      * put `require('fiber')` after each switch server command, because
        sometimes got 'fiber' not defined error
      * use `require('fio')` after `require('test_run').new()`, because
        sometimes got 'fio' not defined error
      
      Part of #2436, #3232
      d2f28afa
    • Sergei Voronezhskii's avatar
      test: errinj for pause relay_send · 1c34c91f
      Sergei Voronezhskii authored
      Instead of using timeout we need just pause `relay_send`. Can't rely
      on timeout because of various system load in parallel mode. Add new
      errinj which checks boolean in loop and until it is not `True` do not
      pass the method `relay_send` to the next statement.
      
      To check the read-only mode, need to make a modification of tuple. It
      is enough to call `replace` method. Instead of `delete` and then
      useless verification that we have not delete tuple by using `get`
      method.
      
      And lookup the xlog files in loop with a little sleep, until the file
      count is not as expected.
      
      Update box/errinj.result because new errinj was added.
      
      Part of #2436, #3232
      1c34c91f
    • Sergei Voronezhskii's avatar
      test: cleanup replication tests · 848a0b03
      Sergei Voronezhskii authored
      - at the end of tests which create any replication config need to call:
        * `test_run:cmd('delete server ...')` removes server object
          from `TestState.servers` list, this behaviour was taken
          from `test_run:drop_cluster()` function
        * `test_run:clenup_cluster()` which clears `box.space._cluster`
      - switch on `use_unix_sockets` because of 'Address already in use'
        problems
      - test `once` need to clean `once*` schemas
      
      Part of #2436, #3232
      848a0b03
    • Kirill Shcherbatov's avatar
      sql: fix tarantoolSqlite3TupleColumnFast · 2bfe8ac5
      Kirill Shcherbatov authored
      The tarantoolSqlite3TupleColumnFast routine used to lookup
      offset_slot in unallocated memory in some cases.
      The assert with exact_field_count same as motivation to change
      old correct assert with field_count in 7a8de281 is not correct.
      assert(format->exact_field_count == 0 ||
             fieldno < format->exact_field_count);
      The tarantoolSqlite3TupleColumnFast routine requires offset_slot
      that has been allocated during tuple_format_create call. This
      value is stored in indexed field with index that limited with
      index_field_count that is <= field_count. Look at
      tuple_format_alloc for more details.
      
      The format in cursor triggering valid assertion has such
      structure because first 4 tuples in _space: 257, 272, 276 and
      280 have an old format of _space with only one field
      (format->field_count == 1).
      It happens because these 4 tuples are recovered not after tuple
      with id 280 which stores actual format of _space. After tuple
      280 is recovered, an actual format is set in struct space of
      _space and all next tuples have full featured formats.
      
      So for these 4 tuples tarantoolSqlite3TupleColumnFast can fail
      even if a field exists, is indexed and has a name. Those
      features are just described in a newer format.
      (thank Gerold103 for problem explanation)
      
      Closes #3772
      2bfe8ac5
    • Kirill Shcherbatov's avatar
      sql: fix parser.parse_only mode for triggers · ac73e345
      Kirill Shcherbatov authored
      As the parse_only flag had not worked correctly for sql triggers
      sql_trigger_compile have had a Vdbe memory leak.
      
      Closes #3838
      ac73e345
    • Kirill Shcherbatov's avatar
      box: fix checkpoint_delete · 3c8330ea
      Kirill Shcherbatov authored
      The rlist_foreach_entry iterator was used for freeing resources.
      As a result there was dirty access to memory during next step of
      for-loop.
      Replaced with rlist_foreach_entry_safe valid for destructors.
      
      Closes #3858
      3c8330ea
  2. Dec 04, 2018
    • Vladislav Shpilevoy's avatar
      c6e5bf48
    • Vladislav Shpilevoy's avatar
      box: move info_handler interface into src/info · f1a114ca
      Vladislav Shpilevoy authored
      Box/info.h defines info_handler interface with a set
      of virtual functions. It allows to hide Lua from code
      not depending on this language, and is used in things
      like index:info(), box.info() to build Lua table with
      some info. But it does not depend on box/ so move it
      to src/.
      
      Also, this API is needed for the forthcoming SWIM
      module which is going to be placed into src/lib and
      needs info to dump its state to Lua from C without
      strict Lua dependency.
      
      @locker:
       - remove pointless _GNU_SOURCE definition from
         box/lua/info.c
       - remove luaT_info_handler_create declaration from
         box/lua/info.h
      
      Needed for #3234
      f1a114ca
  3. Dec 03, 2018
    • Vladimir Davydov's avatar
      lua: getpwall/getgrall error handling - follow-up fixes · a1606e91
      Vladimir Davydov authored
       - Add the forgotten errno(0) to getgrall.
       - Throw errors from getgrall/getpwall instead of returning nil in
         case the underlying system function fails.
       - Fix the error message in getgr.
       - Remove pointless and confusing asterisk sign from error messages.
       - Do not hide a stack frame on error.
      
      Follow-up efccac69 ("lua: fix error handling in getpwall and
      getgrall").
      a1606e91
    • Alexander Turenko's avatar
      lua: fix error handling in getpwall and getgrall · efccac69
      Alexander Turenko authored
      This commit fixes app-tap/pwd.test.lua test. It seems that the problem
      appears after updating to glibc-2.28.
      
      It seems that usual way to handle errors in Unix is to check errno only
      when a return value indicates possibility of an error.
      
      Related to #3766.
      efccac69
    • Alexander Turenko's avatar
      Remove deprecated getaddrinfo() flags · b601d0be
      Alexander Turenko authored
      AI_IDN_ALLOW_UNASSIGNED and AI_IDN_USE_STD3_ASCII_RULES flags are
      deprecated by glibc-2.28 and the deprecation warnings did cause fail of
      Debug build, because of -Werror.
      
      Fixes #3766.
      b601d0be
    • Vladislav Shpilevoy's avatar
      box: move port to src/ · 1730b39a
      Vladislav Shpilevoy authored
      Basic port structure does not depend on anything but
      standard types. It just gives an interface and calls
      virtual functions.
      
      Its location in box/ was ok since it was not used
      anywhere in src/. But next commits will add a new
      method to mpstream so as to dump port. Mpstream is
      implemented in src/, so lets move port over here.
      
      Needed for #3505
      1730b39a
    • Alexander Turenko's avatar
      test: fix app/fiber.test.lua flaky fails · 0e19478c
      Alexander Turenko authored
      Fixes #3852.
      0e19478c
    • Alexander Turenko's avatar
      test: fix hardcoded port in box/net.box.test.lua · f36568c0
      Alexander Turenko authored
      It allows to run the test many times in parallel to investigate flaky
      test failures and decreases probability that the test fails, because
      this port was already used by, say, some other test.
      f36568c0
    • Alexander Turenko's avatar
      test: fix http_client.test.lua with curl-7.62 · 10518cc1
      Alexander Turenko authored
      curl-7.61.1
      
      ```
      tarantool> require('http.client').new():get('http://localhost:0')
      ---
      - status: 595
        reason: Couldn't connect to server
      ```
      
      curl-7.62
      
      ```
      tarantool> require('http.client').new():get('http://localhost:0')
      ---
      - error: 'curl: URL using bad/illegal format or missing URL'
      ...
      ```
      
      curl-7.62 returns CURLE_URL_MALFORMAT is case of zero port and tarantool
      raises an error in the case. I think this behaviour is valid, so I fixed
      the test.
      10518cc1
  4. Nov 29, 2018
    • Vladimir Davydov's avatar
      gc: run garbage collection in background · 07191842
      Vladimir Davydov authored
      Currently, garbage collection is executed synchronously by functions
      that may trigger it, such as gc_consumer_advance or gc_add_checkpoint.
      As a result, one has to be very cautious when using those functions as
      they may yield at their will. For example, we can't shoot off stale
      consumers right in tx_prio handler - we have to use rather clumsy WAL
      watcher interface instead. Besides, in future, when the garbage
      collector state is persisted, we will need to call those functions from
      on_commit trigger callback, where yielding is not normally allowed.
      
      Actually, there's no reason to remove old files synchronously - we could
      as well do it in the background. So this patch introduces a background
      garbage collection fiber that executes gc_run when woken up. Now all
      functions that might trigger garbage collection wake up this fiber
      instead of executing gc_run directly.
      07191842
    • Vladimir Davydov's avatar
      recovery: restore garbage collector vclock after restart · baf28a59
      Vladimir Davydov authored
      After restart the garbage collector vclock is reset to the vclock of the
      oldest preserved checkpoint, which is incorrect - it may be less in case
      there is a replica that lagged behind, and it may be greater as well in
      case the WAL thread hit ENOSPC and had to remove some WAL files to
      continue. Fix it.
      
      A note about xlog/panic_on_wal_error test. To check that replication
      stops if some xlogs are missing, the test first removes xlogs on the
      master, then restarts the master, then tries to start the replica
      expecting that replication should fail. Well, it shouldn't - the replica
      should rebootstrap instead. It didn't rebootstrap before this patch
      though, because the master reported wrong garbage collector vclock (as
      it didn't recover it on restart). After this patch the replica would
      rebootstrap and the test would hang. Fix this by restarting the master
      before removing xlog files.
      baf28a59
    • Vladimir Davydov's avatar
      wal: remove files needed for recovery from backup checkpoints on ENOSPC · bd7f7116
      Vladimir Davydov authored
      Tarantool always keeps box.cfg.checkpoint_count latest checkpoints. It
      also never deletes WAL files needed for recovery from any of them for
      the sake of redundancy, even if it gets ENOSPC while trying to write to
      WAL. This patch changes that behavior: now the WAL thread is allowed to
      delete backup WAL files in case of emergency ENOSPC - after all it's
      better than stopping operation.
      
      Closes #3822
      bd7f7116
    • Vladimir Davydov's avatar
      wal: separate checkpoint and flush paths · 74d8db74
      Vladimir Davydov authored
      Currently, wal_checkpoint() is used for two purposes. First, to make a
      checkpoint (rotate = true). Second, to flush all pending WAL requests
      (rotate = false). Since checkpointing has to fail if cascading rollback
      is in progress so does flushing. This is confusing. Let's separate the
      two paths.
      
      While we are at it, let's also rewrite WAL checkpointing using cbus_call
      instead of cpipe_push as it's a more convenient way of exchanging simple
      two-hop messages between two threads.
      74d8db74
    • Kirill Shcherbatov's avatar
      json: some renames · b56103f5
      Kirill Shcherbatov authored
      We are planning to link json_path_node objects in a tree and attach some
      extra information to them so that they could be used to describe a json
      document structure. Let's rename it to json_token as it sounds more
      appropriate for the purpose.
      
      Also, rename json_path_parser to json_lexer as it isn't a parser,
      really, it's rather a tokenizer or lexer. Besides, the new name is
      shorter.
      
      Needed for #1012
      b56103f5
    • Vladimir Davydov's avatar
      test: fix vinyl/errinj spurious failure · 8e13153b
      Vladimir Davydov authored
      The failing test case checks that modifications done to the space during
      the final dump of a newly built index are recovered properly. It assumes
      that a series of operations will complete in 0.1 seconds, but it may not
      happen if the disk is slow (like on Travis CI). This results in spurious
      failures. To fix this issue, let's replace ERRINJ_VY_RUN_WRITE_TIMEOUT
      used by the test with ERRINJ_VY_RUN_WRITE_DELAY, which blocks index
      creation until it is disabled instead of injecting a time delay as its
      predecessor did.
      
      Closes #3756
      8e13153b
    • Konstantin Osipov's avatar
      Don't repeast SQL stress tests with vinyl engine. · 6e07131d
      Konstantin Osipov authored
      These are stress testing some of the parser/vdbe features, no point
      in replaying them against vinyl. They could just as well run in
      wal_mode="none"
      6e07131d
    • Konstantin Osipov's avatar
      Disable gh-3332-tuple-format-leak.test, gh-3083-ephemeral-unref-tuples.test · 52a212f3
      Konstantin Osipov authored
      Disable these tests in regular suite until they are sped up in scope
      of gh-3845
      52a212f3
    • Ilya Markov's avatar
      lua: moving lua error functions to separate file · 27a04953
      Ilya Markov authored
      Refactoring. Move lua error functions to a separate file.
      
      A prerequisite for #677
      27a04953
    • Sergei Voronezhskii's avatar
      test: skip test backtrace if no libunwind support · 2aa25ba5
      Sergei Voronezhskii authored
      Closes #3824
      2aa25ba5
    • Mergen Imeev's avatar
      iproto: remove iproto functions from execute.c · 474bdf36
      Mergen Imeev authored
      To make functions in execute.h more universal we should reduce
      their dependence on IPROTO. This patch removes IPROTO functions
      from execute.c.
      
      Needed for #3505
      474bdf36
    • Mergen Imeev's avatar
      box: add method dump_lua to port · 6ecd7ee1
      Mergen Imeev authored
      New method dump_lua dumps saved in port tuples to Lua stack. It
      will allow us to call this method without any other interaction
      with port.
      
      Needed for #3505
      6ecd7ee1
    • Kirill Shcherbatov's avatar
      box: store sql text and length in sql_request · bc9e41e9
      Kirill Shcherbatov authored
      Refactored sql_request structure to store pointer to sql string
      data and it's length instead of pointer to msgpack
      representation.
      This is required to use this structure in sql.c where the query
      has a different semantics and can be obtained from stack as a C
      string.
      
      Needed for #3505.
      bc9e41e9
  5. Nov 28, 2018
  6. Nov 27, 2018
    • Sergei Voronezhskii's avatar
      test: enable parallel mode for wal_off tests · d837c94b
      Sergei Voronezhskii authored
      - Box configuration parameter `memtx_memory` is increased, because the
        test `lua` after `tuple` failed with the error:
        `Failed to allocate 368569 bytes in slab allocator for memtx_tuple`
        despite `collectgarbage('collect')` calls after cases with huge/many
        tuples.
        The statistics before the allocation fail gives the following values:
        ```
        box.slab.info()
        ---
        - items_size: 72786472
          items_used_ratio: 4.43%
          quota_size: 107374592
          quota_used_ratio: 93.75%
          arena_used_ratio: 6.1%
          items_used: 3222376
          quota_used: 100663296
          arena_size: 100663296
          arena_used: 6105960
        ```
        The reason of the fail seems to be a slab memory fragmentation. It is
        not clear for now whether we should consider this as a tarantool
        issue.
      
      - Test `snapshot_stress` counts snapshot files present in the
        working directory and can reach the default 'checkpoint_count' value
        `2` if a previous test write its snapshots before.
      
      - Restarting the default server w/o cleaning a working directory
        can leave a snapshot that holds a state saved at the middle of a test,
        before dropping of the space 'tweedledum' (because WAL is disabled),
        that can cause the error `Space 'tweedledum' already exists` for a
        following test.
      
      - Use unix sockets because of errors `Address already in use`.
      
      Part of #2436
      d837c94b
    • Mergen Imeev's avatar
      sql: remove fiber_gc() from sqlite3VdbeHalt() · e3d931e0
      Mergen Imeev authored
      Too many autogenerated ids leads to SEGFAULT. This problem
      appeared because region was cleaned twice: once in
      sqlite3VdbeHalt() and once in sqlite3VdbeDelete() which was
      executed during sqlite3_finalize(). Autogenerated ids that were
      saved there, were fetched after sqlite3VdbeHalt() and before
      sqlite3_finalize(). In this patch region cleaning in
      sqlite3VdbeHalt() has been removed.
      
      Follow up #2618
      Follow up #3199
      e3d931e0
    • Mergen Imeev's avatar
      sql: decode ARRAY and MAP types after SELECT · 135de5b5
      Mergen Imeev authored
      Before this patch MSGPACK received using SELECT statement through
      net.box was unpacked. Fixed in this patch.
      135de5b5
    • Serge Petrenko's avatar
      sql: fix error handling in sql_analysis_load() · fee95cf7
      Serge Petrenko authored
      Previously if an error occured in box_index_len() called from
      sql_analysis_load(), the return code (-1 on error) was cast to uint32_t
      and used later as size of memory to be allocated. This lead to assertion
      failures in slab_order() since allocation size was too big. This was
      discovered during investigation of #3779.
      Fix error handling and add some error logging.
      
      Follow-up #3779
      fee95cf7
    • Vladimir Davydov's avatar
      box: use replicaset.vclock in replica join/subscribe · f50f0b29
      Vladimir Davydov authored
      Again, this is something that was introduced by commit f2bccc18
      ("Use WAL vclock instead of TX vclock in most places") without any
      justification.
      
      TX has its own copy of the current vclock - there's absolutely no need
      to inquire it from the WAL thread. Actually, we already use TX local
      vclock in box_process_vote(). No reason to treat join/subscribe any
      different. Moreover, it's even harmful - there may be a gap at the end
      of a WAL file, in which case WAL vclock will be slightly ahead of TX
      vclock so that should a replica try to subscribe it would never finish
      syncing, see #3830.
      
      Closes #3830
      f50f0b29
    • Vladimir Davydov's avatar
      box: do not rotate WAL when replica subscribes · 7439529d
      Vladimir Davydov authored
      Because this is pointless and confusing. This "feature" was silently
      introduced by commit f2bccc18 ("Use WAL vclock instead of TX vclock
      in most places"). Let's revert this change. This will allow us to
      clearly separate WAL checkpointing from WAL flushing, which will in turn
      facilitate implementation of the checkpoint-on-WAL-threshold feature.
      
      There are two problems here, however. First, not rotating the log breaks
      expectations of replication/gc test: an xlog file doesn't get deleted in
      time as a consequence. This happens, because we don't delete xlogs
      relayed to a replica after join stage is complete - we only do it during
      subscribe stage - and if we don't rotate WAL on subscribe the garbage
      collector won't be invoked. This is actually a bug - we should advance
      the WAL consumer associated with a replica once join stage is complete.
      This patch fixes it, but it unveils another problem - this time in the
      WAL garbage collection procedure.
      
      Turns out, when passed a vclock, the WAL garbage collection procedure
      removes all WAL files that were created before the vclock. Apparently,
      this isn't quite correct - if a consumer is in the middle of a WAL file,
      we must not delete the WAL file, but we do. This works as long as
      consumers never track vlcocks inside WAL files - currently they are
      advanced only when a WAL file is closed and naturally they are advanced
      to the beginning of the next WAL file. However, if we want to advance
      the consumer associated with a replica when join stage ends (this is
      what the previous paragraph is about), it might occur that we will
      advance it to the middle of a WAL file. If that happens the WAL garbage
      collector might remove a file which is actually in use by a replica.
      Fix this as well.
      7439529d
    • Vladimir Davydov's avatar
      engine: pass vclock instead of lsn to collect_garbage callback · ca1eb666
      Vladimir Davydov authored
      First, this is consistent with other engine callbacks, such as
      checkpoint or backup.
      
      Second, a vclock can be used as a search key in a vclock set,
      which in turn can make code more straightforward, e.g. look how
      this patch simplifies vy_log_prev_checkpoint().
      ca1eb666
    • Vladimir Davydov's avatar
      Update small submodule · 6bc47d90
      Vladimir Davydov authored
      In the updated version rb_proto/rb_gen use const qualifier for the key
      argument, which allows to pass pointers to const objects to search
      methods.
      6bc47d90
Loading