Skip to content
Snippets Groups Projects
  1. Jun 28, 2018
    • Ilya Markov's avatar
      http: Fix parse long headers names · 3d121dd4
      Ilya Markov authored
      Bug: During parsing http headers, long headers names are truncated
      to zero length, but values are not ignored.
      
      Fix this with adding parameter  max_header_name_length to http request.
      If header name is bigger than this value, header name is truncated to
      this length. Default value of max_header_name_length is 32.
      
      Do some refactoring with renaming long names in http_parser.
      
      Closes #3451
      3d121dd4
    • Ilya Markov's avatar
      http: Remove parsed status line from headers · 139aa814
      Ilya Markov authored
      Bug: Header parser validates http status line and besides saving http
      status, saves valid characters to header name, which is wrong.
      
      Fix this with skipping status line after validation without saving it as
      a header.
      
      In scope of #3451
      139aa814
    • Vladimir Davydov's avatar
      xdir: remove inprogress files after restart · f41aac61
      Vladimir Davydov authored
      If tarantool is stopped while writing a snapshot or a vinyl run file,
      inprogress files will never be removed. Fix this by collecting those
      files on recovery completion.
      
      Original patch by @IlyaMarkovMipt. Reworked by @locker.
      
      Closes #3406
      f41aac61
    • Ilya Markov's avatar
      xdir: change log messages in gc functions · 93a50580
      Ilya Markov authored
      In order to log only about files that were actually removed change log
      messages from "removing <name of file>" to "removed <name of file>" in
      vy_run_remove_files and xdir_collect_garbage functions.
      
      Needed for #3406
      93a50580
    • LapaevPavel's avatar
      cdc454b8
    • Konstantin Osipov's avatar
      test: update test results · aaa9bdbe
      Konstantin Osipov authored
      A minor follow up on the fix for gh-3452 (http.client timeout bug)
      aaa9bdbe
    • Ilya Markov's avatar
      http.client: Fix waiting after received result · 7dcc8b42
      Ilya Markov authored
      Current implementation of http.client relies on fiber_cond which is set
      after the request was registered and doesn't consider the fact that
      response may be handled before the set of fiber_cond.
      
      So we may have the following situation:
      1. Register request in libcurl(curl_multi_add_handle in curl_execute).
      2. Receive and process response, fiber_cond_signal on cond_var which no
      one waits.
      3. fiber_cond_wait on cond which is already signaled. Wait until timeout
      is fired.
      
      In this case user have to wait timeout, though data was received
      earlier.
      
      Fix this with adding extra flag in_progress to curl_request struct.
      Set this flag true before registering request in libcurl and set it
      false when request is finished before fiber_cond_signal.
      When in_progress flag is false, don't wait on cond variable.
      
      Add 1 error injection.
      
      Closes #3452
      7dcc8b42
  2. Jun 27, 2018
  3. Jun 25, 2018
    • Vladimir Davydov's avatar
      socket: fix race between unix tcp server stop and start · 80d379ee
      Vladimir Davydov authored
      If called on a unix socket, bind(2) creates a new file, see unix(7).
      When we stop a unix tcp server, we should remove that file. Currently,
      we do it from the tcp server fiber, after the server loop is broken,
      which happens when the socket is closed, see tcp_server_loop(). This
      opens a time window for another tcp server to reuse the same path:
      
          main fiber                  tcp server loop
          ----------                  ---------------
      
          -- Start a tcp server.
          s = socket.tcp_server('unix/', sock_path, ...)
          -- Stop the server.
          s:close()
      
                                      socket_readable? => no, break loop
      
          -- Start a new tcp server. Use the same path as before.
          -- This function succeeds, because the socket is closed
          -- so tcp_server_bind_addr() will clean up by itself.
          s = socket.tcp_server('unix/', sock_path, ...)
      
           tcp_server_bind
            tcp_server_bind_addr
             socket_bind => EADDRINUSE
             tcp_connect => ECONNREFUSED
             -- Remove dead unix socket.
             fio.unlink(addr.port)
             socket_bind => success
      
                                      -- Deletes unix socket used
                                      -- by the new server.
                                      fio.unlink(addr.port)
      
      In particular, the race results in sporadic failures of app-tap/console
      test, which restarts a tcp server using the same file path.
      
      To fix this issue, let's close the socket after removing the socket
      file. This is absolutely legit on any UNIX system, and this eliminates
      the race shown above, because a new server that tries to bind on the
      same path as the one already used by a dying server will not receive
      ECONNREFUSED until the socket fd is closed and hence the file is
      removed.
      
      A note about the app-tap/console test. After this patch is applied,
      socket.close() takes a little longer for unix tcp server, because it
      yields twice, once for removing the socket file and once for closing the
      socket file descriptor. As a result, on_disconnect() trigger left from
      the previous test case has time to run after session.type() check.
      Actually, those triggers have already been tested and we should have
      cleared them before proceeding to the next test case. So instead of
      adding two new on_disconnect checks to the test plan, let's clear the
      triggers before session.type() test case and remove 3 on_connect and 5
      on_auth checks from the test plan.
      
      Closes #3168
      80d379ee
    • Vladislav Shpilevoy's avatar
      iproto: protect from false-correct size in msg header · c6951c92
      Vladislav Shpilevoy authored
      Consider this packet:
      
          msgpack = require('msgpack')
          data = msgpack.encode(18400000000000000000)..'aaaaaaa'
      
      Tarantool interprets 18400000000000000000 as size of a coming
      iproto request, and tries with no any checks to allocate buffer
      of such size. It calculates needed capacity like this:
      
          capacity = start_value;
          while (capacity < size)
              capacity *= 2;
      
      Here it is possible that on i-th iteration 'capacity' < 'size',
      but 'capacity * 2' overflows 64 bits and becomes < 'size' again,
      so this loop never ends and occupies 100% CPU.
      
      Strictly speaking overflow has undefined behavior. On the
      original system it led to nullifying 'capacity'.
      
      Such size is improbable as a real packet gabarits, but can appear
      as a result of parsing of some invalid packet, first bytes of
      which accidentally appears to be valid MessagePack uint. This is
      how the bug emerged on the real system.
      
      Lets restrict the maximal packet size to 2GB.
      
      Closes #3464
      c6951c92
  4. Jun 14, 2018
  5. Jun 08, 2018
    • Alexander Turenko's avatar
      debian: don't install systemd service file twice · e38d2762
      Alexander Turenko authored
      It fixes the following errors during tarantool installation from
      packages on debian / ubuntu:
      
      ```
      Unpacking tarantool (1.9.1.23.gacbd91c-1) ...
      dpkg: error processing archive /var/cache/apt/archives/tarantool_1.9.1.23.gacbd91c-1_amd64.deb (--unpack):
       trying to overwrite '/lib/systemd/system/tarantool.service', which is also in package tarantool-common 1.9.1.23.gacbd91c-1
      ```
      
      The problem is that tarantool.service file was shipped with
      tarantool-common and tarantool packages both. It is the regression after
      8925b862.
      
      The way to avoid installing / enabling the service file within tarantool
      package is to pass `--name` option to dh_systemd_enable, but do not pass
      the service file name. In that case dh_systemd_enable does not found the
      service file and does not enforce existence of the file.
      
      Hope there is less hacky way to do so, but I don't found one at the
      moment.
      Unverified
      e38d2762
    • Georgy Kirichenko's avatar
      Fix libunwind segfault · 5c3b3001
      Georgy Kirichenko authored
      Use volatile asm modifier to prevent unwanted and awkward optimizations
      causing segfault while backtracing
      5c3b3001
  6. Jun 07, 2018
  7. Jun 02, 2018
  8. Jun 01, 2018
    • Vladimir Davydov's avatar
      vinyl: fix compaction vs checkpoint race resulting in invalid gc · b25e3168
      Vladimir Davydov authored
      The callback invoked upon compaction completion uses checkpoint_last()
      to determine whether compacted runs may be deleted: if the max LSN
      stored in a compacted run (run->dump_lsn) is greater than the LSN of the
      last checkpoint (gc_lsn) then the run doesn't belong to the last
      checkpoint and hence is safe to delete, see commit 35db70fa ("vinyl:
      remove runs not referenced by any checkpoint immediately").
      
      The problem is checkpoint_last() isn't synced with vylog rotation - it
      returns the signature of the last successfully created memtx snapshot
      and is updated in memtx_engine_commit_checkpoint() after vylog is
      rotated. If a compaction task completes after vylog is rotated but
      before snap file is renamed, it will assume that compacted runs do not
      belong to the last checkpoint, although they do (as they have been
      appended to the rotated vylog), and delete them.
      
      To eliminate this race, let's use vylog signature instead of snap
      signature in vy_task_compact_complete().
      
      Closes #3437
      b25e3168
  9. May 31, 2018
    • Vladimir Davydov's avatar
      vinyl: fix false-positive assertion at exit · ff02157f
      Vladimir Davydov authored
      latch_destroy() and fiber_cond_destroy() are basically no-op. All they
      do is check that latch/cond is not used. When a global latch or cond
      object is destroyed at exit, it may still have users and this is OK as
      we don't stop fibers at exit. In vinyl this results in the following
      false-positive assertion failures:
      
        src/latch.h:81: latch_destroy: Assertion `l->owner == NULL' failed.
      
        src/fiber_cond.c:49: fiber_cond_destroy: Assertion `rlist_empty(&c->waiters)' failed.
      
      Remove "destruction" of vy_log::latch to suppress the first one. Wake up
      all fibers waiting on vy_quota::cond before destruction to suppress the
      second one. Add some test cases.
      
      Closes #3412
      ff02157f
  10. May 29, 2018
  11. May 25, 2018
  12. May 24, 2018
    • Georgy Kirichenko's avatar
      replication: add strict ordering for appliers operating in a full mesh · edd76a2a
      Georgy Kirichenko authored
      In some cases when an applier processing yielded, other applier might
      start some conflicting operation and break replication and database
      consistency.
      Now applier locks a per-server-id latch before processing a transaction.
      This guarantees that there is only one applier request for each server
      in progress at each given moment.
      
      The problem was very rare until full mesh topologies in vinyl
      became a commonplace.
      
      Fixes gh-3339
      edd76a2a
  13. May 22, 2018
  14. May 17, 2018
  15. May 15, 2018
    • Vladimir Davydov's avatar
      test: improve vinyl/select_consistency · 47fe6ced
      Vladimir Davydov authored
      Improve the test by decreasing range_size so that it creates a lot of
      ranges for test indexes, not just one. This helped find bugs causing
      the crash described in #3393.
      
      Follow-up #3393
      47fe6ced
    • Vladimir Davydov's avatar
      vinyl: do not panic if secondary index is inconsistent with primary · 1558c538
      Vladimir Davydov authored
      Although the bug in vy_task_dump_complete() due to which a tuple could
      be lost during dump was fixed, there still may be affected deployments
      as the bug was persisted on disk. To avoid occasional crashes on such
      deployments, let's make vinyl_iterator_secondary_next() skip tuples that
      are present in a secondary index but missing in the primary.
      
      Closes #3393
      1558c538
    • Vladimir Davydov's avatar
      vinyl: fix lost key on dump completion · 1f0023ad
      Vladimir Davydov authored
      vy_task_dump_complete() creates a slice per each range overlapping with
      the newly written run. It uses vy_range_tree_psearch(min_key) to find
      the first overlapping range and nsearch(max_key) to find the range
      immediately following the last overlapping range. This is incorrect as
      nsearch rb tree method returns the element matching the search key if it
      is present in the tree. That is, if the max key written to a run turns
      out to be equal the beginning of a range, the slice won't be created for
      it and it will be silently and persistently lost.
      
      The issue manifests itself as crash in vinyl_iterator_secondary_next(),
      when we fail to find the tuple in the primary index corresponding to a
      statement found in a secondary index.
      
      Part of #3393
      1f0023ad
    • Vladimir Davydov's avatar
      vinyl: fix EQ check in run iterator · 7ee79a0a
      Vladimir Davydov authored
      vy_run_iterator_seek() is supposed to check that the resulting statement
      matches the search key in case of ITER_EQ, but if the search key lies at
      the beginning of the slice, it doesn't. As a result, vy_point_lookup()
      may fail to find an existing tuple as demonstrated below.
      
      Suppose we are looking for key {10} in the primary index which consists
      of an empty mem and two runs:
      
          run 1: DELETE{15}
          run 2: INSERT{10}
      
      vy_run_iterator_next() returns DELETE{15} for run 1 because of the
      missing EQ check and vy_point_lookup() stops at run 1 (since the
      terminal statement is found) and mistakenly returns NULL.
      
      The issue manifests itself as crash in vinyl_iterator_secondary_next(),
      when we fail to find the tuple in the primary index corresponding to a
      statement found in a secondary index.
      
      Part of #3393
      7ee79a0a
    • Alexander Turenko's avatar
      Add test case for fiber safety of digest.pbkdf2 · ec9ec946
      Alexander Turenko authored
      Follows up #3396.
      Unverified
      ec9ec946
  16. May 14, 2018
  17. May 08, 2018
    • Ilya Markov's avatar
      socket: Fix socket test · 2b973c05
      Ilya Markov authored
      In sequential launch of app-tap/console.test, tests failed with "User
      exists" and binding errors.
      
      Make sockets path relative.
      Add users cleanup.
      
      Relates #3168
      2b973c05
Loading