Commits · b907231713a75d716793505ad727ecea9415d774 · core / tarantool

May 30, 2019

vinyl: lookup key in reader thread · b9072317

If a key isn't found in the tuple cache, we fetch it from a run file. In
this case disk read and page decompression is done by a reader thread,
however key lookup in the fetched page is still performed by the tx
thread. Since pages are immutable, this could as well be done by the
reader thread, which would allow us to save some precious CPU cycles for
tx.

Close #4257

(cherry picked from commit 04b19ac1)

b9072317

vinyl: do not allow to cancel a fiber reading a page · 1205c92e

Vladimir Davydov authored 5 years ago

To handle fiber cancellation during page read we need to pin all objects
referenced by vy_page_read_task. Currently, there's the only such
object, vy_run. It has reference counting so pinning it is trivial.
However, to move page lookup to a reader thread, we need to also
reference key def, tuple format, and key. Format and key have reference
counting, but key def doesn't - we typically copy it. Copying it in this
case is too heavy.

Actually, cancelling a fiber manually or on timeout while it's reading
disk doesn't make much sense with PCIE attached flash drives. It used to
be reasonable with rotating disks, since a rotating disk controller
could retry reading a block indefinitely on read failure. It is still
relevant to Network Attached Storage. On the other hand, NAS has never
been tested, and what isn't tested, can and should be removed. For
complex SQL queries we'll be forced to rethink timeout handling anyway.

That being said, let's simply drop this functionality.

(cherry picked from commit bab04b25)

1205c92e

vinyl: encapsulate reader thread selection logic in a helper function · 8d7bded3

Vladimir Davydov authored 5 years ago

Page reading code is intermixed with the reader thread selection in the
same function, which makes it difficult to extend the former. So let's
introduce a helper function encapsulating a call on behalf of a reader
thread.

(cherry picked from commit d8a95a2a)

8d7bded3

vinyl: pass page info by reference to reader thread · 27fecbfd

Vladimir Davydov authored 5 years ago

Since a page read task references the source run file, we don't need to
pass page info by value.

(cherry picked from commit 67d36ccc)

27fecbfd

vinyl: factor out function to lookup key in page · 776e3c05

Vladimir Davydov authored 5 years ago

This function is a part of the run iterator API so we can't use it in
a reader thread. Let's make it an independent helper. As a good side
effect, we can now reuse it in the slice stream implementation.

(cherry picked from commit ac8ce023)

776e3c05

May 29, 2019

Eliminate -Warray-bounds warning on unit/guard.cc · c4838461
Alexander Turenko authored 5 years ago
```
This warning breaks -Werror -O2 build on GCC 9.1.

(cherry picked from commit 2f1a9012)
```
c4838461
Bump luajit version · 58c901eb
Kirill Yukhin authored 5 years ago
```
(cherry picked from commit bfde5242)
```
58c901eb

test: fix fio test when run by the root user · 958dd0a4

Alexander Turenko authored 5 years ago

We run testing in Travis-CI using docker, which by default enters into a
shell session as root.

This is follow up for e9c96a4c ('fio:
fix mktree error reporting').

(cherry picked from commit 8cd18ec6)

958dd0a4

fio: fix mktree error reporting · d955e709

Mike Siomkin authored 6 years ago

In case of, say, permission denied the function did try to concatenate a
string with a cdata<struct error> and fails.

Also unified error messages in case of a single part path and a multi
part one.

(cherry picked from commit e9c96a4c)

d955e709

test: fix vinyl/write_iterator hang · f8676ad6

Vladimir Davydov authored 5 years ago

Tune secondary index content to make sure minor compaction expected by
the test does occur.

Fixes commit e2f5e1bc ("vinyl: don't produce deferred DELETE on
commit if key isn't updated").

Closes #4255

(cherry picked from commit a28a0d6a)

f8676ad6

May 28, 2019

test/luajit-tap: Add unsink_64_kptr.test.lua · 0a1d6378

Cyrill Gorcunov authored 5 years ago

Backport of openrusty/luajit2-test-suite
commit 907c536c210ebe6a147861bb4433d28c0ebfc8cd

To test unsink 64 bit pointers

Part-of #4171

(cherry picked from commit 10142d83)

0a1d6378

Bump luajit version · 928cdef1
Kirill Yukhin authored 5 years ago
```
(cherry picked from commit e7b43e47)
```
928cdef1

May 27, 2019

vinyl: fix deferred DELETE statement lost on commit · aa4b1021

Vladimir Davydov authored 5 years ago

Even if a statement isn't marked as VY_STMT_DEFERRED_DELETE, e.g. it's
a REPLACE produced by an UPDATE request, it may overwrite a statement in
the transaction write set that is marked so, for instance:

  s = box.schema.space.create('test', {engine = 'vinyl'})
  pk = s:create_index('pk')
  sk = s:create_index('sk', {parts = {2, 'unsigned'}})

  s:insert{1, 1}

  box.begin()
  s:replace{1, 2}
  s:update(1, {{'=', 2, 3}})
  box.commit()

If we don't mark REPLACE{3,1} produced by the update operatoin with
VY_STMT_DEFERRED_DELETE flag, we will never generate a DELETE statement
for INSERT{1,1}. That is, we must inherit the flag from the overwritten
statement when we insert a new one into a write set.

Closes #4248

(cherry picked from commit b54433d9)

aa4b1021

vinyl: don't produce deferred DELETE on commit if key isn't updated · 61cf6b25

Vladimir Davydov authored 5 years ago

Consider the following example:

  s = box.schema.space.create('test', {engine = 'vinyl'})
  s:create_index('primary')
  s:create_index('secondary', {parts = {2, 'unsigned'}})

  s:insert{1, 1, 1}
  s:replace{1, 1, 2}

When REPLACE{1,1} is committed to the secondary index, the overwritten
tuple, i.e. INSERT{1,1}, is found in the primary index memory, and so
deferred DELETE{1,1} is generated right away and committed along with
REPLACE{1,1}. However, there's no need to commit anything to the
secondary index in this case, because its key isn't updated. Apart from
eating memory and loading disk, this also breaks index stats, as vy_tx
implementation doesn't expect two statements committed for the same key
in a single transaction.

Fix this by checking if there's a statement in the log for the deleted
key and if there's skipping them both as we do in the regular case, see
the comment in vy_tx_set.

Closes #3693

(cherry picked from commit e2f5e1bc)

61cf6b25

vinyl: fix secondary index divergence on update · 0e37af31

Vladimir Davydov authored 5 years ago

If an UPDATE request doesn't touch key parts of a secondary index, we
don't need to re-index it in the in-memory secondary index, as this
would only increase IO load. Historically, we use column mask set by the
UPDATE operation to skip secondary indexes that are not affected by the
operation on commit. However, there's a problem here: the column mask
isn't precise - it may have a bit set even if the corresponding column
value isn't changed by the update operation, e.g. consider {'+', 2, 0}.
Not taking this into account may result in appearance of phantom tuples
on disk as the write iterator assumes that statements that have no
effect aren't written to secondary indexes (this is needed to apply
INSERT+DELETE "annihilation" optimization). We fixed that by clearing
column mask bits in vy_tx_set in case we detect that the key isn't
changed, for more details see #3607 and commit e72867cb ("vinyl: fix
appearance of phantom tuple in secondary index after update"). It was
rather an ugly hack, but it worked.

However, it turned out that apart from looking hackish this code has
a nasty bug that may lead to tuples missing from secondary indexes.
Consider the following example:

  s = box.schema.space.create('test', {engine = 'vinyl'})
  s:create_index('pk')
  s:create_index('sk', {parts = {2, 'unsigned'}})
  s:insert{1, 1, 1}

  box.begin()
  s:update(1, {{'=', 2, 2}})
  s:update(1, {{'=', 3, 2}})
  box.commit()

The first update operation writes DELETE{1,1} and REPLACE{2,1} to the
secondary index write set. The second update replaces REPLACE{2,1} with
DELETE{2,1} and then with REPLACE{2,1}. When replacing DELETE{2,1} with
REPLACE{2,1} in the write set, we assume that the update doesn't modify
secondary index key parts and clear the column mask so as not to commit
a pointless request, see vy_tx_set. As a result, we skip the first
update too and get key {2,1} missing from the secondary index.

Actually, it was a dumb idea to use column mask to skip statements in
the first place, as there's a much easier way to filter out statements
that have no effect for secondary indexes. The thing is every DELETE
statement inserted into a secondary index write set acts as a "single
DELETE", i.e. there's exactly one older statement it is supposed to
purge. This is, because in contrast to the primary index we don't write
DELETE statements blindly - we always look up the tuple overwritten in
the primary index first. This means that REPLACE+DELETE for the same key
is basically a no-op and can be safely skip. Moreover, DELETE+REPLACE
can be treated as no-op, too, because secondary indexes don't store full
tuples hence all REPLACE statements for the same key are equivalent.
By marking both statements as no-op in vy_tx_set, we guarantee that
no-op statements don't make it to secondary index memory or disk levels.

Closes #4242

(cherry picked from commit 69aee6fc)

0e37af31

May 23, 2019

test/luajit-tap: Add table_chain_bug_LuaJIT_494.test.lua · 02b5948b

Cyrill Gorcunov authored 5 years ago

Backport of openresty/luajit2-test-suite
commit ce2c916d5582914edeb9499f487d9fa812632c5c

To test hash chain bug.

Part-of #4171

(cherry picked from commit 865de309f3b2ff3177a608af384839ce2fa36beb)

02b5948b

Bump luajit submodule · 93fd4563
Kirill Yukhin authored 5 years ago
```
(cherry picked from commit 98aa418fdf74b29c90204f263866f2e64cc3acb4)
```
93fd4563

May 21, 2019

crypto: fix assertion on cipher reinitialization · e408fe50

Vladislav Shpilevoy authored 5 years ago

Crypto provides API to create stream objects. These streams
consume plain and return ecnrypted data. Steps:

    1 c = cipher.new([key, iv])
    2 c:init(key, iv)
    3 c:update(input)
    4 c:result()

Step 2 is optional, if key and iv are specified in new(), but if
it called without key or iv, then result() method crashes.

The commit allows to fill key and iv gradually, in several init()
calls, and remembers previous results.

Closes #4223

(cherry picked from commit 26333580)

e408fe50

vinyl: fix assertion while recovering dumped statement · d5afe7bd

Vladimir Davydov authored 5 years ago

Certain kinds of DML requests don't update secondary indexes, e.g.
UPDATE that doesn't touch secondary index parts or DELETE for which
generation of secondary index statements is deferred. For such a request
vy_is_committed(env, space) may return false on recovery even if it has
actually been dumped: since such a statement is not dumped for secondary
indexes, secondary index's vy_lsm::dump_lsn may be less than such
statement's signature, which makes vy_is_committed() assume that the
statement hasn't been dumped. Further in the code we have checks that
ensure that if we execute a request on recovery, it must not be dumped
for the primary index (as the primary index is always dumped after
secondary indexes for the sake of recovery), which fires in this case.

To fix that, let's refactor the code basing on the following two facts:
 - Primary index is always updated by a DML request.
 - Primary index may only be dumped after secondary indexes.

Closes #4222

(cherry picked from commit 9566f14c)

d5afe7bd

schema: fix error while altering index with sequence · 285ca40f

Vladimir Davydov authored 5 years ago

A check was missing in index.alter. This resulted in an attempt to drop
the sequence attached to the altered index even if the sequence was not
modified.

Closes #4214

(cherry picked from commit 7d778de6)

285ca40f

May 20, 2019

travis-ci: fix LTO and clang · 5baf1c11

Alexander V. Tikhonov authored 5 years ago

Made fixes:

- Added CMAKE_EXTRA_PARAMS environment to docker's container
  runs to enable -DENABLE_LTO=ON/OFF cmake option.

- Added CC/CXX environment to docker's container runs to set
  clang for cmake. Also the additional environment variables
  {CC,CXX}_FOR_BUILD were postponed, because we didn't
  run cross-compilation at the moment, for more info check:

    https://docs.travis-ci.com/user/languages/cpp/#choosing-compilers-to-test-against

- Changed LTO docker's image to 'debian-buster' due to LTO needed
  higher versions of packages, check for more information commit:

    f9e28ce4 ('Add LTO support')

- Fixed sources to avoid of failures on builds by GCC with LTO:

1)  src/box/memtx_rtree.c: In function ‘mp_decode_rect’:
    src/box/memtx_rtree.c:86:24: error: ‘c’ may be used uninitialized
      in this function [-Werror=maybe-uninitialized]
        rect->coords[i * 2] = c;
                            ^
    src/box/memtx_rtree.c:74:10: note: ‘c’ was declared here
      coord_t c;
              ^

2)  src/box/sql/func.c: In function ‘quoteFunc’:
    src/box/sql/func.c:1103:3: error: ‘b’ may be used uninitialized
      in this function [-Werror=maybe-uninitialized]
       sql_result_text(context, sql_value_boolean(argv[0]) ?
       ^
    src/box/sql/vdbeapi.c:217:7: note: ‘b’ was declared here
      bool b;
           ^

3)  src/box/tuple_update.c: In function ‘update_read_ops’:
    src/box/tuple_update.c:1022:4: error: ‘field_no’ may be used
      uninitialized in this function [-Werror=maybe-uninitialized]
        diag_set(ClientError, ER_NO_SUCH_FIELD_NO, field_no);
        ^
    src/box/tuple_update.c:1014:11: note: ‘field_no’ was declared here
       int32_t field_no;
               ^

4)  src/httpc.c: In function ‘httpc_set_verbose’:
    src/httpc.c:267:2: error: call to ‘_curl_easy_setopt_err_long’
      declared with attribute warning: curl_easy_setopt expects a long
      argument for this option [-Werror]
      curl_easy_setopt(req->curl_request.easy, CURLOPT_VERBOSE, curl_verbose);
      ^

5)  src/lua/httpc.c: In function ‘luaT_httpc_request’:
    src/lua/httpc.c:128:64: error: ‘MEM[(int *)&parser + 20B]’ may be used
      uninitialized in this function [-Werror=maybe-uninitialized]
      lua_pushinteger(L, (parser.http_minor > 0) ? parser.http_minor: 0);
                                                                    ^
    src/lua/httpc.c:67:21: note: ‘MEM[(int *)&parser + 20B]’ was declared here
      struct http_parser parser;
                         ^
    src/lua/httpc.c:124:64: error: ‘MEM[(int *)&parser + 16B]’ may be used
      uninitialized in this function [-Werror=maybe-uninitialized]
      lua_pushinteger(L, (parser.http_major > 0) ? parser.http_major: 0);
                                                                    ^
    src/lua/httpc.c:67:21: note: ‘MEM[(int *)&parser + 16B]’ was declared here
      struct http_parser parser;
                         ^

Close #4215

(cherry picked from commit e55396c8)

5baf1c11

travis-ci: set right flags in release testing jobs · e0054086

Alexander Turenko authored 6 years ago

It is important to have testing jobs that build the project with both
-Werror and -O2 to keep the code clean. -O2 is needed, because some
compiler warnings are available only after extra analyzing passes that
are disabled with lesser optimization levels.

The first attempt to add -Werror for release testing jobs was made in
da505ee7 ('Add -Werror for CI (1.10
part)'), but it mistakely doesn't enable -O2 for RelWithDebInfoWError
build. It is possible to fix it in this way:

 | --- a/cmake/compiler.cmake
 | +++ b/cmake/compiler.cmake
 | @@ -113,10 +113,14 @@ set (CMAKE_C_FLAGS_DEBUG
 |      "${CMAKE_C_FLAGS_DEBUG} ${CC_DEBUG_OPT} -O0")
 |  set (CMAKE_C_FLAGS_RELWITHDEBINFO
 |      "${CMAKE_C_FLAGS_RELWITHDEBINFO} ${CC_DEBUG_OPT} -O2")
 | +set (CMAKE_C_FLAGS_RELWITHDEBINFOWERROR
 | +    "${CMAKE_C_FLAGS_RELWITHDEBINFOWERROR} ${CC_DEBUG_OPT} -O2")
 |  set (CMAKE_CXX_FLAGS_DEBUG
 |      "${CMAKE_CXX_FLAGS_DEBUG} ${CC_DEBUG_OPT} -O0")
 |  set (CMAKE_CXX_FLAGS_RELWITHDEBINFO
 |      "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} ${CC_DEBUG_OPT} -O2")
 | +set (CMAKE_CXX_FLAGS_RELWITHDEBINFOWERROR
 | +    "${CMAKE_CXX_FLAGS_RELWITHDEBINFOWERROR} ${CC_DEBUG_OPT} -O2")
 |
 |  unset(CC_DEBUG_OPT)

However I think that a build type (and so `tarantool --version`) should
not show whether -Werror was passed or not. So I have added
ENABLE_WERROR CMake option for that. It can be set like so:

 | cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_WERROR=ON

Enabled the option in testing Travis-CI jobs with the RelWithDebInfo
build type. Deploy jobs don't include it as before.

Fixed all -Wmaybe-uninitialized and -Wunused-result warnings. A few
notes about the fixes:

* net.box does not validate received data in general, so I don't add a
  check for autoincrement IDs too. Set the ID to INT64_MIN, because this
  value is less probably will appear here in a normal case and so is the
  best one to signal a user that something probably going wrongly.
* xrow_decode_*() functions could read uninitialized data from
  row->body[0].iov_base in xrow_on_decode_err() when printing a hex code
  for a row. It could be possible when the received msgpack was empty
  (row->bodycnt == 0), but there were expected keys (key_map != 0).
* getcwd() is marked with __attribute__((__warn_unused_result__)) in
  glibc, but the buffer filled by this call is not used anywhere and so
  just removed.
* Vinyl -Wmaybe-uninitialized warnings are false positive ones.

Added comments and quotes into .travis.yml to ease reading. Removed
"test" word from the CentOS 6 job name, because we don't run tests on
this distro (disabled in the RPM spec).

Fixes #4178.

(cherry picked from commit c308f35d)

e0054086

coio: fix getaddrinfo assertion on 0 timeout · 8b7344dd

Vladislav Shpilevoy authored 5 years ago

Background. Coio provides a way to schedule arbitrary tasks
execution in worker threads. A task consists of a function to
execute, and a custom destructor.

To push a task the function coio_task_post(task, timeout) was
used. When the function returns 0, a caller can obtain a result
and should free the task manually. But the trick is that if
timeout was 0, the task was posted in a detached state. A
detached task frees its memory automatically despite
coio_task_post() result, and does not even yield. Such a task
object can't be accessed and so much the more freed manually.

coio_getaddrinfo() used coio_task_post() and freed the task when
the latter function returned 0. It led to double free when
timeout was set 0. The bug was introduced here
800cec73 in an attempt to do not
yield in say_logrotate, because it is not fiber-safe.

Now there are two functions: coio_task_execute(task, timeout),
which never detaches a task completed successfully, and
coio_task_post(task), which posts a task in a detached state.

Closes #4209

(cherry picked from commit b6466ac7)

8b7344dd

coio: make hints in coio_getaddrinfo optional · cd732a11

Vladislav Shpilevoy authored 5 years ago

According to the standard by Open Group, getaddrinfo() hints
argument is optional - it can be NULL. When it is NULL, hints
is assumed to have 0 in ai_flags, ai_socktype, and ai_protocol;
AF_UNSPEC in ai_family.

See The Open Group Base Specifications.

(cherry picked from commit 9c4f1c8a)

cd732a11

msgpack: validate msgpack.decode() cdata size argument · 10d5a315

Vladislav Shpilevoy authored 5 years ago

Negative size led to an assertion. The commit adds a check if
size is negative.

Closes #4224

(cherry picked from commit 10873f16)

10d5a315

iproto: init coio watcher before join/subscribe · 12a97545

Alexander Turenko authored 5 years ago

box_process_join() and box_process_subscribe() use coio_write_xrow(),
which calls coio_writev_timeout() under hood. If a socket will block at
write() the function calls ev_io_start() to wake the fiber up when the
socket will be ready to write. This code assumes that the watcher
(struct ev_io) is initialized as coio watcher, i.e. coio_create() has
been called.

The reason why the code works before is that coio_write_xrow() in
box_process_{join,subscribe}() writes a small piece of data and so the
situation when a socket write buffer has less free space then needed is
rare.

Fixes #4110.

(cherry picked from commit 539aee3d)

12a97545

May 17, 2019
- travis-ci: add fedora 30 · 91e385ff
  Alexander Turenko authored 5 years ago
  
  Fixes #4194. (cherry picked from commit e6869dd2)
  91e385ff
May 16, 2019

travis-ci: set jobs not to stop on failed tests · 58acee67

Alexander V. Tikhonov authored 6 years ago

Added --force flag to travis-ci jobs not to stop on failed tests.
Due to any found failed test breaks the testing it masks the other
fails and in the following ways it's not good:
- flaky test masks real problem
- release testing needs overall result to fix it fast
- parallel testing may produce flaky test

Close: #4131
(cherry picked from commit 5f87a3a3)

58acee67

test: update test-run · 406f51d1

Alexander Turenko authored 5 years ago

- Added test_run:wait_upstream() and test_run:wait_downstream()
  functions to wait for certain box.info.replication values (#158).
- Fix killing of servers at crash (PR #167).
- Show logs for a non-default server failed at start (#159, PR #168).
- Fix TAP13 hung test reporting (#155, PR #169).
- Fix false positive internal error detection (PR #170).
- Support more then 60 parallel jobs (#82, PR #171).

406f51d1

vinyl: reset dump watermark after updating memory limit · fed18733

Vladimir Davydov authored 6 years ago

The watermark is updated every second anyway, however not updating it
when the limit is reset results in vinyl/quota test failure:

 | --- vinyl/quota.result  Thu Mar 14 16:03:54 2019
 | +++ vinyl/quota.reject  Fri Mar 15 16:32:44 2019
 | @@ -146,7 +146,7 @@
 |  for i = 1, count do s:replace{i, pad} end -- does not trigger dump
 |  ---
 |  ...
 | -box.stat.vinyl().memory.level0 > count * pad:len()
 | +box.stat.vinyl().memory.level0 > count * pad:len() or box.stat.vinyl()
 |  ---
 |  - true
 |  ...

Closes #3864

(cherry picked from commit b15773fa)

fed18733

May 14, 2019

httpc: add MAX_TOTAL_CONNECTIONS option binding · bb53975a

Ilya Konyukhov authored 6 years ago

Right now there is only one option which is configurable for http
client. That is CURLMOPT_MAXCONNECTS. It can be setup like this:

> httpc = require('http.client').new({max_connections = 16})

Basically, this option tells curl to maintain this many connections in
the cache during client instance lifetime. Caching connections are very
useful when user requests mostly same hosts.

When connections cache is full and all of them are waiting for response
and new request comes in, curl creates a new connection, starts request
and then drops first available connection to keep connections cache size
right.

There is one side effect, that when tcp connection is closed, system
actually updates its state to TIME_WAIT. Then for some time resources
for this socket can't be reused (usually 60 seconds).

When user wants to do lots of requests simultaneously (to the same
host), curl ends up creating and dropping lots of connections, which is
not very efficient. When this load is high enough, sockets won't be able
to recover from TIME_WAIT because of timeout and system may run out of
available sockets which results in performance reduce. And user right
now cannot control or limit this behaviour.

The solution is to add a new binding for CURLMOPT_MAX_TOTAL_CONNECTIONS
option. This option tells curl to hold a new connection until
there is one available (request is finished). Only after that curl will
either drop and create new connection or reuse an old one.

This patch bypasses this option into curl instance. It defaults to -1
which means that there is no limit. To create a client with this option
setup, user needs to set max_total_connections option like this:

> httpc = require('http.client').new({max_connections = 8,
                                      max_total_connections = 8})

In general this options usually useful when doing requests mostly to
the same hosts. Other way, defaults should be enough.

Option CURLMOPT_MAX_TOTAL_CONNECTIONS was added from 7.30.0 version, so
if curl version is under 7.30.0, this option is simply ignored.
https://curl.haxx.se/changes.html#7_30_0

Also, this patch adjusts the default for CURLMOPT_MAX_CONNECTS option to
0 which means that for every new easy handle curl will enlarge its max
cache size by 4. See this option docs for more
https://curl.haxx.se/libcurl/c/CURLMOPT_MAXCONNECTS.html

Fixes #3945

(cherry picked from commit d11b552e)

bb53975a

May 09, 2019

httpc: increase max outgoing header size to 8 KiB · e6a490c4

Шипицын Анатолий authored 6 years ago

The reason why the limit is so is that default Apache / nginx maximum
header size is 8 KiB.

Added a check to raise an error when a header is bigger then the limit.

Fixes #3959.

(cherry picked from commit 6b79d50a)

e6a490c4

May 06, 2019

core/coio_file: copyfile -- Make it behave as regular cp · 65ebaf87

Cyrill Gorcunov authored 5 years ago

Traditional cp utility opens destination with O_TRUNC
flag, iow it drops old content of the target file if
such exists.

Fixes #4181

(cherry picked from commit 7b378bc6)

65ebaf87

May 02, 2019

test: update test-run · 0923a6a5

Alexander Turenko authored 5 years ago

Added the signal option into 'stop server' command.

How to use:

 | test_run:cmd('stop server foo with signal=KILL')

The 'stop server foo' command without the option sends SIGTERM as
before.

This feature is intended to be used in a fix of #4162 ('test:
gc.test.lua test fails on *.xlog files cleanup').

(cherry picked from commit 71f7ecf1)

0923a6a5

Apr 30, 2019

test: use yaml.safe_load() instead of yaml.load() · fd303466

Alexander Turenko authored 6 years ago

The primary reason why this change is needed is that yaml.load() w/o an
explicit loader was banned in Gentoo Linux for recent pyyaml versions;
see [1].

We don't use the pyyaml feature that allows to construct a Python object
based on a yaml tag, so safe_load() fit our needs.

See also related changes in test-run and tarantool-python ([2], [3], [4]).

[1]: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=79ba924d94cb0cf8559565178414c2a1d687b90c
[2]: https://github.com/tarantool/test-run/commit/38400e91c600677fb661154d00459d660fa9880d
[3]: https://github.com/tarantool/test-run/commit/89808d60eb3b5130e227fc1a7866f2ad5a197bea
[4]: https://github.com/tarantool/tarantool-python/commit/350771d240a18eec188a53e8c696028b41baa13f

(cherry picked from commit d5fdc533)

fd303466

test: update test-run · 54430215

Alexander Turenko authored 6 years ago

* Run a unit test from a var directory (PR #149).
* Fixed _func system space preclean (PR #150).
* Added default timeout for wait_cond() (60 sec).
* Updated pyyaml version in requirements.txt.
* Fixed reporting of non-default server fail at start.
* Stop 'proxy' when a new non-default instance fails.
* Added user-defined protected globals for pretest_clean.
* Added more logging into wait_fullmesh().
* Better handle keyboard interrupt (PR #160).
* Fail testing when test-run fails internally (PR #161).
* Catch non-default server start fail in app server (#115, PR #162).
* Update tarantool-python submodule (PR #165).

(cherry picked from commit 91dd3cfd)

54430215

Apr 29, 2019

vinyl: be pessimistic about write rate when setting dump watermark · f14619d9

Vladimir Davydov authored 6 years ago

We set the dump watermark using the following formula

    limit - watermark     watermark
    ---------------- = --------------
       write_rate      dump_bandwidth

This ensures that by the time we run out of memory quota, memory
dump will have been completed and we'll be able to proceed. Here
the write_rate is the expected rate at which the workload will
write to the database while the dump is in progress. Once the dump
is started, we throttle the workload in case it exceeds this rate.

Currently, we estimate the write rate as a moving average observed
for the last 5 seconds. This performs poorly unless the workload
write rate is perfectly stable: if the 5 second average turns out to
be even slightly less than the max rate, the workload may experience
long stalls during memory dump.

To avoid that let's use the max write rate multiplied by 1.5 instead
of the average when setting the watermark. This means that we will
start dump earlier than we probably could, but at the same time this
will tolerate write rate fluctuations thus minimizing the probability
of stalls.

Closes #4166

(cherry picked from commit b9b8e8af)

f14619d9

httpc: fix zero timeout handling · 1a7476e6

Alexander Turenko authored 6 years ago

When libcurl is built with --enable-threaded-resolver (which is default)
and the version of the library is 7.60 or above, libcurl calls a timer
callback with exponentially increasing timeout_ms value during DNS
resolving.

This behaviour was introduced in curl-7_59_0-36-g67636222f (see [1],
[2]). During first ten milliseconds the library sets a timer to a passed
time divided by three (see Curl_resolver_getsock()). It is possible that
passed time is zero during at least several thousands of iterations.

Before this commit we didn't set a libev timer in curl_multi_timer_cb()
when a timeout_ms value is zero, but call curl_multi_process()
immediately. Libcurl however can call curl_multi_timer_cb() again and
here we're going into a recursion that stops only when timeous_ms
becomes positive. Often we generate several thousands of stack frames
within this recursion and exceed 512KiB of a fiber stack size.

The fix is easy: set a libev timer to call curl_multi_process() even
when a timeout_ms value is zero.

The reason why we did the call to curl_multi_process() immediately is
the unclear wording in the CURLMOPT_TIMERFUNCTION option documentation.
This documentation page was fixed in curl-7_64_0-88-g47e540df8 (see [3],
[4], [5]).

There is also the related change in curl-7_60_0-121-g3ef67c686 (see [6],
[7]): after this commit libcurl calls a timer callback with zero
timeout_ms during a first three milliseconds of asynchronous DNS
resolving.

Fixes #4179.

[1]: https://github.com/curl/curl/pull/2419
[2]: https://github.com/curl/curl/commit/67636222f42b7db146b963deb577a981b4fcdfa2
[3]: https://github.com/curl/curl/issues/3537
[4]: https://github.com/curl/curl/pull/3601
[5]: https://github.com/curl/curl/commit/47e540df8f32c8f7298ab1bc96b0087b5738c257
[6]: https://github.com/curl/curl/pull/2685
[7]: https://github.com/curl/curl/commit/3ef67c6861c9d6236a4339d3446a444767598a58

(cherry picked from commit 47bd51b5)

1a7476e6

Fix build · d656862c

Vladimir Davydov authored 6 years ago

checkpoint_delete isn't available in 1.10, use checkpoint_destroy
instead.

 | src/box/memtx_engine.c:638:2: error: implicit declaration of function 'checkpoint_delete' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
 |         checkpoint_delete(ckpt);
 |         ^

Fixes commit 8fd63f37 ("memtx: cancel checkpoint thread at exit").

d656862c

memtx: cancel checkpoint thread at exit · 8fd63f37

Vladimir Davydov authored 6 years ago

If a tarantool instance exits while checkpointing is in progress, the
memtx checkpoint thread, which writes the snap file, can access already
freed data resulting in a crash. Let's fix this the same way we did for
relay and vinyl threads - simply cancel the thread forcefully and wait
for it to terminate.

Closes #4170

(cherry picked from commit d95608e4)

8fd63f37