- Feb 20, 2018
-
-
Vladislav Shpilevoy authored
Compatibility must be commutative, but this function is not commutative. It checks, that one type can store values of another type, but not conversely.
-
Vladislav Shpilevoy authored
Closes #2973
-
Vladimir Davydov authored
Not all workloads need bloom filters enabled for all indexes. Let's allow to disable them on per-index basis by setting bloom_fpr to 1. This will allow to save some memory if bloom filters are unused. Closes #3138
-
Vladimir Davydov authored
Currently, one can set insane values for most vinyl index options, which will most certainly result in a crash (e.g. bloom_fpr = 100). Add some sanity checks.
-
Vladimir Davydov authored
While a node of the cluster is re-bootstrapping (joining again), other nodes may try to re-subscribe to it. They will fail, because the rebootstrapped node hasn't tried to subscribe hence hasn't been added to the _cluster table yet and so is not present in the hash at the subscriber's side for replica_on_applier_reconnect() to look it up. Fix this by making a subscriber create an id-less (REPLICA_ID_NIL) struct replica in this case and reattach the applier to it. It will be assigned an id when it finally subscribes and is registered in _cluster. Fixes 71b33405 replication: reconnect applier on master rebootstrap
-
Vladimir Davydov authored
If box.cfg() successfully connects to a number of replicas sufficient to form a quorum (>= box.cfg.replication_connect_quorum), it won't return until it syncs with all of them (lag <= box.cfg.replication_sync_lag). If one of the replicas forming a quorum disconnects permanently while sync is in progress, box.cfg() will hang forever. Such a behavior is rather unreasonable. After all, syncing a quorum is best-effort. It would be much more sensible to return from box.cfg() leaving the instance in the 'orphan' mode in this case. This patch does exactly that: now if we detect that not enough replicas are connected to form a quorum while we are syncing we stop syncing immediately.
-
Vladimir Davydov authored
Currently, the max time box.cfg() may wait for connection to replicas to be established is hardcoded to box.cfg.replication_timeout times 4. As a result, users can't revert to pre replication_connect_quorum behavior, when box.cfg() blocks until it connects to all replicas. To fix that, let's introduce a new configuration option, replication_connect_timeout, which determines the replication configuration timeout. By default the option is set to 4 seconds. Closes #3151
-
Vladimir Davydov authored
If a unique index includes all parts of another unique index, we can skip the check for duplicates for it on INSERT. Let's mark all such indexes with a special flag on CREATE/ALTER and optimize out the check if the flag is set. If there are two indexes that index the same set of fields, check uniqueness for the one with a lower id, because it is likelier to have have a "warmer" cache. Closes #3154
-
Vladimir Davydov authored
We keep run files corresponding to (at least) the last snapshot, because we need them for backups and replication. Deletion of compacted run files is postponed until the next snapshot. As a consequence, we don't delete run files created on a replica during the join stage. However, in contrast to run files created during normal operation, these are pure garbage and should be deleted right away. Not deleting them can result in depletion of disk space, because vinyl has quite high write amplification by design. We can't write a functional test for this, because there's no way to guarantee that compaction started during join will finish before join completion - if it doesn't, compacted runs won't be removed, because they will be assigned to the snapshot created by join. Closes #3162
-
Vladislav Shpilevoy authored
Vinyl index key definition is stored in vylog even if an index is empty, and we while do not have a method to update it. So vinyl index key definition alter is forbidden even on an empty space. Closes #3169
-
- Feb 19, 2018
-
-
Vladimir Davydov authored
Closes #3173
-
- Feb 17, 2018
-
-
Georgy Kirichenko authored
If any ddl operation is in progress then all other ddls are waiting on schema latch. But after first ddl will be done any other request may be issued just after it and commit order will be broken in case of multimaster replication. To prevent this behavior any ddl operation should wait until all queued ddls are done. Fixes #2951
-
Georgy Kirichenko authored
Prevent latch lock interception by other already scheduled or active fiber if there is only one waiting. This is needed for strict latch ordering.
-
- Feb 16, 2018
-
-
Konstantin Belyavskiy authored
Incomming ACK lead to race condition and prevent heartbeat messages. It ends up with disconnect on timeout. This fix is based on @locker proposal to send vclock only to reply master (since it itself sends heartbeat messages). Closes #3160
-
Vladimir Davydov authored
This reverts commit a7871247.
-
- Feb 15, 2018
-
-
Konstantin Osipov authored
This reverts commit 99c7a971.
-
Vladimir Davydov authored
If a vinyl transaction stalls waiting for quota for more than box.cfg.too_long_threshold seconds, emit a warning to the log: W> waited for 699089 bytes of vinyl memory quota for too long: 0.504 sec This will help us understand whether our users experience lags due to absence of throttling in vinyl (see #1862). Closes #3096
-
imarkov authored
The name of the universe is optional, so we don't check it. If a user wants to specify extra options in the grant, such as if_not_exists, and mistakes object name argument with options argument, options are silently ignored: box.schema.user.grant('tnt', 'read,write,execute', 'universe', {if_not_exists = true}) Fix this by adding Lua code that ensures that universe name is a scalar (string or nil). Closes #3146
-
- Feb 13, 2018
-
-
Konstantin Belyavskiy authored
In replication schema if one of the instances was powered off, it isn't detected by others and the connection hangs. Alive machines show 'follow' state. Add timeout to solve this issue. It's safe since applier and relay both send messages every replication_timeout so we can assume that if we read nothing we have problem with connection. Use replication_disconnect_timeout which is replication_timeout * 4 as for now. The test fixed and comments improved by @locker. Closes #3025
-
Vladimir Davydov authored
If an instance is 'orphan', it is read-only hence box.ctl.wait_rw() should block until the instance syncs, but currently it doesn't. Fix it.
-
Vladimir Davydov authored
vy_log_rotate() releases the log latch between reading the last vylog file and writing the new vylog file. This works as long as the latch implementation guarantees that latch_lock() called immediately after latch_unlock() on the same lock doesn't yield. Although this is true now, we shouldn't rely on that, because this may change any time.
-
Konstantin Osipov authored
-
Vladislav Shpilevoy authored
Closes #2789
-
imarkov authored
* Create constant SUPER - id of super role * Forward the constant to box.schema * Add checks on drop super role Closes #3084
-
- Feb 11, 2018
-
-
Vladimir Davydov authored
- space.bsize returns the size of user data stored in the space. It is the sum of memory.bytes and disk.bytes as reported by the primary index. - index.bsize returns the size of memory used for indexing data. It is the sum of memory.index_size, disk.index_size, and disk.bloom_size as reported by index.info. For secondary indexes we also add the size of binary data stored on disk (disk.bytes), because it is only needed to build the index. - index.len returns the total number of rows stored in the index. It is the sum of memory.rows and disk.rows as reported by index.info. Note, it may be greater than the number of tuples stored in the space, because it includes DELETE and UPDATE statements. Closes #2863 Closes #3056
-
Vladimir Davydov authored
This patch adds the following statistics to index.info: - memory.index_size - size of memory tree extents - cache.index_size - size of cache tree extents - disk.index_size - size of page index - disk.bloom_size - size of bloom filters
-
Konstantin Belyavskiy authored
This patch adds a new connection option to http client, 'unix_socket'. The option specifies the path to the unix socket to use as connection endpoint instead of TCP: httpc = require('http.client') httpc.request('GET', 'http://localhost/index.html', nil, {unix_socket = '/var/run/docker.sock'}) The option is supported only if tarantool was built with libcurl 7.40.0 or newer. For older versions, an attempt to use the option will result in a Lua exception. Suggested and first implemented by @rosik. The test was refactored by @locker. Closes #3040
-
- Feb 10, 2018
-
-
Vladimir Davydov authored
It will help resolve box.once() conflicts in case master is rw and replica is ro. Closes #2537
-
Vladimir Davydov authored
This patch adds two new Lua function, box.ctl.wait_ro() and box.ctl.wait_rw(), that block the current fiber until the server switches to read-only or read-write mode, respectively. Both functions take the timeout as an optional argument. Needed for #2537
-
Vladimir Davydov authored
src/lua/init.c: In function ‘tarantool_panic_handler’: src/lua/init.c:321:2: error: implicit declaration of function ‘print_backtrace’ [-Werror=implicit-function-declaration] print_backtrace(); ^~~~~~~~~~~~~~~ src/lua/fiber.c:244:1: error: ‘lbox_fiber_statof_bt’ defined but not used [-Werror=unused-function] lbox_fiber_statof_bt(struct fiber *f, void *cb_ctx) ^~~~~~~~~~~~~~~~~~~~
-
- Feb 08, 2018
-
-
Vladimir Davydov authored
There are two issues in the rollback code: - txn_rollback_stmt() rollbacks the current autocommit transaction even if it is called from a sub-statement. As a result, if a sub-statement (i.e. a statement called from a before_replace or on_replace trigger) fails (e.g. due to a conflict), it will trash the current transaction leading to a bad memory access upon returning from the trigger. - txn_begin_stmt() calls txn_rollback_stmt() on failure even if it did not instantiate the statement. So if it is called from a trigger and fails (e.g. due to nesting limit), it may trash the parent statement, again leading to a crash. Fix them both and add some tests. Closes #3127
-
Vladimir Davydov authored
Obviously, there's no point in rebuilding an index if all we do is relaxing the uniqueness property. This will also allow us to clear the uniqueness flag for vinyl indexes, which do not support rebuild. Note, a memtx tree index stores a pointer to either cmp_def or key_def depending on whether the index is unique. Hence to clear the uniqueness flag without rebuilding the index, we need to update this pointer. To do that, we add a new index virtual method, update_def. Closes #2449
-
Vladimir Davydov authored
-
- Feb 06, 2018
-
-
Vladimir Davydov authored
If an instance is read-only, an attempt to join a new replica to it will fail with ER_READONLY, because joining a replica to a cluster implies registration in the _cluster system space. However, if the replica is already registered, which is the case if it is being rebootstrapped with the same uuid (see box.cfg.instance_uuid), the record corresponding to the replica is already present in the _cluster space and hence no write operation is required. Still, rebootstrap fails with the same error. Let's rearrange the access checks to make it possible to rebootstrap a replica from a read-only master provided it has the same uuid. Closes #3111
-
Vladimir Davydov authored
We can save a lookup in a secondary index on update if indexed fields are not modified. The extra check comes for free as we have a bit mask of all updated fields. Closes #2980
-
Vladimir Davydov authored
When a tarantool instance starts for the first time (the local directory is empty), it chooses the peer with the lowest UUID as the bootstrap master. As a result, one cannot reliably rebootstrap a cluster node (delete all local files and restart): if the node happens to have the lowest UUID in the cluster after restart, it will assume that it is the leader of a new cluster and bootstrap locally, splitting the cluster in two. To fix this problem, let's always give preference to peers with a higher vclock when choosing a bootstrap master and only fall back on selection by UUID if two or more peers have the same vclock. To achieve that, we need to introduce a new iproto request type for fetching the current vclock of a tarantool instance (we cannot squeeze the vclock in the greeting, because the latter is already packed). The new request type is called IPROTO_REQUEST_VOTE so that in future it can be reused for a more sophisticated leader election algorithm. It has no body and does not require authentication. In reply to such a request, a tarantool instance will send IPROTO_OK and its current vclock. If the version of the master is >= 1.7.7, an applier will send IPROTO_REQUEST_VOTE to fetch the master's vclock before trying to authenticate. The vclock will then be to determine the node to bootstrap from. Closes #3108
-
Vladimir Davydov authored
No functional changes, just a trivial cleanup: - Move all C functions inside extern "C" section. - Rename xrow_decode_join to xrow_decode_join_xc. - Make XXX_xc wrappers around XXX functions.
-
- Feb 05, 2018
-
-
Vladimir Davydov authored
Before commit 2788dc1b ("Add APPLIER_READY state") we only printed the 'authenticated' message to the log in case credentials were set in the replication URI. The commit changed that: now we print the message even in case of guest connections, when applier does not send the AUTH command to the master at all. As a result if guest connections are not permitted by the master, the applier will keep printing 'authenticated' after every unsuccessful attempt to subscribe. This is misleading. Let us revert back to the behavior we had before commit 2788dc1b. Closes #3113
-
- Feb 02, 2018
-
-
Konstantin Nazarov authored
As there is now support for Alpine Linux in packpack, there is no longer any need in a custom Dockerfile builder.
-
Konstantin Nazarov authored
This patch is to get in line with the Alpine support in packpack: - don't rely on git, and use a source package instead - add subpackages with debug symbols, documentation and headers - don't build tarantool 3 times in a row
-