Skip to content
Snippets Groups Projects
  1. Feb 06, 2018
    • Vladimir Davydov's avatar
      replication: fix cluster node rebootstrap · 4e62423e
      Vladimir Davydov authored
      When a tarantool instance starts for the first time (the local directory
      is empty), it chooses the peer with the lowest UUID as the bootstrap
      master. As a result, one cannot reliably rebootstrap a cluster node
      (delete all local files and restart): if the node happens to have the
      lowest UUID in the cluster after restart, it will assume that it is the
      leader of a new cluster and bootstrap locally, splitting the cluster in
      two.
      
      To fix this problem, let's always give preference to peers with a higher
      vclock when choosing a bootstrap master and only fall back on selection
      by UUID if two or more peers have the same vclock. To achieve that, we
      need to introduce a new iproto request type for fetching the current
      vclock of a tarantool instance (we cannot squeeze the vclock in the
      greeting, because the latter is already packed). The new request type is
      called IPROTO_REQUEST_VOTE so that in future it can be reused for a more
      sophisticated leader election algorithm. It has no body and does not
      require authentication. In reply to such a request, a tarantool instance
      will send IPROTO_OK and its current vclock. If the version of the master
      is >= 1.7.7, an applier will send IPROTO_REQUEST_VOTE to fetch the
      master's vclock before trying to authenticate. The vclock will then be
      to determine the node to bootstrap from.
      
      Closes #3108
      4e62423e
    • Vladimir Davydov's avatar
      Cleanup xrow.h · e1d0946b
      Vladimir Davydov authored
      No functional changes, just a trivial cleanup:
      
       - Move all C functions inside extern "C" section.
       - Rename xrow_decode_join to xrow_decode_join_xc.
       - Make XXX_xc wrappers around XXX functions.
      e1d0946b
  2. Feb 05, 2018
    • Vladimir Davydov's avatar
      applier: do not print 'authenticated' message if connecting as guest · 674c1058
      Vladimir Davydov authored
      Before commit 2788dc1b ("Add APPLIER_READY state") we only printed
      the 'authenticated' message to the log in case credentials were set in
      the replication URI. The commit changed that: now we print the message
      even in case of guest connections, when applier does not send the AUTH
      command to the master at all. As a result if guest connections are not
      permitted by the master, the applier will keep printing 'authenticated'
      after every unsuccessful attempt to subscribe. This is misleading. Let
      us revert back to the behavior we had before commit 2788dc1b.
      
      Closes #3113
      674c1058
  3. Feb 02, 2018
    • Konstantin Nazarov's avatar
      Get rid of README and Dockerfile for Alpine Linux · 99ca8d1c
      Konstantin Nazarov authored
      As there is now support for Alpine Linux in packpack, there is no
      longer any need in a custom Dockerfile builder.
      99ca8d1c
    • Konstantin Nazarov's avatar
      Add -dev, -doc and -dbg packages for Alpine Linux · 8d5cbe66
      Konstantin Nazarov authored
      This patch is to get in line with the Alpine support in packpack:
      
      - don't rely on git, and use a source package instead
      - add subpackages with debug symbols, documentation and headers
      - don't build tarantool 3 times in a row
      8d5cbe66
    • Vladimir Davydov's avatar
      replication: reconnect applier on master rebootstrap · 71b33405
      Vladimir Davydov authored
      If one node of a cluster is rebootstrapped (i.e. restarted from an
      empty directory with the same configuration), other replicas will
      never try to reconnect to it - the appliers will simply stop with
      the ER_REPLICASET_UUID_MISMATCH error. The only way to fix this is
      reconfigure replication on all other nodes.
      
      Let's fix this problem by reassigning an applier to a new replica
      in case its UUID mismatches the UUID of the replica it is currently
      assigned to.
      
      Cannot write a test, because rebootstrap is unreliable - see #3108.
      
      Closes #3112
      71b33405
    • Vladimir Davydov's avatar
      applier: stop sending ACKs if master closed socket · dfb48d4d
      Vladimir Davydov authored
      If the master closes its end of the socket when there are still unread
      rows available for the replica to apply, we will get tons of EPIPE error
      messages at the replica's side, emitted every time it attempts to send
      an ACK back to the master (i.e. one per each row left in the socket):
      
        main/107/applierw/ sio.cc:303 !> SystemError writev(2), called on fd 12, aka 127.0.0.1:50852: Broken pipe
      
      To avoid that, let's make the applier writer fiber (the one that sends
      ACKs) exit immediately if it receives EPIPE error while trying to send
      an ACK.
      
      Closes #2945
      dfb48d4d
    • Konstantin Osipov's avatar
    • Konstantin Belyavskiy's avatar
      Fix force_recovery on empty xlog · be558f20
      Konstantin Belyavskiy authored
      * Fix force_recovery behaviour on empty xlog files and ones with corrupted
        header.
      * Add a test
      * Update xlog-py/empty.test.py, since corrupted xlog no longer leads
        to a broken startup.
      
      Closes #3026, #3076
      be558f20
    • Konstantin Osipov's avatar
      access: revert part of Ilya's patch for create,drop ACL · 28bf71bc
      Konstantin Osipov authored
      For backward compatibility, automatically grant CREATE, DROP
      ACL to all users who have READ and WRITE access.
      
      Our automatic upgrade script automatically grants CREATE and
      ALTER to users with READ/WRITE access on universe, but this is
      insufficient, since new users could be created after upgrade.
      
      Follow up on gh-945  and gh-3089.
      28bf71bc
    • IlyaMarkovMipt's avatar
      security: Add create, drop, alter privilege support · 82123356
      IlyaMarkovMipt authored
      * Add privileges Create, Drop, Alter on universe support.
      * Fix super role behavior, allowing users with
        this role to drop any objects.
      
      Relates #945
      Closes #3089
      82123356
    • IlyaMarkovMipt's avatar
      fio: Read with empty len parameter · df2387b3
      IlyaMarkovMipt authored
      * Add possibility to use file:read without len parameter.
      In this case, whole file will be read.
      
      Closes #2925
      df2387b3
    • Vladimir Davydov's avatar
      iproto: change IPROTO_NOP code from 11 to 12 · 48601fa1
      Vladimir Davydov authored
      11 was initially used for SQL EXECUTE in 1.8, but 1.7 commit
      b73030f2 ("iproto: add IPROTO_NOP request type") reassigned
      it to NOP so after the merge SQL EXECUTE landed at 12, which
      broke connectors. Let's shift NOP to 12 and move EXECUTE back
      to 11. This is OK as 1.7.7 which introduced the new iproto type
      hasn't been officially released yet.
      48601fa1
  4. Feb 01, 2018
  5. Jan 31, 2018
    • Vladimir Davydov's avatar
      replication: introduce orphan mode · dfd3071f
      Vladimir Davydov authored
      This patch modifies the replication configuration procedure so as to
      fully conform to the specification presented in #2958. In a nutshell,
      now box.cfg() tries to synchronize all connected replicas before
      returning. If it fails to connect enough replicas to form a quorum, it
      leaves the server in a degraded 'orphan' mode, which is basically
      read-only. More details below.
      
      First of all, it's worth mentioning that we already have 'orphan' status
      in Tarantool (between 'loading' and 'hot_standby'), but it has nothing
      to do with replication. Actually, it's unclear why it was introduced in
      the first place so we agreed to silently drop it.
      
      We assume that a replica is synchronized if its lag is not greater than
      the value of new configuration option box.cfg.replication_sync_lag.
      Otherwise a replica is considered to be syncing and has "sync" status.
      If replication_sync_lag is unset (nil) or set to TIMEOUT_INFINITY, then
      a replica skips the "sync" state and switches to "follow" immediately.
      The default value of replication_sync_lag is 10 seconds, but it is
      ignored (assumed to be inf) in case the master is running tarantool
      older than 1.7.7, which does not send heartbeat messages.
      
      If box.cfg() is called for the very first time (bootstrap) for a given
      instance, then
      
       1. It tries to connect to all configured replicas for as long as it
          takes (replication_timeout isn't taken into account). If it fails to
          connect to at least one replica, bootstrap is aborted.
      
       2. If this is a cluster bootstrap and the current instance turns out to
          be the new cluster leader, then it performs local bootstrap and
          switches to 'running' state and leaves box.cfg() immediately.
      
       3. Otherwise (i.e. if this is bootstrap of a slave replica), then it
          bootstraps from a remote master and then stays in 'orphan' state
          until it synchronizes with all replicas before switching to
          'running' state and leaving box.cfg().
      
      If box.cfg() is called after bootstrap, in order to recover from the
      local storage, then
      
       1. It recovers the last snapshot and xlogs stored in the local
          directory.
      
       2. Then it switches to 'orphan' mode and tries to connect to at least
          as many replicas as specified by box.cfg.replication_connect_quorum
          for a time period which is a multiple of box.cfg.replication_timeout
          (4x). If it fails, it doesn't abort, but leaves box.cfg() in
          'orphan' mode. The state will switch to 'running' asynchronously as
          soon as the instance has synced with 'replication_connect_quorum'
          replicas.
      
       3. If it managed to connect to enough replicas to form a quorum at step
          2, it synchronizes with them: box.cfg() doesn't return until at
          least 'replication_connect_quorum' replicas have been synchronized.
      
      If box.cfg() is called after recovery to reconfigure replication, then
      it tries to connect to all specified replicas within a time period which
      is a multiple of box.cfg.replication_timeout (4x). The value of
      box.cfg.replication_connect_quorum isn't taken into account, neither is
      the value of box.cfg.replication_sync_lag - box.cfg() returns as soon as
      all configured replicas have been connected.
      
      Just like any other status, the new one is reflected by box.info.status.
      
      Suggested by @kostja
      
      Follow-up #2958
      Closes #999
      dfd3071f
    • Kirill Yukhin's avatar
      Fix order of function attributes for RB tree in replication · 4e0729cb
      Kirill Yukhin authored
      rb_gen used incorrect order of function attributes: sttic
      MAYBE_UNUSED, which caused fails while compiling w/ Clang.
      Change order of mentioned attributes.
      4e0729cb
    • Roman Tsisyk's avatar
      cce7548c
    • Roman Tsisyk's avatar
      Travis CI: use Ubuntu Xenial for tests · e67342c3
      Roman Tsisyk authored
      e67342c3
    • Roman Tsisyk's avatar
      Travis CI: update distributions · d4a73c92
      Roman Tsisyk authored
      - Remove old versions of Fedora and Ubuntu
      - Add Fedora 26 and Fedora 27
      d4a73c92
  6. Jan 30, 2018
    • IlyaMarkovMipt's avatar
      security: Change checks on usage access · 9e30f895
      IlyaMarkovMipt authored
      * Add following behavior:
      Owner of object can't utilize her own objects if she has not usage
      access.
      * Change access checks of space, sequence, function objects
      Similar checks of other objects are performed in alter.cc.
      
      Closes gh-3089
      9e30f895
    • IlyaMarkovMipt's avatar
      net.box: Fix typo · a0681659
      IlyaMarkovMipt authored
      * Fix typo in net_box.lua in rare error case
      a0681659
    • imarkov's avatar
      fix: Broken compilation with gcc 4.6 · 459d63fc
      imarkov authored
      * Delete contructor delegation in ClientError
      * Move code body from one contructor to another
      459d63fc
    • Vladimir Davydov's avatar
      relay: send heartbeat on subscribe if replica is uptodate · f0892e5e
      Vladimir Davydov authored
      Currently, a realy sends a heartbeat message to the replica only if
      there was no WAL events for 'replication_timeout' seconds. As a result,
      a replica that happens to be uptodate on subscribe will not update the
      lag until the timeout passes, which may delay configuration. Let's make
      relay send a heartbeat message right after subscribe in case the replica
      is uptodate.
      f0892e5e
    • Vladimir Davydov's avatar
      replication: add helpers to set and clear replica applier · 85310417
      Vladimir Davydov authored
      These operations are going to become more complicated than just setting
      a pointer so let's introduce helpers for them.
      85310417
    • Vladimir Davydov's avatar
      replication: gather all replicaset variables in struct · 580f1505
      Vladimir Davydov authored
      There is already a handful of global variables describing the replica
      set state and there is going to be more so let's consolidate them in
      a singleton struct:
      
        replicaset => replicaset.hash
        replica_pool => replicaset.pool
        anon_replicas => replicaset.anon
        replicaset_vclock => replicaset.vclock
      
      While we are at it, let's also move INSTANCE_UUID definition from
      xrow.c to replication.cc, where it truly belongs. The only reason
      I see for it to be defined in xrow.c is to compile vinyl unit tests
      without linking replication.o, but we can easily circumvent this by
      defining INSTANCE_UUID in vy_iterators_helpers.c.
      
      Suggested by @kostja
      580f1505
    • Vladimir Davydov's avatar
      replication: get rid of replica->pause_on_connect flag · 3adb9789
      Vladimir Davydov authored
      replicaset_connect() leaves appliers that failed to connect within the
      specified time period running. To prevent them from entering 'subscribe'
      stage prematurely (i.e. before replicaset_follow() is called), we set
      replica->pause_on_connect flag, which will force them to freeze upon
      successful connection. We clear this flag in replicaset_follow(). This
      juggling with flags looks ugly. Instead, let's stop failed appliers in
      replicaset_connect() and restart them in replicaset_follow().
      
      Follow-up #2958
      3adb9789
    • IlyaMarkovMipt's avatar
      fio: Add new methods · 70baba01
      IlyaMarkovMipt authored
      Introduce in fio new methods taken from Python os.path:
      
      * fio.path.exists()
      * fio.path.lexists()
      * fio.path.is_file()
      * fio.path.is_dir()
      * fio.path.is_mount()
      70baba01
    • Konstantin Osipov's avatar
      4cdafae2
    • Ilya Konyukhov's avatar
      Setup apkbuild spec file · d1a6e58d
      Ilya Konyukhov authored
      It builds the last stable tarantool release at the moment (1.7.6).
      It clones tarantool from github repo, then updates submodules, then
      compiles tarantool, small and msgpuck.
      
      Then it packs everything into a package and cleans everything after
      itself
      d1a6e58d
    • Vladimir Davydov's avatar
      vinyl: ignore quota timeout on replication · 4d648f3b
      Vladimir Davydov authored
      If vinyl fails to do memory dumps in time on a replica (e.g. it ran
      out of disk space), replication will stop forever with an error, and
      the admin will have to call box.cfg() to restart replication. Since
      replication is asynchronous anyway, we shouldn't stop it on vinyl
      timeout - it isn't critical as the replica will recover as soon as
      the admin fixes the problem (e.g. frees up some disk space). Let's
      ignore vinyl timeout altogether for applier fibers (currently, we
      ignore it only on join) - the admin can monitor how badly a replica
      lags behind the master via box.info.replication lag/idle.
      
      Closes #3087
      4d648f3b
    • Vladimir Davydov's avatar
      Introduce BEFORE trigger · 5a27b737
      Vladimir Davydov authored
      To register a BEFORE trigger for a space, call space:before_replace()
      function. Similarly to space:on_replace(), this function takes a new
      trigger callback as the first argument and a function to remove from
      the registered trigger list as the second optional argument.
      
      Trigger callbacks are executed from space_execute_dml(), right before
      passing down a request to the engine implementation, but after resolving
      the space sequence. Just like on_replace, a before_replace callback is
      passed old and new tuples, but it can also return a tuple or nil, which
      will affect the current statement as follows:
      
       - If a callback function returns the old tuple, the statement is
         ignored and IPROTO_NOP is written to xlog to bump LSN.
      
       - If a callback function returns the new tuple or doesn't return
         anything, the statement is executed as is.
      
       - If a callback function returns nil, the statement is turned into
         DELETE.
      
       - If a callback function returns a tuple, the statement is turned
         into REPLACE for this tuple.
      
      Other return values result in ER_BEFORE_REPLACE_RET error.
      
      Note, the trigger must not change the primary key of the old tuple,
      because that would require splitting the resulting statement into two -
      DELETE and REPLACE.
      
      The new trigger can be used to resolve asynchronous replication
      conflicts as illustrated by replication/before_replace test.
      
      Closes #2993
      5a27b737
  7. Jan 29, 2018
    • Vladimir Davydov's avatar
      iproto: add IPROTO_NOP request type · b73030f2
      Vladimir Davydov authored
      To implement space:before_replace trigger, we need to introduce a new
      request type for bumping LSN, because the new trigger may turn any DML
      operation into a no-op. Let's call it IPROTO_NOP. It is treated as DML
      (passed to apply_row, etc), but it is ignored by space_execute_dml() and
      so doesn't actually modify anyting, only bumps LSN on the server. The
      new request type has name "NOP" (for xlog reader), however it isn't
      reported via box.stat().
      
      Needed for #2993
      b73030f2
    • Vladimir Davydov's avatar
      Move helpers for updating request from space.c to request.c · 05f9d568
      Vladimir Davydov authored
      This patch moves helpers used to fix requests after certain DML
      operations to a separate source file. Currently, there are only
      two of them, but there are going to be more so it seems to be a
      good idea to isolate them. No functional changes.
      
      Suggested by @kostja
      05f9d568
    • Vladimir Davydov's avatar
      space: introduce space_execute_dml helper · 9a85ae76
      Vladimir Davydov authored
      I'm planning to call BEFORE triggers in space.c. Since a BEFORE trigger
      can change the request type, we can't call it from functions handling
      particular kinds of requests (space_execute_replace() and others).
      So let's move the switch-case statement that executes different space
      callbacks depending on the request type from process_rw() to a new
      function, space_execute_dml(), defined in space.c. We will execute
      BEFORE triggers from this new function, right before dispatching the
      request by its type.
      
      Needed for #2993
      9a85ae76
  8. Jan 27, 2018
Loading