Skip to content
Snippets Groups Projects
  1. May 31, 2018
    • Vladislav Shpilevoy's avatar
      session: move salt into iproto connection · 860d6b3f
      Vladislav Shpilevoy authored
      Session salt is 32 random bytes, that are used to encode password
      when a user is authorized. The salt is not used in non-binary
      sessions, and can be moved to iproto connection.
      860d6b3f
    • Vladislav Shpilevoy's avatar
      yaml: introduce yaml.decode tag_only option · 567769d0
      Vladislav Shpilevoy authored
      Yaml.decode tag_only option allows to decode a single tag of a
      YAML document. For #2677 it is needed to detect different push
      types in text console: print pushes via console.print, and actual
      pushes via box.session.push.
      
      To distinguish them YAML tags will be used. A client console for
      each message will try to find a tag. If a tag is absent, then the
      message is a simple response to a request.
      
      If a tag is !print!, then the document consists of a single
      string, that must be printed. Such a document must be decoded to
      get the printed string. So the calls sequence is
      yaml.decode(tag_only) + yaml.decode. The reason why a print
      message must be decoded is that a print() result on a server side
      can be not well-formatted YAML, and must be encoded into it to be
      correctly sent. For example, when I do on a server side something
      like this:
      
      console.print('very bad YAML string')
      
      The result of a print() is not a YAML document, and to be sent it
      must be encoded into YAML on a server side.
      
      If a tag is !push!, then the document is sent via
      box.session.push, and must not be decoded. It can be just printed
      or ignored or something.
      
      Needed for #2677
      567769d0
    • Vladislav Shpilevoy's avatar
      lua: merge encode_tagged into encode options · b2da28f8
      Vladislav Shpilevoy authored
      Encode_tagged is a workaround to be able to pass options to
      yaml.encode().
      
      Before the patch yaml.encode() in fact has this signature:
      yaml.encode(...). So it was impossible to add any options to this
      function - all of them would be treated as the parameters. But
      documentation says: https://tarantool.io/en/doc/1.9/reference/reference_lua/yaml.html?highlight=yaml#lua-function.yaml.encode
      that the function has this signature: yaml.encode(value).
      
      I hope if anyone uses yaml.encode(), he does it according to the
      documentation. And I can add the {tag_prefix, tag_handle} options
      to yaml.encode() and remove yaml.encode_tagged() workaround.
      b2da28f8
  2. May 30, 2018
    • Vladislav Shpilevoy's avatar
      yaml: introduce yaml.encode_tagged · ddcd95a0
      Vladislav Shpilevoy authored
      Encode_tagged allows to define one global YAML tag for a
      document. Tagged YAML documents are going to be used for
      console text pushes to distinguish actual box.session.push() from
      console.print(). The first will have tag !push, and the
      second - !print.
      ddcd95a0
    • Vladimir Davydov's avatar
      engine: constify vclock argument · 088e3e24
      Vladimir Davydov authored
      None of engine_wait_checkpoint, engine_commit_checkpoint, engine_join,
      engine_backup needs to modify the vclock argument.
      088e3e24
    • Vladimir Davydov's avatar
      Allow to increase box.cfg.vinyl_memory and memtx_memory at runtime · 30492862
      Vladimir Davydov authored
      Slab arena can grow dynamically so all we need to do is increase the
      quota limit. Decreasing the limits is still explicitly prohibited,
      because slab arena never unmaps slabs.
      
      Closes #2634
      30492862
    • Vladimir Davydov's avatar
      vinyl: update recovery context with records written during recovery · d135f39c
      Vladimir Davydov authored
      During recovery, we may write VY_LOG_CREATE_LSM and VY_LOG_DROP_LSM
      records we failed to write before restart (because those records are
      written after WAL and hence may not make it to vylog). Right after
      recovery we invoke garbage collection to drop incomplete runs. Once
      VY_LOG_PREPARE_LSM record is introduced, we will also collect incomplete
      LSM trees there (those we failed to build). However, there may be LSM
      trees we managed to build but failed to write VY_LOG_CREATE_LSM for.
      This is OK as we will retry vylog write, but currenntly it isn't
      reflected in the recovery context used for garbage collection. To avoid
      purging such LSM trees, let's update the recovery context with records
      written during recovery.
      
      Needed for #1653
      d135f39c
  3. May 29, 2018
  4. May 25, 2018
    • Konstantin Belyavskiy's avatar
      replication: display downstream status at upstream · 3db1dee9
      Konstantin Belyavskiy authored
      This fix improves 'box.info.replication' output.
      If downstream fails and thus disconnects from upstream, improve
      logging by printing 'status: disconnected' and error message on
      both sides (master and replica).
      
      Closes #3365
      3db1dee9
    • Konstantin Belyavskiy's avatar
      replication: do not delete relay on applier disconnect · adc28591
      Konstantin Belyavskiy authored
      This is a part of more complex task aiming to improve logging.
      Do not destroy relay since it stores last error and it can be
      useful for diagnostic reason.
      Now relay is created with replica and always exists. So also
      remove several NULL checks.
      Add relay_state { OFF, FOLLOW and STOPPED } to track replica
      presence, once connected it either FOLLOW or STOPPED until
      master is reset.
      Updated with @kostja proposal.
      
      Used for #3365.
      adc28591
    • Konstantin Osipov's avatar
    • Vladimir Davydov's avatar
      vinyl: purge dropped indexes from vylog on garbage collection · a2d1d2a2
      Vladimir Davydov authored
      Currently, when an index is dropped, we remove all ranges/slices
      associated with it and mark all runs as dropped in vylog immediately.
      To find ranges/slices/runs, we use vy_lsm struct, see vy_log_lsm_prune.
      
      The problem is vy_lsm struct may be inconsistent with the state stored
      in vylog if index drop races with compaction, because we first write
      changes done by compaction task to vylog and only then update vy_lsm
      struct, see vy_task_compact_complete. Since write to vylog yields, this
      opens a time window during which the index can be dropped. If this
      happens, objects that were created by compaction but haven't been logged
      yet (such as new runs, slices, ranges) will be deleted from vylog by
      index drop, and this will permanently break vylog, making recovery
      impossible.
      
      To fix this issue, let's rework garbage collection of objects associated
      with dropped indexes as follows. Now when an index is dropped, we write
      a single record to vylog, VY_LOG_DROP_LSM, i.e. just mark the index as
      dropped without deleting associated objects. Actual index cleanup takes
      place in the garbage collection procedure, see vy_gc, which purges all
      ranges/slices linked to marked indexes from vylog and marks all their
      runs as dropped. When all runs are actually deleted from disk and
      "forgotten" in vylog, we remove the index record from vylog by writing
      VY_LOG_FORGET_LSM record. Since garbage collection procedure uses vylog
      itself instead of vy_lsm struct for iterating over vinyl objects, no
      race between index drop and dump/compaction can now lead to broken
      vylog.
      
      Closes #3416
      a2d1d2a2
    • Vladimir Davydov's avatar
      vinyl: store lsn of index drop record in vylog · 264f7e3f
      Vladimir Davydov authored
      This is required to rework garbage collection in vinyl.
      264f7e3f
    • Vladimir Davydov's avatar
      alter: pass lsn of index drop record to engine · 1af04afe
      Vladimir Davydov authored
      We pass lsn of index alter/create records, let's pass lsn of drop record
      for consistency. This is also needed by vinyl to store it in vylog (see
      the next patch).
      1af04afe
    • Vladimir Davydov's avatar
      vinyl: do not reuse lsm objects during recovery from vylog · 31ab8e03
      Vladimir Davydov authored
      If an index was dropped and then recreated, then while replaying vylog
      we will reuse vy_lsm_recovery_info object corresponding to it. There's
      no reason why we do that instead of simply allocating a new object -
      amount of memory saved is negligible, but the code looks more complex.
      Let's simplify the code - whenever we see VY_LOG_CREATE_LSM, create a
      new vy_lsm_recovery_info object and replace the old incarnation if any
      in the hash map.
      31ab8e03
    • Konstantin Osipov's avatar
      test: update replication_connect_timeout in tests to a lower value · e9bf00fc
      Konstantin Osipov authored
      replication: make replication_connect_timeout dynamic
      e9bf00fc
    • Konstantin Osipov's avatar
    • Vladimir Davydov's avatar
      test: rework test case for memtx async garbage collection · c5f98b91
      Vladimir Davydov authored
      Do not use errinj as it is unreliable. Check that:
       - No memory is freed by immediately after space drop (WAL is off).
       - All memory is freed asynchronously after yield.
      c5f98b91
    • Vladimir Davydov's avatar
      replication: fix log message in case of sync failure · 6c35bf9b
      Vladimir Davydov authored
      replicaset_sync() returns not only if the instance synchronized to
      connected replicas, but also if some replicas have disconnected and
      the quorum can't be formed any more. Nevertheless, it always prints
      that sync has been completed. Fix it.
      
      See #3422
      6c35bf9b
    • Vladimir Davydov's avatar
      replication: do not stop syncing if replicas are loading · 1785e79c
      Vladimir Davydov authored
      If a replica disconnects while sync is in progress, box.cfg{} may stop
      syncing leaving the instance in 'orphan' mode. This will happen if not
      enough replicas are connected to form a quorum. This makes sense e.g. on
      network error, but not when a replica is loading, because in the latter
      case it should be up and running quite soon. Let's account replicas that
      disconnected because they haven't completed initial configuration yet
      and continue syncing if connected + loading > quorum.
      
      Closes #3422
      1785e79c
    • Konstantin Belyavskiy's avatar
      replication: use applier_state to check quorum · ca53ab91
      Konstantin Belyavskiy authored
      Small refactoring: remove 'enum replica_state' since reuse a subset
      from applier state machine 'enum replica_state' to check if we have
      achieved replication quorum and hence can leave read-only mode.
      ca53ab91
    • Konstantin Osipov's avatar
      replication: change default replication_connect_timeout to 30 seconds · 06a63686
      Konstantin Osipov authored
      The default of 4 seconds is too low to bootstrap a large cluster.
      06a63686
    • Vladislav Shpilevoy's avatar
      iproto: 'iproto_msg_max' -> 'net_msg_max' in message · 020fb77f
      Vladislav Shpilevoy authored
      Closes #3425
      020fb77f
  5. May 24, 2018
    • Georgy Kirichenko's avatar
      replication: add strict ordering for appliers operating in a full mesh · edd76a2a
      Georgy Kirichenko authored
      In some cases when an applier processing yielded, other applier might
      start some conflicting operation and break replication and database
      consistency.
      Now applier locks a per-server-id latch before processing a transaction.
      This guarantees that there is only one applier request for each server
      in progress at each given moment.
      
      The problem was very rare until full mesh topologies in vinyl
      became a commonplace.
      
      Fixes gh-3339
      edd76a2a
    • Vladimir Davydov's avatar
      memtx: run garbage collection on demand · 39c8b526
      Vladimir Davydov authored
      When a memtx space is dropped or truncated, we delegate freeing tuples
      stored in it to a background fiber so as not to block the caller (and tx
      thread) for too long. Turns out it doesn't work out well for ephemeral
      spaces, which share the destruction code with normal spaces: the problem
      is the user might issue a lot of complex SQL SELECT statements that
      create a lot of ephemeral spaces and do not yield and hence don't give
      the garbage collection fiber a chance to clean up. There's a test that
      emulates this, 2.0:test/sql-tap/gh-3083-ephemeral-unref-tuples.test.lua.
      For this test to pass, let's run garbage collection procedure on demand,
      i.e. when any of memtx allocation functions fails to allocate memory.
      
      Follow-up #3408
      39c8b526
    • Vladimir Davydov's avatar
      memtx: rework background garbage collection procedure · cc0e5b4c
      Vladimir Davydov authored
      Currently, the engine has not control over yields issued during
      asynchronous index destruction. As a result, it can't force gc when
      there's not enough memory. To fix that, let's make gc callback stateful:
      now it's supposed to free some objects and return true if there's still
      more objects to free or false otherwise. Yields are now done by the
      memtx engine itself after each gc callback invocation.
      cc0e5b4c
  6. May 22, 2018
  7. May 21, 2018
    • Vladislav Shpilevoy's avatar
      Remove unused FDGuard · f57fd113
      Vladislav Shpilevoy authored
      f57fd113
    • Vladimir Davydov's avatar
      memtx: free tuples asynchronously when primary index is dropped · 2a1482f3
      Vladimir Davydov authored
      When a memtx space is dropped or truncated, we have to unreference all
      tuples stored in it. Currently, we do it synchronously, thus blocking
      the tx thread. If a space is big, tx thread may remain blocked for
      several seconds, which is unacceptable. This patch makes drop/truncate
      hand actual work to a background fiber.
      
      Before this patch, drop of a space with 10M 64-byte records took more
      than 0.5 seconds. After this patch, it takes less than 1 millisecond.
      
      Closes #3408
      2a1482f3
Loading