Skip to content
Snippets Groups Projects
  1. Jun 20, 2022
  2. Jun 17, 2022
  3. Jun 15, 2022
  4. Jun 06, 2022
    • Georgy Moshkin's avatar
      fix(discovery): don't fail if raft node is ready but leader_id is not · 31bf1bc2
      Georgy Moshkin authored
      If proc_discover is invoked after raft node was initialized but before
      raft leader was elected, it would return an error before this commit.
      Because of that it was impossible to restart the whole cluster at once.
      
      This commit change proc_discover such that in case leader_id is not
      ready, the normal discovery algorithm takes place.
      
      Closes #93
      31bf1bc2
  5. Jun 01, 2022
  6. May 31, 2022
    • Georgy Moshkin's avatar
      fix(discovery): fix hanging if some peers don't respond · 4d3116b0
      Georgy Moshkin authored
      Previously the discovery algorithm would try to reach each known peer
      sequentially requiring each consequent request to succeed until the next
      one can be attempted. This would not work in some cases (see test in
      previous commit).
      
      So the new algorithm instead makes a single attempt to reach each peer
      within a round, and if some failed they're retried in the next round of
      requests. This allows overall discovery to succeed in cases when some
      of the initial peers never respond.
      
      Closes #54
      4d3116b0
    • Yaroslav Dynnikov's avatar
      fix: remove unique index on peer_address · a06ff88d
      Yaroslav Dynnikov authored
      The `peer_address` parameter is an inbound address used for
      communication with the peer. It shouldn't be confused with the listen
      address. The persisted `peer_address` may become obsolete due to
      circumstances beyond picodata control (e.g. DNS or IP changes). Thus
      there's no point in its prior validation, including the uniqueness
      check.
      
      There's also no such task as getting peer by peer_address.
      
      To sum up, an index over `peer_address` is useless. It only creates
      problems and causes panics.
      
      Close https://git.picodata.io/picodata/picodata/picodata/-/issues/88
      a06ff88d
  7. May 30, 2022
  8. May 26, 2022
  9. May 23, 2022
  10. May 21, 2022
    • Yaroslav Dynnikov's avatar
      refactor: employ topology module in start_boot · 33ac49d9
      Yaroslav Dynnikov authored
      It's necessary to incapsulate topology management logics away from main.
      33ac49d9
    • Yaroslav Dynnikov's avatar
      feature: topology module · 955aa02e
      Yaroslav Dynnikov authored
      It encapsulates the logics of a JoinRequest batch processing.
      
      Topology module will be quite important in picodata. This first version
      misses a lot of features, but a few commits later it's going to
      implement quite a lot of logics.
      
      When a new instance is joined - there's one complex thing: raft leader
      has to decide where this new instance is going to be emplaced, i.e. what
      replicaset should it join. There're many different parameters have an
      influence - `repliction_factor`, `failure-domain`, and of course the
      existing topology. So, this new `topology` module must make the decision.
      
      This patch only refactors the current Picodata behavior, and doesn't
      bring new features for its users. Instead, it opens the door to a future
      development.
      
      Also, this patch provides a unit-testing basis for the future features.
      955aa02e
  11. May 20, 2022
  12. May 17, 2022
  13. May 16, 2022
    • Yaroslav Dynnikov's avatar
      test: fix flaky args::tests · 18ccf158
      Yaroslav Dynnikov authored
      By default cargo runs tests in parallel in multiple threads.  Both
      `test_log_level` and `test_parse` access environment variables which are
      shared across threads. Consequently, their concurrent modification
      results in the test failure.
      
      This patch unites these two tests making it linear.
      18ccf158
  14. May 13, 2022
  15. May 12, 2022
    • Yaroslav Dynnikov's avatar
      chore: refine logging · c01f07df
      Yaroslav Dynnikov authored
      1. Lower log level of connection errors in `netork.rs`.
      2. Give raft fibers a name.
      c01f07df
    • Yaroslav Dynnikov's avatar
      feature: concurrent join requests handling · 9b079eae
      Yaroslav Dynnikov authored
      There were some problems with join requests synchronization. Raft
      forbids proposing a configuration change if there's another one
      uncommitted (see [1]). In that case, it replaces an `EntryConfChange`
      with an `EntryNormal`. It could happen at any time even without bugs in
      code due to the network partitioning, and its the repsonsibility of
      the picodata product to handle it properly.
      
      Earlier, there was no way to wait when raft leaves the joint state. It
      used to slow down cluster assembling and made it race-prone. The waiting
      for the cluster readiness is also important in tests. Some operations
      (the most important amongst them is leader switching) are impossible
      until instance finishes promotion to a voter. For instance, raft rejects
      `MsgTimeoutNow` unless the node is promotable (see [2]). It makes some
      testing scenarios flaky.
      
      This patch introduces new synchronization primitive - `JointStateLatch`.
      The latch is held on the leader and is locked upon
      `raw_node.propose_conf_change()`. It's unlocked only when the second
      (implicit) conf change that represents leaving joint state is committed.
      The latch also tracks the index of the corresponding `EntryConfChange`.
      Even if raft ignores it for any reason, the latch is still unlocked as
      soon as the committed index exceeds the one of the latch.
      
      [1] https://github.com/tikv/raft-rs/blob/v0.6.0/src/raft.rs#L2014-L2026
      [2] https://github.com/tikv/raft-rs/blob/v0.6.0/src/raft.rs#L2314
      
      Close https://git.picodata.io/picodata/picodata/picodata/-/issues/47
      Close https://git.picodata.io/picodata/picodata/picodata/-/issues/53
      9b079eae
    • Yaroslav Dynnikov's avatar
      fix: pytest wait_ready implementation · 7b94717a
      Yaroslav Dynnikov authored
      Waiting for a valid `leader_id` on a node isn't enough. It may already
      have one, but still be a Learner. Instead, the fixture should wait until
      the node is promoted to voter.
      7b94717a
  16. May 08, 2022
  17. Apr 27, 2022
Loading