- Jun 20, 2022
-
-
-
-
Valentin Syrovatskiy authored
-
- Jun 17, 2022
-
-
Bootstrapping the replication was implemented in 2dac77c5. But it was configured on the new instance only. The old instance (that joined earlier) couldn't update `box.cfg({replication})` until now. Close https://git.picodata.io/picodata/picodata/picodata/-/issues/52
-
- Jun 15, 2022
-
-
- Jun 06, 2022
-
-
Georgy Moshkin authored
If proc_discover is invoked after raft node was initialized but before raft leader was elected, it would return an error before this commit. Because of that it was impossible to restart the whole cluster at once. This commit change proc_discover such that in case leader_id is not ready, the normal discovery algorithm takes place. Closes #93
-
- Jun 01, 2022
-
-
Sergey V authored
* Make `--cluster-id` CLI mandatory. * Handle cluster_id mismatch in raft_join. When an instance attempts to join the cluster and the instances's `--instance-id` parameter mismatches the cluster_id of the cluster an error is raised inside the raft_join handler.
-
- May 31, 2022
-
-
Yaroslav Dynnikov authored
The `peer_address` parameter is an inbound address used for communication with the peer. It shouldn't be confused with the listen address. The persisted `peer_address` may become obsolete due to circumstances beyond picodata control (e.g. DNS or IP changes). Thus there's no point in its prior validation, including the uniqueness check. There's also no such task as getting peer by peer_address. To sum up, an index over `peer_address` is useless. It only creates problems and causes panics. Close https://git.picodata.io/picodata/picodata/picodata/-/issues/88
-
- May 30, 2022
-
-
-
Yaroslav Dynnikov authored
Picodata already assigns `replicaset_id` to an instance when it joins, but it wasn't used in Tarantool `box.cfg` yet. Now it is. It's also important to set up listen port in `start_join` immediately. Without it Tarantool will stuck waiting for connection to self. Part of https://git.picodata.io/picodata/picodata/picodata/-/issues/52
-
-
- May 26, 2022
-
-
Yaroslav Dynnikov authored
-
- May 23, 2022
-
-
Yaroslav Dynnikov authored
- Add corresponding field to the Peer struct. - Generate it in the topology module. - Use it in `box.cfg`. Close https://git.picodata.io/picodata/picodata/picodata/-/issues/51
-
Yaroslav Dynnikov authored
Address `replication_factor` when choosing `relicaset_id` for a new instance. It dosn't consider `failure_domain` yet, but takes into account the number of instances. Close https://git.picodata.io/picodata/picodata/picodata/-/issues/68
-
Yaroslav Dynnikov authored
- Choose it in the topology module if it's not provided in a `JoinRequest`. - Persist in `raft_group` space. - Respond with an error if `JoinRequest` contains different `replicaset_id`. - In `JoinResponse` it's transferred automatically. Part of https://git.picodata.io/picodata/picodata/picodata/-/issues/51
-
Yaroslav Dynnikov authored
- Generate it in the topology module - Persist it in `raft_group` space - Transfer it in `JoinResponse` - Use it in `box.cfg` Close https://git.picodata.io/picodata/picodata/picodata/-/issues/50
-
- May 21, 2022
-
-
Yaroslav Dynnikov authored
It encapsulates the logics of a JoinRequest batch processing. Topology module will be quite important in picodata. This first version misses a lot of features, but a few commits later it's going to implement quite a lot of logics. When a new instance is joined - there's one complex thing: raft leader has to decide where this new instance is going to be emplaced, i.e. what replicaset should it join. There're many different parameters have an influence - `repliction_factor`, `failure-domain`, and of course the existing topology. So, this new `topology` module must make the decision. This patch only refactors the current Picodata behavior, and doesn't bring new features for its users. Instead, it opens the door to a future development. Also, this patch provides a unit-testing basis for the future features.
-
- May 20, 2022
-
-
Yaroslav Dynnikov authored
Both JoinRequest and JoinResponse are going to be used in other modules. Move them one level above from `traft::node::*` to `traft::*`.
-
Yaroslav Dynnikov authored
One of the most tricky Raft cases is a so-called ABA problem [1]. In that case it's important to protect a batch of join requests with a term number. Since the whole batch is supplied with atomicity-sensitive uuids, applying it on a different term may cause an inconsistency, which is very, veeeery bad. [1] https://en.wikipedia.org/wiki/ABA_problem
-
- May 13, 2022
-
-
- May 12, 2022
-
-
Yaroslav Dynnikov authored
1. Lower log level of connection errors in `netork.rs`. 2. Give raft fibers a name.
-
Yaroslav Dynnikov authored
There were some problems with join requests synchronization. Raft forbids proposing a configuration change if there's another one uncommitted (see [1]). In that case, it replaces an `EntryConfChange` with an `EntryNormal`. It could happen at any time even without bugs in code due to the network partitioning, and its the repsonsibility of the picodata product to handle it properly. Earlier, there was no way to wait when raft leaves the joint state. It used to slow down cluster assembling and made it race-prone. The waiting for the cluster readiness is also important in tests. Some operations (the most important amongst them is leader switching) are impossible until instance finishes promotion to a voter. For instance, raft rejects `MsgTimeoutNow` unless the node is promotable (see [2]). It makes some testing scenarios flaky. This patch introduces new synchronization primitive - `JointStateLatch`. The latch is held on the leader and is locked upon `raw_node.propose_conf_change()`. It's unlocked only when the second (implicit) conf change that represents leaving joint state is committed. The latch also tracks the index of the corresponding `EntryConfChange`. Even if raft ignores it for any reason, the latch is still unlocked as soon as the committed index exceeds the one of the latch. [1] https://github.com/tikv/raft-rs/blob/v0.6.0/src/raft.rs#L2014-L2026 [2] https://github.com/tikv/raft-rs/blob/v0.6.0/src/raft.rs#L2314 Close https://git.picodata.io/picodata/picodata/picodata/-/issues/47 Close https://git.picodata.io/picodata/picodata/picodata/-/issues/53
-
Yaroslav Dynnikov authored
Waiting for a valid `leader_id` on a node isn't enough. It may already have one, but still be a Learner. Instead, the fixture should wait until the node is promoted to voter.
-
- Apr 21, 2022
-
-
Yaroslav Dynnikov authored
-
- Apr 20, 2022
-
-
Yaroslav Dynnikov authored
-
Yaroslav Dynnikov authored
Use common `Notify` instead of a cond. It makes the code more clear and reliable, because each request is tracked individually and can't be affected by different yields in the main_loop.
-
Yaroslav Dynnikov authored
-
Yaroslav Dynnikov authored
-
Yaroslav Dynnikov authored
Replace macros with function in `traft::node`.
-
Yaroslav Dynnikov authored
-
- Apr 18, 2022
-
-
Yaroslav Dynnikov authored
-
Yaroslav Dynnikov authored
-
Yaroslav Dynnikov authored
feat: raft peer discovery PoC chore: prevent C functions from being optimized out feat: improve peer discovery fix: fix tests after making instance_id arg mandatory Smart supevision with fork Make it work Under development IPC messages to supervisor One more little step: entrypoint enum Arrange IPC from child to supervisor Remove tarantool_main macro Persist snapshot Fix some fresh bugs Implement postjoin Discovery under refactoring Enhance discovery Working on discovery Fix all discovery bugs known so far Draft join algorithm Fail applying snapshot Joining a learner works Cleanup snapshot generation Reorganize traft code and call join automatically Change peer.commit_index type from option to u64 Implement autopromotion to voter Implement read_index Take read_index before self-romotion Cleanup excess logs Cleanup logs and code Deep refactoring in progress Finish refactoring db schema Embed entries applying inside traft node Refactor raft node communication Replace fiber channel with a mailbox, which is a `Vec<_>` + fiber cond. It allows to batch raft commands in a more predictable way and makes the code less error-prone. Remove commented code Simplify raft nodes interaction over net_box Eliminate `traft::Message` struct because its internals aren't used. Instead, serialize `raft::Message` using protobuf. Batch ConnectionPool requests 1. Send messages in batches. 2. Allow changing connection uri. 3. Close unused connections after `inactivity_timeout`. Enhance the raft node 1. Collect results from raft node 2. Fix initial bootstrap which used to fail due to fiber race. 3. Wrap raft storage operations in a transaction. Bump tarantool module Add documentation draft Cleanup warnings Try fixing tests Fix test_storage_log Fix test_traft_pool Fix luatest single and couple Start fixing threesome test Implement concurrent join requests handling
-
- Feb 22, 2022
-
-
Yaroslav Dynnikov authored
This patch introduces two tests: 1. `couple.test_failover` reproduces heatbeat timeout on a follower that leads to a new election. 2. `threesome.test_leader_dispuption` simulates disconnected follower. It shouldn't disrupt the leader in this case. Leader death is already tested in `threesome.test_log_rollback`. Close https://gitlab.com/picodata/picodata/picodata/-/issues/25
-
Yaroslav Dynnikov authored
Also, fix the code to pass the test: 1. Use `replace` instead of `insert` when appending entries. They may be overridden. 2. Increase pool worker queue size. Raft node may send several messages within a single tick. Without this change the second message is dropped. The second change is a poorly designed workaround. Constant limit doesn't seem to be both cost-effective and reliable at the same time, but we don't have a better solution at the moment. Close https://gitlab.com/picodata/picodata/picodata/-/issues/26
-
Yaroslav Dynnikov authored
It makes testing easier. Monkey patching the handler isn't necessary anymore. Instead, it's mocked now.
-
- Feb 21, 2022
-
-
Yaroslav Dynnikov authored
Follower makes a proposal. It's applied on both instances. This patch also introduces a new API `picolib.raft_status`. Close https://gitlab.com/picodata/picodata/picodata/-/issues/24
-
- Feb 18, 2022
-
-
Georgy Moshkin authored
+ tarantool-sys submodule + tarantool-patches directory + build.rs build script to patch and build tarantool + static linking with tarantool + refactoring for cli arguments
-