applier: fix upstream.lag calculations
upstream.lag is the delta between the moment when a row was written to master's journal and the moment when it was received by the replica. It's an important metric to check whether the replica has fallen too far behind master. Not all the rows coming from master have a valid time of creation. For example, RAFT system messages don't have one, and we can't assign correct time to them: these messages do not originate from the journal, and assigning current time to them would lead to jumps in upstream.lag results. Stop updating upstream.lag for rows which don't have creation time assigned. The upstream.lag calculation changes were meant to fix the flaky replication/errinj.test: Test failed! Result content mismatch: --- replication/errinj.result Fri Aug 13 15:15:35 2021 +++ /tmp/tnt/rejects/replication/errinj.reject Fri Aug 13 15:40:39 2021 @@ -310,7 +310,7 @@ ... box.info.replication[1].upstream.lag < 1 --- -- true +- false ... But the changes were not enough, because now the test may see the initial lag value (TIMEOUT_INFINITY). So fix the test as well by waiting until upstream.lag becomes < 1.
Loading
Please register or sign in to comment