fix: buckets resolving bugs (!538) · Merge requests · core / sbroad

Merged Arseniy Volynets requested to merge av/fix-buckets-discovery into main 6 months ago

Summarize the changes

fix: buckets resolving bugs

For query select * from t sk = 1 and sk = 2 we failed to resolve empty buckets. The problem was that we didn't compute distribution of (sk, sk) tuple correctly: it has two keys on first column and on the second column. But we alway computed only single key.

To compute all keys we now take a cartesian product between all groups of columns of a tuple, where each group consists of columns corresponding to single column of sharding key.

Suppose tuple is (a, b, a). a,b refer to sharding columns, then we have two groups: a -> {0, 2} b -> {1} And the distribution keys: {0, 2} x {1} = {(0, 1), (2, 1)}

Another bug was that in bucket discovery we used Buckets::Any for Motion(Local) node. This led to dml queries executing on all nodes instead of buckets children. Currently we use Motion(Local) for the following cases:

Update/Delete: materialize reading subtree and do dml operation, no idea why this required Buckets::Any
UnionAll between sharded and local tables, here we materialize global subtree only on a single storage to avoid duplicates. So, the subtree with motion should have the same buckets as child (child always will have Buckets::Any).

Ensure that

New code is covered by unit and integration tests.
Related issues would be automatically closed with gitlab's closing pattern (Closes #issue_number).
~~Public modules are documented (check the rendered version with cargo doc --open).~~
~~(if PEST grammar has changed) EBNF grammar reflects these changes (check the result with railroad diagram generator.~~

fix: buckets resolving bugs

Summarize the changes

Ensure that

Next steps

Activity

fix: buckets resolving bugs

Summarize the changes

Ensure that

Next steps

Merge request reports

Activity