Skip to content

fix: buckets resolving bugs

Arseniy Volynets requested to merge av/fix-buckets-discovery into main

Summarize the changes

  • fix: buckets resolving bugs
  • For query select * from t sk = 1 and sk = 2 we failed to resolve empty buckets. The problem was that we didn't compute distribution of (sk, sk) tuple correctly: it has two keys on first column and on the second column. But we alway computed only single key.

To compute all keys we now take a cartesian product between all groups of columns of a tuple, where each group consists of columns corresponding to single column of sharding key.

Suppose tuple is (a, b, a). a,b refer to sharding columns, then we have two groups: a -> {0, 2} b -> {1} And the distribution keys: {0, 2} x {1} = {(0, 1), (2, 1)}

  • Another bug was that in bucket discovery we used Buckets::Any for Motion(Local) node. This led to dml queries executing on all nodes instead of buckets children. Currently we use Motion(Local) for the following cases:
  1. Update/Delete: materialize reading subtree and do dml operation, no idea why this required Buckets::Any
  2. UnionAll between sharded and local tables, here we materialize global subtree only on a single storage to avoid duplicates. So, the subtree with motion should have the same buckets as child (child always will have Buckets::Any).

Ensure that

  • New code is covered by unit and integration tests.
  • Related issues would be automatically closed with gitlab's closing pattern (Closes #issue_number).
  • Public modules are documented (check the rendered version with cargo doc --open).
  • (if PEST grammar has changed) EBNF grammar reflects these changes (check the result with railroad diagram generator.

close #843 (closed)

Next steps

Merge request reports