Verified Commit ed9808c2 authored 5 months ago by Denis Smirnov

fix: bucket calculation for duplicated columns


The queries `select * from t where sk = 1 and sk = 2` discovered
the bucket for the constant 1, rather then an empty set. The reason
was that the tuple merge transformed `sk = 1 and sk = 2` to
`(sk, sk) = (1, 2)`, while the distribution took into account only
the first position (constant 1).

To compute all keys we now take a cartesian product between all
groups of columns of a tuple, where each group consists of columns
corresponding to single column of sharding key.

Suppose tuple is (a, b, a). (a, b) refer to sharding columns, then
we have two groups:
a -> {0, 2}
b -> {1}

And the distribution keys are:
{0, 2} x {1} = {(0, 1), (2, 1)}

Co-authored-by: Arseniy Volynets <a.volynets@picodata.io>

parent 8507c789

No related branches found

No related tags found

1 merge request!1414sbroad import

Hide whitespace changes

Inline Side-by-side

Showing with 56 additions and 16 deletions

Please register or to comment