Skip to content
Snippets Groups Projects
Verified Commit ed9808c2 authored by Denis Smirnov's avatar Denis Smirnov
Browse files

fix: bucket calculation for duplicated columns


The queries `select * from t where sk = 1 and sk = 2` discovered
the bucket for the constant 1, rather then an empty set. The reason
was that the tuple merge transformed `sk = 1 and sk = 2` to
`(sk, sk) = (1, 2)`, while the distribution took into account only
the first position (constant 1).

To compute all keys we now take a cartesian product between all
groups of columns of a tuple, where each group consists of columns
corresponding to single column of sharding key.

Suppose tuple is (a, b, a). (a, b) refer to sharding columns, then
we have two groups:
a -> {0, 2}
b -> {1}

And the distribution keys are:
{0, 2} x {1} = {(0, 1), (2, 1)}

Co-authored-by: default avatarArseniy Volynets <a.volynets@picodata.io>
parent 8507c789
No related branches found
No related tags found
1 merge request!1414sbroad import
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment