Skip to content
Snippets Groups Projects
  • Denis Smirnov's avatar
    ed9808c2
    fix: bucket calculation for duplicated columns · ed9808c2
    Denis Smirnov authored
    
    The queries `select * from t where sk = 1 and sk = 2` discovered
    the bucket for the constant 1, rather then an empty set. The reason
    was that the tuple merge transformed `sk = 1 and sk = 2` to
    `(sk, sk) = (1, 2)`, while the distribution took into account only
    the first position (constant 1).
    
    To compute all keys we now take a cartesian product between all
    groups of columns of a tuple, where each group consists of columns
    corresponding to single column of sharding key.
    
    Suppose tuple is (a, b, a). (a, b) refer to sharding columns, then
    we have two groups:
    a -> {0, 2}
    b -> {1}
    
    And the distribution keys are:
    {0, 2} x {1} = {(0, 1), (2, 1)}
    
    Co-authored-by: default avatarArseniy Volynets <a.volynets@picodata.io>
    ed9808c2
    History
    fix: bucket calculation for duplicated columns
    Denis Smirnov authored
    
    The queries `select * from t where sk = 1 and sk = 2` discovered
    the bucket for the constant 1, rather then an empty set. The reason
    was that the tuple merge transformed `sk = 1 and sk = 2` to
    `(sk, sk) = (1, 2)`, while the distribution took into account only
    the first position (constant 1).
    
    To compute all keys we now take a cartesian product between all
    groups of columns of a tuple, where each group consists of columns
    corresponding to single column of sharding key.
    
    Suppose tuple is (a, b, a). (a, b) refer to sharding columns, then
    we have two groups:
    a -> {0, 2}
    b -> {1}
    
    And the distribution keys are:
    {0, 2} x {1} = {(0, 1), (2, 1)}
    
    Co-authored-by: default avatarArseniy Volynets <a.volynets@picodata.io>