Skip to content
Snippets Groups Projects
  1. Nov 13, 2024
  2. Nov 05, 2024
  3. Nov 02, 2024
  4. Oct 29, 2024
    • Arseniy Volynets's avatar
      fix: incorrect equivalence classes · b2833f6a
      Arseniy Volynets authored
      - `propagate_equality` transformation did not
      compute equality classes correctly, its 'merge'
      function was completely wrong: it tried to add
      intersection of classes to a another class,
      instead of doing union
      - to merge classes correctly we must do it
      when we add a new pair of equal expressions:
      otherwise later there will too many classes
      that contain common elements, so 'merge'
      function was removed and 'insert' now merges
      two classes that contain common elements
      - Also this logic is now covered by tests
      b2833f6a
  5. Oct 28, 2024
    • Arseniy Volynets's avatar
      fix: incorrect equality cols for Eq · 29b7bd56
      Arseniy Volynets authored
      - In case we have equality between columns
      of the same table, `eq_cols_from_eq` should
      return empty equality cols, but it returned
      None, which led to wrong motion. Fixed that.
      - It was found after a merge tuples transform
      fix, in front_sql_join_single_left_5 test
      where now there are two equalities instead of one.
      Earlier, there always was one equality because
      merge tuples produced a single (..) = (..)
      term for an and-chain. And this test didn't work,
      because this term contained one equality pair.
      Now there are two terms: (..) = (..) and (..)=(..)
      and one of them does not contain any equality
      pairs.
      29b7bd56
    • Arseniy Volynets's avatar
      fix: merge tuple transformation didn't group cols · 7fc3ff62
      Arseniy Volynets authored
      - merge tuple transformation that merges several
      and-ed equalities into equlities of rows didn't
      group columns by child they refer to. This led
      to rows where we couldn't find sharding keys,
      because they were scattered across the different
      rows:
      
      ```
      sk(t1) = (a, b), sk(t2) = (e, f)
      ... on (t1.a, t2.f) = (t2.e, t1.b)
      ```
      
      But now correct rows are generated:
      
      ```
      ... on (t1.a, t1.b) = (t2.e, t2.f)
      ```
      7fc3ff62
  6. Oct 23, 2024
  7. Oct 18, 2024
  8. Oct 17, 2024
  9. Oct 15, 2024
    • EmirVildanov's avatar
    • Arseniy Volynets's avatar
      feat: show buckets estimation in explain · 56933ca1
      Arseniy Volynets authored and Denis Smirnov's avatar Denis Smirnov committed
      - Add new line in explain reporting on which buckets query will
        be executed.
      - For queries consisting of a single subtree we can say exactly on
        which buckets it will be executed, for queries with more subtrees
        (with motions), we provide an upper bound of total buckets used
        in the query. Upper bound is computed by merging buckets from the
        leaf subtrees.
      - In case for DML query with non-local motion we can't provide an
        upper bound, and print 'buckets: unknown'
      
      Examples:
      ```
      explain select a from t
      ->
      projection ("t"."a"::integer -> "a")
          scan "t"
      execution options:
          vdbe_max_steps = 45000
          vtable_max_rows = 5000
      buckets = [1-3000]
      
      explain select t.a from t join t as t2
      on t.a = t2.b
      ->
      projection ("t"."a"::integer -> "a")
        ...
      execution options:
          vdbe_max_steps = 45000
          vtable_max_rows = 5000
      buckets <= [1-3000]
      
      explain select id from _pico_table
      ->
      projection ("_pico_table"."id"::unsigned -> "id")
          scan "_pico_table"
      execution options:
          vdbe_max_steps = 45000
          vtable_max_rows = 5000
      buckets = any
      
      explain insert into t values (1, 2)
      ->
      insert "t" on conflict: fail
      motion [policy: segment([ref("COLUMN_1")])]
          values
              value row (...)
      execution options:
          vdbe_max_steps = 45000
          vtable_max_rows = 5000
      buckets = unknown
      ```
      56933ca1
  10. Oct 14, 2024
  11. Oct 07, 2024
    • Denis Smirnov's avatar
      fix: bucket calculation for duplicated columns · ed9808c2
      Denis Smirnov authored
      
      The queries `select * from t where sk = 1 and sk = 2` discovered
      the bucket for the constant 1, rather then an empty set. The reason
      was that the tuple merge transformed `sk = 1 and sk = 2` to
      `(sk, sk) = (1, 2)`, while the distribution took into account only
      the first position (constant 1).
      
      To compute all keys we now take a cartesian product between all
      groups of columns of a tuple, where each group consists of columns
      corresponding to single column of sharding key.
      
      Suppose tuple is (a, b, a). (a, b) refer to sharding columns, then
      we have two groups:
      a -> {0, 2}
      b -> {1}
      
      And the distribution keys are:
      {0, 2} x {1} = {(0, 1), (2, 1)}
      
      Co-authored-by: default avatarArseniy Volynets <a.volynets@picodata.io>
      ed9808c2
    • Denis Smirnov's avatar
      fix: bucket discovery for local motions · 8507c789
      Denis Smirnov authored
      
      Previously, during bucket discovery, we used Buckets::Any for
      Motion(Local) nodes. This caused DML queries to be executed on all
      nodes instead of targeting specific bucket children. We now apply
      Motion(Local) only in the following cases:
      
      - update/delete. When materializing the reading subtree for DML
        operations, Buckets::Any was used, but the reason for this is
        unclear.
      - union all between sharded and local tables. To prevent duplicates,
        we materialize the global subtree only on a single storage node.
        Consequently, the subtree with Motion(Local) must have the same
        buckets as its child (the child node will always have Buckets::Any).
      
      Co-authored-by: default avatarArseniy Volynets <a.volynets@picodata.io>
      8507c789
  12. Oct 04, 2024
    • Arseniy Volynets's avatar
      fix: wrong slices calculation · d5aa241e
      Arseniy Volynets authored
      - Previously we computed slices based on their
      level in bfs tree traversal. This was wrong, as
      motions that were independent could be in
      different slices
      - Fix that, now we the slice in which motion will
      be is the max number of other motion nodes in path
      from this motion to any leaf node
      d5aa241e
  13. Oct 03, 2024
    • Arseniy Volynets's avatar
      refactor: remove ExecuteOptions · 093f271f
      Arseniy Volynets authored
      - ExecuteOptions is a hashmap that always
      stores only 1 entry. Moreover this led to
      a bug, when we expected that it stores ALL
      execution options, while it stored only
      vdbe_max_steps
      093f271f
    • Arseniy Volynets's avatar
      fix: vtable max rows limit not applied · b6574768
      Arseniy Volynets authored
      - We didn't apply vtable max rows value
      when executing local sql. We tried to lookup
      it in the execute options hashmap and took
      the default value instead, though it was
      not stored in the hashmap.
      b6574768
  14. Oct 02, 2024
  15. Oct 01, 2024
  16. Sep 30, 2024
  17. Sep 27, 2024
    • Arseniy Volynets's avatar
      fix: used to error on concat with parameters · 6687b5c9
      Arseniy Volynets authored
      - When adding concat to plan, we had a check
      that both concat children are expressions.
      - This leads to error, when children are
      parameters.
      - Replace error with debug assert that checks
      for expression | parameter children. Debug assert
      is used because it is quiet unlikely that we will
      mess up expression children, no need to check it
      in release builds.
      6687b5c9
    • Arseniy Volynets's avatar
      feat: support select without scan · 97d2b814
      Arseniy Volynets authored
      - Support simple queries that do not select
      data from tables. This is similar to `values ..`,
      but allows to add aliases for expressions.
      - Such selects can also be used in subqueries
      - Examples:
      ```
      select 1;
      select (select count(*) from t);
      select 2 as foo, 3 as bar;
      
      select a from t
      where b in (select 100)
      ```
      97d2b814
    • Andrey Strochuk's avatar
      c389ed4a
    • Arseniy Volynets's avatar
      fix: wrong hash calculation of plan subtree · 277d570e
      Arseniy Volynets authored
      - We didn't traverse output during subtree
      traversal when calculating hash. Some nodes
      (Motion, Projection) store non-trivial
      information, which allows to distinguish
      different plans. We came across this by
      collision between two different queries:
      ```
      SELECT w.n FROM t JOIN w ON t.n = w.n
      LIMIT 3
      
      SELECT id, count(*) FROM t
      GROUP BY id
      HAVING id > 2
      LIMIT 3
      ```
      These plan happened to have subtrees that match
      exactly except for output of projection, that
      we didn't traverse.
      
      - fix this by traversing subtree fully
      277d570e
  18. Sep 26, 2024
    • Arseniy Volynets's avatar
      fix: node mapped twice · 07574e00
      Arseniy Volynets authored
      - ValuesRow node have `data` and `output`
      fields refer to the same stuff in query
      `values (1)`. For except between global
      and sharded table we applied special
      transformation, in which we cloned
      global subtree.
      - Subtree cloner didn't respect DAG structure
      of our plan: there maybe multiple references
      to the same node. Instead it expected a tree.
      Fix that
      07574e00
    • Arseniy Volynets's avatar
      fix: error on union under insert · 5480dfbf
      Arseniy Volynets authored
      5480dfbf
  19. Sep 24, 2024
  20. Sep 23, 2024
Loading