- Oct 14, 2024
-
-
EmirVildanov authored
-
- Oct 07, 2024
-
-
Denis Smirnov authored
The queries `select * from t where sk = 1 and sk = 2` discovered the bucket for the constant 1, rather then an empty set. The reason was that the tuple merge transformed `sk = 1 and sk = 2` to `(sk, sk) = (1, 2)`, while the distribution took into account only the first position (constant 1). To compute all keys we now take a cartesian product between all groups of columns of a tuple, where each group consists of columns corresponding to single column of sharding key. Suppose tuple is (a, b, a). (a, b) refer to sharding columns, then we have two groups: a -> {0, 2} b -> {1} And the distribution keys are: {0, 2} x {1} = {(0, 1), (2, 1)} Co-authored-by:
Arseniy Volynets <a.volynets@picodata.io>
-
Denis Smirnov authored
Previously, during bucket discovery, we used Buckets::Any for Motion(Local) nodes. This caused DML queries to be executed on all nodes instead of targeting specific bucket children. We now apply Motion(Local) only in the following cases: - update/delete. When materializing the reading subtree for DML operations, Buckets::Any was used, but the reason for this is unclear. - union all between sharded and local tables. To prevent duplicates, we materialize the global subtree only on a single storage node. Consequently, the subtree with Motion(Local) must have the same buckets as its child (the child node will always have Buckets::Any). Co-authored-by:
Arseniy Volynets <a.volynets@picodata.io>
-
- Oct 04, 2024
-
-
Arseniy Volynets authored
- Previously we computed slices based on their level in bfs tree traversal. This was wrong, as motions that were independent could be in different slices - Fix that, now we the slice in which motion will be is the max number of other motion nodes in path from this motion to any leaf node
-
- Oct 03, 2024
-
-
Arseniy Volynets authored
- ExecuteOptions is a hashmap that always stores only 1 entry. Moreover this led to a bug, when we expected that it stores ALL execution options, while it stored only vdbe_max_steps
-
Arseniy Volynets authored
- We didn't apply vtable max rows value when executing local sql. We tried to lookup it in the execute options hashmap and took the default value instead, though it was not stored in the hashmap.
-
- Oct 02, 2024
-
-
-
Arseniy Volynets authored
- Earlier we returned empty buckets for Values/ValuesRow nodes and then choosed random bucket for execution if they were a subtree root. - Buckets::Any provides the same semantics, use it instead and simplify `empty_query_result` function, which now always returns empty result.
-
Denis Smirnov authored
-
- Oct 01, 2024
-
-
-
Arseniy Volynets authored
-
- Sep 30, 2024
-
-
Вартан Бабаян authored
-
Denis Smirnov authored
-
- When creating Values node in IR we didn't check that all values rows have the same length. - This led to panic on earlier pipeline stages: syntax plan build - Add a check that all values rows have the same length
-
- Sep 27, 2024
-
-
Arseniy Volynets authored
- When adding concat to plan, we had a check that both concat children are expressions. - This leads to error, when children are parameters. - Replace error with debug assert that checks for expression | parameter children. Debug assert is used because it is quiet unlikely that we will mess up expression children, no need to check it in release builds.
-
Arseniy Volynets authored
- Support simple queries that do not select data from tables. This is similar to `values ..`, but allows to add aliases for expressions. - Such selects can also be used in subqueries - Examples: ``` select 1; select (select count(*) from t); select 2 as foo, 3 as bar; select a from t where b in (select 100) ```
-
Andrey Strochuk authored
-
Arseniy Volynets authored
- We didn't traverse output during subtree traversal when calculating hash. Some nodes (Motion, Projection) store non-trivial information, which allows to distinguish different plans. We came across this by collision between two different queries: ``` SELECT w.n FROM t JOIN w ON t.n = w.n LIMIT 3 SELECT id, count(*) FROM t GROUP BY id HAVING id > 2 LIMIT 3 ``` These plan happened to have subtrees that match exactly except for output of projection, that we didn't traverse. - fix this by traversing subtree fully
-
- Sep 26, 2024
-
-
Arseniy Volynets authored
- ValuesRow node have `data` and `output` fields refer to the same stuff in query `values (1)`. For except between global and sharded table we applied special transformation, in which we cloned global subtree. - Subtree cloner didn't respect DAG structure of our plan: there maybe multiple references to the same node. Instead it expected a tree. Fix that
-
Arseniy Volynets authored
-
- Sep 24, 2024
-
-
Вартан Бабаян authored
-
Denis Smirnov authored
-
Arseniy Volynets authored
-
- Sep 23, 2024
-
-
Arseniy Volynets authored
-
Arseniy Volynets authored
-
Arseniy Volynets authored
-
- Sep 20, 2024
-
-
Artur Sabirov authored
-
Artur Sabirov authored
-
-
Previous version was using yaml-rust which is marked as unmaintained. This gets reported by dependency scanner we use for certification. Though worth noting that serde-yaml itself is now archived. There is a promising replacement library but no need to hurry with that just yet. But anyway, we don't need this library in the production build, only for tests. So, moved it to development dependencies. Co-authored-by:
Denis Smirnov <sd@picodata.io>
-
- Sep 18, 2024
-
-
- Sep 17, 2024
-
-
Arseniy Volynets authored
- Support like operator with signature: expr1 LIKE expr2 [ESCAPE expr3] which returns TRUE only if expr1 matches the string specified by expr2 (pattern). '_' in pattern matches any single character, '%' matches any character 0 or more times. All other characters match itself according to case. - Optional escape clause specifies character to use for escaping '_' and '%'
-
EmirVildanov authored
-
EmirVildanov authored
-
Arseniy Volynets authored
- to_params uses subtree iterator to create parameters for cached query. But our plan is a DAG, not a tree. We have multiple nodes referring to the same node, this leads to situation when we traverse the same subquery node multiple times. This in term leads to wrong number of parameters produces by to_params - fix this by keeping track of already traversed parameter nodes in to_params
-
EmirVildanov authored
-
- Sep 16, 2024
-
-
EmirVildanov authored
-
EmirVildanov authored
feat: support additional SubQuery children for any relational operator, interpret SubQuery with single output as expression
-
EmirVildanov authored
-
EmirVildanov authored
-