Commits · 73576c4bd9dd047016fbcf287e83452aad8b3416 · core / picodata

Nov 22, 2023
- feat: update taratnool-module · b8bde87e
  Maksim Kaitmazian authored 1 year ago and Denis Smirnov committed 1 year ago
  
  b8bde87e
Sep 28, 2023
- fix: support tarantool-module v3.0.0 · a0d4f8a8
  Georgy Moshkin authored 1 year ago and Denis Smirnov committed 1 year ago
  
  a0d4f8a8
Jul 13, 2023
- feat: move picodata engine to picodata repository · 0fad429c
  Denis Smirnov authored 1 year ago
  
  0fad429c
Jul 04, 2023
- feat: use key_def to calculate bucket_id in picodata · dabc981a
  Denis Smirnov authored 1 year ago
  
  dabc981a
Jun 15, 2023

Denis Smirnov authored 1 year ago

Current commit redesigns distributed INSERT command. Previously we
dispatched insert SQL command to the storages. If INSERT could be
done locally (without building a virtual table on the router) we
used SQL "bucket_id(<string>)" function to recalculate buckets
on the starage via SQL.

This approach had the main disadvantage - it worked only with a
"bucket_id" SQL function that had a single argument as a parameter.
An attempt to support multiple parameters of different types (tuple
columns as we plan to implement for Picodata engine in future) faced
serious technical problems.
Also, the old implementation had performance issues:
- we created temporary spaces on the storage even for a single VALUE
  insertion
- we always dispatched VALUES to the storage to build a virtual table
- the resulting SQL was too verbose (as we produced a subquery under
  INSERT node)

It was desided to get rid of the old approach and migrate to non-SQL
insertion. It means that a new type of motion was introduled - local
segment. It allows us to build a virtual table on the storage in
Rust memory, then using space API transform and insert collected tuples
directly into the target space within a single transaction. Also we
can materailize VALUES (if they contain only constants) on the router
and get rid of the redundant network transmission over the network.

c64f009b

May 02, 2023

feat(POC): picodata engine · 93685c03

Denis Smirnov authored 1 year ago

Implement the picodata engine for sbroad to integrate distributed
SQL into the picodata binary. In fact, picodata engine simple
exports public rust functions from api.rs. On the picodata side
they (and some lua functions) are imported and included into the
bunary on the build phase.

Current commit is POC and have some problems:
- DDL is still not implemented in the picodata (_pico_space is mocked);
- we include tarantool module symbols twice (first time in sbroad,
  second time in picodata binary);

But anyway, it works and all these problems should be solved in the
next commits.

93685c03

Feb 08, 2023
- fix: update tarantool module to fix Lua overflow · 58f982a3
  Denis Smirnov authored 2 years ago
  
  58f982a3
Jan 19, 2023

feat: use spaces as virtual tables · 21f90b86

Denis Smirnov authored 2 years ago

We stop using VALUES to store temporary tuple on the storages and
switch to the tarantool spaces instead. This is done to avoid the
problems with the auto generated column names in VALUES, parser
stack and parameters limitations.

Tarantool forbids to use multiple space engines in a single transaction.
So for vinyl tables we have to use vinyl spaces as a tepmorary storage.
For memtx tables we can use temporary memtx spaces.

One more important change is that we can't insert values of
different numeric types in a number column (as we don't cast them
as the local SQL does).

21f90b86

Jan 16, 2023
- refactoring(perf): replace tree traversal with a custom one · 95edcbdc
  Denis Smirnov authored 2 years ago
  
  Reduce the amount of the heap allocations (use recursion instead of the heap stack).
  95edcbdc
Dec 29, 2022

feat!: dispatch IR instead of the SQL with parameters · 214b55a5

Denis Smirnov authored 2 years ago

BREAKING CHANGE: api functions have changed signatures.

This commit changes the way how the router dispatches commands to
the storages. Previously, the router compiled the SQL statements
with parameters from the plan subtrees and sent them to the storages.

Now the router sends the raw IR subtrees to the storages.

1. The subtrees are constructed from the original plan nodes
   for performance reasons: the node's memory chunk is extracted
   from the original plan tree (replaced with invalid parameter node)
   and reused in the sub-plan.
2. The router-storage message consists of the two parts: required and
   optional. The required part is the hash of the sub-plan (excluding
   constants - analogue of the SQL pattern in the previous version)
   and parameters. The optional part is the IR itself and the syntax
   node tree (precompiled on the router to skip redundant work on the
   multiple storages). Storage uses a lazy deserialization of the message:
   - first it deserialized the part with the hash and parameters (to
     check the plan cache)
   - if the cache lookup failed, it deserializes the IR and the syntax
     node tree and updates the cache.
3. The SHA256 hash was replaced with BLAKE3 for performance reasons.

214b55a5

Nov 22, 2022
- feat: update Cargo.lock · 11d9a906
  Denis Smirnov authored 2 years ago
  
  11d9a906
Sep 21, 2022
- repair func cartridge part · 61537ce7
  Igor Kuznetsov authored 2 years ago
  
  61537ce7
Sep 19, 2022
- extract benches crate · fff06cac
  Igor Kuznetsov authored 2 years ago
  
  fff06cac
Sep 15, 2022
- extract sbroad-core to workspace · 02e5f70a
  Igor Kuznetsov authored 2 years ago
  
  02e5f70a
Sep 09, 2022

feat: opentelemetry · fad79ec7

Denis Smirnov authored 2 years ago

Implement opentelemetry instrumentation in sbroad.
How to test:

docker run --name jaeger -d --rm -p6831:6831/udp -p6832:6832/udp \
-p16686:16686 -p14268:14268 jaegertracing/all-in-one:latest

Then run a stress test with sbroad. The results would be available
at http://localhost:16686/

fad79ec7

Sep 08, 2022
- feat: add security scanner · 6582f70c
  Igor Kuznetsov authored 2 years ago
  
  6582f70c
Aug 09, 2022

feat: get rid of the decimal symbols in sbroad · f23bcd2c

Denis Smirnov authored 2 years ago

Thanks to the updates in tarantool module we don't use tarantool
symbols to work with decimal. As a bonus we can remove our mocking
framework with dynamic linking of the decNumber library to cargo
test binary.

f23bcd2c

Aug 02, 2022

refactoring: tarantool module is public, no need for the access token · bbd16842
Denis Smirnov authored 2 years ago

bbd16842

feat: migrate to dynamic linking for unit tests · e950ebc5

Denis Smirnov authored 2 years ago

Previously, we created a static archives of the msgpuck and decNumber
libraries and made a statically linked them into the test executable.
After tarrantool module migrated to dlsym, we can no longer use static
linking.

As a result we build shared libraries for msgpuck and decNumber to
dynamic link them into the unit test binary.

e950ebc5

Jul 05, 2022

deps(tarantool-module): enable features (picodata + schema) · aa35b4d5

Дмитрий Кольцов authored 2 years ago

picodata features enables Picodata Tarantool fork functionality
Schema enables functionality needed to operate spaces metadata

aa35b4d5

ci(cargo.lock): remove cargo.lock from .gitignore · 196ba7d7

Дмитрий Кольцов authored 2 years ago

We want to get reproducible builds. Cargo.lock does exactly this.
On the start of the development we were following the rule that
"libraries does not include cargo.lock in git, binary does".
However we misinterpreted the role of Sbroad. It is built as a "cdylib"
so we should treat it as an executable and include cargo.lock

196ba7d7