Skip to content
Snippets Groups Projects
  • Denis Smirnov's avatar
    c64f009b
    feat: non-SQL insert · c64f009b
    Denis Smirnov authored
    Current commit redesigns distributed INSERT command. Previously we
    dispatched insert SQL command to the storages. If INSERT could be
    done locally (without building a virtual table on the router) we
    used SQL "bucket_id(<string>)" function to recalculate buckets
    on the starage via SQL.
    
    This approach had the main disadvantage - it worked only with a
    "bucket_id" SQL function that had a single argument as a parameter.
    An attempt to support multiple parameters of different types (tuple
    columns as we plan to implement for Picodata engine in future) faced
    serious technical problems.
    Also, the old implementation had performance issues:
    - we created temporary spaces on the storage even for a single VALUE
      insertion
    - we always dispatched VALUES to the storage to build a virtual table
    - the resulting SQL was too verbose (as we produced a subquery under
      INSERT node)
    
    It was desided to get rid of the old approach and migrate to non-SQL
    insertion. It means that a new type of motion was introduled - local
    segment. It allows us to build a virtual table on the storage in
    Rust memory, then using space API transform and insert collected tuples
    directly into the target space within a single transaction. Also we
    can materailize VALUES (if they contain only constants) on the router
    and get rid of the redundant network transmission over the network.
    Verified
    c64f009b
    History
    feat: non-SQL insert
    Denis Smirnov authored
    Current commit redesigns distributed INSERT command. Previously we
    dispatched insert SQL command to the storages. If INSERT could be
    done locally (without building a virtual table on the router) we
    used SQL "bucket_id(<string>)" function to recalculate buckets
    on the starage via SQL.
    
    This approach had the main disadvantage - it worked only with a
    "bucket_id" SQL function that had a single argument as a parameter.
    An attempt to support multiple parameters of different types (tuple
    columns as we plan to implement for Picodata engine in future) faced
    serious technical problems.
    Also, the old implementation had performance issues:
    - we created temporary spaces on the storage even for a single VALUE
      insertion
    - we always dispatched VALUES to the storage to build a virtual table
    - the resulting SQL was too verbose (as we produced a subquery under
      INSERT node)
    
    It was desided to get rid of the old approach and migrate to non-SQL
    insertion. It means that a new type of motion was introduled - local
    segment. It allows us to build a virtual table on the storage in
    Rust memory, then using space API transform and insert collected tuples
    directly into the target space within a single transaction. Also we
    can materailize VALUES (if they contain only constants) on the router
    and get rid of the redundant network transmission over the network.