vinyl: rework upsert operation
Previous upsert implementation had a few drawbacks which led to number of various bugs and issues. Issue #5092 (redundant update operations execution) In a nutshell, application of upsert(s) (on top of another upsert) consists of two actions (see vy_apply_upsert()): execute and squash. Consider example: insert({1, 1}) -- terminal statement, stored on disk upsert({1}, {{'-', 2, 20}}) -- old ups1 upsert({1}, {{'+', 2, 10}}) -- new ups2 'Execute' step takes update operations from the new upsert and combines them with key of the old upsert. {1} + {'+', 2, 10} can't be evaluated since key consists of only one field. Note that in case upsert doesn't fold into insert the upsert's tuple and the tuple stored in index can be different. In our particular case, tuple stored on disk has two fields ({1, 1}), so first upsert's update operation can be applied to it: {1, 1} + {'+', 2, 10} --> {1, 11}. If upsert's operation can't be executed using key of old upsert, we simply continue processing squash step. In turn 'squash' is a combination of update operations: arithmetic operations are combined so we don't have to store actions over the same field; the rest operations - are merged into single array. As a result, we get one upsert with squashed operations: upsert({1}, {{'+', 2, -10}}). Then vy_apply_upsert() is called again to apply new upsert on the top of terminal statement - insert{1, 1}. Since now tuple has second field, update operations can be executed and corresponding result is {1, -9}. It is the final result of upsert application procedure. Now imagine that we have following upserts: upsert({1, 1}, {{'-', 2, 20}}) -- old ups1 upsert({1}, {{'+', 2, 10}}) -- new ups2 In this case execution successfully finishes and modifies old upsert's tuple: {1, 1} + {'+', 2, 10} --> {1, 11} However, we still have to squash/accumulate update operations since they may be applied on tuple stored on disk later. After all, we have following upsert: upsert({2, 11}, {{'+', 2, -10}}). Then it is applied on the top of insert({1, 1}) and we get the same result as in the first case - {1, -9}. The only difference is that upsert's tuple was modified. As one can see, execution of update operations applied to upsert's tuple is redundant in the case index already contains tuple with the same key (i.e. when upserts turns into update). Instead, we are able to accumulate/squash update operations only. When the last upsert is being applied, we can either execute all update operation on tuple fetched from index (i.e. upsert is update) OR on tuple specified in the first upsert (i.e. first upsert is insert). Issue #5105 (upsert doesn't follow associative property) Secondly, current approach breaks associative property: after upsert's update operations are merged into one array, part of them (related to one upsert) can be skipped, meanwhile the rest - is applied. For instance: -- Index is over second field. i = s:create_index('pk', {parts={2, 'uint'}}) s:replace{1, 2, 3, 'default'} s:upsert({2, 2, 2}, {{'=', 4, 'upserted'}}) -- First update operation modifies primary key, so upsert must be ignored. s:upsert({2, 2, 2}, {{'#', 1, 1}, {'!', 3, 1}}) After merging two upserts we get the next one: upsert({2, 2, 2}, {{'=', 4, 'upserted'}, {'#', 1, 1}, {'!', 3, 1}} While we executing update operations, we don't distinguish operations from different upserts. Thus, if one operation fails, the rest are ignored as well. As a result, first (in general case - all preceding squashed upserts) upsert won't be applied, even despite the fact it is absolutely correct. What is more, user gets no error/warning concerning this fact. Issue #1622 (no upsert result validation) After upsert application, there's no check verifying that result satisfies space's format: number of fields, their types, overflows etc. Due to this tuples violating format may appear in the space, which in turn may lead to unpredictable consequences. To resolve these issues, let's group update operations of each upsert into separate array. So that operations related to particular upsert are stored in single array. In terms of previous example we will get: upsert({2, 2, 2}, {{{'=', 4, 'upserted'}}, {{'#', 1, 1}, {'!', 3, 1}}} Also note that we don't longer have to apply update operations on tuple in vy_apply_upsert() when it comes for two upserts: it can be done once we face terminal statement; or if there's no underlying statement (i.e. it is delete statement or no statement at all) we apply all update arrays except the first one on upsert's tuple. In case one of operations from array fail, we skip the rest operations from this array and process to the next array. After successful application of update operations of each array, we check that the resulting tuple fits into space format. If they aren't, we rollback applied operations, log error and moving to the next group of operations. Finally, arithmetic operations are not longer able to be combined: it is requirement which is arises from #5105 issue. Otherwise, result of upserts combination might turn out to be inapplicable to the tuple stored on disk (e.g. result applied on tuple leads to integer overflow - in this case only last upsert leading to overflow must be ignored). Closes #1622 Closes #5105 Closes #5092 Part of #5107
Showing
- src/box/vinyl.c 8 additions, 4 deletionssrc/box/vinyl.c
- src/box/vy_stmt.h 4 additions, 1 deletionsrc/box/vy_stmt.h
- src/box/vy_upsert.c 180 additions, 127 deletionssrc/box/vy_upsert.c
- test/vinyl/upgrade/upsert/00000000000000000000.vylog 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000000.vylog
- test/vinyl/upgrade/upsert/00000000000000000004.snap 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000004.snap
- test/vinyl/upgrade/upsert/00000000000000000004.vylog 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000004.vylog
- test/vinyl/upgrade/upsert/00000000000000000004.xlog 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000004.xlog
- test/vinyl/upgrade/upsert/00000000000000000010.snap 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000010.snap
- test/vinyl/upgrade/upsert/00000000000000000010.vylog 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000010.vylog
- test/vinyl/upgrade/upsert/00000000000000000010.xlog 0 additions, 0 deletionstest/vinyl/upgrade/upsert/00000000000000000010.xlog
- test/vinyl/upgrade/upsert/512/0/00000000000000000002.index 0 additions, 0 deletionstest/vinyl/upgrade/upsert/512/0/00000000000000000002.index
- test/vinyl/upgrade/upsert/512/0/00000000000000000002.run 0 additions, 0 deletionstest/vinyl/upgrade/upsert/512/0/00000000000000000002.run
- test/vinyl/upgrade/upsert/512/0/00000000000000000004.index 0 additions, 0 deletionstest/vinyl/upgrade/upsert/512/0/00000000000000000004.index
- test/vinyl/upgrade/upsert/512/0/00000000000000000004.run 0 additions, 0 deletionstest/vinyl/upgrade/upsert/512/0/00000000000000000004.run
- test/vinyl/upgrade/upsert/512/0/00000000000000000006.index 0 additions, 0 deletionstest/vinyl/upgrade/upsert/512/0/00000000000000000006.index
- test/vinyl/upgrade/upsert/512/0/00000000000000000006.run 0 additions, 0 deletionstest/vinyl/upgrade/upsert/512/0/00000000000000000006.run
- test/vinyl/upsert.result 567 additions, 0 deletionstest/vinyl/upsert.result
- test/vinyl/upsert.test.lua 224 additions, 0 deletionstest/vinyl/upsert.test.lua
- test/vinyl/upsert_upgrade.result 59 additions, 0 deletionstest/vinyl/upsert_upgrade.result
- test/vinyl/upsert_upgrade.test.lua 32 additions, 0 deletionstest/vinyl/upsert_upgrade.test.lua
Loading
Please register or sign in to comment