Skip to content
Snippets Groups Projects
Commit 785a1bdb authored by Vladimir Davydov's avatar Vladimir Davydov Committed by Roman Tsisyk
Browse files

vinyl: track key intervals in conflict manager

Currently, the conflict manager only tracks keys returned by the read
iterator, so Vinyl isn't really serializable as select() can return
phantom records, e.g.

  space: {10}, {20}, {30}, {40}, {50}

  Transaction 1                         Transaction 2
  -------------                         -------------
  box.begin()
  space:select({30}, {iterator='GE'})
  -- returns {30}, {40}, {50}
                                        box.begin()
                                        box.insert{35}
                                        box.insert{45}
                                        box.insert{55}
                                        box.commit()
  space:select({30}, {iterator='GE'})
  -- returns {30}, {35}, {40}, {45}, {50}, {55};
  -- were it serializable, the transaction would
  -- be sent to read view so that this select()
  -- would return the same set of values as the
  -- previous one
  box.commit()

Besides, tracking individual keys read by a transaction can be very
expensive from the memory consumption point of view: think of calling
select(*) on a big space.

So this patch makes the conflict manager track intervals instead of
individual keys. To achieve that it splits tx_manager->read_set in two:

 - vy_tx->read_set. Contains intervals read by a transaction. Needed to
   efficiently search intervals that should be merged with a new one.
   Intervals in this tree cannot intersect.

 - vy_index->read_set. Contains intervals read by all transaction from
   an index. Needed to efficiently search transactions that conflict
   with a write. Intervals can intersect.

When vy_tx_track() is called, it first looks up all intervals
intersecting with the new interval in vy_tx->read_set, removes them, and
extends the new interval to span them. Then it inserts the new interval
into both vy_index->read_set and vy_tx->read_set. The vy_index->read_set
is used on commit to send all transactions that read intervals modified
by the committed statement to read view.

Note, now we don't differentiate 'gaps', i.e. non-existent keys read by
a transaction. Gaps were used to avoid aborting a transaction if a
non-existent key read by it is deleted. We can't track gaps without
bloating the read set on select(*).

Closes #2671
parent 1fb352ac
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment