Skip to content
Snippets Groups Projects
Commit a2d1d2a2 authored by Vladimir Davydov's avatar Vladimir Davydov Committed by Konstantin Osipov
Browse files

vinyl: purge dropped indexes from vylog on garbage collection

Currently, when an index is dropped, we remove all ranges/slices
associated with it and mark all runs as dropped in vylog immediately.
To find ranges/slices/runs, we use vy_lsm struct, see vy_log_lsm_prune.

The problem is vy_lsm struct may be inconsistent with the state stored
in vylog if index drop races with compaction, because we first write
changes done by compaction task to vylog and only then update vy_lsm
struct, see vy_task_compact_complete. Since write to vylog yields, this
opens a time window during which the index can be dropped. If this
happens, objects that were created by compaction but haven't been logged
yet (such as new runs, slices, ranges) will be deleted from vylog by
index drop, and this will permanently break vylog, making recovery
impossible.

To fix this issue, let's rework garbage collection of objects associated
with dropped indexes as follows. Now when an index is dropped, we write
a single record to vylog, VY_LOG_DROP_LSM, i.e. just mark the index as
dropped without deleting associated objects. Actual index cleanup takes
place in the garbage collection procedure, see vy_gc, which purges all
ranges/slices linked to marked indexes from vylog and marks all their
runs as dropped. When all runs are actually deleted from disk and
"forgotten" in vylog, we remove the index record from vylog by writing
VY_LOG_FORGET_LSM record. Since garbage collection procedure uses vylog
itself instead of vy_lsm struct for iterating over vinyl objects, no
race between index drop and dump/compaction can now lead to broken
vylog.

Closes #3416
parent 264f7e3f
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment