vinyl: fix compaction vs checkpoint race resulting in invalid gc
The callback invoked upon compaction completion uses checkpoint_last() to determine whether compacted runs may be deleted: if the max LSN stored in a compacted run (run->dump_lsn) is greater than the LSN of the last checkpoint (gc_lsn) then the run doesn't belong to the last checkpoint and hence is safe to delete, see commit 35db70fa ("vinyl: remove runs not referenced by any checkpoint immediately"). The problem is checkpoint_last() isn't synced with vylog rotation - it returns the signature of the last successfully created memtx snapshot and is updated in memtx_engine_commit_checkpoint() after vylog is rotated. If a compaction task completes after vylog is rotated but before snap file is renamed, it will assume that compacted runs do not belong to the last checkpoint, although they do (as they have been appended to the rotated vylog), and delete them. To eliminate this race, let's use vylog signature instead of snap signature in vy_task_compact_complete(). Closes #3437
Showing
- src/box/memtx_engine.c 8 additions, 0 deletionssrc/box/memtx_engine.c
- src/box/vy_log.c 6 additions, 0 deletionssrc/box/vy_log.c
- src/box/vy_log.h 6 additions, 0 deletionssrc/box/vy_log.h
- src/box/vy_scheduler.c 1 addition, 1 deletionsrc/box/vy_scheduler.c
- src/errinj.h 1 addition, 0 deletionssrc/errinj.h
- test/box/errinj.result 2 additions, 0 deletionstest/box/errinj.result
- test/vinyl/errinj.result 69 additions, 0 deletionstest/vinyl/errinj.result
- test/vinyl/errinj.test.lua 29 additions, 0 deletionstest/vinyl/errinj.test.lua
Loading
Please register or sign in to comment