vinyl: skip bad vylog records in force_recovery mode
We've had a number of issues when Tarantool was permanently broken (unable to recover after restart) because of a bad vylog record. The `force_recovery` mode didn't help so the user would have no other choice but to rebootstrap. A funny thing is those bugs were usually caused by a race between the garbage collector and dump/compaction when a vylog record was written for a dropped index. The worst thing that could happen if we ignored such a bad record is an unused run file not deleted from disk. Apparently, this is better than a permanent recovery failure so let's support the `force_recovery` mode in vylog. The tricky part here is handling checkpoint after restart. The problem is that to create a vylog checkpoint, we load the previous vylog file so we have to ignore errors if it was loaded in the `force_recovery` mode. Closes #10292 NO_DOC=bug fix (cherry picked from commit c68e8a8e029d849d68c6018ed00b5a79cc769222)
Showing
- changelogs/unreleased/gh-10292-vy-log-force-recovery.md 4 additions, 0 deletionschangelogs/unreleased/gh-10292-vy-log-force-recovery.md
- src/box/vy_log.c 78 additions, 9 deletionssrc/box/vy_log.c
- src/box/vy_log.h 9 additions, 0 deletionssrc/box/vy_log.h
- src/lib/core/errinj.h 1 addition, 0 deletionssrc/lib/core/errinj.h
- test/vinyl-luatest/gh_10292_vylog_force_recovery_test.lua 66 additions, 0 deletionstest/vinyl-luatest/gh_10292_vylog_force_recovery_test.lua
Loading
Please register or sign in to comment