box: rework internal garbage collection API
The current gc implementation has a number of flaws: - It tracks checkpoints, not consumers, which makes it impossible to identify the reason why gc isn't invoked. All we can see is the number of users of each particular checkpoint (reference counter), while it would be good to know what references it (replica or backup). - While tracking checkpoints suits well for backup and initial join, it doesn't look good when used for subscribe, because replica is supposed to track a vclock, not a checkpoint. - Tracking checkpoints from box/gc also violates encapsulation: checkpoints are, in fact, memtx snapshots, so they should be tracked by memtx engine, not by gc, as they are now. This results in atrocities, like having two snap xdirs - one in memtx, another in gc. - Garbage collection is invoked by a special internal function, box.internal.gc.run(), which is passed the signature of the oldest checkpoint to save. This function is then used by the snapshot daemon to maintain the configured number of checkpoints. This brings unjustified complexity to the snapshot daemon implementation: instead of just calling box.snapshot() periodically it has to take on responsibility to invoke the garbage collector with the right signature. This also means that garbage collection is disabled unless snapshot daemon is configured to be running, which is confusing, as snapshot daemon is disabled by default. So this patch reworks box/gc as follows: - Checkpoints are now tracked by memtx engine and can be accessed via a new module box/src/checkpoint.[hc], which provides simple wrappers around corresponding MemtxEngine methods. - box/gc.[hc] now tracks not checkpoints, but individual consumers that can be registered, unregistered, and advanced. Each consumer has a human-readable name displayed by box.internal.gc.info(): tarantool> box.internal.gc.info() --- - consumers: - name: backup signature: 8 - name: replica 885a81a9-a286-4f06-9cb1-ed665d7f5566 signature: 12 - name: replica 5d3e314f-bc03-49bf-a12b-5ce709540c87 signature: 12 checkpoints: - signature: 8 - signature: 11 - signature: 12 ... - box.internal.gc.run() is removed. Garbage collection is now invoked automatically by box.snapshot() and doesn't require the snapshot daemon to be up and running.
Showing
- src/box/CMakeLists.txt 1 addition, 0 deletionssrc/box/CMakeLists.txt
- src/box/box.cc 51 additions, 31 deletionssrc/box/box.cc
- src/box/box.h 1 addition, 1 deletionsrc/box/box.h
- src/box/checkpoint.cc 60 additions, 0 deletionssrc/box/checkpoint.cc
- src/box/checkpoint.h 97 additions, 0 deletionssrc/box/checkpoint.h
- src/box/engine.cc 8 additions, 12 deletionssrc/box/engine.cc
- src/box/engine.h 10 additions, 2 deletionssrc/box/engine.h
- src/box/gc.c 194 additions, 124 deletionssrc/box/gc.c
- src/box/gc.h 58 additions, 73 deletionssrc/box/gc.h
- src/box/lua/cfg.cc 12 additions, 0 deletionssrc/box/lua/cfg.cc
- src/box/lua/init.c 33 additions, 22 deletionssrc/box/lua/init.c
- src/box/lua/load_cfg.lua 2 additions, 5 deletionssrc/box/lua/load_cfg.lua
- src/box/lua/snapshot_daemon.lua 2 additions, 36 deletionssrc/box/lua/snapshot_daemon.lua
- src/box/memtx_engine.cc 42 additions, 10 deletionssrc/box/memtx_engine.cc
- src/box/memtx_engine.h 33 additions, 0 deletionssrc/box/memtx_engine.h
- src/box/relay.cc 14 additions, 9 deletionssrc/box/relay.cc
- src/box/replication.cc 3 additions, 13 deletionssrc/box/replication.cc
- src/box/replication.h 2 additions, 14 deletionssrc/box/replication.h
- src/box/vinyl_engine.cc 2 additions, 1 deletionsrc/box/vinyl_engine.cc
- src/box/vinyl_engine.h 1 addition, 1 deletionsrc/box/vinyl_engine.h
Loading
Please register or sign in to comment