Memory leak on temporary space drop
The problem was originally detected when we made a stress test for picodata SQL, when SQL internally moved from data temporary tables to temporary ones. Everything ends very quickly with errors like:
...
2023-10-23 16:40:15.254 [93320] main/138/pico.sql vshard.rs:56 E> LuaError(ExecutionError("sbroad: failed to create space: TMP_a327cbb3638f42e5a47779fc0eab667a_18: tarantool error: MemoryIssue: Failed to allocate 16384 bytes in mempool for new slab")): dql_on_some
...
The simple repro via Lua
- check the
box.slab.stats()
... - mem_free: 16000 mem_used: 32816 item_count: 586 item_size: 56 slab_count: 3 slab_size: 16384 ...
- create a temporary space (
s = box.schema.space.create('temp', { type = 'temporary' })
) - check the
box.slab.stats()
... - mem_free: 15944 mem_used: 32872 item_count: 587 item_size: 56 slab_count: 3 slab_size: 16384 ...
- drop a temporary space (
s = box.schema.space.drop(s.id)
) - check the
box.slab.stats()
(the leak!item_count
is the same!)... - mem_free: 15944 mem_used: 32872 item_count: 587 item_size: 56 slab_count: 3 slab_size: 16384 ...
In a case of the data-temporary
tables there is no leak an the item_count
returns to original value 586.
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Denis Smirnov added bug label
added bug label
- Denis Smirnov assigned to @darthunix
assigned to @darthunix
- Denis Smirnov changed the description
changed the description
- Author Owner
I wondered what type of an object is stored inside the leaked slab (its size is 56 bytes). I have added an assertion inside
mempool_create()
and the aswer was obvious - we have a leaked tuple.* thread #1, queue = 'com.apple.main-thread', stop reason = hit program assert frame #4: 0x00000001006ede9c tarantool`mempool_create(pool=0x00000001008e0a98, cache=0x00000001008d26f0, objsize=56) at mempool.h:233:3 230 uint32_t objsize) 231 { 232 if (objsize == 56) -> 233 assert(0); 234 size_t overhead = (objsize > sizeof(struct mslab) ? 235 objsize : sizeof(struct mslab)); 236 size_t slab_size = (size_t) (overhead / OVERHEAD_RATIO); (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = hit program assert frame #0: 0x00007ff802ba230e libsystem_kernel.dylib`__pthread_kill + 10 frame #1: 0x00007ff802bd9f7b libsystem_pthread.dylib`pthread_kill + 263 frame #2: 0x00007ff802b23ca5 libsystem_c.dylib`abort + 123 frame #3: 0x00007ff802b22fbe libsystem_c.dylib`__assert_rtn + 314 * frame #4: 0x00000001006ede9c tarantool`mempool_create(pool=0x00000001008e0a98, cache=0x00000001008d26f0, objsize=56) at mempool.h:233:3 frame #5: 0x00000001006ed3fb tarantool`small_mempool_create(alloc=0x00000001008e0798) at small.c:257:3 frame #6: 0x00000001006ed15f tarantool`small_alloc_create(alloc=0x00000001008e0798, cache=0x00000001008d26f0, objsize_min=16, granularity=8, alloc_factor=1.04999995, actual_alloc_factor=0x00007ff7bfeff434) at small.c:324:2 frame #7: 0x00000001005a4932 tarantool`tuple_init(hash=(tarantool`lua_hash at lj_api.c:1338)) at tuple.c:314:2 frame #8: 0x0000000100157f7a tarantool`::box_init() at box.cc:5285:6 frame #9: 0x0000000100005aed tarantool`main(argc=1, argv=0x0000000102808200) at main.cc:809:3 frame #10: 0x00007ff8028a9310 dyld`start + 2432
Edited by Denis Smirnov - Author Owner
The leaked tuple is in the space 280 - i.e. _space (again some dirty patches for debug)
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x000000010f9f74f0 tarantool`collect_slab_stats_cb(stats_=0x00000001119ff2d8, ctx_=0x00000001119ff360) at alter.cc:2015:3 frame #1: 0x000000010ffc4d8a tarantool`small_stats(alloc=0x00000001101e6488, totals=0x00000001119ff338, cb=(tarantool`collect_slab_stats_cb(void const*, void*) at alter.cc:2008), cb_ctx=0x00000001119ff360) at small.c:469:7 frame #2: 0x000000010f9f746a tarantool`SmallAlloc::stats(alloc_stats=0x00000001119ff368, cb=(tarantool`collect_slab_stats_cb(void const*, void*) at alter.cc:2008), cb_ctx=0x00000001119ff360) at allocator.h:122:3 frame #3: 0x000000010f9f5e51 tarantool`log_stats() at alter.cc:2027:2 frame #4: 0x000000010f9ea623 tarantool`on_replace_dd_space((null)=0x0000000110183800, event=0x000000011182e038) at alter.cc:2178:3 frame #5: 0x000000010fbdb59d tarantool`trigger_run_list(list=0x00000001119ff918, event=0x000000011182e038) at trigger.cc:119:7 frame #6: 0x000000010fbdb367 tarantool`trigger_run(list=0x00007fcd22a26900, event=0x000000011182e038) at trigger.cc:152:9 frame #7: 0x000000010fa0d584 tarantool`txn_commit_stmt(txn=0x000000011182e038, request=0x00000001119ffa80) at txn.c:702:13 frame #8: 0x000000010fa1ca9c tarantool`box_process_rw(request=0x00000001119ffa80, space=0x00007fcd22a26860, result=0x00000001119ffb98) at box.cc:478:6 frame #9: 0x000000010fa26380 tarantool`box_process1(request=0x00000001119ffa80, result=0x00000001119ffb98) at box.cc:3162:9 frame #10: 0x000000010fa27825 tarantool`box_delete(space_id=280, index_id=0, key="\x91\xce@", key_end="PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP"..., result=0x00000001119ffb98) at box.cc:3357:9 frame #11: 0x000000010fb454c6 tarantool`lbox_index_delete(L=0x0000000110565378) at index.c:140:11 frame #12: 0x000000010fc1340b tarantool`lj_BC_FUNCC + 68 frame #13: 0x000000010fc20eba tarantool`lua_pcall(L=0x0000000110565378, nargs=0, nresults=0, errfunc=0) at lj_api.c:1163:12 frame #14: 0x000000010fb80d83 tarantool`luaT_call(L=0x0000000110565378, nargs=0, nreturns=0) at utils.c:618:6 frame #15: 0x000000010fb76067 tarantool`lua_main(L=0x0000000110565378, is_debugging=false, argc=1, argv=0x00007fcd22008800) at init.c:1027:11 frame #16: 0x000000010fb75839 tarantool`run_script_f(ap=0x0000000111810528) at init.c:1180:7 frame #17: 0x000000010f8dd71a tarantool`fiber_cxx_invoke(f=(tarantool`run_script_f at init.c:1038), ap=0x0000000111810528) at fiber.h:1240:10 frame #18: 0x000000010fbacfc5 tarantool`fiber_loop(data=0x0000000000000000) at fiber.c:998:18 frame #19: 0x000000010ffe912e tarantool`coro_init at coro.c:108:3
- Author Owner
Ok, the leak is inside
box_insert->box_process1->box_process_rw->space_execute_dml
.Edited by Denis Smirnov - Author Owner
I have rechecked
box.slab.stats()
and found out thatbox.slab.stats()
detects the incrementing number of objects even for data temporary spaces. I didn't see them before because temporary spaces tuples fitted into 56 byte slab arena, while data temporary - into 160 byte slabs. But anyway, the stress test crashes and we need to fix it. I have mede a small memory profiling function that dumps the memory output to stderr on eachSmallAlloc::alloc
call.Here is a memory profiling of a single instance picodata with Temporary spaces.
Here is the mempool on the instance start:
24 bytes mempool (objects: 155, slabs: 1, used: 3720, total: 16272) 32 bytes mempool (objects: 2599, slabs: 6, used: 83168, total: 97632) 56 bytes mempool (objects: 595, slabs: 3, used: 33320, total: 48816) 160 bytes mempool (objects: 204, slabs: 3, used: 32640, total: 48816) 312 bytes mempool (objects: 34, slabs: 1, used: 10608, total: 32656) 648 bytes mempool (objects: 5, slabs: 1, used: 3240, total: 65424)
And this is on the instance crash with
"Failed to allocate 16384 bytes in mempool for new slab"
(after running stress test):24 bytes mempool (objects: 1042, slabs: 2, used: 25008, total: 32544) 32 bytes mempool (objects: 2599, slabs: 6, used: 83168, total: 97632) 56 bytes mempool (objects: 596, slabs: 3, used: 33376, total: 48816) 160 bytes mempool (objects: 199, slabs: 3, used: 31840, total: 48816) 312 bytes mempool (objects: 34, slabs: 1, used: 10608, total: 32656) 648 bytes mempool (objects: 5, slabs: 1, used: 3240, total: 65424)
Edited by Denis Smirnov Collapse replies - Author Owner
I have increased memtx memory to 64 MB, but still face 24B slab arenas' overflow:
24 bytes mempool (objects: 156, slabs: 1, used: 3744, total: 16272) 32 bytes mempool (objects: 2599, slabs: 6, used: 83168, total: 97632) 56 bytes mempool (objects: 595, slabs: 3, used: 33320, total: 48816) 160 bytes mempool (objects: 208, slabs: 3, used: 33280, total: 48816) 312 bytes mempool (objects: 34, slabs: 1, used: 10608, total: 32656) 648 bytes mempool (objects: 5, slabs: 1, used: 3240, total: 65424) -> (Failed to allocate...) 24 bytes mempool (objects: 1044, slabs: 2, used: 25056, total: 32544) 32 bytes mempool (objects: 2599, slabs: 6, used: 83168, total: 97632) 56 bytes mempool (objects: 596, slabs: 3, used: 33376, total: 48816) 160 bytes mempool (objects: 207, slabs: 3, used: 33120, total: 48816) 312 bytes mempool (objects: 34, slabs: 1, used: 10608, total: 32656) 648 bytes mempool (objects: 5, slabs: 1, used: 3240, total: 65424)
- Author Owner
Here is the same test for DataTemporary spaces (no leak in 24B slab mempool):
24 bytes mempool (objects: 157, slabs: 1, used: 3768, total: 16272) 32 bytes mempool (objects: 2599, slabs: 6, used: 83168, total: 97632) 56 bytes mempool (objects: 595, slabs: 3, used: 33320, total: 48816) 160 bytes mempool (objects: 212, slabs: 3, used: 33920, total: 48816) 312 bytes mempool (objects: 34, slabs: 1, used: 10608, total: 32656) 648 bytes mempool (objects: 5, slabs: 1, used: 3240, total: 65424) -> (Failed to allocate...) 24 bytes mempool (objects: 157, slabs: 1, used: 3768, total: 16272) 32 bytes mempool (objects: 2599, slabs: 6, used: 83168, total: 97632) 56 bytes mempool (objects: 601, slabs: 3, used: 33656, total: 48816) 160 bytes mempool (objects: 231, slabs: 3, used: 36960, total: 48816) 312 bytes mempool (objects: 54, slabs: 1, used: 16848, total: 32656) 648 bytes mempool (objects: 5, slabs: 1, used: 3240, total: 65424)
- Author Owner
BT for under 24 B allocation
* frame #0: 0x00000001050a3087 picodata`MemtxAllocator<SmallAlloc>::alloc(size=17) at memtx_allocator.h:320:4 frame #1: 0x00000001050a2ba0 picodata`tuple* memtx_tuple_new_raw_impl<SmallAlloc>(tuple_format*, char const*, char const*, bool) [inlined] MemtxAllocator<SmallAlloc>::alloc_tuple(size=13) at memtx_allocator.h:256:26 frame #2: 0x00000001050a2b98 picodata`tuple* memtx_tuple_new_raw_impl<SmallAlloc>(format=0x00007ff3d352e3b0, data="\x92\U00000002", end=<unavailable>, validate=<unavailable>) at memtx_engine.cc:1588:18 frame #3: 0x00000001050a41f4 picodata`memtx_space_execute_replace(space=0x00007ff3d352e250, txn=<unavailable>, request=<unavailable>, result=<unavailable>) at memtx_space.c:393:3 frame #4: 0x00000001050f85dd picodata`space_execute_dml(space=0x00007ff3d352e250, txn=0x000000010a428038, request=0x00000001179fa2f0, result=0x00000001179fa2a8) at space.c:811:7 frame #5: 0x0000000105121165 picodata`::box_process_rw(request=0x00000001179fa2f0, space=0x00007ff3d352e250, result=0x00000001179fa600) at box.cc:461:6 frame #6: 0x0000000105127d33 picodata`::box_process1(request=<unavailable>, result=<unavailable>) at box.cc:3162:9 [artificial] frame #7: 0x000000010512869b picodata`::box_insert(space_id=<unavailable>, tuple=<unavailable>, tuple_end=<unavailable>, result=<unavailable>) at box.cc:3328:9 frame #8: 0x0000000105960474 picodata`tarantool::space::Space::insert::hff7987cadb53a987(self=0x00000001179fad2c, value=0x00000001179fb1f8) at space.rs:644:9 frame #9: 0x000000010572135e picodata`sbroad::backend::sql::space::TmpSpace::initialize::h4d43092fbdcbe89d(exec_plan=0x00000001179fd328, base=(data_ptr = "64da4e1570ec467b99aa7b122d6facf8", length = 32), motion_id=18, buckets=0x00000001179fd178, engine=0x00000001179fcd37) at space.rs:115:23 frame #10: 0x00000001058c55ec picodata`sbroad::backend::sql::ir::_$LT$impl$u20$sbroad..executor..ir..ExecutionPlan$GT$::to_sql::_$u7b$$u7b$closure$u7d$$u7d$::hff8325199384dc8e at ir.rs:453:41 frame #11: 0x00000001057c3303 picodata`sbroad::otm::child_span::he3f87337f3417eef(name=(data_ptr = "\"syntax.ordered.sql\"for Motion node ()sbroad/sbroad-core/src/executor/ir.rsinvalid motion child nodecurrent node () is not motionMotion node () must have a single child only (actual ) is not sub querynode (", length = 20), f={closure_env#0} @ 0x00000001179fce68) at otm.rs:255:22 frame #12: 0x000000010590ca23 picodata`sbroad::backend::sql::ir::_$LT$impl$u20$sbroad..executor..ir..ExecutionPlan$GT$::to_sql::h8e67b3b4c7113be4(self=0x00000001179fd328, nodes=&[&sbroad::backend::sql::tree::SyntaxData] @ 0x00000001179fd0b8, buckets=0x00000001179fd178, name_base=(data_ptr = "64da4e1570ec467b99aa7b122d6facf8", length = 32)) at ir.rs:235:29 frame #13: 0x00000001056fc2c9 picodata`sbroad::executor::engine::helpers::compile_optional::hf9f26ec20548eb84(optional=0x00000001179fd328) at helpers.rs:270:5 frame #14: 0x00000001056fbf2f picodata`sbroad::executor::engine::helpers::compile_encoded_optional::hf6a7c448bec2e426(raw_optional=0x00000001179fed80) at helpers.rs:253:5 frame #15: 0x0000000105703783 picodata`sbroad::executor::engine::helpers::execute_non_cacheable_dql_with_raw_optional::h41f6ccfb8f3622c4(raw_optional=0x00000001179fed80, vtable_max_rows=5000, opts=<unavailable>) at helpers.rs:1681:46 frame #16: 0x00000001049eb441 picodata`picodata::sql::storage::StorageRuntime::execute_plan::h3cd3939c63c75b71(self=0x00000001179fe0a0, required=0x00000001179fee48, raw_optional=0x00000001179fed80) at storage.rs:234:21 frame #17: 0x000000010495a207 picodata`picodata::sql::execute::__tp_inner::_$u7b$$u7b$closure$u7d$$u7d$::h0381ac32dc46e9ad at sql.rs:411:15 frame #18: 0x0000000104f220ad picodata`sbroad::otm::query_span::h3f6dfe28a96477a4(name=(data_ptr = "\"api.storage\"execute", length = 13), id=(data_ptr = "", length = 0), tracer=0x00000001179ff237, ctx=0x00000001179ff210, sql=(data_ptr = "", length = 0), f={closure_env#0} @ 0x00000001179fecb8) at otm.rs:337:22 frame #19: 0x0000000104e82b17 picodata`picodata::sql::execute::__tp_inner::h0e4f245e4c3085fe(raw=&tarantool::tuple::RawBytes @ 0x00000001179ff250) at sql.rs:409:5 frame #20: 0x0000000104e820b2 picodata`execute(__tp_ctx=FunctionCtx @ 0x00000001179ff318, __tp_args=(start = "\x92ْ", end = "\x96\xce@")) at sql.rs:399:1 frame #21: 0x000000010509c76d picodata`module_func_call(mf=0x00007ff3d3459588, args=<unavailable>, ret=0x00000001179ff980) at module_cache.c:240:11 frame #22: 0x00000001050fe908 picodata`func_c_call(base=<unavailable>, args=<unavailable>, ret=<unavailable>) at func.c:533:9 frame #23: 0x00000001050fed62 picodata`func_call_no_access_check(base=0x00007ff3d34594c0, args=<unavailable>, ret=0x00000001179ff980) at func.c:612:11 frame #24: 0x00000001051b7383 picodata`port_msgpack_set_plain(base=<unavailable>, plain=<unavailable>, len=<unavailable>) at call.c:103:3 frame #25: 0x0000000105231813 picodata`lj_BC_FUNCC + 68 frame #26: 0x00000001051b7877 picodata`port_msgpack_set_plain [inlined] error_unlink_effect(e=0x00000001179ffc08) at diag.h:222:6 frame #27: 0x0000000105231813 picodata`lj_BC_FUNCC + 68 frame #28: 0x000000010523933d picodata`lua_pcall(L=0x000000010a903f00, nargs=<unavailable>, nresults=<unavailable>, errfunc=0) at lj_api.c:1163:12 frame #29: 0x0000000104fa3dc3 picodata`luaT_call(L=<unavailable>, nargs=<unavailable>, nreturns=<unavailable>) at utils.c:618:6 frame #30: 0x00000001051b69e4 picodata`port_msgpack_set_plain(base=<unavailable>, plain=<unavailable>, len=<unavailable>) at call.c:103:3 frame #31: 0x00000001051b8bcc picodata`port_msgpack_destroy(base=<unavailable>) at call.c:93:2 frame #32: 0x00000001050fed62 picodata`func_call_no_access_check(base=<unavailable>, args=0x00000001179ffcb0, ret=<unavailable>) at func.c:612:11 frame #33: 0x0000000105155979 picodata`box_process_call(request=0x0000000115410188, port=0x00000001179ffe98) at call.c:173:7 frame #34: 0x000000010506cd8c picodata`tx_process_call(m=<unavailable>) at iproto.cc:2291:8 frame #35: 0x0000000104f34944 picodata`cmsg_deliver(msg=<unavailable>) at cbus.c:553:2 frame #36: 0x0000000104f35a2c picodata`fiber_pool_f(ap=<unavailable>) at fiber_pool.c:64:3 frame #37: 0x0000000104f8dc3c picodata`fiber_cxx_invoke(f=<unavailable>, ap=<unavailable>)(__va_list_tag*), __va_list_tag*) at fiber.h:1240:10 frame #38: 0x0000000104f2ee5c picodata`fiber_loop(data=<unavailable>) at fiber.c:998:18 frame #39: 0x0000000104f9347c picodata`coro_init at coro.c:108:3
Edited by Denis Smirnov - Author Owner
Finally, I got the lua repro. the key feature here is to make in insertion into the space (an empty space doesn't leak):
for i=1, 1000 do s = box.schema.space.create('temp', { type = 'temporary' }) box.space.temp:format({{ name = 'id', type = 'unsigned' },{ name = 'value', type = 'unsigned' }}) box.space.temp:create_index('primary', { parts = { 'id' } }) box.space.temp:insert({1, 1}) box.schema.space.drop(s.id) end
Edited by Denis Smirnov Collapse replies - Author Owner
When we run a test with
data-temporary
, we can see memtx fiber working. Here is the trace:* frame #0: 0x000000010e848d6c picodata`matras_destroy(m=0x00007fad1c13e150) at matras.c:118:25 frame #1: 0x000000010e97053d picodata`void memtx_tree_index_gc_free<true>(memtx_gc_task*) [inlined] NS_USE_HINT::memtx_tree_destroy(tree=0x00007fad1c13e0f8) at bps_tree.h:1475:2 frame #2: 0x000000010e970531 picodata`void memtx_tree_index_gc_free<true>(memtx_gc_task*) [inlined] void memtx_tree_index_free<true>(index=0x00007fad1c13e0b0) at memtx_tree.cc:806:2 frame #3: 0x000000010e970531 picodata`void memtx_tree_index_gc_free<true>(task=0x00007fad1c13e1e0) at memtx_tree.cc:850:2 frame #4: 0x000000010e992dd4 picodata`memtx_engine_gc_f(__va_list_tag*) [inlined] memtx_engine_run_gc(memtx=0x00007fad1d00fa00, stop=<unavailable>) at memtx_engine.cc:1314:3 frame #5: 0x000000010e992d98 picodata`memtx_engine_gc_f(va=<unavailable>) at memtx_engine.cc:1326:3 frame #6: 0x000000010e880c3c picodata`fiber_cxx_invoke(f=<unavailable>, ap=<unavailable>)(__va_list_tag*), __va_list_tag*) at fiber.h:1240:10 frame #7: 0x000000010e821e5c picodata`fiber_loop(data=<unavailable>) at fiber.c:998:18 frame #8: 0x000000010e88647c picodata`coro_init at coro.c:108:3
In a case of
temporary
spaces no gc happens (looks like we don't register its matras arena for desruction) - Author Owner
All the drop space/index/sequence opcodes are executed inside
alter_space_do()
function* frame #0: 0x000000010e9ffffe picodata`alter_space_do(stmt=0x0000000113c28288, alter=0x0000000113c283a8) at alter.cc:936:14 frame #1: 0x000000010e9fb9c5 picodata`on_replace_dd_index((null)=<unavailable>, event=<unavailable>) at alter.cc:2649:3 frame #2: 0x000000010e83a42a picodata`trigger_run_list(list=0x000000011f5ff8c8, event=0x0000000113c28038) at trigger.cc:119:7 frame #3: 0x000000010e83a211 picodata`::trigger_run(list=0x00007fad1c769370, event=0x0000000113c28038) at trigger.cc:152:9 frame #4: 0x000000010ea0c452 picodata`txn_commit_stmt(txn=0x0000000113c28038, request=0x000000011f5ff9b0) at txn.c:702:13 frame #5: 0x000000010ea14436 picodata`::box_process_rw(request=0x000000011f5ff9b0, space=0x000000012169ead8, result=0x0000000000000000) at box.cc:478:6 frame #6: 0x000000010ea1af13 picodata`::box_process1(request=<unavailable>, result=<unavailable>) at box.cc:3162:9 [artificial] frame #7: 0x000000010ea1b9e7 picodata`::box_delete(space_id=288, index_id=0, key=<unavailable>, key_end=<unavailable>, result=<unavailable>) at box.cc:3357:9 frame #8: 0x000000010eab4d0a picodata`lbox_index_delete(L=0x000000012169ead8) at index.c:140:11 frame #9: 0x000000010eb249f3 picodata`lj_BC_FUNCC + 68 frame #10: 0x000000010eaab32b picodata`execute_lua_eval(L=0x000000012169ead8) at call.c:435:2 frame #11: 0x000000010eb249f3 picodata`lj_BC_FUNCC + 68 frame #12: 0x000000010eb2c51d picodata`lua_pcall(L=0x000000012169ead8, nargs=1, nresults=-1, errfunc=0) at lj_api.c:1163:12 frame #13: 0x000000010e896dc3 picodata`luaT_call(L=<unavailable>, nargs=<unavailable>, nreturns=<unavailable>) at utils.c:618:6 frame #14: 0x000000010eaa9bc4 picodata`box_process_lua(handler=HANDLER_EVAL, ctx=0x000000011f5ffc68, ret=0x000000011f5ffe98) at call.c:646:6 frame #15: 0x000000010eaa9cc6 picodata`box_lua_eval(expr=<unavailable>, expr_len=<unavailable>, args=<unavailable>, ret=<unavailable>) at call.c:692:9 frame #16: 0x000000010ea48da8 picodata`box_process_lua(handler=<unavailable>, ctx=<unavailable>, ret=<unavailable>) at call.c:639:23 frame #17: 0x000000010e95fcfd picodata`tx_process_call(m=<unavailable>) at iproto.cc:2296:8 frame #18: 0x000000010e827944 picodata`cmsg_deliver(msg=<unavailable>) at cbus.c:553:2 frame #19: 0x000000010e828a2c picodata`fiber_pool_f(ap=<unavailable>) at fiber_pool.c:64:3 frame #20: 0x000000010e880c3c picodata`fiber_cxx_invoke(f=<unavailable>, ap=<unavailable>)(__va_list_tag*), __va_list_tag*) at fiber.h:1240:10 frame #21: 0x000000010e821e5c picodata`fiber_loop(data=<unavailable>) at fiber.c:998:18 frame #22: 0x000000010e88647c picodata`coro_init at coro.c:108:3
- Author Owner
The problem is that
alter_space_do()
adds in alter.cc:1009alter_space_commit()
as an on commit trigger.... /* * Install transaction commit/rollback triggers to either * finish or rollback the DDL depending on the results of * writing to WAL. */ txn_stmt_on_commit(stmt, on_commit);
alter_space_commit()
responsibility is to free the memory on space drop. It is never called for temporary spaces, but called for data-temporary, so there are no leaks (here is a trace for data-temporary):* frame #0: 0x000000010ea0082a picodata`alter_space_commit(trigger=0x0000000113c28478, event=0x0000000113c28038) at alter.cc:817:62 frame #1: 0x000000010e83a42a picodata`trigger_run_list(list=0x00007ff7b1f50a08, event=0x0000000113c28038) at trigger.cc:119:7 frame #2: 0x000000010e83a5c1 picodata`::trigger_run_reverse(list=0x0000000113c28198, event=0x0000000113c28038) at trigger.cc:166:9 frame #3: 0x000000010ea0c856 picodata`txn_complete_success(txn=0x0000000113c28038) at txn.c:799:7 frame #4: 0x000000010ea0f570 picodata`txn_on_journal_write(entry=0x0000000113c284e8) at txn.c:856:3 frame #5: 0x000000010ea48392 picodata`tx_complete_batch [inlined] journal_async_complete(entry=<unavailable>) at journal.h:257:2 frame #6: 0x000000010ea48383 picodata`tx_complete_batch [inlined] tx_schedule_queue(queue=0x00000001146001c0) at wal.c:283:3 frame #7: 0x000000010ea4836d picodata`tx_complete_batch(msg=0x00000001146001a0) at wal.c:379:2 frame #8: 0x000000010e827518 picodata`cbus_process [inlined] cmsg_deliver(msg=0x00000001146001a0) at cbus.c:553:2 frame #9: 0x000000010e82750b picodata`cbus_process(endpoint=<unavailable>) at cbus.c:788:3 frame #10: 0x000000010e882ddb picodata`ev_invoke_pending(loop=0x0000000111c86258) at ev.c:3797:11 frame #11: 0x000000010e8836d8 picodata`ev_run(loop=0x0000000111c86258, flags=0) at ev.c:4221:7 frame #12: 0x000000010e8805d0 picodata`::tarantool_main(argc=<unavailable>, argv=<unavailable>, cb=(picodata`picodata::main_run::trampoline::hc3c3036662e73318 at main.rs:52), cb_data=0x00007fad1c105ee0)(void *), void *) at main.cc:855:4 frame #13: 0x000000010dfec9cb picodata`picodata::main_run::hbb245d4b20992f1c(args=Run @ 0x00007ff7b1f52058) at main.rs:136:26 frame #14: 0x000000010dfebddc picodata`picodata::main::hb855b5b5f3fabad9 at main.rs:30:38 ...
The reason is the
tx_complete_batch()
function completes execution of a batch of WAL write requests (we don't write to WAL and don't execute this function)Edited by Denis Smirnov - Maintainer
fully temporary spaces are also supported in upstream tarantool-3.0. If this is reproducible there, we should probably report this bug
- Maintainer
Regarding your repro, I observe that box.slab.stats() doesn't decrease the item_count after dropping data-temporary or normal spaces either
Collapse replies - Author Owner
There is a special memtx gc fiber that frees the memory asynchronously (
memtx_engine_gc_f()
). 1
- Georgy Moshkin mentioned in merge request picodata/tarantool!126
mentioned in merge request picodata/tarantool!126
- Yaroslav Dynnikov changed milestone to %23.12.0
changed milestone to %23.12.0
- Georgy Moshkin closed with merge request tarantool!126 (merged)
closed with merge request tarantool!126 (merged)
- Georgy Moshkin mentioned in merge request !721 (closed)
mentioned in merge request !721 (closed)