From d6cf327faf89b362005d709174efc9fb77df9feb Mon Sep 17 00:00:00 2001 From: Alexander Turenko <alexander.turenko@tarantool.org> Date: Wed, 29 Jan 2020 11:18:33 +0300 Subject: [PATCH] test: stabilize flaky fiber memory leak detection After #4736 regression fix (in fact it just reverts the new logic in small) it is possible again that a fiber's region may hold a memory for a while, but release it eventually. When the used memory exceeds 128 KiB threshold, fiber_gc() puts 'garbage' slabs back to slab_cache and subtracts them from region_used() metric. But until this point those slabs are accounted in region_used() and so in fiber.info() metrics. This commit fixes flakiness of test cases of the following kind: | fiber.info()[fiber.self().id()].memory.used -- should be zero | <...workload...> | fiber.info()[fiber.self().id()].memory.used -- should be zero The problem is that the first `<...>.memory.used` value may be non-zero. It depends of previous tests that were executed on this tarantool instance. The obvious way to solve it would be print differences between `<...>.memory.used` values before and after a workload instead of absolute values. This however does not work, because a first slab in a region can be almost used at the point where a test case starts and a next slab will be acquired from a slab_cache. This means that the previous slab will become a 'garbage' and will not be collected until 128 KiB threshold will exceed: the latter `<...>.memory.used` check will return a bigger value than the former one. However, if the threshold will be reached during the workload, the latter check may show lesser value than the former one. In short, the test case would be unstable after this change. It is resolved by restarting of a tarantool instance before such test cases to ensure that there are no 'garbage' slabs in a current fiber's region. Note: This works only if a test case reserves only one slab at the moment: otherwise some memory may be hold after the case (and so a memory check after a workload will fail). However it seems that our cases are small enough to don't trigger this situation. Call of region_free() would be enough, but we have no Lua API for it. Fixes #4750. --- test/engine/snapshot.result | 3 +++ test/engine/snapshot.test.lua | 4 ++++ test/sql/gh-3199-no-mem-leaks.result | 9 ++++++--- test/sql/gh-3199-no-mem-leaks.test.lua | 5 ++++- 4 files changed, 17 insertions(+), 4 deletions(-) diff --git a/test/engine/snapshot.result b/test/engine/snapshot.result index 9b4824ba39..f4ac7287aa 100644 --- a/test/engine/snapshot.result +++ b/test/engine/snapshot.result @@ -238,6 +238,9 @@ box.space.test.index.secondary:select{} box.space.test:drop() --- ... +-- Hard way to flush garbage slabs in the fiber's region. See +-- gh-4750. +test_run:cmd('restart server default') -- Check that box.snapshot() doesn't leave garbage one the region. -- https://github.com/tarantool/tarantool/issues/3732 fiber = require('fiber') diff --git a/test/engine/snapshot.test.lua b/test/engine/snapshot.test.lua index 13eecccc2c..3f5a89e0bb 100644 --- a/test/engine/snapshot.test.lua +++ b/test/engine/snapshot.test.lua @@ -42,6 +42,10 @@ box.space.test.index.primary:select{} box.space.test.index.secondary:select{} box.space.test:drop() +-- Hard way to flush garbage slabs in the fiber's region. See +-- gh-4750. +test_run:cmd('restart server default') + -- Check that box.snapshot() doesn't leave garbage one the region. -- https://github.com/tarantool/tarantool/issues/3732 fiber = require('fiber') diff --git a/test/sql/gh-3199-no-mem-leaks.result b/test/sql/gh-3199-no-mem-leaks.result index 35d8572b48..d8590779a2 100644 --- a/test/sql/gh-3199-no-mem-leaks.result +++ b/test/sql/gh-3199-no-mem-leaks.result @@ -7,13 +7,16 @@ engine = test_run:get_cfg('engine') _ = box.space._session_settings:update('sql_default_engine', {{'=', 2, engine}}) --- ... -fiber = require('fiber') ---- -... -- This test checks that no leaks of region memory happens during -- executing SQL queries. -- -- box.cfg() +-- Hard way to flush garbage slabs in the fiber's region. See +-- gh-4750. +test_run:cmd('restart server default') +fiber = require('fiber') +--- +... box.execute('CREATE TABLE test (id INT PRIMARY KEY, x INTEGER, y INTEGER)') --- - row_count: 1 diff --git a/test/sql/gh-3199-no-mem-leaks.test.lua b/test/sql/gh-3199-no-mem-leaks.test.lua index 41648d0fc1..d517c3373c 100644 --- a/test/sql/gh-3199-no-mem-leaks.test.lua +++ b/test/sql/gh-3199-no-mem-leaks.test.lua @@ -1,7 +1,6 @@ test_run = require('test_run').new() engine = test_run:get_cfg('engine') _ = box.space._session_settings:update('sql_default_engine', {{'=', 2, engine}}) -fiber = require('fiber') -- This test checks that no leaks of region memory happens during -- executing SQL queries. @@ -9,6 +8,10 @@ fiber = require('fiber') -- box.cfg() +-- Hard way to flush garbage slabs in the fiber's region. See +-- gh-4750. +test_run:cmd('restart server default') +fiber = require('fiber') box.execute('CREATE TABLE test (id INT PRIMARY KEY, x INTEGER, y INTEGER)') box.execute('INSERT INTO test VALUES (1, 1, 1), (2, 2, 2)') -- GitLab