From d6cf327faf89b362005d709174efc9fb77df9feb Mon Sep 17 00:00:00 2001
From: Alexander Turenko <alexander.turenko@tarantool.org>
Date: Wed, 29 Jan 2020 11:18:33 +0300
Subject: [PATCH] test: stabilize flaky fiber memory leak detection

After #4736 regression fix (in fact it just reverts the new logic in
small) it is possible again that a fiber's region may hold a memory for
a while, but release it eventually. When the used memory exceeds 128 KiB
threshold, fiber_gc() puts 'garbage' slabs back to slab_cache and
subtracts them from region_used() metric. But until this point those
slabs are accounted in region_used() and so in fiber.info() metrics.

This commit fixes flakiness of test cases of the following kind:

 | fiber.info()[fiber.self().id()].memory.used -- should be zero
 | <...workload...>
 | fiber.info()[fiber.self().id()].memory.used -- should be zero

The problem is that the first `<...>.memory.used` value may be non-zero.
It depends of previous tests that were executed on this tarantool
instance.

The obvious way to solve it would be print differences between
`<...>.memory.used` values before and after a workload instead of
absolute values. This however does not work, because a first slab in a
region can be almost used at the point where a test case starts and a
next slab will be acquired from a slab_cache. This means that the
previous slab will become a 'garbage' and will not be collected until
128 KiB threshold will exceed: the latter `<...>.memory.used` check will
return a bigger value than the former one. However, if the threshold
will be reached during the workload, the latter check may show lesser
value than the former one. In short, the test case would be unstable
after this change.

It is resolved by restarting of a tarantool instance before such test
cases to ensure that there are no 'garbage' slabs in a current fiber's
region.

Note: This works only if a test case reserves only one slab at the
moment: otherwise some memory may be hold after the case (and so a
memory check after a workload will fail). However it seems that our
cases are small enough to don't trigger this situation.

Call of region_free() would be enough, but we have no Lua API for it.

Fixes #4750.
---
 test/engine/snapshot.result            | 3 +++
 test/engine/snapshot.test.lua          | 4 ++++
 test/sql/gh-3199-no-mem-leaks.result   | 9 ++++++---
 test/sql/gh-3199-no-mem-leaks.test.lua | 5 ++++-
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/test/engine/snapshot.result b/test/engine/snapshot.result
index 9b4824ba39..f4ac7287aa 100644
--- a/test/engine/snapshot.result
+++ b/test/engine/snapshot.result
@@ -238,6 +238,9 @@ box.space.test.index.secondary:select{}
 box.space.test:drop()
 ---
 ...
+-- Hard way to flush garbage slabs in the fiber's region. See
+-- gh-4750.
+test_run:cmd('restart server default')
 -- Check that box.snapshot() doesn't leave garbage one the region.
 -- https://github.com/tarantool/tarantool/issues/3732
 fiber = require('fiber')
diff --git a/test/engine/snapshot.test.lua b/test/engine/snapshot.test.lua
index 13eecccc2c..3f5a89e0bb 100644
--- a/test/engine/snapshot.test.lua
+++ b/test/engine/snapshot.test.lua
@@ -42,6 +42,10 @@ box.space.test.index.primary:select{}
 box.space.test.index.secondary:select{}
 box.space.test:drop()
 
+-- Hard way to flush garbage slabs in the fiber's region. See
+-- gh-4750.
+test_run:cmd('restart server default')
+
 -- Check that box.snapshot() doesn't leave garbage one the region.
 -- https://github.com/tarantool/tarantool/issues/3732
 fiber = require('fiber')
diff --git a/test/sql/gh-3199-no-mem-leaks.result b/test/sql/gh-3199-no-mem-leaks.result
index 35d8572b48..d8590779a2 100644
--- a/test/sql/gh-3199-no-mem-leaks.result
+++ b/test/sql/gh-3199-no-mem-leaks.result
@@ -7,13 +7,16 @@ engine = test_run:get_cfg('engine')
 _ = box.space._session_settings:update('sql_default_engine', {{'=', 2, engine}})
 ---
 ...
-fiber = require('fiber')
----
-...
 -- This test checks that no leaks of region memory happens during
 -- executing SQL queries.
 --
 -- box.cfg()
+-- Hard way to flush garbage slabs in the fiber's region. See
+-- gh-4750.
+test_run:cmd('restart server default')
+fiber = require('fiber')
+---
+...
 box.execute('CREATE TABLE test (id INT PRIMARY KEY, x INTEGER, y INTEGER)')
 ---
 - row_count: 1
diff --git a/test/sql/gh-3199-no-mem-leaks.test.lua b/test/sql/gh-3199-no-mem-leaks.test.lua
index 41648d0fc1..d517c3373c 100644
--- a/test/sql/gh-3199-no-mem-leaks.test.lua
+++ b/test/sql/gh-3199-no-mem-leaks.test.lua
@@ -1,7 +1,6 @@
 test_run = require('test_run').new()
 engine = test_run:get_cfg('engine')
 _ = box.space._session_settings:update('sql_default_engine', {{'=', 2, engine}})
-fiber = require('fiber')
 
 -- This test checks that no leaks of region memory happens during
 -- executing SQL queries.
@@ -9,6 +8,10 @@ fiber = require('fiber')
 
 -- box.cfg()
 
+-- Hard way to flush garbage slabs in the fiber's region. See
+-- gh-4750.
+test_run:cmd('restart server default')
+fiber = require('fiber')
 
 box.execute('CREATE TABLE test (id INT PRIMARY KEY, x INTEGER, y INTEGER)')
 box.execute('INSERT INTO test VALUES (1, 1, 1), (2, 2, 2)')
-- 
GitLab