lua: add fiber.top() listing fiber cpu consumption
Implement a new function in Lua fiber library: top(). It returns a table containing fiber cpu usage stats. The table has two entries: "cpu_misses" and "cpu". "cpu" itself is a table listing all the alive fibers and their cpu consumtion. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - cpu: 107/lua: instant: 30.967324490456 time: 0.351821993 average: 25.582738345233 104/lua: instant: 9.6473633128437 time: 0.110869897 average: 7.9693406131877 101/on_shutdown: instant: 0 time: 0 average: 0 103/lua: instant: 9.8026528631511 time: 0.112641118 average: 18.138387232255 106/lua: instant: 20.071174377224 time: 0.226901357 average: 17.077908441831 102/interactive: instant: 0 time: 9.6858e-05 average: 0 105/lua: instant: 9.2461986412164 time: 0.10657528 average: 7.7068458630827 1/sched: instant: 20.265286315108 time: 0.237095335 average: 23.141537169257 cpu_misses: 0 ... ``` The two entries in a table returned by `fiber.top()` are `cpu_misses` and `cpu`. `cpu` itself is a table whose keys are strings containing fiber ids and names. The three metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. 3) time (seconds), which estimates how much cpu time each fiber spent processing during its lifetime. More info on `cpu_misses` field returned by `fiber.top()`: `cpu_misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`. "Time" entry is also added to each fibers output in fiber.info() (it duplicates "time" entry from fiber.top().cpu per fiber). Note, that "time" is only counted while fiber.top is enabled.
Showing
- src/lib/core/fiber.c 193 additions, 2 deletionssrc/lib/core/fiber.c
- src/lib/core/fiber.h 70 additions, 0 deletionssrc/lib/core/fiber.h
- src/lua/fiber.c 72 additions, 0 deletionssrc/lua/fiber.c
- test/app/fiber.result 83 additions, 0 deletionstest/app/fiber.result
- test/app/fiber.test.lua 40 additions, 0 deletionstest/app/fiber.test.lua
Loading
Please register or sign in to comment