chore(tests): more extensive benchmarks
I was profiling clients and for ease of profiling, to have more raw runtime, I decided to move them into a new crate with custom benchmarking harness. Also I needed to profile lua net.box
llib.
Benchmarking Results
As it turns out when compiled with --release
the performance of all clients is identical. Which was overlooked in the previous benchmark.
lua_netbox: 16641.59709ns +- 7978.959165410804ns
netbox: 15863.29184ns +- 3094.862121340621ns
network_client: 16923.68175ns +- 2970.5274764941873ns
This is probably due to async state machines being heavily optimized in release mode.
Suggestions
The performance testing approach in this MR seems better to me than in !288 (merged) due to:
- Being easier to profile - smaller callstack
- Has custom async harness - produces more accurate results for async functions
- Lua
net.box
client is tested right in lua - has accurate results
Therefore it is suggested to merge this benchmarks and remove the ones from !288 (merged)
Conclusion
There probably is not much room for optimization in our async network client.