Research: allow_filtering analogue
At the moment our users can execute any supported SQL queries and this is the problem for the resource utilization reasons.
- Memory. A query
select * from tcan dump into memory millions of rows (as a negative scenario) and we can face OOM. Whileselect * from t where pk = 1returns a predictable amount of rows (0 or 1). - Time. Tarantool has a single transactional thread and while the local Tarantool plan doesn't have Yield node, the query occupies all CPU resources and other queries have to wait till it is done. We don't want to have unpredictable latency spikes.
Solution. The ideal solution is to have a fare resource manager that tracks memory and time utilization for each query and switches the tasks. But we don't live in this ideal world at the moment and want a brutal and easy to implement solution to protect our users - something like allow filtering option in Casandra (i don't like this name, but we can use our own). Any query that can block transaction thread for unpredictable time (or can return more than X rows?) would be cancelled without this option.
The goal of this research is to:
- look for similar solutions (other then Casandra)
- classify the supported queries without
allow filtering - find out what information from the schema we need for sbroad (PK?)
- design the resource limits (rows? bytes? total or max per shard?)
As a result of the research we should produce a track of tasks to protect our users from memory/time spikes.