Fix severe slowdown on certain strings in LuaJIT
Apply a patch from Yura Sokolov: The default "fast" string hash function samples only a few positions in a string, the remaining bytes don't affect the function's result. The function performs well for short strings; however long strings can yield extremely high collision rates. An adaptive schema was implemented. Two hash functions are used simultaneously. A bucket is picked based on the output of the fast hash function. If an item is to be inserted in a collision chain longer than a certain threshold, another bucket is picked based on the stronger hash function. Since two hash functions are used simultaneously, insert should consider two buckets. The second bucket is often NOT considered thanks to the bloom filter. The filter is rebuilt during GC cycle.
Please register or sign in to comment