Skip to content
Snippets Groups Projects
user avatar
Vladimir Davydov authored
Tuple bloom filter is an array of bloom filters, each of which reflects
lookups by all possible partial keys. To optimize the overall bloom
filter size, we need to know how many unique elements there are for each
partial key. To achieve that, we require the caller to pass the number
of key parts that have been hashed for the given tuple. Here's how it
looks in Vinyl:

        uint32_t hashed_parts = writer->last_stmt == NULL ? 0 :
                tuple_common_key_parts(stmt, writer->last_stmt,
                                       writer->key_def);
        tuple_bloom_builder_add(writer->bloom, stmt,
                                writer->key_def, hashed_parts);

Actually, there's no need in such a requirement as instead we can
calculate the hash value for the given tuple, compare it with the hash
of the tuple added last time, and add the new hash only if the two
values differ. This should be accurate enough while allowing us to get
rid of the cumbersome tuple_common_key_parts helper. Note, such a check
will only work if tuples are added in the order defined by the key
definition, but that already holds - anyway, one wouldn't be able to
use tuple_common_key_parts either if it wasn't true.

While we are at it, refresh the obsolete comment to tuple_bloom_builder.
afd30b95
History
Name Last commit Last update