The elements in a large universe
have different membership likelihoods and query frequencies in many applications. Thus, the number of hash functions assigned to each element of
in the traditional Bloom filter can be further optimized to minimize the average false positive rate. We propose an improved weighted Bloom filter (IWBF) that assigns an optimal number of hash functions to each element and has a less average false positive rate compared to the weighted Bloom filter. We show a tight space lower bound for any approximated membership querying algorithm that represents a small subset
and answers membership queries with predefined false positive rates, when the query frequencies and membership likelihoods of the elements in
are known. We also provide an approximate space lower bound for approximated membership querying algorithms that have an average false positive rate, and show that the number of bits used in IWBF is within a factor of
of the approximate space lower bound.