Twitter | Pretraživanje | |
Daniel Lemire
Xor Filters: Faster and Smaller Than Bloom Filters
In software, you frequently need to check whether some objects is in a set. For example, you might have a list of forbidden Web addresses. As someone enters a new Web address, you may want to check...
Reply Retweet Označi sa "sviđa mi se" More
Jonah H. Harris 20. pro
Odgovor korisniku/ci @lemire
This article felt like a bait-and-switch. It sounded interesting until the very end, when finding out it’s immutable-only and, as such, not an online-capable algorithm. That’s a pretty big caveat and a huge difference when basing the article on bloom and cuckoo comparisons. :(
Reply Retweet Označi sa "sviđa mi se"
Daniel Lemire 20. pro
Odgovor korisniku/ci @jonahharris
Neither Bloom nor cuckoo filters are online algorithms. It is all fixed capacities.
Reply Retweet Označi sa "sviđa mi se"
Todd Lipcon 20. pro
Odgovor korisniku/ci @lemire
Any comparison to vacuum filter (vldb19?)
Reply Retweet Označi sa "sviđa mi se"
Daniel Lemire 20. pro
Odgovor korisniku/ci @tlipcon
Xor filters are going to be faster, smaller; the implementation far simpler (~200 lines).
Reply Retweet Označi sa "sviđa mi se"
Loris Cro 20. pro
Odgovor korisniku/ci @lemire
CF nerd here, thanks for the paper I'll make sure to read it carefully. In the meantime, two nitpicks: 1. Cuckoo's aren't that hard, I have a <250 LOC implementation that doesn't cut corners.
Reply Retweet Označi sa "sviđa mi se"
Daniel Lemire 20. pro
Odgovor korisniku/ci @croloris
Your implementation is great and quite simple. I don't think I ever wrote that this wasn't possible. I will write however that your implementation of an xor filter would be simpler. At least the query part...
Reply Retweet Označi sa "sviđa mi se"
Andreas Kipf 20. pro
Odgovor korisniku/ci @lemire
Nice implementation! The space savings are impressive. Have you considered comparing against cache-sectorized and register-blocked Bloom filters?
Reply Retweet Označi sa "sviđa mi se"
Andreas Kipf 20. pro
Odgovor korisniku/ci @lemire
This repo contains reproducibility instructions:
Reply Retweet Označi sa "sviđa mi se"
Alexander Nasonov 21. pro
Odgovor korisniku/ci @lemire
When I experimented with random acyclic graphs, murmur was quite bad on some inputs, look at how many tests are commented out:
Reply Retweet Označi sa "sviđa mi se"
Daniel Lemire 21. pro
Odgovor korisniku/ci @nasonov
Murmur is quite good. Are you sure that your implementation is correct?
Reply Retweet Označi sa "sviđa mi se"