diff options
author | Chandler Carruth <chandlerc@gmail.com> | 2012-03-01 18:55:25 +0000 |
---|---|---|
committer | Chandler Carruth <chandlerc@gmail.com> | 2012-03-01 18:55:25 +0000 |
commit | 0b66c6fca22e85f732cf58f459a06c06833d1882 (patch) | |
tree | 6fd138d33fad2d4b3b5a071367708da25281b327 /lib/Support | |
parent | 4b1212b4bfac98c688d484bf22ae158875f06ad5 (diff) | |
download | external_llvm-0b66c6fca22e85f732cf58f459a06c06833d1882.zip external_llvm-0b66c6fca22e85f732cf58f459a06c06833d1882.tar.gz external_llvm-0b66c6fca22e85f732cf58f459a06c06833d1882.tar.bz2 |
Rewrite LLVM's generalized support library for hashing to follow the API
of the proposed standard hashing interfaces (N3333), and to use
a modified and tuned version of the CityHash algorithm.
Some of the highlights of this change:
-- Significantly higher quality hashing algorithm with very well
distributed results, and extremely few collisions. Should be close to
a checksum for up to 64-bit keys. Very little clustering or clumping of
hash codes, to better distribute load on probed hash tables.
-- Built-in support for reserved values.
-- Simplified API that composes cleanly with other C++ idioms and APIs.
-- Better scaling performance as keys grow. This is the fastest
algorithm I've found and measured for moderately sized keys (such as
show up in some of the uniquing and folding use cases)
-- Support for enabling per-execution seeds to prevent table ordering
or other artifacts of hashing algorithms to impact the output of
LLVM. The seeding would make each run different and highlight these
problems during bootstrap.
This implementation was tested extensively using the SMHasher test
suite, and pased with flying colors, doing better than the original
CityHash algorithm even.
I've included a unittest, although it is somewhat minimal at the moment.
I've also added (or refactored into the proper location) type traits
necessary to implement this, and converted users of GeneralHash over.
My only immediate concerns with this implementation is the performance
of hashing small keys. I've already started working to improve this, and
will continue to do so. Currently, the only algorithms faster produce
lower quality results, but it is likely there is a better compromise
than the current one.
Many thanks to Jeffrey Yasskin who did most of the work on the N3333
paper, pair-programmed some of this code, and reviewed much of it. Many
thanks also go to Geoff Pike Pike and Jyrki Alakuijala, the original
authors of CityHash on which this is heavily based, and Austin Appleby
who created MurmurHash and the SMHasher test suite.
Also thanks to Nadav, Tobias, Howard, Jay, Nick, Ahmed, and Duncan for
all of the review comments! If there are further comments or concerns,
please let me know and I'll jump on 'em.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151822 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib/Support')
-rw-r--r-- | lib/Support/CMakeLists.txt | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/lib/Support/CMakeLists.txt b/lib/Support/CMakeLists.txt index 6cec47d..0b69238 100644 --- a/lib/Support/CMakeLists.txt +++ b/lib/Support/CMakeLists.txt @@ -26,6 +26,7 @@ add_llvm_library(LLVMSupport FoldingSet.cpp FormattedStream.cpp GraphWriter.cpp + Hashing.cpp IntEqClasses.cpp IntervalMap.cpp IntrusiveRefCntPtr.cpp |