xxHash  0.8.0
Extremely fast non-cryptographic hash function
Macros | Enumerations
Tuning parameters

Macros

#define XXH_NO_LONG_LONG
 Define this to disable 64-bit code. More...
 
#define XXH_FORCE_MEMORY_ACCESS   0
 Controls how unaligned memory is accessed. More...
 
#define XXH_FORCE_ALIGN_CHECK   0
 If defined to non-zero, adds a special path for aligned inputs (XXH32() and XXH64() only). More...
 
#define XXH_NO_INLINE_HINTS   0
 When non-zero, sets all functions to static. More...
 
#define XXH32_ENDJMP   0
 Whether to use a jump for XXH32_finalize. More...
 
#define XXH_OLD_NAMES
 Redefines old internal names. More...
 
#define XXH_DEBUGLEVEL   0
 Sets the debugging level. More...
 
#define XXH_CPU_LITTLE_ENDIAN   XXH_isLittleEndian()
 Whether the target is little endian. More...
 
#define XXH_VECTOR   XXH_SCALAR
 Overrides the vectorization implementation chosen for XXH3. More...
 
#define XXH_ACC_ALIGN   8
 Selects the minimum alignment for XXH3's accumulators. More...
 

Enumerations

enum  XXH_VECTOR_TYPE {
  XXH_SCALAR = 0 , XXH_SSE2 = 1 , XXH_AVX2 = 2 , XXH_AVX512 = 3 ,
  XXH_NEON = 4 , XXH_VSX = 5
}
 Possible values for XXH_VECTOR. More...
 

Detailed Description

Various macros to control xxHash's behavior.

Macro Definition Documentation

◆ XXH_NO_LONG_LONG

#define XXH_NO_LONG_LONG

Define this to disable 64-bit code.

Useful if only using the XXH32 family and you have a strict C90 compiler.

◆ XXH_FORCE_MEMORY_ACCESS

#define XXH_FORCE_MEMORY_ACCESS   0

Controls how unaligned memory is accessed.

By default, access to unaligned memory is controlled by memcpy(), which is safe and portable.

Unfortunately, on some target/compiler combinations, the generated assembly is sub-optimal.

The below switch allow selection of a different access method in the search for improved performance.

Possible options:
  • XXH_FORCE_MEMORY_ACCESS=0 (default): memcpy
    Use memcpy(). Safe and portable. Note that most modern compilers will eliminate the function call and treat it as an unaligned access.
  • XXH_FORCE_MEMORY_ACCESS=1: __attribute__((packed))
    Depends on compiler extensions and is therefore not portable. This method is safe if your compiler supports it, and generally as fast or faster than memcpy.
  • XXH_FORCE_MEMORY_ACCESS=2: Direct cast
    Casts directly and dereferences. This method doesn't depend on the compiler, but it violates the C standard as it directly dereferences an unaligned pointer. It can generate buggy code on targets which do not support unaligned memory accesses, but in some circumstances, it's the only known way to get the most performance.
  • XXH_FORCE_MEMORY_ACCESS=3: Byteshift
    Also portable. This can generate the best code on old compilers which don't inline small memcpy() calls, and it might also be faster on big-endian systems which lack a native byteswap instruction. However, some compilers will emit literal byteshifts even if the target supports unaligned access.
    Warning
    Methods 1 and 2 rely on implementation-defined behavior. Use these with care, as what works on one compiler/platform/optimization level may cause another to read garbage data or even crash.
    See http://fastcompression.blogspot.com/2015/08/accessing-unaligned-memory.html for details.

Prefer these methods in priority order (0 > 3 > 1 > 2)

◆ XXH_FORCE_ALIGN_CHECK

#define XXH_FORCE_ALIGN_CHECK   0

If defined to non-zero, adds a special path for aligned inputs (XXH32() and XXH64() only).

This is an important performance trick for architectures without decent unaligned memory access performance.

It checks for input alignment, and when conditions are met, uses a "fast path" employing direct 32-bit/64-bit reads, resulting in dramatically faster read speed.

The check costs one initial branch per hash, which is generally negligible, but not zero.

Moreover, it's not useful to generate an additional code path if memory access uses the same instruction for both aligned and unaligned addresses (e.g. x86 and aarch64).

In these cases, the alignment check can be removed by setting this macro to 0. Then the code will always use unaligned memory access. Align check is automatically disabled on x86, x64 & arm64, which are platforms known to offer good unaligned memory accesses performance.

This option does not affect XXH3 (only XXH32 and XXH64).

◆ XXH_NO_INLINE_HINTS

#define XXH_NO_INLINE_HINTS   0

When non-zero, sets all functions to static.

By default, xxHash tries to force the compiler to inline almost all internal functions.

This can usually improve performance due to reduced jumping and improved constant folding, but significantly increases the size of the binary which might not be favorable.

Additionally, sometimes the forced inlining can be detrimental to performance, depending on the architecture.

XXH_NO_INLINE_HINTS marks all internal functions as static, giving the compiler full control on whether to inline or not.

When not optimizing (-O0), optimizing for size (-Os, -Oz), or using -fno-inline with GCC or Clang, this will automatically be defined.

◆ XXH32_ENDJMP

#define XXH32_ENDJMP   0

Whether to use a jump for XXH32_finalize.

For performance, XXH32_finalize uses multiple branches in the finalizer. This is generally preferable for performance, but depending on exact architecture, a jmp may be preferable.

This setting is only possibly making a difference for very small inputs.

◆ XXH_OLD_NAMES

#define XXH_OLD_NAMES

Redefines old internal names.

For compatibility with code that uses xxHash's internals before the names were changed to improve namespacing. There is no other reason to use this.

◆ XXH_DEBUGLEVEL

#define XXH_DEBUGLEVEL   0

Sets the debugging level.

XXH_DEBUGLEVEL is expected to be defined externally, typically via the compiler's command line options. The value must be a number.

◆ XXH_CPU_LITTLE_ENDIAN

#define XXH_CPU_LITTLE_ENDIAN   XXH_isLittleEndian()

Whether the target is little endian.

Defined to 1 if the target is little endian, or 0 if it is big endian. It can be defined externally, for example on the compiler command line.

If it is not defined, a runtime check (which is usually constant folded) is used instead.

Note
This is not necessarily defined to an integer constant.
See also
XXH_isLittleEndian() for the runtime check.

◆ XXH_VECTOR

#define XXH_VECTOR   XXH_SCALAR

Overrides the vectorization implementation chosen for XXH3.

Can be defined to 0 to disable SIMD or any of the values mentioned in XXH_VECTOR_TYPE.

If this is not defined, it uses predefined macros to determine the best implementation.

◆ XXH_ACC_ALIGN

#define XXH_ACC_ALIGN   8

Selects the minimum alignment for XXH3's accumulators.

When using SIMD, this should match the alignment reqired for said vector type, so, for example, 32 for AVX2.

Default: Auto detected.

Enumeration Type Documentation

◆ XXH_VECTOR_TYPE

Possible values for XXH_VECTOR.

Note that these are actually implemented as macros.

If this is not defined, it is detected automatically. XXH_X86DISPATCH overrides this.

Enumerator
XXH_SCALAR 

Portable scalar version

XXH_SSE2 

SSE2 for Pentium 4, Opteron, all x86_64.

Note
SSE2 is also guaranteed on Windows 10, macOS, and Android x86.
XXH_AVX2 

AVX2 for Haswell and Bulldozer

XXH_AVX512 

AVX512 for Skylake and Icelake

XXH_NEON 

NEON for most ARMv7-A and all AArch64

XXH_VSX 

VSX and ZVector for POWER8/z13 (64-bit)