mirror of
https://git.tukaani.org/xz.git
synced 2024-04-04 12:36:23 +02:00
liblzma: CLMUL CRC64: Work around a bug in MSVC, second attempt.
This affects only 32-bit x86 builds. x86-64 is OK as is. I still cannot easily test this myself. The reporter has tested this and it passes the tests included in the CMake build and performance is good: raw CRC64 is 2-3 times faster than the C version of the slice-by-four method. (Note that liblzma doesn't include a MSVC-compatible version of the 32-bit x86 assembly code for the slice-by-four method.) Thanks to Iouri Kharon for figuring out a fix, testing, and benchmarking.
This commit is contained in:
parent
b7fb438ea0
commit
c337983e92
1 changed files with 18 additions and 0 deletions
|
@ -184,6 +184,20 @@ calc_hi(uint64_t poly, uint64_t a)
|
||||||
MASK_H(in, mask, high)
|
MASK_H(in, mask, high)
|
||||||
|
|
||||||
|
|
||||||
|
// MSVC (VS2015 - VS2022) produces bad 32-bit x86 code from the CLMUL CRC
|
||||||
|
// code when optimizations are enabled (release build). According to the bug
|
||||||
|
// report, the ebx register is corrupted and the calculated result is wrong.
|
||||||
|
// Trying to workaround the problem with "__asm mov ebx, ebx" didn't help.
|
||||||
|
// The following pragma works and performance is still good. x86-64 builds
|
||||||
|
// aren't affected by this problem.
|
||||||
|
//
|
||||||
|
// NOTE: Another pragma after the function restores the optimizations.
|
||||||
|
// If the #if condition here is updated, the other one must be updated too.
|
||||||
|
#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && !defined(__clang__) \
|
||||||
|
&& defined(_M_IX86)
|
||||||
|
# pragma optimize("g", off)
|
||||||
|
#endif
|
||||||
|
|
||||||
// EDG-based compilers (Intel's classic compiler and compiler for E2K) can
|
// EDG-based compilers (Intel's classic compiler and compiler for E2K) can
|
||||||
// define __GNUC__ but the attribute must not be used with them.
|
// define __GNUC__ but the attribute must not be used with them.
|
||||||
// The new Clang-based ICX needs the attribute.
|
// The new Clang-based ICX needs the attribute.
|
||||||
|
@ -371,6 +385,10 @@ crc64_clmul(const uint8_t *buf, size_t size, uint64_t crc)
|
||||||
# pragma GCC diagnostic pop
|
# pragma GCC diagnostic pop
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
#if defined(_MSC_VER) && !defined(__INTEL_COMPILER) && !defined(__clang__) \
|
||||||
|
&& defined(_M_IX86)
|
||||||
|
# pragma optimize("", on)
|
||||||
|
#endif
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue