Using the aligned methods requires more care to ensure that
the address really is aligned, so it's nicer if the aligned
methods are prefixed. The next commit will remove the unaligned_
prefix from the unaligned methods which in liblzma are used in
more places than the aligned ones.
Add a configure option --enable-unsafe-type-punning to get the
old non-conforming memory access methods. It can be useful with
old compilers or in some other less typical situations but
shouldn't normally be used.
Omit the packed struct trick for unaligned access. While it's
best in some cases, this is simpler. If the memcpy trick doesn't
work, one can request unsafe type punning from configure.
Because CRC32/CRC64 code needs fast aligned reads, if no very
safe way to do it is found, type punning is used as a fallback.
This sucks but since it currently works in practice, it seems to
be the least bad option. It's never needed with GCC >= 4.7 or
Clang >= 3.6 since these support __builtin_assume_aligned and
thus fast aligned access can be done with the memcpy trick.
Other things:
- Support GCC/Clang __builtin_bswapXX
- Cleaner bswap fallback macros
- Minor cleanups
This adds a configure option --enable-path-for-scripts=PREFIX
which defaults to empty except on Solaris it is /usr/xpg4/bin
to make POSIX grep and others available. The Solaris case had
been documented in INSTALL with a manual fix but it's better
to do this automatically since it is needed on most Solaris
systems anyway.
Thanks to Daniel Richard G.
Or any off_t which isn't very big (like signed 64 bit integer
that most system have). A small off_t could overflow if the
file being decompressed had long enough run of zero bytes,
which would result in corrupt output.
Now memcpy() or GNU C packed structs for unaligned access instead
of type punning. See the comment in this commit for details.
Avoiding type punning with unaligned access is needed to
silence gcc -fsanitize=undefined.
New functions: unaliged_readXXne and unaligned_writeXXne where
XX is 16, 32, or 64.
I should have always known this but I didn't. Here is an example
as a reminder to myself:
int mycopy(void *dest, void *src, size_t n)
{
memcpy(dest, src, n);
return dest == NULL;
}
In the example, a compiler may assume that dest != NULL because
passing NULL to memcpy() would be undefined behavior. Testing
with GCC 8.2.1, mycopy(NULL, NULL, 0) returns 1 with -O0 and -O1.
With -O2 the return value is 0 because the compiler infers that
dest cannot be NULL because it was already used with memcpy()
and thus the test for NULL gets optimized out.
In liblzma, if a null-pointer was passed to memcpy(), there were
no checks for NULL *after* the memcpy() call, so I cautiously
suspect that it shouldn't have caused bad behavior in practice,
but it's hard to be sure, and the problematic cases had to be
fixed anyway.
Thanks to Jeffrey Walton.
"xz -dcfv not_an_xz_file" crashed (all four options are
required to trigger it). It caused xz to call
lzma_get_progress(&strm, ...) when no coder was initialized
in strm. In this situation strm.internal is NULL which leads
to a crash in lzma_get_progress().
The bug was introduced when xz started using lzma_get_progress()
to get progress info for multi-threaded compression, so the
bug is present in versions 5.1.3alpha and higher.
Thanks to Filip Palian <Filip.Palian@pjwstk.edu.pl> for
the bug report.