See the code comment for reasoning. It's far from perfect but
hopefully good enough for certain cases while hopefully doing
nothing bad in other situations.
At presets -5 ... -9, 4020 MiB vs. 4096 MiB makes no difference
on how xz scales down the number of threads.
The limit has to be a few MiB below 4096 MiB because otherwise
things like "xz --lzma2=dict=500MiB" won't scale down the dict
size enough and xz cannot allocate enough memory. With
"ulimit -v $((4096 * 1024))" on x86-64, the limit in xz had
to be no more than 4085 MiB. Some safety margin is good though.
This is hack but it should be useful when running 32-bit xz on
a 64-bit kernel that gives full 4 GiB address space to xz.
Hopefully this is enough to solve this:
https://bugzilla.redhat.com/show_bug.cgi?id=1196786
FreeBSD has a patch that limits the result in tuklib_physmem()
to SIZE_MAX on 32-bit systems. While I think it's not the way
to do it, the results on --memlimit-compress have been good. This
commit should achieve practically identical results for compression
while leaving decompression and tuklib_physmem() and thus
lzma_physmem() unaffected.
xz --flush-timeout=2000, old version:
1. xz is started. The next flush will happen after two seconds.
2. No input for one second.
3. A burst of a few kilobytes of input.
4. No input for one second.
5. Two seconds have passed and flushing starts.
The first second counted towards the flush-timeout even though
there was no pending data. This can cause flushing to occur more
often than needed.
xz --flush-timeout=2000, after this commit:
1. xz is started.
2. No input for one second.
3. A burst of a few kilobytes of input. The next flush will
happen after two seconds counted from the time when the
first bytes of the burst were read.
4. No input for one second.
5. No input for another second.
6. Two seconds have passed and flushing starts.
The same code sequence repeats so it's nicer as a separate function.
Note that in one case there was no test for opt_mode != MODE_TEST,
but that was only because that condition would always be true, so
this commit doesn't change the behavior there.
When input blocked, xz --flush-timeout=1 would wake up every
millisecond and initiate flushing which would have nothing to
flush and thus would just waste CPU time. The fix disables the
timeout when no input has been seen since the previous flush.
Using the aligned methods requires more care to ensure that
the address really is aligned, so it's nicer if the aligned
methods are prefixed. The next commit will remove the unaligned_
prefix from the unaligned methods which in liblzma are used in
more places than the aligned ones.
Add a configure option --enable-unsafe-type-punning to get the
old non-conforming memory access methods. It can be useful with
old compilers or in some other less typical situations but
shouldn't normally be used.
Omit the packed struct trick for unaligned access. While it's
best in some cases, this is simpler. If the memcpy trick doesn't
work, one can request unsafe type punning from configure.
Because CRC32/CRC64 code needs fast aligned reads, if no very
safe way to do it is found, type punning is used as a fallback.
This sucks but since it currently works in practice, it seems to
be the least bad option. It's never needed with GCC >= 4.7 or
Clang >= 3.6 since these support __builtin_assume_aligned and
thus fast aligned access can be done with the memcpy trick.
Other things:
- Support GCC/Clang __builtin_bswapXX
- Cleaner bswap fallback macros
- Minor cleanups
This adds a configure option --enable-path-for-scripts=PREFIX
which defaults to empty except on Solaris it is /usr/xpg4/bin
to make POSIX grep and others available. The Solaris case had
been documented in INSTALL with a manual fix but it's better
to do this automatically since it is needed on most Solaris
systems anyway.
Thanks to Daniel Richard G.
LZMA_TIMED_OUT is *internally* used as a value for lzma_ret
enumeration. Previously it was #defined to 32 and cast to lzma_ret.
That way it wasn't visible in the public API, but this was hackish.
Now the public API has eight LZMA_RET_INTERNALx members and
LZMA_TIMED_OUT is #defined to LZMA_RET_INTERNAL1. This way
the code is cleaner overall although the public API has a few
extra mysterious enum members.
Or any off_t which isn't very big (like signed 64 bit integer
that most system have). A small off_t could overflow if the
file being decompressed had long enough run of zero bytes,
which would result in corrupt output.
Now memcpy() or GNU C packed structs for unaligned access instead
of type punning. See the comment in this commit for details.
Avoiding type punning with unaligned access is needed to
silence gcc -fsanitize=undefined.
New functions: unaliged_readXXne and unaligned_writeXXne where
XX is 16, 32, or 64.