xz-archive

mirror of https://git.tukaani.org/xz.git synced 2024-04-04 12:36:23 +02:00

Author	SHA1	Message	Date
Lasse Collin	a648978b20	xzgrep: Make the fix for ZDI-CAN-16587 more robust. I don't know if this can make a difference in the real world but it looked kind of suspicious (what happens with sed implementations that cannot process very long lines?). At least this commit shouldn't make it worse.	2022-07-19 00:10:55 +03:00
Lasse Collin	bd7b290f3f	xzgrep: Use grep -H --label when available (GNU, *BSDs). It avoids the use of sed for prefixing filenames to output lines. Using sed for that is slower and prone to security bugs so now the sed method is only used as a fallback. This also fixes an actual bug: When grepping a binary file, GNU grep nowadays prints its diagnostics to stderr instead of stdout and thus the sed-method for prefixing the filename doesn't work. So with this commit grepping binary files gives reasonable output with GNU grep now. This was inspired by zgrep but the implementation is different.	2022-07-18 22:06:10 +03:00
Lasse Collin	b56729af9f	xzgrep: Use -e to specify the pattern to grep. Now we don't need the separate test for adding the -q option as it can be added directly in the two places where it's needed.	2022-07-18 21:10:25 +03:00
Lasse Collin	bad61b5997	Scripts: Use printf instead of echo in a few places. It's a good habbit as echo has some portability corner cases when the string contents can be anything.	2022-07-18 19:18:48 +03:00
Lasse Collin	6a4a4a7d26	xzgrep: Add more LC_ALL=C to avoid bugs with multibyte characters. Also replace one use of expr with printf. The rationale for LC_ALL=C was already mentioned in `69d1b3fc29` that fixed a security issue. However, unrelated uses weren't changed in that commit yet. POSIX says that with sed and such tools one should use LC_ALL=C to ensure predictable behavior when strings contain byte sequences that aren't valid multibyte characters in the current locale. See under "Application usage" in here: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html With GNU sed invalid multibyte strings would work without this; it's documented in its Texinfo manual. Some other implementations aren't so forgiving.	2022-07-17 21:36:25 +03:00
Lasse Collin	b48f9d615f	xzgrep: Fix parsing of certain options. Fix handling of "xzgrep -25 foo" (in GNU grep "grep -25 foo" is an alias for "grep -C25 foo"). xzgrep would treat "foo" as filename instead of as a pattern. This bug was fixed in zgrep in gzip in 2012. Add -E, -F, -G, and -P to the "no argument required" list. Add -X to "argument required" list. It is an intentionally-undocumented GNU grep option so this isn't an important option for xzgrep but it seems that other grep implementations (well, those that I checked) don't support -X so I hope this change is an improvement still. grep -d (grep --directories=ACTION) requires an argument. In contrast to zgrep, I kept -d in the "no argument required" list because it's not supported in xzgrep (or zgrep). This way "xzgrep -d" gives an error about option being unsupported instead of telling that it requires an argument. Both zgrep and xzgrep tell that it's unsupported if an argument is specified. Add comments.	2022-07-17 20:57:06 +03:00
Lasse Collin	2d2201bc63	Tests: Add the .lzma files to test_files.sh.	2022-07-14 20:33:05 +03:00
Lasse Collin	ce5549a591	Tests: Add .lzma test files.	2022-07-14 19:37:42 +03:00
Lasse Collin	107c93ee5c	liblzma: Rename a variable and improve a comment.	2022-07-14 18:12:38 +03:00
Lasse Collin	511feb5ead	Update THANKS.	2022-07-13 22:24:41 +03:00
Lasse Collin	9595a3119b	liblzma: Add optional autodetection of LZMA end marker. Turns out that this is needed for .lzma files as the spec in LZMA SDK says that end marker may be present even if the size is stored in the header. Such files are rare but exist in the real world. The code in liblzma is so old that the spec didn't exist in LZMA SDK back then and I had understood that such files weren't possible (the lzma tool in LZMA SDK didn't create such files). This modifies the internal API so that LZMA decoder can be told if EOPM is allowed even when the uncompressed size is known. It's allowed with .lzma and not with other uses. Thanks to Karl Beldan for reporting the problem.	2022-07-13 22:24:07 +03:00
Lasse Collin	0c0f8e9761	xz: Document the special memlimit case of 2000 MiB on MIPS32. See commit `fc3d3a7296`.	2022-07-12 18:53:04 +03:00
Jia Tan	d1bfa3dc70	Created script to generate code coverage reports. The script uses lcov and genhtml after running the tests to show the code coverage statistics. The script will create a coverage directory where it is run. It can be run both in and out of the source directory.	2022-07-10 22:42:22 +03:00
Jia Tan	86a30b0255	Tests: Add more tests into test_check.	2022-06-16 17:39:59 +03:00
Lasse Collin	82e30fed66	Tests: Use char[][24] array for enum_strings_lzma_ret. Array of pointers to short strings is a bit pointless here and now it's fully const.	2022-06-16 15:02:57 +03:00
Lasse Collin	5ba9459e6c	Tests: tuktest.h: Add tuktest_error_impl to help with error conditions.	2022-06-16 14:12:14 +03:00
Lasse Collin	b339892668	Tests: tuktest.h: Rename file_from_* and use tuktest_malloc there.	2022-06-16 13:29:59 +03:00
Lasse Collin	d8b63a0ad6	Tests: tuktest.h: Add malloc wrapper with automatic freeing.	2022-06-16 13:08:19 +03:00
Lasse Collin	1d51536a4b	Tests: tuktest.h: Move a function.	2022-06-16 11:47:37 +03:00
Lasse Collin	70c7555f64	Tests: test_vli: Remove an invalid test-assertion. lzma_vli is unsigned so trying a signed value results in a compiler warning from -Wsign-conversion. (lzma_vli)-1 equals to LZMA_VLI_UNKNOWN anyway which is the next assertion.	2022-06-14 22:21:15 +03:00
Lasse Collin	154b73c5a1	Tests: test_vli: Add const where appropriate.	2022-06-14 22:17:01 +03:00
Jia Tan	0354d6cce3	Added vli tests to .gitignore	2022-06-14 22:03:55 +03:00
Jia Tan	a08f5ccf6b	Created tests for all functions exported in vli.h Achieved 100% code coverage vli_encoder.c, vli_decoder.c, and vli_size.c	2022-06-14 22:00:34 +03:00
jiat75	1e3eb61815	Added parallel test artifacts to .gitignore	2022-06-14 21:47:09 +03:00
Lasse Collin	00e3613f12	Tests: Use good-1-empty-bcj-lzma2.xz in test_bcj_exact_size. It's much nicer this way so that the test data isn't a hardcoded table inside the C file.	2022-06-14 21:29:21 +03:00
Lasse Collin	86bab755be	Tests: Add file reading helpers to tuktest.h.	2022-06-14 21:26:13 +03:00
Lasse Collin	83d2337b72	Tests: tuktest.h: Move a printf from a macro to a helper function.	2022-06-14 18:21:57 +03:00
Lasse Collin	f9e8176ea7	Tests: Add test file good-1-empty-bcj-lzma2.xz. This is from test_bcj_exact_size.c. It's good to have it as a standalone file.	2022-06-14 17:20:49 +03:00
Jia Tan	aa75c5563a	Tests: Created tests for hardware functions. Created tests for all API functions exported in src/liblzma/api/lzma/hardware.h. The tests are fairly trivial but are helpful because they will inform users if their machines cannot support these functions. They also improve the code coverage metrics.	2022-06-10 16:58:47 +03:00
Lasse Collin	5c8ffdca20	Tests: Convert test_check to tuktest. Thanks to Jia Tan for help with all the tests.	2022-06-02 21:06:58 +03:00
Lasse Collin	faf5ff8899	Tests: Convert test_block_header to tuktest.	2022-06-02 20:45:05 +03:00
Lasse Collin	754d39fbeb	Tests: Convert test_bcj_exact_size to tuktest. The compress() and decompress() functions were merged because the later depends on the former so they need to be a single test case.	2022-06-02 20:28:23 +03:00
Lasse Collin	96da21470f	Tests: Include tuktest.h in tests.h. This breaks -Werror because none of the tests so far use tuktest.h and thus there are warnings about unused variables and functions.	2022-06-02 20:27:00 +03:00
Lasse Collin	df71ba1c99	Tests: Add tuktest.h mini-test-framework.	2022-06-02 20:25:21 +03:00
Lasse Collin	4773608554	Build: Enable Automake's parallel test harness. It has been the default for quite some time already and the old serial harness isn't discouraged. The downside is that with parallel tests one cannot print progress info or other diagnostics to the terminal; all output from the tests will be in the log files only. But now that the compression tests are separated the parallel tests will speed things up.	2022-05-23 21:31:36 +03:00
Lasse Collin	9a6dd6d46f	Tests: Split test_compress.sh into separate test unit for each file. test_compress.sh now takes one command line argument: a filename to be tested. If it begins with "compress_generated_" the file will be created with create_compress_files. This will allow parallel execution of the slow tests.	2022-05-23 21:31:20 +03:00
Lasse Collin	c7758ac9c7	Test: Make create_compress_files.c a little more flexible. If a command line argument is given, then only the test file of that type is created. It's quite dumb in sense that unknown names don't give an error but it's good enough here. Also use EXIT_FAILURE instead of 1 as exit status for errors.	2022-05-23 20:59:47 +03:00
Lasse Collin	4a8e4a7b0a	Tests: Remove unneeded commented lines from test_compress.sh.	2022-05-23 20:17:42 +03:00
Lasse Collin	2ee50d150e	Tests: Remove progress indicator from test_compress.sh. It will be useless with Automake's parallel tests.	2022-05-23 20:16:00 +03:00
Lasse Collin	2ce4f36f17	liblzma: Silence a warning. The actual initialization is done via mythread_sync and seems that GCC doesn't necessarily see that it gets initialized there.	2022-05-23 19:37:18 +03:00
Lasse Collin	5d8f3764ef	xz: Fix build with --disable-threads.	2022-04-14 20:53:16 +03:00
Lasse Collin	1d59289727	xz: Change the cap of the default -T0 memlimit for 32-bit xz. The SIZE_MAX / 3 was 1365 MiB. 1400 MiB gives little more room and it looks like a round (artificial) number in --info-memory once --info-memory is made to display it. Also, using #if avoids useless code on 64-bit builds.	2022-04-14 14:50:17 +03:00
Lasse Collin	c77fe55ddb	xz: Add a default soft memory usage limit for --threads=0. This is a soft limit in sense that it only affects the number of threads. It never makes xz fail and it never makes xz change settings that would affect the compressed output. The idea is to make -T0 have more reasonable behavior when the system has very many cores or when a memory-hungry compression options are used. This also helps with 32-bit xz, preventing it from running out of address space. The downside of this commit is that now the number of threads might become too low compared to what the user expected. I hope this to be an acceptable compromise as the old behavior has been a source of well-argued complaints for a long time.	2022-04-14 14:20:46 +03:00
Lasse Collin	0adc13bfe3	xz: Make -T0 use multithreaded mode on single-core systems. The main problem withi the old behavior is that the compressed output is different on single-core systems vs. multicore systems. This commit fixes it by making -T0 one thread in multithreaded mode on single-core systems. The downside of this is that it uses more memory. However, if --memlimit-compress is used, xz can (thanks to the previous commit) drop to the single-threaded mode still.	2022-04-14 13:00:40 +03:00
Lasse Collin	898faa9728	xz: Changes to --memlimit-compress and --no-adjust. In single-threaded mode, --memlimit-compress can make xz scale down the LZMA2 dictionary size to meet the memory usage limit. This obviously affects the compressed output. However, if xz was in threaded mode, --memlimit-compress could make xz reduce the number of threads but it wouldn't make xz switch from multithreaded mode to single-threaded mode or scale down the LZMA2 dictionary size. This seemed illogical and there was even a "FIXME?" about it. Now --memlimit-compress can make xz switch to single-threaded mode if one thread in multithreaded mode uses too much memory. If memory usage is still too high, then the LZMA2 dictionary size can be scaled down too. The option --no-adjust was also changed so that it no longer prevents xz from scaling down the number of threads as that doesn't affect compressed output (only performance). After this commit --no-adjust only prevents adjustments that affect compressed output, that is, with --no-adjust xz won't switch from multithreaded mode to single-threaded mode and won't scale down the LZMA2 dictionary size. The man page wasn't updated yet.	2022-04-14 12:38:00 +03:00
Lasse Collin	cad299008c	xz: Add --memlimit-mt-decompress along with a default limit value. --memlimit-mt-decompress allows specifying the limit for multithreaded decompression. This matches memlimit_threading in liblzma. This limit can only affect the number of threads being used; it will never prevent xz from decompressing a file. The old --memlimit-decompress option is still used at the same time. If the value of --memlimit-decompress (the default value or one specified by the user) is less than the value of --memlimit-mt-decompress , then --memlimit-mt-decompress is reduced to match --memlimit-decompress. Man page wasn't updated yet.	2022-04-12 00:04:30 +03:00
Lasse Collin	fe87b4cd53	liblzma: Threaded decoder: Improve setting of pending_error. It doesn't need to be done conditionally. The comments try to explain it.	2022-04-06 23:11:59 +03:00
Lasse Collin	90621da7f6	liblzma: Add a new flag LZMA_FAIL_FAST for threaded decoder. In most cases if the input file is corrupt the application won't care about the uncompressed content at all. With this new flag the threaded decoder will return an error as soon as any thread has detected an error; it won't wait to copy out the data before the location of the error. I don't plan to use this in xz to keep the behavior consistent between single-threaded and multi-threaded modes.	2022-04-06 13:16:00 +03:00
Lasse Collin	64b6d496dc	liblzma: Threaded decoder: Always wait for output if LZMA_FINISH is used. This makes the behavior consistent with the single-threaded decoder when handling truncated .xz files. Thanks to Jia Tan for finding this issue.	2022-04-05 12:24:57 +03:00
Lasse Collin	e671bc8828	liblzma: Threaded decoder: Support zpipe.c-style decoding loop. This makes it possible to call lzma_code() in a loop that only reads new input when lzma_code() didn't fill the output buffer completely. That isn't the calling style suggested by the liblzma example program 02_decompress.c so perhaps the usefulness of this feature is limited. Also, it is possible to write such a loop so that it works with the single-threaded decoder but not with the threaded decoder even after this commit, or so that it works only if lzma_mt.timeout = 0. The zlib tutorial <https://zlib.net/zlib_how.html> is a well-known example of a loop where more input is read only when output isn't full. Porting this as is to liblzma would work with the single-threaded decoder (if LZMA_CONCATENATED isn't used) but it wouldn't work with threaded decoder even after this commit because the loop assumes that no more output is possible when it cannot read more input ("if (strm.avail_in == 0) break;"). This cannot be fixed at liblzma side; the loop has to be modified at least a little. I'm adding this in any case because the actual code is simple and short and should have no harmful side-effects in other situations.	2022-04-02 21:49:59 +03:00

... 8 9 10 11 12 ...

1835 commits