xz-archive/tests/files/README


.lzma Test Files
----------------

0. Introduction

    This directory contains bunch of files to test handling of .lzma files
    in .lzma decoder implementations. Many of the files have been created
    by hand with a hex editor, thus there is no better "source code" than
    the files themselves. All the test files (*.lzma) and this README have
    been put into the public domain.


1. File Types

    Good files (good-*.lzma) must decode successfully without requiring
    a lot of CPU time or RAM. If the decoder supports only Single-Block
    Streams, then good-multi-*.lzma won't decode, of course.

    Bad files (bad-*.lzma) must cause the decoder to give an error. Like
    with the good files, these files must not require a lot of CPU time
    or RAM before they get detected to be broken.

    Malicious files (malicious-*.lzma) are good in terms of the file format
    specification, but try to trigger excessive CPU, RAM or disk usage in
    the decoder. To prevent malicious files from putting the decoder in
    inifinite loop (*), eating all available RAM or disk space, decoders
    should have internal limitters that catch these situations.

    (*) Strictly speaking not infinite, but if decoding of a small file
        would take a few weeks or even years, it's an infinite loop in
        practice.


2. Descriptions of Individual Files

2.1. Good Files

    good-single-none.lzma uses implicit Copy filter with known Uncompressed
    Size.

    good-single-none-pad.lzma is good-single-none.lzma with Footer Padding.

    good-cat-single-none-pad.lzma is two good-single-none-pad.lzma files
    concatenated as is. Fully decoding this file requires that the decoder
    supports decoding concatenated files.

    good-single-subblock_implicit.lzma uses implicit Subblock filter.

    good-single-lzma.lzma is LZMA compressed file with EOPM.

    good-single-subblock-lzma.lzma has basic combination of Subblock and
    LZMA filters.

    good-single-none-empty_1.lzma is an empty file with implicit Copy
    filter and no integrity Check.

    good-single-none-empty_2.lzma is an empty file with implicit Copy
    filter and CRC32 as Check.

    good-single-none-empty_3.lzma is an empty file with implicit Copy
    filter, known Compressed Size, and no integrity Check.

    good-single-lzma-empty.lzma is an empty file with LZMA filter and no
    integrity Check.

    good-single-subblock_rle.lzma takes advantage of Subblock filter's
    run-length encoding.

    good-single-delta-lzma.tiff.lzma is an image file that compresses
    better with Delta+LZMA than with plain LZMA.

    good-single-lzma-flush_1.lzma has a flush marker in the middle of
    the file, and no EOPM.

    good-single-lzma-flush_2.lzma has a flush marker in the middle of
    the file and just before EOPM.

    good-multi-none-1.lzma is a basic Multi-Block Stream with two Data
    Blocks and Footer Metadata Block.

    good-multi-none-2.lzma is good-multi-none-1.lzma with Total Size and
    Uncompressed Size added to the Footer Metadata Block.

    good-multi-none-extra_1.lzma has the `Extra is present' flag set but
    no actual Extra Records.

    good-multi-none-extra_2.lzma has two non-empty Extra Records.

    good-multi-none-extra_3.lzma has an Extra Record that has empty Data.

    good-multi-none-header_1.lzma has very minimal Header Metadata Block
    with only the Metadata Flags field.

    good-multi-none-header_2.lzma has all information in both Header and
    Footer Metadata Blocks. The Size of Header Metadata Block has wrong
    value in Header Metadata Block, but this value must be ignored by
    the decoder in case of Header Metadata Block.

    good-multi-none-header_3.lzma has Index only in the Header Metadata
    Block. Footer Metadata Block contains only Size of Header Metadata
    Block and Total Size.

    good-multi-none-block_1.lzma has Index in Header Metadata Block. The
    Compressed Size and Uncompressed Size fields are present in the Data
    Blocks. There is some Footer Padding between the Blocks.


2.2. Bad Files

    bad-single-none-truncated.lzma is good-single-none.lzma without the
    last byte of the file.

    bad-cat-single-none-pad_garbage_1.lzma is good-cat-single-none-pad.lzma
    with 0xFE appended to the end of the file. 0xFE doesn't begin .lzma
    or LZMA_Alone format file.

    bad-cat-single-none-pad_garbage_2.lzma is good-cat-single-none-pad.lzma
    with 0xFF appended to the end of the file. 0xFF begins .lzma format
    file, thus the decoder has to detect that the file is incomplete.

    bad-cat-single-none-pad_garbage_3.lzma is good-cat-single-none-pad.lzma
    with 0x5D appended to the end of the file. 0x5D is the most common
    first byte of LZMA_Alone format file.

    bad-single-none-footer_filter_flags.lzma has different Stream Flags
    in Stream Footer than in Stream Header.

    bad-single-none-too_long_vli.lzma has 10-byte variable-length integer.

    bad-single-none-empty.lzma is like good-single-none-empty_3.lzma but
    with non-zero value in the Compressed Size field.

    bad-single-data_after_eopm_1.lzma has LZMA+Subblock, where the Subblock
    filter gives one byte of data to LZMA after LZMA has detected EOPM.

    bad-single-data_after_eopm_2.lzma is like
    bad-single-data_after_eopm_1.lzma but Subblock gives 256 MiB of data
    to LZMA after LZMA has detected EOPM.

    bad-single-subblock_subblock.lzma has Subblock+Subblock, where the
    Subblock decoder is given End of Input in the middle of a Subblock.

    bad-single-subblock-padding_loop.lzma contains huge amount of
    consecutive Padding bytes, which isn't allowed by the Subblock filter
    format. If it were allowed, this file would hang the decoder for very
    long time (weeks to years).

    bad-single-subblock1023-slow.lzma is similar to
    malicious-single-subblock31-slow.lzma except that this uses 1023 bytes
    of Padding in every place instead of 31 bytes. The Subblock filter
    format specification allows only 31-byte Padings, thus this file must
    get detected as bad without producing any output. Allowing larger
    Padding than 31 bytes was considered (so this test file was created),
    but it seemed to be a bad idea since it would increase worst-case CPU
    usage.

    bad-single-lzma-flush_beginning.lzma has flush marker in the beginning
    of the LZMA data.

    bad-single-lzma-flush_twice.lzma has two flush markers with no data
    between them.

    bad-multi-none-1.lzma has data after the last field in the Metadata
    Block and the `Extra is present' flag is not set.

    bad-multi-none-2.lzma has wrong Total Size in Footer Metadata Block.

    bad-multi-none-3.lzma has wrong Uncompressed Size in Footer Metadata
    Block.

    bad-multi-none-index_1.lzma has wrong value in the Number of Data
    Blocks field.

    bad-multi-none-index_2.lzma has too short Metadata to contain all
    the Index Records.

    bad-multi-none-index_3.lzma has wrong value in Total Size field in
    the Index.

    bad-multi-none-index_4.lzma has wrong value in Uncompressed Size field
    in the Index.

    bad-multi-none-extra_1.lzma has incomplete Extra Record at the end of
    the Metadata Block.

    bad-multi-none-extra_2.lzma has incomplete variable-length integer as
    Extra Record ID.

    bad-multi-none-extra_3.lzma has incomplete Extra Record at the end of
    the Metadata Block.

    bad-multi-none-header_1.lzma has empty Header Metadata Block (even
    the Metadata Flags field is not present).

    bad-multi-none-header_2.lzma has Index in the Header Metadata Block,
    which describes only one Data Block, while the Stream actually has
    two Data Blocks. A sophisticated decoder should give an error when
    it detects the second Data Block; all Multi-Block decoders must
    detect the file as corrupt at some point.

    bad-multi-none-header_3.lzma contains too small Total Size in Header
    Metadata Block. A sophisticated decoder should abort decoding before
    the second Data Block, preferably before the first Data Block has
    been finished; all Multi-Block decoders must detect the file as
    corrupt at some point.

    bad-multi-none-header_4.lzma is like bad-multi-none-header_3.lzma but
    with too small Uncompressed Size.

    bad-multi-none-header_5.lzma has Index in the Header Metadata Block,
    but the Total Size field is missing from the Footer Metadata Block.

    bad-multi-none-header_6.lzma has both Index and Total Size in Header
    Metadata Block, but Total Size doesn't match the Index. A sophisticated
    decoder should abort before decoding any Data Blocks; all Multi-Block
    decoders must detect the file as corrupt at some point.

    bad-multi-none-block_1.lzma has wrong Uncompressed Size in the first
    Data Block. A sophisticated decoder should detect this error before
    producing any output, because it can see that the Uncompressed Size
    doesn't match with the Index in Header Metadata Block; all Multi-Block
    decoders must detect the file as corrupt at some point.

    bad-multi-none-block_2.lzma has too big Compressed Size in the first
    Data Block. A sophisticated decoder may be able to detect the file as
    corrupt before producing any output, because Comrpessed Size + size
    of Block Header exceed the Total Size stored in Index in Header
    Metadata Block. A sophisticated decoder should be able to detect the
    error before the end of the first Data Block; all Multi-Block decoders
    must detect the file as corrupt at some point.


2.3. Malicious Files

    malicious-single-subblock31-slow.lzma requires quite a bit of CPU time
    per decoded byte. It contains LZMA compressed Subblock filter data that
    has as much Padding as the specification allows. LZMA is also used as
    a Subfilter, to further slowdown the decoder. Every Subfilter instance
    produces only one byte of output. If you can create a file that wastes
    notably more CPU cycles than this file, please contact Lasse Collin.

    malicious-single-subblock-256MiB.lzma is a tiny file that produces
    256 MiB of output. It uses Subblock filter's run-length encoding
    to achieve this.

    malicious-single-subblock-64PiB.lzma is a tiny file that produces
    64 PiB of output (if you have patience to wait). This is done by
    chaining two Subblock filters and using their run-length encoders.

    malicious-multi-metadata-64PiB.lzma is like
    malicious-single-subblock-64PiB.lzma but the huge amount of output
    is in a Metadata Block. Trying to decode this file may take years
    unless the decoder catches that the Metadata has unreasonable size.
Added tests/files/README. 2008-01-07 18:09:44 +02:00
			`.lzma Test Files`
			`----------------`

			`0. Introduction`

			`This directory contains bunch of files to test handling of .lzma files`
			`in .lzma decoder implementations. Many of the files have been created`
			`by hand with a hex editor, thus there is no better "source code" than`
			`the files themselves. All the test files (*.lzma) and this README have`
			`been put into the public domain.`


			`1. File Types`

			`Good files (good-*.lzma) must decode successfully without requiring`
			`a lot of CPU time or RAM. If the decoder supports only Single-Block`
			`Streams, then good-multi-*.lzma won't decode, of course.`

			`Bad files (bad-*.lzma) must cause the decoder to give an error. Like`
			`with the good files, these files must not require a lot of CPU time`
			`or RAM before they get detected to be broken.`

			`Malicious files (malicious-*.lzma) are good in terms of the file format`
			`specification, but try to trigger excessive CPU, RAM or disk usage in`
			`the decoder. To prevent malicious files from putting the decoder in`
			`inifinite loop (*), eating all available RAM or disk space, decoders`
			`should have internal limitters that catch these situations.`

			`(*) Strictly speaking not infinite, but if decoding of a small file`
			`would take a few weeks or even years, it's an infinite loop in`
			`practice.`


			`2. Descriptions of Individual Files`

			`2.1. Good Files`

			`good-single-none.lzma uses implicit Copy filter with known Uncompressed`
			`Size.`

			`good-single-none-pad.lzma is good-single-none.lzma with Footer Padding.`

			`good-cat-single-none-pad.lzma is two good-single-none-pad.lzma files`
			`concatenated as is. Fully decoding this file requires that the decoder`
			`supports decoding concatenated files.`

Added good-single-subblock_implicit.lzma. 2008-01-08 22:27:46 +02:00			`good-single-subblock_implicit.lzma uses implicit Subblock filter.`

Added tests/files/README. 2008-01-07 18:09:44 +02:00			`good-single-lzma.lzma is LZMA compressed file with EOPM.`

			`good-single-subblock-lzma.lzma has basic combination of Subblock and`
			`LZMA filters.`

Updated tests/files/README. 2008-01-08 23:10:57 +02:00			`good-single-none-empty_1.lzma is an empty file with implicit Copy`
			`filter and no integrity Check.`

			`good-single-none-empty_2.lzma is an empty file with implicit Copy`
			`filter and CRC32 as Check.`

Added good-single-none-empty_3.lzma and bad-single-none-empty.lzma. 2008-01-09 12:06:46 +02:00			`good-single-none-empty_3.lzma is an empty file with implicit Copy`
			`filter, known Compressed Size, and no integrity Check.`

Updated tests/files/README. 2008-01-08 23:10:57 +02:00			`good-single-lzma-empty.lzma is an empty file with LZMA filter and no`
			`integrity Check.`

Added tests/files/README. 2008-01-07 18:09:44 +02:00			`good-single-subblock_rle.lzma takes advantage of Subblock filter's`
			`run-length encoding.`

			`good-single-delta-lzma.tiff.lzma is an image file that compresses`
			`better with Delta+LZMA than with plain LZMA.`

Added test files to test usage of flush marker in LZMA. 2008-01-18 20:13:00 +02:00			`good-single-lzma-flush_1.lzma has a flush marker in the middle of`
			`the file, and no EOPM.`

			`good-single-lzma-flush_2.lzma has a flush marker in the middle of`
			`the file and just before EOPM.`

Added bunch of test files containing Multi-Block Streams. 2008-01-24 00:46:05 +02:00			`good-multi-none-1.lzma is a basic Multi-Block Stream with two Data`
			`Blocks and Footer Metadata Block.`

			`good-multi-none-2.lzma is good-multi-none-1.lzma with Total Size and`
			`Uncompressed Size added to the Footer Metadata Block.`

			good-multi-none-extra_1.lzma has the `Extra is present' flag set but
			`no actual Extra Records.`

			`good-multi-none-extra_2.lzma has two non-empty Extra Records.`

			`good-multi-none-extra_3.lzma has an Extra Record that has empty Data.`

			`good-multi-none-header_1.lzma has very minimal Header Metadata Block`
			`with only the Metadata Flags field.`

			`good-multi-none-header_2.lzma has all information in both Header and`
			`Footer Metadata Blocks. The Size of Header Metadata Block has wrong`
			`value in Header Metadata Block, but this value must be ignored by`
			`the decoder in case of Header Metadata Block.`

Added more Multi-Block Stream test files. 2008-01-24 14:49:34 +02:00			`good-multi-none-header_3.lzma has Index only in the Header Metadata`
			`Block. Footer Metadata Block contains only Size of Header Metadata`
			`Block and Total Size.`

Added more Multi-Block test files. Improved some descriptions in the test files' README. 2008-01-25 23:50:35 +02:00			`good-multi-none-block_1.lzma has Index in Header Metadata Block. The`
			`Compressed Size and Uncompressed Size fields are present in the Data`
			`Blocks. There is some Footer Padding between the Blocks.`

Added tests/files/README. 2008-01-07 18:09:44 +02:00
			`2.2. Bad Files`

Added a few test files. 2008-01-08 13:35:36 +02:00			`bad-single-none-truncated.lzma is good-single-none.lzma without the`
			`last byte of the file.`

			`bad-cat-single-none-pad_garbage_1.lzma is good-cat-single-none-pad.lzma`
			`with 0xFE appended to the end of the file. 0xFE doesn't begin .lzma`
			`or LZMA_Alone format file.`

			`bad-cat-single-none-pad_garbage_2.lzma is good-cat-single-none-pad.lzma`
			`with 0xFF appended to the end of the file. 0xFF begins .lzma format`
			`file, thus the decoder has to detect that the file is incomplete.`

			`bad-cat-single-none-pad_garbage_3.lzma is good-cat-single-none-pad.lzma`
			`with 0x5D appended to the end of the file. 0x5D is the most common`
			`first byte of LZMA_Alone format file.`

Added bad-single-none-footer_filter_flags.lzma and bad-single-none-too_long_vli.lzma. 2008-01-23 20:05:01 +02:00			`bad-single-none-footer_filter_flags.lzma has different Stream Flags`
			`in Stream Footer than in Stream Header.`

			`bad-single-none-too_long_vli.lzma has 10-byte variable-length integer.`

Added good-single-none-empty_3.lzma and bad-single-none-empty.lzma. 2008-01-09 12:06:46 +02:00			`bad-single-none-empty.lzma is like good-single-none-empty_3.lzma but`
			`with non-zero value in the Compressed Size field.`

Added a few test files. 2008-01-08 13:35:36 +02:00			`bad-single-data_after_eopm_1.lzma has LZMA+Subblock, where the Subblock`
Added tests/files/README. 2008-01-07 18:09:44 +02:00			`filter gives one byte of data to LZMA after LZMA has detected EOPM.`

			`bad-single-data_after_eopm_2.lzma is like`
Added a few test files. 2008-01-08 13:35:36 +02:00			`bad-single-data_after_eopm_1.lzma but Subblock gives 256 MiB of data`
			`to LZMA after LZMA has detected EOPM.`
Added tests/files/README. 2008-01-07 18:09:44 +02:00
			`bad-single-subblock_subblock.lzma has Subblock+Subblock, where the`
			`Subblock decoder is given End of Input in the middle of a Subblock.`

			`bad-single-subblock-padding_loop.lzma contains huge amount of`
			`consecutive Padding bytes, which isn't allowed by the Subblock filter`
			`format. If it were allowed, this file would hang the decoder for very`
			`long time (weeks to years).`

			`bad-single-subblock1023-slow.lzma is similar to`
			`malicious-single-subblock31-slow.lzma except that this uses 1023 bytes`
			`of Padding in every place instead of 31 bytes. The Subblock filter`
			`format specification allows only 31-byte Padings, thus this file must`
			`get detected as bad without producing any output. Allowing larger`
			`Padding than 31 bytes was considered (so this test file was created),`
			`but it seemed to be a bad idea since it would increase worst-case CPU`
			`usage.`

Added test files to test usage of flush marker in LZMA. 2008-01-18 20:13:00 +02:00			`bad-single-lzma-flush_beginning.lzma has flush marker in the beginning`
			`of the LZMA data.`

			`bad-single-lzma-flush_twice.lzma has two flush markers with no data`
			`between them.`

Added bunch of test files containing Multi-Block Streams. 2008-01-24 00:46:05 +02:00			`bad-multi-none-1.lzma has data after the last field in the Metadata`
			Block and the `Extra is present' flag is not set.

			`bad-multi-none-2.lzma has wrong Total Size in Footer Metadata Block.`

			`bad-multi-none-3.lzma has wrong Uncompressed Size in Footer Metadata`
			`Block.`

			`bad-multi-none-index_1.lzma has wrong value in the Number of Data`
			`Blocks field.`

			`bad-multi-none-index_2.lzma has too short Metadata to contain all`
			`the Index Records.`

			`bad-multi-none-index_3.lzma has wrong value in Total Size field in`
			`the Index.`

			`bad-multi-none-index_4.lzma has wrong value in Uncompressed Size field`
			`in the Index.`

			`bad-multi-none-extra_1.lzma has incomplete Extra Record at the end of`
			`the Metadata Block.`

			`bad-multi-none-extra_2.lzma has incomplete variable-length integer as`
			`Extra Record ID.`

			`bad-multi-none-extra_3.lzma has incomplete Extra Record at the end of`
			`the Metadata Block.`

			`bad-multi-none-header_1.lzma has empty Header Metadata Block (even`
			`the Metadata Flags field is not present).`

Added more Multi-Block Stream test files. 2008-01-24 14:49:34 +02:00			`bad-multi-none-header_2.lzma has Index in the Header Metadata Block,`
			`which describes only one Data Block, while the Stream actually has`
Added more Multi-Block test files. Improved some descriptions in the test files' README. 2008-01-25 23:50:35 +02:00			`two Data Blocks. A sophisticated decoder should give an error when`
			`it detects the second Data Block; all Multi-Block decoders must`
			`detect the file as corrupt at some point.`
Added more Multi-Block Stream test files. 2008-01-24 14:49:34 +02:00
			`bad-multi-none-header_3.lzma contains too small Total Size in Header`
Added more Multi-Block test files. Improved some descriptions in the test files' README. 2008-01-25 23:50:35 +02:00			`Metadata Block. A sophisticated decoder should abort decoding before`
			`the second Data Block, preferably before the first Data Block has`
			`been finished; all Multi-Block decoders must detect the file as`
			`corrupt at some point.`
Added more Multi-Block Stream test files. 2008-01-24 14:49:34 +02:00
			`bad-multi-none-header_4.lzma is like bad-multi-none-header_3.lzma but`
			`with too small Uncompressed Size.`

			`bad-multi-none-header_5.lzma has Index in the Header Metadata Block,`
			`but the Total Size field is missing from the Footer Metadata Block.`

			`bad-multi-none-header_6.lzma has both Index and Total Size in Header`
Added more Multi-Block test files. Improved some descriptions in the test files' README. 2008-01-25 23:50:35 +02:00			`Metadata Block, but Total Size doesn't match the Index. A sophisticated`
			`decoder should abort before decoding any Data Blocks; all Multi-Block`
			`decoders must detect the file as corrupt at some point.`

			`bad-multi-none-block_1.lzma has wrong Uncompressed Size in the first`
			`Data Block. A sophisticated decoder should detect this error before`
			`producing any output, because it can see that the Uncompressed Size`
			`doesn't match with the Index in Header Metadata Block; all Multi-Block`
			`decoders must detect the file as corrupt at some point.`

			`bad-multi-none-block_2.lzma has too big Compressed Size in the first`
			`Data Block. A sophisticated decoder may be able to detect the file as`
			`corrupt before producing any output, because Comrpessed Size + size`
			`of Block Header exceed the Total Size stored in Index in Header`
			`Metadata Block. A sophisticated decoder should be able to detect the`
			`error before the end of the first Data Block; all Multi-Block decoders`
			`must detect the file as corrupt at some point.`
Added more Multi-Block Stream test files. 2008-01-24 14:49:34 +02:00
Added tests/files/README. 2008-01-07 18:09:44 +02:00
			`2.3. Malicious Files`

			`malicious-single-subblock31-slow.lzma requires quite a bit of CPU time`
			`per decoded byte. It contains LZMA compressed Subblock filter data that`
			`has as much Padding as the specification allows. LZMA is also used as`
			`a Subfilter, to further slowdown the decoder. Every Subfilter instance`
			`produces only one byte of output. If you can create a file that wastes`
			`notably more CPU cycles than this file, please contact Lasse Collin.`

			`malicious-single-subblock-256MiB.lzma is a tiny file that produces`
			`256 MiB of output. It uses Subblock filter's run-length encoding`
			`to achieve this.`

			`malicious-single-subblock-64PiB.lzma is a tiny file that produces`
			`64 PiB of output (if you have patience to wait). This is done by`
			`chaining two Subblock filters and using their run-length encoders.`

			`malicious-multi-metadata-64PiB.lzma is like`
			`malicious-single-subblock-64PiB.lzma but the huge amount of output`
			`is in a Metadata Block. Trying to decode this file may take years`
			`unless the decoder catches that the Metadata has unreasonable size.`
No results found.