Updated faq.txt.

Some questions worth answering were removed, because I currently don't have good up to date answers to them.
2024-04-04 12:36:23 +02:00 · 2009-08-18 00:26:48 +03:00 · 2009-08-18 00:26:48 +03:00 · b198e770a1
commit b198e770a1
parent fe111a25cd
1 changed files with 73 additions and 166 deletions
--- a/doc/faq.txt
+++ b/doc/faq.txt
@ -2,185 +2,96 @@
 XZ Utils FAQ
 ============
-Q:  What are LZMA, LZMA Utils, lzma, .lzma, liblzma, LZMA SDK, LZMA_Alone,
+Q:  What do the letters XZ mean?
    7-Zip and p7zip?
-A:  LZMA stands for Lempel-Ziv-Markov chain-Algorithm. LZMA is the name
+A:  Nothing. They are just two letters, which come from the file format
-    of the compression algorithm designed by Igor Pavlov. He is the author
+    suffix .xz. The .xz suffix was selected, because it seemed to be
-    of 7-Zip, which is a great LGPL'd compression tool for Microsoft
+    pretty much unused. It is no deeper meaning.
    Windows operating systems. In addition to 7-Zip itself, also LZMA SDK
    is available on the website of 7-Zip. LZMA SDK contains LZMA
    implementations in C++, Java and C#. The C++ version is the original
    implementation which is used also in 7-Zip itself.
    Excluding the unrar plugin, 7-Zip is free software (free as in
    freedom). Thanks to this, it was possible to port it to POSIX
    platforms. The port was done and is maintained by myspace (TODO:
    myspace's real name?). p7zip is a port of 7-Zip's command line version;
    p7zip doesn't include the 7-Zip's GUI.
    In POSIX world, users are used to gzip and bzip2 command line tools.
    Developers know APIs of zlib and libbzip2. LZMA Utils try to ease
    adoption of LZMA on free operating systems by providing a compression
    library and a set of command line tools. The library is called liblzma.
    It provides a zlib-like API making it easy to adapt LZMA compression in
    existing applications. The main command line tool is known as lzma,
    whose command line syntax is very similar to that of gzip and bzip2.
    The original command line tool from LZMA SDK (lzma.exe) was found from
    a directory called LZMA_Alone in the LZMA SDK. It used a simple header
    format in .lzma files. This format was also used by LZMA Utils up to
    and including 4.32.x. In LZMA Utils documentation, LZMA_Alone refers
    to both the file format and the command line tool from LZMA SDK.
    Because of various limitations of the LZMA_Alone file format, a new
    file format was developed. Extending some existing format such as .gz
    used by gzip was considered, but these formats were found to be too
    limited. The filename suffix for the new .lzma format is `.lzma'. The
    same suffix is also used for files in the LZMA_Alone format. To make
    the transition to the new format as transparent as possible, LZMA Utils
    support both the new and old formats transparently.
    7-Zip and LZMA SDK: <http://7-zip.org/>
    p7zip: <http://p7zip.sourceforge.net/>
    LZMA Utils: <http://tukaani.org/lzma/>
-Q:  What LZMA implementations there are available?
+Q:  What are LZMA and LZMA2?
-A:  LZMA SDK contains implementations in C++, Java and C#. The C++ version
+A:  LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name
-    is the original implementation which is part of 7-Zip. LZMA SDK
+    of the compression algorithm designed by Igor Pavlov for 7-Zip.
-    contains also a small LZMA decoder in C.
+    LZMA is based on LZ77 and range encoding.
-    A port of LZMA SDK to Pascal was made by Alan Birtles
+    LZMA2 is an updated version of the original LZMA to fix a couple of
-    <http://www.birtles.org.uk/programming/>. It should work with
+    practical issues. In context of XZ Utils, LZMA is called LZMA1 to
-    multiple Pascal programming language implementations.
+    emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the
-
+    primary compression algorithm in the .xz file format.
    LZMA Utils includes liblzma, which is directly based on LZMA SDK.
    liblzma is written in C (C99, not C89). In contrast to C++ callback
    API used by LZMA SDK, liblzma uses zlib-like stateful C API. I do not
    want to comment whether both/former/latter/neither API(s) are good or
    bad. The only reason to implement a zlib-like API was, that many
    developers are already familiar with zlib, and very many applications
    already use zlib. Having a similar API makes it easier to include LZMA
    support in existing applications.
    See also <http://en.wikipedia.org/wiki/LZMA#External_links>.
-Q:  Which file formats are supported by LZMA Utils?
+Q:  There are many LZMA related projects. How does XZ Utils relate to them?
-A:  Even when the raw LZMA stream is always the same, it can be wrapped
+A:  7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly
-    in different container formats. The preferred format is the new .lzma
+    a subset of the 7-Zip source tree.
    format. It has magic bytes (the first six bytes: 0xFF 'L' 'Z' 'M'
    'A' 0x00). The format supports chaining up to seven filters, splitting
    data to multiple blocks for easier multi-threading and rough
    random-access reading. The file integrity is verified using CRC32,
    CRC64, or SHA256, and by verifying the uncompressed size of the file.
-    LZMA SDK includes a tool called LZMA_Alone. It supports uses a
+    p7zip is 7-Zip's command line tools ported to POSIX-like systems.
    primitive header which includes only the mandatory stream information
    required by the LZMA decoder. This format can be both read and
    written by liblzma and the command line tool (use --format=alone to
    create such files).
-    .7z is the native archive format used by 7-Zip. This format is not
+    LZMA Utils provide a gzip-like lzma tool for POSIX-like systems.
-    supported by liblzma, and probably will never be supported. You
+    LZMA Utils are based on LZMA SDK. XZ Utils are the successor to
-    should use e.g. p7zip to extract .7z files.
+    LZMA Utils.
-    It is possible to implement custom file formats by using raw filter
+    There are several other projects using LZMA. Most are more or less
-    mode in liblzma. In this mode the application needs to store the filter
+    based on LZMA SDK.
    properties and provide them to liblzma before starting to uncompress
    the data.
-Q:  How can I identify files containing LZMA compressed data?
+Q:  Do XZ Utils support the .7z format?
-A:  The preferred filename suffix for .lzma files is `.lzma'. `.tar.lzma'
+A:  No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z
-    may be abbreviated to `.tlz'. The same suffixes are used for files in
+    files.
    LZMA_Alone format. In practice this should be no problem since tools
    included in LZMA Utils support both formats transparently.
    Checking the magic bytes is easy way to detect files in the new .lzma
    format (the first six bytes: 0xFF 'L' 'Z' 'M' 'A' 0x00). The "file"
    command version FIXME contains magic strings for this format.
    The old LZMA_Alone format has no magic bytes. Its header cannot contain
    arbitrary bytes, thus it is possible to make a guess. Unfortunately the
    guessing is usually too hard to be reliable, so don't try it unless you
    are desperate.
-Q:  Does the lzma command line tool support sparse files?
+Q:  I have many .tar.7z files. Can I convert them to .tar.xz without
    spending hours recompressing the data?
-A:  Sparse files can (of course) be compressed like normal files, but
+A:  In the "extra" directory, there is a script named 7z2lzma.bash which
-    uncompression will not restore sparseness of the file. Use an archiver
+    is able to convert some .7z files to the .lzma format (not .xz). It
-    tool to take care of sparseness before compressing the data with lzma.
+    needs the 7za (or 7z) command from p7zip. The script may silently
-
+    produce corrupt output if certain assumptions are not met, so
-    The reason for this is that archiver tools handle files, while
+    decompress the resulting .lzma file and compare it against the
-    compression tools handle streams or buffers. Being a sparse file is
+    original before deleting the original file!
    a property of the file on the disk, not a property of the stream or
    buffer.
-Q:  Can I recover parts of a broken LZMA file (e.g. corrupted CD-R)?
+Q:  I have many .lzma files. Can I quickly convert them to the .xz format?
-A:  With LZMA_Alone and single-block .lzma files, you can uncompress the
+A:  For now, no. Since XZ Utils supports the .lzma format, it's usually
-    file until you hit the first broken byte. The data after the broken
+    not too bad to keep the old files in the old format. If you want to
-    position is lost. LZMA relies on the uncompression history, and if
+    do the conversion anyway, you need to decompress the .lzma files and
-    bytes are missing in the middle of the file, it is impossible to
+    then recompress to the .xz format.
    reliably continue after the broken section.
-    With multi-block .lzma files it may be possible to locale the next
+    Technically, there is a way to make the conversion relatively fast
-    block in the file and continue decoding there. A limited recovery
+    (roughly twice the time that normal decompression takes). Writing
-    tool for this kind of situations is planned.
+    such a tool would take quite a bit time though, and would probably
    be useful to only a few people. If you really want such a conversion
    tool, contact Lasse Collin and offer some money.
-Q:  Is LZMA patented?
+Q:  Can I recover parts of a broken .xz file (e.g. corrupted CD-R)?
-A:  No, the authors are not aware of any patents that could affect LZMA.
+A:  It may be possible if the file consist of multiple blocks, which
-    However, due to nature of software patents, the authors cannot
+    typically is not the case if the file was created in single-threaded
-    guarantee, that LZMA isn't affected by any third party patent.
+    mode. There is no recovery program yet.
-Q:  Where can I find documentation about how LZMA works as an algorithm?
+Q:  Is (some part of) XZ Utils patented?
-A:  Read the source code, Luke. There is no documentation about LZMA
+A:  Lasse Collin is not aware of any patents that could affect XZ Utils.
-    internals. It is possible that Igor Pavlov is the only person on
+    However, due to nature of software patents, it's not possible to
-    the Earth that completely knows and understands the algorithm.
+    guarantee that XZ Utils isn't affected by any third party patent(s).
    You could begin by downloading LZMA SDK, and start reading from
    the LZMA decoder to get some idea about the bitstream format.
    Before you begin, you should know the basics of LZ77 and
    range coding algorithms. LZMA is based on LZ77, but LZMA is
    *a lot* more complex. Range coding is used to compress the
    final bitstream like Huffman coding is used in Deflate.
-Q:  What are filters?
+Q:  Where can I find documentation about the file format and algorithms?
-A:  In context of .lzma files, a filter means an implementation of a
+A:  The .xz format is documented in xz-file-format.txt. It is a container
-    compression algorithm. The primary filter is LZMA, which is why
+    format only, and doesn't include descriptions of any non-trivial
-    the names of the tools contain the letters LZMA.
+    filters.
-    liblzma and the new .lzma format support also other filters than LZMA.
+    Documenting LZMA and LZMA2 is planned, but for now, there is no other
-    There are different types of filters, which are suitable for different
+    documentation that the source code. Before you begin, you should know
-    types of data. Thus, to select the optimal filter and settings, the
+    the basics of LZ77 and range coding algorithms. LZMA is based on LZ77,
-    type of the input data being compressed needs to be known.
+    but LZMA is *a lot* more complex. Range coding is used to compress
-
+    the final bitstream like Huffman coding is used in Deflate.
    Some filters are most useful when combined with another filter like
    LZMA. These filters increase redundancy in the data, without changing
    the size of the data, by taking advantage of properties specific to
    the data being compressed.
    So far, all the filters are always reversible. That is, no matter what
    data you pass to a filter encoder, it can be always defiltered back to
    the original form. Because of this, it is safe to compress for example
    a software package that contains other file types than executables
    using a filter specific to the architechture of the package being
    compressed.
    The old LZMA_Alone format supports only the LZMA filter.
 Q:  I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma?
@ -189,27 +100,23 @@ A:  BCJ filter is called "x86" in liblzma. BCJ2 is not included,
    because it requires using more than one encoded output stream.
-Q:  Can I use LZMA in proprietary, non-free applications?
+Q:  How do I build a program that needs liblzmadec (lzmadec.h)?
-A:  Yes. See the file COPYING for details.
+A:  liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no
    liblzmadec. The code using liblzmadec should be ported to use
    liblzma instead. If you cannot or don't want to do that, download
    LZMA Utils from <http://tukaani.org/lzma/>.
-Q:  I would like to help. What can I do?
+Q:  The default build of liblzma is too big. How can I make it smaller?
-A:  See the TODO file. Please contact Lasse Collin before starting to do
+A:  Give --enable-small to the configure script. Use also appropriate
-    anything, because it is possible that someone else is already working
+    --enable or --disable options to include only those filter encoders
-    on the same thing.
+    and decoders and integrity checks that you actually need. Use
    CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize
    for size. See INSTALL for information about configure options.
-
+    If the result is still too big, take a look at XZ Embedded. It is
-Q:  How can I contact the authors?
+    a separate project, which provides a limited but signinificantly
-
+    smaller XZ decoder implementation than XZ Utils.
 A:  Lasse Collin is the maintainer of LZMA Utils. You can contact him
    either via IRC (Larhzu on #tukaani at Freenode or IRCnet). Email
    should work too, <lasse.collin@tukaani.org>.
    Igor Pavlov is the father of LZMA. He is the author of 7-Zip
    and LZMA SDK. <http://7-zip.org/>
    NOTE: Please don't bother Igor Pavlov with questions specific
    to LZMA Utils.