Frequently Asked Question

3.4 Data Compression
Last Updated 6 months ago

Data compression is a crucial aspect of managing storage and transmission of files in Linux. By reducing file sizes, compression saves storage space and speeds up file transfer. Two of the most popular tools for compression in Linux are TAR and ZIP. Though both serve the purpose of compressing data, they work in different ways and are used for distinct purposes. This article delves into these two tools and how they function in Linux environments.

1. TAR (Tape Archive)

TAR stands for Tape Archive, and it is primarily used for archiving multiple files into a single file, without compressing them by default. TAR is often used to combine many files and directories into one file, which makes it easier to transfer or back up. The compression with TAR can be done using additional tools like gzip or bzip2.

Features of TAR:

  1. Archiving: TAR combines multiple files into a single archive.
  2. Compression (Optional): While archiving, TAR doesn't compress files by itself. To compress, tools like gzip or bzip2 are used.
  3. Preservation of File Structure: TAR maintains the directory structure, file permissions, and metadata.
  4. Efficient Backup: TAR is commonly used for creating backups, as it preserves metadata and permissions.

Creating a TAR Archive:

tar -cvf archive.tar /path/to/files

  • c: Create a new archive
  • v: Verbose mode (shows progress)
  • f: Specifies the filename of the archive

  • Extracting a TAR Archive:

    tar -xvf archive.tar
    • x: Extract files from the archive

    Creating a Compressed TAR Archive (using Gzip):

    tar -czvf archive.tar.gz /path/to/files
    • z: Compress using gzip

    Creating a Compressed TAR Archive (using Bzip2):

    tar -cjvf archive.tar.bz2 /path/to/files
    • j: Compress using bzip2

    Extracting a Compressed Archive:

    tar -xzvf archive.tar.gz    # for Gzip
    tar -xjvf archive.tar.bz2   # for Bzip2

    2. ZIP

    ZIP is another popular compression tool, but it works differently from TAR. ZIP not only archives multiple files but also compresses them simultaneously. Unlike TAR, where compression is optional, ZIP both archives and compresses in one step. It’s widely used in cross-platform environments because ZIP archives can be easily extracted on Windows, macOS, and Linux.

    Features of ZIP:

    1. Combined Archiving and Compression: ZIP archives and compresses files at the same time.
    2. Cross-Platform: ZIP archives are widely supported across different operating systems, including Windows and macOS.
    3. Individual File Compression: ZIP compresses each file individually, making it easy to extract specific files without decompressing the entire archive.

    Creating a ZIP Archive:

    zip -r archive.zip /path/to/files
    • r: Recursively adds files from the directory to the ZIP archive.

    Extracting a ZIP Archive:

    unzip archive.zip

    Adding Files to an Existing ZIP Archive:

    zip -u archive.zip newfile.txt
    • u: Updates the ZIP archive by adding new files or replacing existing ones.


    Compressing Files with Maximum Compression Level:

    zip -9 archive.zip /path/to/files
    • 9: Specifies the highest compression level.

    Listing the Contents of a ZIP Archive:

    unzip -l archive.zip

    Please Wait!

    Please wait... it will take a second!