A classic Unix archiving format for bundling multiple files into a single container
The TAR (Tape Archive) format is one of the oldest and most established archiving formats in computing, originating from Unix systems in the early 1970s. Initially designed for backing up data to magnetic tape drives (hence the name), TAR has evolved into a standard method for bundling multiple files and directories into a single file while preserving file system information like permissions, ownership, and directory structure.
Unlike many other archiving formats, TAR itself does not provide compression — it simply combines multiple files into a single container. However, TAR files are commonly compressed after creation using utilities like gzip, bzip2, or xz, resulting in files with extensions like .tar.gz, .tar.bz2, or .tar.xz (sometimes abbreviated as .tgz, .tbz, etc.).
Despite its age, TAR remains essential in modern computing, particularly in Unix-like environments such as Linux, macOS, and BSD systems where it's the standard format for software distribution, backups, and system maintenance. Its simplicity, reliability, and preservation of Unix file attributes make it indispensable for system administrators, developers, and power users.
The TAR format consists of a sequence of file entries, each with a header containing metadata (permissions, owner, size, timestamp, etc.) followed by the file's content. Modern implementations support several format variations including the original (legacy) format, GNU extensions, and the PAX/POSIX format which offers better support for large files, long filenames, and extended attributes.
TAR archives, usually compressed with gzip or xz, are the standard format for distributing software packages in the Unix/Linux world. Source code tarballs (.tar.gz or .tar.xz files) are the primary distribution method for open-source software. Many Linux package managers use TAR as the underlying format for their package files, often with custom headers or extensions.
TAR's ability to preserve file system attributes makes it ideal for system backups where maintaining permissions, ownership, and directory structures is critical. System administrators often use TAR to create backups of important system directories or user data, preserving all the metadata needed for a complete restoration.
When transferring complex directory structures between systems, TAR provides a convenient way to bundle everything into a single file while preserving all important attributes. This makes it useful for migrating websites, applications, or user environments between different servers or systems.
For projects containing numerous small files that need to be distributed as a unit, TAR offers a simple way to bundle everything together. This is common in web development (bundling website assets), academic research (datasets with many files), and documentation projects (collections of related documents and images).
Modern containerization technologies like Docker make extensive use of the TAR format for packaging and distributing container images. TAR's ability to preserve file attributes and directory structures makes it perfect for capturing the file system state needed for containers to function correctly across different environments.
TAR has varying levels of native support across different operating systems:
Various applications can work with TAR files:
The TAR ecosystem includes several format variants and compression combinations:
Feature | TAR | ZIP | RAR | 7Z | CPIO |
---|---|---|---|---|---|
Built-in Compression | |||||
Metadata Preservation | |||||
Cross-platform Support | |||||
Random Access to Files | |||||
Compression Efficiency | |||||
Open Standard | |||||
Native OS Support |
TAR excels in metadata preservation and adhering to open standards, making it ideal for Unix/Linux environments and system backups. ZIP offers better cross-platform compatibility and random access to files, while 7Z provides superior compression efficiency. RAR has good compression but is a proprietary format, and CPIO (another Unix archive format) offers similar metadata preservation to TAR but with lower general adoption.
When converting from other archive formats to TAR, first extract the contents of the original archive, then create a new TAR archive from the extracted files. This ensures that all files are properly included with their directory structure. On Unix/Linux systems, use commands like tar -cf output.tar extracted_folder/
. For preserving permissions, use tar -cpf output.tar extracted_folder/
. Consider adding compression with options like -z
for gzip or -j
for bzip2.
To bundle individual files into a TAR archive, simply list them as arguments to the tar command, e.g., tar -cf archive.tar file1 file2 directory/
. For large collections, consider using wildcards or find commands to locate all relevant files. On Windows without command-line tar, use archive managers like 7-Zip to create TAR archives from selected files and folders.
Since TAR itself doesn't provide compression, you'll often want to compress the TAR file after creation. The most common method is to use gzip with tar -czf archive.tar.gz files/
. For better compression at the cost of speed, use bzip2 with tar -cjf archive.tar.bz2 files/
. For maximum compression, use xz with tar -cJf archive.tar.xz files/
. Many modern tar implementations support these compression methods directly.
To convert a TAR file to another archive format, first extract the TAR file with tar -xf archive.tar
. If the TAR file is compressed (e.g., .tar.gz), include the appropriate decompression option like -z
. After extraction, create the new archive format from the extracted files using the appropriate tool (e.g., zip, 7z, rar). Some archive managers like 7-Zip can perform this conversion directly without an intermediate extraction step.
When working with compressed TAR files (.tar.gz, .tar.bz2, etc.), most modern tar implementations can automatically detect and handle the compression method. Use tar -xf archive.tar.gz
to extract regardless of compression type. If you encounter older tar versions that don't support automatic detection, use the appropriate option: -z
for gzip, -j
for bzip2, or -J
for xz compression.
When extracting TAR archives, especially for system backups or software installation, pay attention to permission preservation. Use tar -xpf archive.tar
to preserve original permissions, ownership, and timestamps. Be aware that extracting as a non-root user may limit the ability to set certain permissions or ownerships. For cross-platform transfers, permissions might not translate exactly between different operating systems.
-v
) when creating or extracting to see what files are being processed--exclude
to avoid including unnecessary files like temporary files or version control directoriestar -tf archive.tar
. This works for compressed TAR files as well; the command automatically detects the compression format in most modern TAR implementations. To see more detailed information including file sizes, permissions, and timestamps, add the -v (verbose) option: tar -tvf archive.tar
. On Windows without a command-line tar utility, you can use archive managers like 7-Zip to browse TAR file contents through their graphical interface.