Introduction
What if you would be able to perform a data compression four times faster, with the same compression ratio as you normally do. Pbzip2 command line utility can easily accomplish this as it gives you an option to select number CPU and amount of RAM to be used during the compression process.
Regular tar and bzip2 compression
We all know the regular command to perform tar and bzip2 directory compression. The below command will tar
and compress our sandbox directory FOOBAR
. We are also prefixing the below command to get exact time for how long will it take to output compressed file FOOBAR.tar.Bbz2
from 242MB FOOBAR directory:
# time tar cjf FOOBAR1.tar.bz2 FOOBAR/ real 0m20.030s user 0m19.828s sys 0m0.304s
From the above time output we can see that it took about 20 seconds to create following compressed file:
# ls -lh FOOBAR1.tar.bz2 -rw-r--r-- 1 root root 54M Mar 10 20:25 FOOBAR1.tar.bz2
Faster compression with bpzip2
pbzip2
by default uses all available CPU’s and 100MB RAM to perform compression. The following linux command will perform directory compression using pbzip2
. Once again we use time to measure execution time:
# time tar -c FOOBAR | pbzip2 -c > FOOBAR2.tar.bz2 real 0m4.777s user 0m35.588s sys 0m1.060s
Alternatively, the bellow command will yield the same result:
# time tar cf FOOBAR3.tar.bz2 --use-compress-prog=pbzip2 FOOBAR real 0m4.764s user 0m35.508s sys 0m1.136s
Reserve Resources
As already mentioned, pbzip2
allows user to select number of CPU’s and amount of RAM to be dedicated to the compression. Below example is using only single CPU to perform requested compression:
# time tar -c FOOBAR | pbzip2 -c -p1 > FOOBAR4.tar.bz2 real 0m20.348s user 0m19.972s sys 0m0.648s
In order to dedicate selected amount of RAM use -m
switch. By default pbzip2 uses 100MB. Example below performs compression using 1 CPU and 10MB of RAM:
# time tar -c FOOBAR | pbzip2 -c -p1 -m10 > FOOBAR5.tar.bz2 real 0m20.362s user 0m19.932s sys 0m0.704s
Compression Level
As it is usually the case with any compression utilities, pbzip2
also allows for compression ratio settings. The compression range is from 1 to 9, where default is 9 which is also the best compression ratio. To change compression rate to eg. 1
use -1
:
time tar -c FOOBAR | pbzip2 -c -1 > FOOBAR6.tar.bz2 real 0m3.786s user 0m28.612s sys 0m0.364s
Using the above example you will end up with a faster execution time but larger file name:
# ls -lh *.bz2 -rw-r--r-- 1 root root 54M Mar 10 20:02 FOOBAR1.tar.bz2 -rw-r--r-- 1 root root 54M Mar 10 20:41 FOOBAR2.tar.bz2 -rw-r--r-- 1 root root 54M Mar 10 20:43 FOOBAR3.tar.bz2 -rw-r--r-- 1 root root 54M Mar 10 20:48 FOOBAR4.tar.bz2 -rw-r--r-- 1 root root 54M Mar 10 20:54 FOOBAR5.tar.bz2 -rw-r--r-- 1 root root 67M Mar 10 21:00 FOOBAR6.tar.bz2
Decompression
To preform a decompression using pbzip2
does to produce significant, if any, time saving in comparison with bzip2
. The following linux commands can be used to decompress bzip2 compressed data using pbzip2
utility:
# tar xf FOOBAR1.tar.bz2 --use-compress-prog=pbzip2 OR # pbzip2 -dc FOOBAR1.tar.bz2 | tar x