Introduction
What if you would be able to perform a data compression four times faster, with the same compression ratio as you normally do. Pbzip2 command line utility can easily accomplish this as it gives you an option to select number CPU and amount of RAM to be used during the compression process.
Regular tar and bzip2 compression
We all know the regular command to perform tar and bzip2 directory compression. The below command will
tar
and compress our sandbox directory
FOOBAR
. We are also prefixing the below command to get exact time for how long will it take to output compressed file
FOOBAR.tar.Bbz2
from 242MB FOOBAR directory:
# time tar cjf FOOBAR1.tar.bz2 FOOBAR/
real 0m20.030s
user 0m19.828s
sys 0m0.304s
From the above time output we can see that it took about 20 seconds to create following compressed file:
# ls -lh FOOBAR1.tar.bz2
-rw-r--r-- 1 root root 54M Mar 10 20:25 FOOBAR1.tar.bz2
Faster compression with bpzip2
pbzip2
by default uses all available CPU's and 100MB RAM to perform compression. The following
linux command will perform directory compression using
pbzip2
. Once again we use time to measure execution time:
# time tar -c FOOBAR | pbzip2 -c > FOOBAR2.tar.bz2
real 0m4.777s
user 0m35.588s
sys 0m1.060s
Alternatively, the bellow command will yield the same result:
# time tar cf FOOBAR3.tar.bz2 --use-compress-prog=pbzip2 FOOBAR
real 0m4.764s
user 0m35.508s
sys 0m1.136s
Reserve Resources
As already mentioned,
pbzip2
allows user to select number of CPU's and amount of RAM to be dedicated to the compression. Below example is using only single CPU to perform requested compression:
# time tar -c FOOBAR | pbzip2 -c -p1 > FOOBAR4.tar.bz2
real 0m20.348s
user 0m19.972s
sys 0m0.648s
In order to dedicate selected amount of RAM use
-m
switch. By default pbzip2 uses 100MB. Example below performs compression using 1 CPU and 10MB of RAM:
# time tar -c FOOBAR | pbzip2 -c -p1 -m10 > FOOBAR5.tar.bz2
real 0m20.362s
user 0m19.932s
sys 0m0.704s
Compression Level
As it is usually the case with any compression utilities,
pbzip2
also allows for compression ratio settings. The compression range is from 1 to 9, where default is 9 which is also the best compression ratio. To change compression rate to eg.
1
use
-1
:
time tar -c FOOBAR | pbzip2 -c -1 > FOOBAR6.tar.bz2
real 0m3.786s
user 0m28.612s
sys 0m0.364s
Using the above example you will end up with a faster execution time but larger file name:
# ls -lh *.bz2
-rw-r--r-- 1 root root 54M Mar 10 20:02 FOOBAR1.tar.bz2
-rw-r--r-- 1 root root 54M Mar 10 20:41 FOOBAR2.tar.bz2
-rw-r--r-- 1 root root 54M Mar 10 20:43 FOOBAR3.tar.bz2
-rw-r--r-- 1 root root 54M Mar 10 20:48 FOOBAR4.tar.bz2
-rw-r--r-- 1 root root 54M Mar 10 20:54 FOOBAR5.tar.bz2
-rw-r--r-- 1 root root 67M Mar 10 21:00 FOOBAR6.tar.bz2
Decompression
To preform a decompression using
pbzip2
does to produce significant, if any, time saving in comparison with
bzip2
. The following
linux commands can be used to decompress bzip2 compressed data using
pbzip2
utility:
# tar xf FOOBAR1.tar.bz2 --use-compress-prog=pbzip2
OR
# pbzip2 -dc FOOBAR1.tar.bz2 | tar x