In the below example you can find some tips on how to join columns from multiple files to a single comma separated value file (CSV). For reading a columns form a multiple files we can use paste
command. Consider a following example. In our sand box directory we have 3 files where each contains a single column of date:
$ ls f1 f2 f3 $ cat f1 az dr qw rt er $ cat f2 iu dr gg hh jj qq ee ui $ cat f3 qp df
Next, we can join them together using paste
:
$ paste f1 f2 f3 az iu qp dr dr df qw gg rt hh er jj qq ee ui
By default the paste command will use TAB
to separate all columns from each other. This behavior can be overwritten by using -d
option. For example instead of a tab delimited file we create a comma delimited file:
$ paste -d , f1 f2 f3 az,iu,qp dr,dr,df qw,gg, rt,hh, er,jj, ,qq, ,ee, ,ui,
Ok, this was easy. But what about joining a selected columns from multiple columns files? Consider a following TAB
delimited send box files where each file contains more than one column:
$ ls f4 f5 f6 $ cat f4 qw mn qw ty ix ao pi er sy $ cat f5 rk wp lp cy wn em $ cat f6 tr er wm ut vb mq rp el st
Using a paste on all files will join all columns into a single output:
$ paste f4 f5 f6 qw mn qw rk wp tr er wm ty ix ao lp cy ut vb mq pi er sy wn em rp el st
Once we have the above output we can use cut
or awk
commands to select only those columns we are interested in. In the next example we will join a second and third from a f4
file, first column from f5
file and last column from f6
with ,
as a delimiter:
$ paste f4 f5 f6 | awk 'BEGIN { OFS = "," }{ print $2,$3,$4,$8}' mn,qw,rk,wm ix,ao,lp,mq er,sy,wn,st
Please note that you can specify the output columns in any order so for example this is also a valid command:
$ paste f4 f5 f6 | awk 'BEGIN { OFS = "," }{ print $4,$8,$2,$3}' rk,wm,mn,qw lp,mq,ix,ao wn,st,er,sy
Similarly a cut
command with a combination of tr
can be used to join or separate multiple columns form comma separated value CSV file or STDIN:
$ paste f4 f5 f6 | tr '\t' ',' | cut -d , -f2,3,4,8 mn,qw,rk,wm ix,ao,lp,mq er,sy,wn,st
The last thing to mention is that to save your new CSV output to a file you need to use redirection to redirect it to a new file. For example we create a new file called mydata.csv
:
$ paste f4 f5 f6 | tr '\t' ',' | cut -d , -f2,3,4,8 > mydata.csv