Calculate column average using bash shell

The purpose of this tutorial is to show how the Bash shell can calcuate the average value of a single column of a text file on a Linux system.

Read below to see the code that you can use on your own system to calculate a column average.

In this tutorial you will learn:

  • How to calculate column average using bash shell
Calculate column average using bash shell
Calculate column average using bash shell
Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Any Linux system
Software Bash shell
Other Privileged access to your Linux system as root or via the sudo command.
Conventions # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux commands to be executed as a regular non-privileged user

Calculate column average using bash shell




Our example file contains the following two lines:

line1 4.5
line2 6

The average of these two numbers is 5.25. Let’s see how we can arrive at that average by using Bash. Our solution should be scalable so that it could work even if we had more lines.

  1. One way to do this is to use combination of Bash for loop, along with the cut, echo, and bc commands. Execute the code below, assuming that file.txt is in your current working directory:
    $ count=0; total=0; for i in $( awk '{ print $2; }' file.txt );\
    do total=$(echo $total+$i | bc ); \
    ((count++)); done; echo "scale=2; $total / $count" | bc
    
  2. And here is a shell script version of the above command so we can see what is happening in more detail:
    #!/bin/bash
    
    count=0;
    total=0; 
    
    for i in $( awk '{ print $2; }' file.txt )
       do 
         total=$(echo $total+$i | bc )
         ((count++))
       done
    echo "scale=2; $total / $count" | bc
    

    For each line in file.txt we extract a second column with awk ( $i ). Then we use echo and bc commands to add all numbers $i to get a total $total. The script also stores a number of loops in $count. The last line uses echo and bc commands to calculate average with two decimal points.

  3. Here is another method of calculating the average by using only the awk command.
    $ awk '{ total += $2; count++ } END { print total/count }' file.txt 
    
  4. Here is yet another method that relies on awk, paste, wc, and bc. This will work as is, regardless of how many lines your file contains.
    $ sum=$(awk '{print $2}' file.txt | paste -sd+ | bc); echo "$sum / $(cat file.txt | wc -l)" | bc -l
    

Closing Thoughts




In this tutorial, we learned how to calculate the average of a column in Bash. The Bash shell is so flexible and powerful that it almost always gives us a way to accomplish tasks such as the one covered in this tutorial. As usual, there exist many different ways to tackle this problem, and a variety of different tools that can be used for the task. We covered some of the most simple ones here. Maybe you can think of an even better way.



Comments and Discussions
Linux Forum