Correct Variable Parsing and Quoting in Bash

Incorrect quoting in original source code can easily lead to bugs when the input provided by users is not as expected or not uniform. Over time, when Bash scripts change, an unforeseen side effect of an incorrectly quoted variable can lead to a bug even in otherwise untouched code. This is even more important for security related applications which may be prone to hacking attempts. Learn how to do quoting and variable parsing/validation properly from the outset, and avoid many of these issues! Let’s get started…

In this tutorial series you will learn:

  • How to quote your Bash variables properly
  • The caveats and results of incorrect quoting
  • How to ensure variable values are what they are supposed to be
  • How to check for empty, numeric and text based variable values
Correct Variable Parsing and Quoting in Bash

Correct Variable Parsing and Quoting in Bash

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Linux Distribution-independent
Software Bash command line, Linux based system
Other Any utility which is not included in the Bash shell by default can be installed using sudo apt-get install utility-name (or yum instead of apt-get)
Conventions # – requires linux-commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires linux-commands to be executed as a regular non-privileged user

Example 1: Quote those variables!

Unless you’re working with numerical values, and even in that case at times, it is wise to always quote your text-based variables when checking for equality etc. Let’s look at an example:

$ VAR1="a"; if [ ${VAR1} == "a" ]; then echo 'Yes!'; fi
Yes!
$ VAR1=; if [ ${VAR1} == "a" ]; then echo 'Yes!'; fi
bash: [: ==: unary operator expected


First we set VAR1 to the value a and subsequently checked if VAR1 equaled a. That worked, and we may think our code is fine and leave it as-is inside our script. However, sometime later and after many code changes, we start seeing bash: [: ==: unary operator expected – a somewhat cryptic message telling us there is something amiss with our code.

The reason is shown in the second example. If somehow our variable is empty, i.e. has failed to be set properly (or has been erased since setting), then we will be presented with an error as Bash effectively reads this; if [ == "a" ] which is a statement which does not make much sense, and it fails to compute.

If we properly quoted our variable with double quotes ("), this would not happen:

$ VAR1=; if [ "${VAR1}" == "a" ]; then echo 'Yes!'; fi
$ 

This time, Bash read the statement as if [ "" == "a" ] – a statement both easier on the eyes and the Bash compiler. No output is generated as clearly an empty string does not equal the letter a.

Example 2: Taking quoting a bit further

Once you have been working with Bash for a while, you will learn some of it’s language idioms. One such idiom is the – let’s call it privilege (and it surely is a convenience!) – to be able to quote numeric variables even if a numeric operation is being executed:

$ VAR1=13; if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi
Yes!
$ VAR1=7; if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi

Even though VAR1 is set to a numeric value, Bash will accept the " quoting around VAR1 and correctly produce the outcome of the if statement using the is equal (i.e. -eq) comparison operation.

Yet, we have not reached full circle yet, as the following still fails;

$ VAR1=; if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi
bash: [: : integer expression expected

This time an integer expression is expected, yet an empty variable (i.e. "" was passed), and this certainly is not numeric. Is there a way to fix this? Sure:

Example 3: Checking for zero length

$ VAR1=; if [ -n "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
$ VAR1=13; if [ -n "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
Yes!

Here we use a pre-check to see if the variable does not have a length of zero by using the conditional statement -n which means that the string does not have a length of zero. This could also be swapped for the reverse by using ! -z where -z means the string has a length of zero and the ! negates the same, i.e. reverses the outcome:

$ VAR1=; if [ ! -z "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
$ VAR1=13; if [ ! -z "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
Yes!
$ VAR1=7; if [ ! -z "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
$ 


We also added the =7 example here to show how the if statement functions correctly. Always test your if statements and conditions in a variety of situations, use cases and generic exceptions (bad values, no value, odd values, etc.) if you would like to make sure your code will be free of bugs.

Example 4: An almost complete check

There is still a shortcoming in the last example. Did you pick it up? Basically, if we pass textual values to the string, or if statement still fails:

$ VAR1='a'; if [ ! -z "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
bash: [: a: integer expression expected

This can be overcome by using a subshell, grep, and some regular expressions. For more information on regular expressions, see our Bash regexps for beginners with examples and advanced Bash regex with examples articles. For more information on Bash subshells, see our Linux Subshells for Beginners with Examples and Advanced Linux Subshells with Examples articles.

The syntax is not too complex:

$ VAR1=7; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
$ VAR1=13; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
Yes!
$ VAR1='a'; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; fi
$

Great. Here we check the contents of VAR1 to be numeric by using a grep -o (grep only; i.e. grep only the part matched by the search string, which is in this case a regular expression). We select any number character from 0-9 and this one or more times (as indicated by the \+ qualifier to the [0-9] selection range). Then, we try and match this grep matched part only text against the original variable. Is it same? If yes, then our variable consists of only numbers.

When we expand our outer if statement a bit to include an else clause which will tell us if a variable is not numeric, and when we try and pass in 'a' as input we see that the various inputs are each parsed correctly;

$ VAR1=7; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; else echo 'Variable not numeric!'; fi
$ VAR1=13; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; else echo 'Variable not numeric!'; fi
Yes!
$ VAR1='a'; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; else echo 'Variable not numeric!'; fi
Variable not numeric!


So now we have a perfect line for our code, no? No… We’re still missing something… Do you see what?

Example 5: A complete check

Did you see the issue? We haven’t checked for an empty variable yet!

$ VAR1=''; if [ "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; else echo 'Variable not numeric!'; fi
bash: [: : integer expression expected

Ouch. I trust by now you see why I regularly mention in my articles to always check your code creations in one way or another. Sure, Bash lends itself to quick and easy scripting, but if you want to make sure things will continue to work properly when changing your scripts or adding extra code, you will want to make sure your tests, inputs and outputs are clean and clearly defined. The fix is easy:

$ VAR1=''; if [ ! -z "${VAR1}" -a "$(echo "${VAR1}" | grep -o '[0-9]\+')" == "${VAR1}" ]; then if [ "${VAR1}" -eq 13 ]; then echo 'Yes!'; fi; else echo 'Variable not numeric!'; fi
Variable not numeric!

Here, using the fist if statement, we add an additional condition for variable VAR1 to not (!) be a zero length variable. This works well given the current setup as the second part of the first if statement can still proceed irrespective of the contents of VAR1.

Conclusion

In this article, we looked at how to correctly quote and parse/evaluate variables, and we explored how complex it was to write a perfect variable checking piece of Bash code. Learning how to do these things correctly from the outset will greatly limit the amount of possible bugs which can be introduced by accident.

Enjoy, and double quote those variables! 🙂



Comments and Discussions
Linux Forum