Bash Advanced Variable Idioms for Case Sensitivity Management

Whenever we work with textual strings, sooner or later the issue of case comes up. Does a word need to be fully uppercase, fully lowercase, with a capitalized letter at the start of the word or sentence, and so on.

An idiom is a natural language expression of a simple programming task. For example, in the sleep 10 command (which will pause the terminal one is working in for ten seconds), the word sleep is a natural language expression of what is a time based coding construct, developed in the Bash GNU coreutils software package.

There are a number of special variable-bound idioms (i.e. suffixes which can be added to a variable name, indicating what we would like to do with a given variable), which can be used in Bash to more easily do these types of conversions on the fly instead of having to use for example the Sed Stream Editor with a Regular Expression to do the same.

If you are interested in using regular expressions, have a look at our Bash Regexps For Beginners With Examples Advanced Bash Regex With Examples articles!

This makes working with variables that need case modification, or if statement testing a whole lot easier and provides great flexibility. Idioms can be added directly inside the if statements and do not need to employ a subshell with sed.

While the syntax looks slightly complex to start with, once you learn a little mental support trick to remember the right keys, you will be well on your way to use these idioms in your next script or Bash one-liner script at the command line!

In this tutorial you will learn:

  • How to use the ^, ^^, , and ,, Bash variable suffix idioms
  • How to use a regular expression [] range idiom in combination with these
  • How to use the ^ and , idioms directly from within if statements
  • Detailed examples exemplifying the use of ^, ^^, , and ,,

Bash Advanced Variable Idioms for Case Sensitivity Management

Bash Advanced Variable Idioms for Case Sensitivity Management

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Linux Distribution-independent
Software Bash command line, Linux based system
Other Any utility which is not included in the Bash shell by default can be installed using sudo apt-get install utility-name (or yum install for RedHat based systems)
Conventions # – requires linux-commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires linux-commands to be executed as a regular non-privileged user


Example 1: Making full variables uppercase

Let us start with an example showing how to print a variable as uppercase:

$ VAR='make me uppercase'; echo "${VAR^^}"
MAKE ME UPPERCASE

We first set the variable VAR to make me uppercase. The way we did this is by using ^^ at the end of the variable name – a suffix, a Bash idiom, to tell the Bash internal interpreter to substitute our text for it’s uppercase version.

Note that any time one sets a variable, one will use the VAR= syntax, leaving off the leading variable ($) Bash idiom. Subsequent uses, which are not re-assignments by themselves, will use the $ syntax. Hence, the echo uses $.

You can also see { and } being used around the variable name. Whilst this is not strictly necessary:

$ VAR=1; echo $VAR
1

It is highly recommend, and I personally recommend it also, to always correctly quote variables, as it avoids mistakes and even issues like where it is not clear for the Bash behind-the-scenes interpreter when a variable ends:

$ VAR='a'; echo "$VARa"

In the first example, the Bash interpreter sees a variable name commencing ($) and keeps reading till it hits a space, as can be seen here:

$ VAR='a'; echo "$VAR a"
a

Here we had to introduce a space just to make our echo work correctly.

In other words, in our former example, the variable name that Bash sees is VARa and it is unable to split/see where the variable ends and the rest of the string-to-output starts or re-starts. Let us compare this with properly encapsulating variables with { and }:

$ VAR='a'; echo "${VAR}a"
aa

Here no issues are seen; it is clear to the Bash interpreter that ${VAR} is the variable and a is the text to follow after it, all thanks to properly encapsulating our variable.

This also translates back, in a strong manner, to using the special ^^ idiom and other such Bash idioms. Let us exemplify this:

$ VAR='make me uppercase'; echo $VAR^^
make me uppercase^^
$ VAR='make me uppercase'; echo "$VAR^^"
make me uppercase^^

In this case, Bash is able to see that we would like the VAR variable to be printed, though interprets ^^ as standard text. As can be seen clearly from this and previous examples, it is a best practice to always surround variables names with { and }.

Example 2: Making full variables lowercase

Now that we have seen how to make a full variable uppercase by using the ^^ idiom, let us look at how to change full variables to lowercase by using the ,, idiom:

$ VAR='MAKE ME LOWERCASE'; echo "${VAR,,}"
make me lowercase

It is an interesting syntax idiom to use ,, as a suffix to the variable, but it works correctly as shown.



A little mental support trick to remember these

A great way/method to remember anything is to visually confirm or imagine whatever needs to be remembered. If you can add a few mental constructs onto this, like making relations with other things, you are likely to remember the same next time.

these is to have a look at your physical keyboard, and if you are using a querty keyboard like me, you will see that ^ is SHIFT-6 and ‘,’ is right next to the m. How does this help?

Firstly, the 6/^ key is at the top, and the , key is at the bottom. Next, the , key is also the one on the bottom closest non-alphabet-character to the 6 key. Lastly, both keys are on the right hand side of the keyboard reminding one that these idioms are a suffix, not prefix, to a variable

Once you have visually confirmed this once or twice, it will likely stick in memory quite well and you’ll be able to use these idioms in your next Bash script or one-liner without having to re-reference the syntax.

Example 3: Changing specific letters

We can also make a specific letter uppercase:

$ VAR='ababab cdcdcd'; echo "${VAR^^b}"
aBaBaB cdcdcd

Or lowercase:

$ VAR='ABABAB CDCDCD'; echo "${VAR,,C}"
ABABAB cDcDcD

There are two gotchas/limitations here. Firstly, we must make sure to specify the right case of letter to start with. Thus, specifying a lowercase replacement for the letter c will not work:

$ VAR='ABABAB CDCDCD'; echo "${VAR,,c}"
ABABAB CDCDCD

As there simply is no lowercase c in the text, there is only C (uppercase), and this works fine as the example shown above the last one.

We also cannot specify multiple letters by using either of these presumable, but non-working formats:

$ VAR='ABABAB CDCDCD'; echo "${VAR,,CD}"
ABABAB CDCDCD
$ VAR='ABABAB CDCDCD'; echo "${VAR,,C,,D}"
ABABAB CDCDCD

The way to get this to work correctly is to use the regular expression format of [...selection list...], as follows:

$ VAR='ABABAB CDCDCD'; echo "${VAR,,[CD]}"
ABABAB cdcdcd

Starting the sentence with an uppercase or lowercase character

Changing only the first letter is possible also:

$ VAR='ababab cdcdcd'; echo "${VAR^}"
Ababab cdcdcd
$ VAR='ABABAB CDCDCD'; echo "${VAR,}"
aBABAB CDCDCD

Here we used a single ^ or , to make the first letter uppercase or lowercase.



Using these Bash variable suffix idioms from within if statements

We can also use these Bash variable suffix idioms directly from within if statements:

$ VAR='abc'; if [ "${VAR^^}" == "ABC" ]; then echo 'Matched!'; else echo 'Not Matched!'; fi
Matched!

Here we have a variable VAR with value abc. Next, inside the if statement, we change the contents of the variable, dynamically, into ABC by using ${VAR^^} as our first compare string in the if statement. Next, we compare with ABC and we have a match, proving that our inline substitution to uppercase worked.

This is much simpler then starting a subshell and doing the same using sed and a regular expression:

$ VAR='abc'; if [ "$(echo "${VAR}" | sed 's|[a-z]|\U&|g')" == "ABC" ]; then echo 'Matched!'; else echo 'Not Matched!'; fi
Matched!

The \U& in this sed instruction can be read as ‘change any capture (done by [a-z] and references by & in \U&)to the uppercase (\U`) equivalent thereof. Compare the complexity of this solution to the previous one.

Another if example

$ VAR='abc'; if [[ "${VAR^^b}" == *"B"* ]]; then echo 'Matched!'; else echo 'Not Matched!'; fi
Matched!

In this example, we changed the text abc to aBc by using ${VAR^^b} as described earlier (uppercase only the letter b). Then we use a compare which has an asterisk to the left and the right of the letter B. This means we are looking for …any string… followed by B followed by …any string… (note that one can also leave off the starting or ending asterisk in order to match sentences starting with, or ending with B respectively).

Conclusion

In this article, we explored the Bash variable suffix idioms ^, ^^, , and ,,. We had a look at how they can be used to substitute strings to their upper and lowercase variants, and how to work with one or more individual letters, including making the first letter uppercase or lowercase.

We also explored how to use these idioms further from within Bash if statements. Finally we provided a proposed memory support trick to remember what characters can be used, and where, as Bash idioms for upper and lowercase substitution of text.

Leave us a thought with your coolest text case substitution commands! Enjoy!



Comments and Discussions
Linux Forum