There are a lot of basic functionalities that we don’t think about twice in our daily life. For example, it is straightforward if you are trying to use the find and replace function in your word document. But what about when you cannot have a beautiful interface? What if you need to program it into your script?
There is the simple solution of using an if-else statement, but that is too long and requires too much effort. A quicker method must be there for something as commonplace as this, which is precisely what Bash offers.
Today we look at the tr command, which translates (replaces), squeezes (removes repetition), or deletes elements from the standard input and provides a standard output.
Bash tr command basic usage
The most basic syntax looks like this:
tr [OPTION] SET1 SET2
Here, the OPTION can refer to any flags tr provides. We will take a look at them later on. The SET1 are the characters that will be operated on, and SET2 is the set of characters that replace or modify the SET1 characters. This will start making much more sense with more examples.
As we have already mentioned, the tr command takes in standard input, so to even use it, we need to provide it an input, which can be done with the old reliable echo command. So, for example:
echo 'FOSSLinux' | tr 'SL' 'lw'
In the output, all the S’s are replaced with l’s and the L’s with w’s.
What happens when you make SET1 larger than SET2?
echo 'FOSSLinux' | tr 'SLnf' 'lw'
As you can see from the output, tr uses the last element of SET2 for the operations of any characters that exceed the usual definition. And this is not only a specific case, but it happens wherever necessary. When there is no mention of what character to use for the translation, tr goes with the last element of SET2.
Another observation from this example is that even though we mentioned ‘f’ in SET1, the ‘F’ didn’t get translated. Why is that? Because the tr command is case-sensitive. If we had instead mentioned ‘F’ in the SET1, it would work just the same.
Complement
The complement flag (-c) replaces all the characters except those mentioned in the SET1. Using the same example still:
echo 'FOSSLinux' | tr -c 'SL' 'lw'
Since technically, the number of characters in SET1 is much higher than those in SET2 because it includes all the characters except S and L, tr goes with the last element of SET2, that is, ‘w’ here, to translate the entire string.
There is another observation to be made here: The prompt, unlike the prior cases, doesn’t go to the next line. A line usually ends with a newline character (\n) that describes that the next part has to go to the following line. However, since everything except ‘S’ and ‘L’ has been replaced, even the newline character has been.
Delete
The delete flag (-d) is pretty simple to understand. It deletes the characters that the user mentions. And since there is only deletion, no translation, it only requires SET1 of characters and no SET2. For example:
echo 'FOSSLinux' | tr -d 'SL'
This deletes the characters ‘S’ and ‘L’ from the entire input string.
Squeeze repeats
The squeeze repeats (-s) flag does precisely what it says. If there is a consecutive repetition of a character from SET1, it deletes the repetition and keeps only one of the instances. After that, it takes characters from SET2 to replace the characters from SET1. Example:
echo 'FOOSSLinux' | tr -s 'SO' '_b'
Here, the repetition of ‘O’ and ‘S’ are removed first, and then ‘O’ is replaced by ‘_’ and ‘S’ by ‘b’. If you want to remove the repetition of certain characters without translation, even that can be done. In such a case, you only need to mention SET1.
echo 'FOOSSLLLinux' | tr -s 'SO'
The output, as you can see, deletes the repetition of the S and O characters.
Truncate
We already saw what happens when more elements are in SET1 than in SET2. The last element of SET2 replaces everything that doesn’t have a corresponding element. For example:
echo 'FOSSLinux' | tr 'FOSL' 'lw'
Here, ‘F’ corresponds to ‘l’, and ‘O’ corresponds to ‘w’, which is the extent of correspondence. But as we can see from the output, the rest of the elements of SET1 use the last element of SET2, ‘w’, as the corresponding characters. In other words, the corresponding translation characters of ‘S’ and ‘L’ are ‘w’. While this is desirable in some instances, sometimes it is not. In those cases, we can use the truncate (-t) flag:
echo 'FOSSLinux' | tr -t 'FOSL' 'lw'
This truncates (reduces) the length of SET1 to that of SET2 and leaves the extra elements as they were, without any translation whatsoever.
Specific use cases
Now that we have seen all things tr can do, it is time to see how this comes into use in real life.
Extract numbers
A straightforward example would be that you need to extract only the digits from a sentence. For instance, you need to extract the numbers in a line where someone mentions their age. So if the sentence is “I am 19 years old” and you only need “19” out of if, you delete all the characters except the numerical digits.
echo "I am 19 years old" | tr -cd [:digit:]
The command has a simple breakdown: I want to operate only on the characters, not numbers. Hence the complement flag (-c), and the thing I want to NOT operate on are numerical digits, so the “:digit:” part. And then there’s the delete flag (-d), which deletes the intended characters.
This example also demonstrates that you can use different combinations of the flags as you might need.
Separate elements of a CSV file
A CSV file means a file that has ‘comma separated values’. It is a very common method of storing data, where the different elements are separated by commas only. What if you want to print out those elements in other lines?
I have a CSV file here:
Now we need to print different elements in different lines, right? This means we must translate the commas into the newline character (\n). The command becomes:
cat distros.csv | tr ',' '\n'
As evident from the command output, we can see that the elements have been separated.
Conclusion
The tr command is an essential tool in the shed regarding Bash, mainly Bash scripting. It helps translate or otherwise edit character strings very simply and quickly. Fluency in commands like tr lead to overall mastery of Bash. We hope this article was helpful. Cheers!