The tr command

The tr command
	Chapter 3. Regular Expressions

The translate command is used to translate strings or patterns from one set of characters to another.

For example, supposing we have a whole bunch of lowercase letters in a string, and we want to translate that all to uppercase letters, the simplest way to do that is to use the translate command.

Translate does not take a filename as an argument, it only acts on a stream of characters. Working with the file columns.txt from previously, we may want to translate all the names to uppercase. Previously we had the line:

Hamish 35

We now want to translate that to:

HAMISH 35

without editing the file. Cat-ting our file (columns.txt) and then piping the output of the cat command to the input of the translate command causing all lowercase names to be translated to uppercase names.

cat columns.txt | tr '[a-z]' '[A-Z]'

Remember we have not modified the file columns.txt so how do we save the output? Simple, by redirecting the output of the translate command with '>' to a file called UpCaseColumns.txt with:

cat columns.txt | tr '[a-z]' '[A-Z]' > UpCaseColumns.txt

Since the tr command, does not take a filename like sed did, we could have changed the above example to:

tr '[a-z]' '[A-Z]' < columns.txt > UpCaseColumns.txt

As you can see the input to the translate command now comes, not from stdin, but rather from columns.txt. So either way we do it, we can achieve what we've set out to do, using tr as part of a stream, or taking the input from the stdin ('<').

We can also use translate in another way: to distinguish between spaces and tabs. Spaces and tabs can be a pain when using scripts to compile system reports. What we need is a way of translating these characters. Now, there are many ways to skin a cat in Linux and shell scripting. I'm going to show you one way, although I'm sure you could now write a sed expression to do the same thing.

Assume that I have a file with a number of columns in it, but I am not sure about the number of spaces or tabs between the different columns, I would need some way of changing these spaces into a single space. Why? Since, having a space (one or more) or a tab (one or more) between the columns will produce significantly different output if we extracted information from the file with a shell script. How do we do convert many spaces or tabs into a single space? Well, translate is our right-hand man (or woman) for this particular task. In order not to waste our time modifying our columns.txt let's work on the free command, which shows you free memory on your system. Type:

free

If you look at the output you will see that there's lots of spaces between each one of these fields. How do we reduce multiple spaces between fields to a single space? We can use to tr to squeeze characters (you can squeeze any characters but in this case we want to squeeze a space):

free |tr -s ' '

The -s switch tells the translate command to squeeze. (Read the info page on tr to find out all the other switches of tr).

We could squeeze zeroes with:

free |tr -s '0'

Which would obviously make zero sense!

Going back to our previous command of squeezing spaces, you'll see immediately that our memory usage table (which is what the free command produces) becomes much more usable because we've removed superfluous spaces.

Perhaps, we want some fields from the output. We could redirect the output of this into a file with:

free |tr -s ' ' > file.txt

Traditional systems would have you use a Text editor to cut and paste the fields you are interested in, into a new file. Do we want to do that? Absolutely not! We're lazy, we want to find a better way of doing this.

What I'm interested in, is the line that contains 'Mem'. As part of your project, you should be building a set of scripts to monitor your system. Memory sounds like a good one that you may want to save. Instead of just redirecting the tr command to a file, let's first pass it through sed where we extract only the lines beginning with the word "Mem":

free | tr -s ' ' | sed '/^Mem/!d'

This returns only the line that we're interested in. We could run this over and over again, to ensure that the values change.

Let's take this one step further. We're only interested in the second, third and fourth fields of the line (representing total memory, used memory and free memory respectively). How do we retrieve only these fields?