One of the most important concepts in programming is the concept of arrays. An array can be thought of as a collection of data recorded together. As the set of values in an array are kept together, they are usually operated on jointly or in succession. They are handy in real-life scenarios, as we often have to deal with certain data sets.
The terminal commands of Bash can be used along with certain syntax operators as a whole programming language, which is referred to as Bash scripting. Today, we will bring these two areas together and see how arrays can be used in Bash scripts.
Introduction to arrays
As mentioned before, an array is a collection of data. But that is not enough because a haphazard collection is of no use unless it has some characteristics or ways to be used that make our lives easier.
Types of arrays
Indexed array
The best way to understand the concept of an indexed array is to think of a real-life numbered list created by writing down items on paper. Let us take an example of a grocery list. There are specific properties of a list like this: first off, there is a name for the list. In this case, “grocery.” Secondly, there are numbered items in that list, which means that each item occupies a certain numeric position in that list. There are a couple of more things, such as the size of the list (the number of the items) and finally, the items themselves. These are the various properties of a list that you can manipulate.
Similarly, an indexed array has a name, and each item holds a value. Each item has a specific position inside the array, and the array overall has a size, which is the number of items present inside the array. Now let us see how we can configure these different properties of an array for a Bash script.
Associative array
For an associative array, there are no numeric positions of items. Here, the property is based on key-value pairs. This kind of array is helpful in cases where specific values are permanently associated with certain other keywords. For example, we will take the states of the United States. TX refers to Texas, CA to California, NY to New York, etc. As mentioned, the abbreviations are permanently linked to the states.
As usual, associative arrays have a size, a name, etc. The major difference between indexed and associative arrays is that items are referred to by their index in indexed arrays, while keys in associative arrays refer to values.
Creating an array
Indexed array
Let’s continue with our example and create a grocery list:
grocery=(Almonds Jam Rice Apples)
To print this list, the command echo needs to be used (there is a whole section about reading arrays later on, for now, don’t worry about the command). This makes the overall script:
Executing this script:
Using the declare command
The previous method of creating an indexed array was straightforward. There is another way to create arrays, using the declare command, which is a more “proper” way. To create the same array, the command becomes:
declare -a grocery=(Almonds Jam Rice Apples)
Here, the -a flag denotes that you want to create an indexed array.
The printing command remains the same.
Associative array
There is no other way of creating an associative array but to use the declare command. The flag changes to -A, which denotes an associative array. We will build upon the states example:
declare -A states=(["TX"]="Texas" ["CA"]="California" ["NV"]="Nevada")
The echo command is used to print out the values according to the keys. Don’t worry about the command. For now, we will explain it in depth further.
Printing arrays
There are various ways to read and print elements of a list in Bash. Each case is helpful for different scenarios.
Individual elements
Indexed arrays
The first part is to read individual elements. For this purpose, we need to know the index or the position of an element in an array. A thing to note is that, just like Python, the indexing begins at 0. So for this array, the indexing would look like this:
If I want the second element of the array, I will have to use the index 1:
echo ${grocery[1]}
The final result:
As you can notice here, we have used curly brackets around the array’s name. We don’t need to do this for a simple variable, but the curly brackets are necessary for an array.
Associative arrays
To print an individual element of an associative array, you need to know the key of the desired element. For example, in our list of states, we need to see the value of the key TX. The required command is:
echo ${grocery[TX]}
The curly brackets are not necessary around the name of a variable in Bash usually, but they are in the case of arrays.
All elements
Printing all the elements of an element is a derivative of printing individual elements. We use the wildcard character *(asterisk) to achieve this. In Bash, using * means you are trying to target everything. To get a clearer idea, say you want to list everything that begins with the letter ‘D,’ then you can type in:
ls D*
As you can see, it yields only the files and directories that begin with the letter ‘D.’ Similarly, to list all the elements of an array or everything in an array, we use this character.
Indexed array
echo ${grocery[*]}
This is the command from earlier in the article, so you have seen how it works. The asterisk refers to all the elements of the group.
Associative array
Using the asterisk to print all elements:
echo ${states[*]}
This is the command we used earlier. Since associative arrays work based on keys, they will not print the keys themselves, just the values. Some commands print both, and we will explore them further.
Iterating
Indexed arrays
Another way to list the elements of an array is to print them out one at a time. For this, we will have to use the for loop. It will be easier to explain with the code written first:
for elem in "${grocery[@]}" do echo "$elem" done
There’s quite a bit of unpacking here. First, how does a for loop work? It is a fundamental loop in programming, which allows a code to be run repeatedly. If you want a collection to go through the same process but separately, a for loop is the ideal contestant. We have a pretty good example here already.
The for loop is instructed to address the array “grocery.” The for loop sets a couple of variables in the beginning and keeps changing the values of those variables with every loop. Here, the variable ‘elem‘ is used to address the individual elements of the array. The ‘@’ symbol signifies that we want Bash to loop through the entire array and not only one element. You can think of ‘@’ as another variable.
Now, when the for loop starts for the first time, the value of ‘@’ is 0; hence, ‘elem‘ is the array’s first element (0th index). So “Almonds.” Next, the for loop instructs what to do with ‘elem‘. This begins with the keyword ‘do.’ In this case, we want to print it using echo. Finally, ‘done‘ signifies to Bash that the loop is completed.
After this, it loops on the next value of ‘@,’ which is 1, and hence, ‘elem‘ becomes “Jam”. The whole thing happens again and again until the array has no more elements to operate on.
Associative arrays
Starting with the code:
for k in "${!states[@]}" do echo ${states[$k]} done
The first thing to see here is the @ symbol. Let us think of @ and k as variables. When the loop starts, the @ symbol refers to the first key. The variable k holds the key that @ is referring to. If we talk about our associative array, the first key is “TX,” so when the loop starts, @ refers to the key “TX,” and the variable k means “TX.” The keyword do indicate the beginning of the tasks that each item in the for loop needs to do. The only task here is to print ${states[$k]}. As we said, in the first iteration of the loop, k is “TX,” so in the first iteration, this line is equivalent to printing ${states[“TX”]}, which means the value corresponding to the key “TX.”
As you can guess, the keyword done means the end of the tasks that need to be done for each item in the loop. When the loop ends for the first time, @ starts referring to the second key, and k becomes “CA.” This loop continues until there are no more key-value pairs left in the array. The execution of this script looks like this:
But if you want to make it a little more friendly, you can always print the key before its value. So the script will be modified to:
for k in "${!states[@]}" do echo $k : ${states[$k]} done
This will give a more friendly result:
You will notice another curious thing here: we have used double quotations around the variables when referring to them. We didn’t do that before. There is a reason for that as well. To explain it better, let’s alter the indexed array to include “Peanut Butter” or the associative array to include [NY]=New York. Running the for loop yields:
We didn’t want that now, did we? The “Peanut” and the “Butter” have been separated in the indexed array, and NY only means “New” in the associative one. How would Bash know any better, right? It perceives every whitespace it encounters as a separation between elements. To remedy this, we place individual elements in double-quotes:
Now executing this script:
This is also why the script holds all its variables inside double-quotes. This avoids the confusion of whitespaces inside the variable values.
Splicing
Indexed array
Another way to print an array is to print according to the indices of a required range. For example, if you only want the first three elements, index 0 to 2. To print only those elements of the array:
echo "${grocery[@]:0:2}"
Executing this script:
Oh, it seems like we only got the first two. Bash conventions require that you input the ending index with added one to its value when splicing. So if we want to print the first three elements:
echo "${grocery[@]:0:3}"
An excellent way to visualize this is that it goes from the beginning of index 0 to the beginning of index 3 (and hence doesn’t include index 3 itself).
Number of elements in an array
Indexed array
To get the number of elements in an array, only a straightforward modification needs to be made to the basic printing statement.
For our case, it would look like this:
echo "${#grocery[@]}"
Executing it in the script:
Associative array
Similar to an indexed array, executing this line in the script gives the number of elements (key-value pairs):
echo "${#states[@]}"
Inserting an element into an array
Inserting an element in an array is the same as adding a new element to the end of the array. This can be done in a method parallel to the common method of incrementing. For example, in a loop, if you want a variable to increase its value by one after each loop, you can write that at the end of the script as:
var = var + 1
On shorthand, it looks like this:
var += 1
Using this method for incrementing to arrays:
Associative array
Let us add an element for Massachusetts in the array:
states+=(["MA"]="Massachusetts")
Indexed array
Let us add Yogurt to our grocery list with the statement:
grocery+=("Yogurt")
Replacing an element in an array
Indexed array
Replacing an item in an array requires that you know the index of the target element. Let us change the newly added sixth element to Muesli. We can do that with the command:
grocery[5]=("Muesli")
Now printing the array again:
Deleting an element from an array
Indexed array
Finally, let’s complete the journey of the sixth element by removing it from the array and back to the original array. This again requires the index of the element. To remove the sixth element, the statement we need is:
unset grocery[5]
Checking if it worked:
Associative array
Like an indexed array, we will use the unset command to delete an element, but we will use the key since there is no indexing in an associative array. We will remove the element for Massachusetts that we added in the last section:
unset states["MA"]
Executing the script:
Conclusion
Arrays are a vital part of Bash scripting and the whole logic of programming. As mentioned before, in any real-life simulating situation (typically the end-usage of any program), data collection needs to be handled. Learning to manipulate those data sets is the bread-and-butter of a programmer.
We hope this article was helpful to you. Cheers!