Bash scripts are a highly efficient means of automating tasks, particularly those that take advantage of other existing programs. This automation often requires repeating a similar operation several times, which is precisely where the for loop comes into its own.

Linux and Mac system administrators are typically familiar with scripting via the terminal, but even Windows users can get in on the action with the Windows Subsystem for Linux.

How Bash Scripts Work

A bash script is simply a plain text file containing a series of commands that the bash shell can read and execute. Bash is the default shell in pre-Catalina macOS, and most Linux distributions.

If you’ve never worked with a shell script before, you should begin with the absolute simplest case. This will allow you to practice key concepts including the creation of the script and its execution.

First, create the following file in a convenient location (ideally, open a terminal and navigate to the desired directory first):

        #!/bin/bash
echo "Hello, World"

The first line tells whatever runs this program how to run it (i.e. using the bash interpreter). The second is just a command like any other you might enter on the command line. Save that file as hello_world.sh, then:

        $ chmod +x hello_world.sh
$ ./hello_world.sh

The chmod command on the first line makes the file executable, meaning that it can be run by typing its name, as in the second line.

If you see the words “Hello, World” appear printed on a line in your terminal, then everything’s working as required.

How For Loops Work

In general programming, there are two main types of for loop: numeric and foreach. The numeric type is traditionally the most common, but in bash usage, it’s usually the other way round.

Numeric for loops typically focus on a single integer which determines how many iterations will be carried out, for example:

        for (i = 0; i < 100; i++) {
    /* statements to execute repeatedly */
}

This is a familiar-looking for loop that will iterate exactly 100 times, unless i is altered within the loop, or another statement causes execution of the for loop to halt.

Foreach loops, in contrast, tend to operate on structures such as lists or arrays, and iterate for every item within that collection:

        people = [ "Peter", "Paul", "Mary" ]

foreach (people as person) {
    if (person == "Paul") {
        ...
    }
}

Some languages use a slightly different syntax which swaps the order of collection and item:

        people = [ "Peter", "Paul", "Mary" ]

for (person in people) {
    if (person == "Paul") {
        ...
    }
}

For in Loops

In bash, the foreach—or for in—loop is more common. The basic syntax is, simply:

        for arg in [list]
do
    /* statements to execute repeatedly */
    /* the value of arg can be obtained using $arg */
done

For example, to iterate through three explicitly-named files:

        for file in one.c two.c three.c
do
    ls "$file"
done

If such files exist in the current directory, the output from this script will be:

        one.c
two.c
three.c

Instead of a fixed set of files, the list can be obtained via a glob pattern (one including wildcards – special characters that represent other characters). In the following example, the for loop iterates across all files (in the current directory) whose names end in ".xml":

        for file in *.xml
do
    ls -l "$file"
done

Here’s some example output:

        $ -rw-r--r-- 1 bobby staff 2436 3 Nov 2019 feed.xml
$ -rw-r--r-- 1 bobby staff 6447 27 Oct 16:24 sitemap.xml

This may look very much like a long-winded way of doing:

        $ ls -l *.xml
    

But there’s a significant difference: the for loop executes the ls program 2 separate times, with a single filename passed to it each time. In the separate ls example, the glob pattern (*.xml) matches filenames first and then sends all of them, as individual command-line parameters, to one instance of ls.

Here’s an example that uses the wc (word count) program to make the difference more obvious:

        $ wc -l *.xml
44 feed.xml
231 sitemap.xml
275 total

The wc program counts the number of lines in each file separately, then prints a total count across all of them. In contrast, if wc operates within a for loop:

        for file in *.xml
do
    wc -l $file
done

You’ll still see the count for each file:

        44 feed.xml
231 sitemap.xml

But there is no overall summary total because wc is run in isolation, each time the loop iterates.

When a List is Not a List

There’s a very easy and common mistake when dealing with for loops, due to the way bash handles quoted arguments/strings. Looping through a list of files should be done like this:

        for file in one.c two.c
    

Not like this:

        for file in "one.c two.c"
    

The second example encloses filenames in double-quotes which results in a list with just a single parameter; the for loop will only execute one time. This problem can be avoided by using a variable in such cases:

        FILES="one.c two.c"
for file in $FILES
do
    ...
done

Note that the variable declaration itself does need to enclose its value in double-quotes!

For Without a List

With nothing to iterate through, a for loop operates on whatever command-line arguments were provided to the script when invoked. For example, if you have a script named args.sh containing the following:

        #!/bin/sh
for a
do
    echo $a
done

Then executing args.sh will give you the following:

        $ ./args.sh one two three
one
two
three

Bash recognizes this case and treats for a do as the equivalent of for a in $@ do where $@ is a special variable representing command-line arguments.

Emulating a Traditional Numeric For Loop

Bash scripts often deal with lists of files or lines of output from other commands, so the for in type of loop is common. However, the traditional c-style operation is still supported:

        for (( i=1; i<=5; i++ ))
do
    echo $i
done

This is the classic form with three parts in which:

  1. a variable is initialised (i=1) when the loop is first encountered
  2. the loop continues so long as the condition (i<=5) is true
  3. each time around the loop, the variable is incremented (i++)

Iterating between two values is a common enough requirement that there’s a shorter, slightly less confusing alternative:

        for i in {1..5}
do
    echo $i
done

The brace expansion that takes place effectively translates the above for loop into:

        for i in 1 2 3 4
    

Finer Loop Control With Break and Continue

More complex for loops often need a way of exiting early or immediately restarting the main loop with the next value in turn. To do so, bash borrows the break and continue statements that are common in other programming languages. Here’s an example that uses both to find the first file that’s more than 100 characters long:

        #!/bin/bash
for file in *
do
    if [ ! -f "$file" ]
    then
        echo "$file is not a file"
        continue
    fi

    num_chars=$(wc -c < "$file")
    echo $file is "$num_chars characters long"

    if [ $num_chars -gt 100 ]
    then
        echo "Found $file"
        break
    fi
done

The for loop here operates on all files in the current directory. If the file is not a regular file (e.g. if it’s a directory), the continue statement is used to restart the loop with the next file in turn. If it’s a regular file, the second conditional block will determine if it contains more than 100 characters. If so, the break statement is used to immediately leave the for loop (and reach the end of the script).

Conclusion

A bash script is a file containing a set of instructions that can be executed. A for loop allows part of a script to be repeated many times. With the use of variables, external commands, and the break and continue statements, bash scripts can apply more complex logic and carry out a wide range of tasks.