pic
Personal
Website

4d. For-Loops

PhD in Economics

Introduction

A key feature of programming is its ability to automate repetitive tasks. A standard way to implement this is via for-loops. They allow you to execute the same block of code, treating each element of a list as a different input.

For-loops are delimited with the keywords for and end. To illustrate this syntax, consider the function println(a), which evaluates a and displays its output in the REPL. In case a is a string, println(a) will simply display the word stored in a. The following script repeatedly applies println to display each word in a list.

Code

for x in ["hello","beautiful","world"]
    println(x)
end

Output in REPL
"hello" "beautiful" "world"
Remark
The keyword in can be replaced by ∈ or =. [note] Recall that ∈ can be written through Tab Completion using the command \in Consequently, the following are all equivalent.

for x in ["hello","beautiful","world"] 
    println(x)
end

for x ∈ ["hello","beautiful","world"] 
    println(x)
end

for x = ["hello","beautiful","world"] 
    println(x)
end

Furthermore, we can use any term to describe the iteration variable.

An Alternative Name For The Iteration Variable

for word in ["hello","beautiful","world"] 
    println(word)
end

Output in REPL
"hello" "beautiful" "world"

Based on this example, we can identify three components that characterize a for-loop:

  • A code block to be executed: represented in the example by println(x).

  • A list of elements: represented in the example by ["hello","beautiful","world"]. This specifies the elements over which we'll apply the code block, and allow for elements of any data type (e.g., strings, numbers, and even functions). The only requirement is that the list must be an iterable object, defined as a collection whose elements can be accessed individually. Examples of iterable objects are vectors, as in the example, but there are others. One important type of iterable object is ranges, which we'll explore later.

  • An iteration variable: represented in the example by x. This serves as a label that takes on the value of each element in the list, one at a time, as the loop iterates. The iteration variable is a local variable, with no significance outside the for-loop. Its sole purpose is to provide a convenient way to access and manipulate the elements of the list within the loop.


The following sections will demonstrate how different objects can be employed as lists. We'll also show that for-loops can iterate over elements that may not be immediately apparent, such as functions.

Always Wrap For-Loops in Functions
At this stage of the website, we're still introducing fundamental concepts. Thus, the subjects are presented as simple as possible. This is why for-loops in particular will be written in the global scope.

However, you should always wrap for-loops in functions. Executing for-loops outside a function severely degrades performance, and is additionally subject to different rules regarding variable scoping. [note] In fact, older versions of Julia were restricting the use of for-loops in the global scope.

About High Performance
For-loops can negatively impact performance in certain languages (e.g., Matlab, Python, and R). These languages recommend instead using vectorized code, which operates over the whole vector rather than its individual elements. In Part II of this website, we'll see that such a recommendation has no bearing on Julia. Indeed, for-loops are as fast as vectorized code, and speeding up operations in Julia commonly rely on them.

Iterating over Indices

So far, we've considered a simple list like ["hello", "beautiful", "world"] to illustrate how for-loops work. In real applications, nonetheless, manually specifying each element in a list quickly becomes impractical. Fortunately, we can streamline the process when the list follows a predictable pattern, such as a sequence of numbers: instead of listing each element individually, we can directly describe the pattern that generates the list.

Building on this insight, we'll explore how to define ranges, which embody this approach by specifying a sequence of numbers. The technique not only allows us to iterate over numerical sequences, but also provides the possibility of accessing specific elements within a collection.

Ranges

Ranges in Julia are defined through the syntax <begin>:<steps>:<end>, where <begin> is the starting index and <end> is the ending index. Likewise, <steps> determines the increment between values, with a default increment of one if the term is omitted. By specifying a negative value for <steps>, you can also reverse the order of the range. All this is demonstrated below.

for i in 1:2:5
    println(i)
end

Output in REPL
1 3 5

for i in 3:-1:1
    println(i)
end

Output in REPL
3 2 1

Remark
Ranges in Julia aren't limited to for-loops. For instance, they can define vectors comprising sequential values when combined with the function collect.

Code

x = collect(4:6)

Output in REPL
julia>
x
3-element Vector{Int64}: 4 5 6

Iterating over Indices of an Array

Ranges can additionally be used to access elements of a collection. In combination with a for-loop, this enables us to apply the same block of code to each element of a vector.

In particular, it's possible to iterate over all indices of a vector x through the expression 1:length(x), where length(x) returns the number of elements in x. The same functionality can be achieved by the function eachindex(x). In fact, this is the recommended method for iterating over all elements, as it returns an iterator optimized for each type of array.

x = [4, 6, 8]

for i in 1:length(x)
    println(x[i])
end

Output in REPL
4 6 8

x = [4, 6, 8]

for i in eachindex(x)
    println(x[i])
end

Output in REPL
4 6 8

Remark
There are other approaches to iterating over all indices of a vector x. For instance, you can use LinearIndices(x), or firstindex(x):lastindex(x) to specify a range from the first to the last index of x.

This multiplicity of methods is necessary to handle non-standard indices, such as those provided by the OffsetArrays.jl package. This sets Julia's arrays to have 0 as the first index, a convention commonly found in many other languages. Nevertheless, unless you're creating a package for third-party use, you don't have to worry about which approach to implement. Indeed, they can all be used interchangeably, as shown in the examples below.

x = [4, 6, 8]

for i in eachindex(x)
    println(x[i])
end

Output in REPL
4 6 8

x = [4, 6, 8]

for i in 1:length(x)
    println(x[i])
end

Output in REPL
4 6 8

x = [4, 6, 8]

for i in LinearIndices(x) 
    println(x[i])
end

Output in REPL
4 6 8

x = [4, 6, 8]

for i in firstindex(x):lastindex(x)
    println(x[i])
end

Output in REPL
4 6 8

Among all the alternatives, I recommend sticking to eachindex. This automatically selects the most efficient method for each type of collection. Additionally, it employs the same syntax, regardless of the indexing approach.

Rules for Variable Scope in For-Loops

Similar to functions, for-loops create a new scope for variables. In fact, the scoping rules for both are similar, with one key difference: for-loops can modify global variables, whereas functions cannot.

Warning!
The rules of variable scope we present apply generally, with one exception that arises from following bad practices. Since this case is rare, we won't cover it.

Basically, the issue occurs when i) the for-loop is not wrapped in a function, ii) a local variable shares its name with a global variable, and iii) the script is run non-interactively (i.e., run by using the function include). [note] Recall that there are two methods to execute a script, as we saw here. The first method is the one we're using, where you work interactively with Julia. This includes running commands in the REPL's prompt julia>and the execution of a script through a code editor. The second method consists of executing a file that stores a script, through the function include.

Unless the three conditions hold simultaneously, you don't have to worry about this scenario. And even if this occurs, Julia will display a warning in the REPL indicating that there's a problem with your code.

To formalize the variable scope of for-loops, we'll refer to a variable x. The rules governing its scope are:

  • the variable of iteration x is always local, regardless of whether there's a variable x defined outside the for-loop.

  • if there's no variable named x outside the for-loop, x is a new local variable. Moreover, this variable won't be accessible outside the for-loop.

  • if there's a variable named x outside the for-loop, x refers to this variable.

The following code snippets illustrate the first two rules, which exclusively refer to local variables. The second example is particularly noteworthy, as it highlights a common mistake made by beginners: running a for-loop that defines a local variable, and then trying to access this outside the for-loop.

x = 2

for x in ["hello"]          # this 'x' is local, not related to 'x = 2'
    println(x)
end

Output in REPL
"hello"

#no `x` outside the for-loop

for word in ["hello"]
    x = word                # `x` is local to the for-loop, not available outside it
end

Output in REPL
julia>
x
ERROR: UndefVarError: x not defined

Likewise, the following example demonstrates the consequences of last rule we mentioned. This refers to the consequences of variable scope for global variables.

x = [2, 4, 6]

for i in eachindex(x)
    x[i] * 10        # it refers to the `x` outside of the for-loop
end

Output in REPL
julia>
x
3-element Vector{Int64}: 20 40 60

x = [2, 4, 6]

for word in ["hello"]
    x = word                        # it reassigns the `x` defined outside the for-loop
end

Output in REPL
julia>
x
"hello"

Array Comprehensions

It's possible to create arrays by applying a code block to each element of a collection. The technique is known as array comprehensions, and has a syntax similar to a for-loop: [<expression> for... if...], where <expression> can be an operation or a function.

For illustration purposes, consider a vector x. Suppose the goal is to create a vector y, where each element in y is the square of the corresponding element in x. [note] This example only aims at explaining the syntax of array comprehensions—y could be simply created by y = x .* x or y = x .^ 2. The following code snippets show two approaches to creating y using array comprehensions.

x = [1,2,3]


y = [a^2 for a in x]                  # or y = [x[i]^2 for i in eachindex(x)]

Output in REPL
julia>
y
3-element Vector{Int64}: 1 4 9

x = [1,2,3]

foo(a) = a^2
y      = [foo(a) for a in x]          # or y = [foo(x[i]) for i in eachindex(x)]

Output in REPL
julia>
y
3-element Vector{Int64}: 1 4 9

You can also add conditions to array comprehensions, by placing them at the end of the expression.

Comprehension with Condition

x = [i for i in 1:4 if i ≤ 3]

Output in REPL
julia>
x
3-element Vector{Int64}: 1 2 3
Remark
Array comprehensions can also be used to create matrices. Regarding syntax, this involves a comma to separate the description of each dimension.

Comprehension for Matrices

y = [i * j for i in 1:2, j in 1:2]

Output in REPL
julia>
y
2×2 Matrix{Int64}: 1 2 2 4

Multiple Iterations

Thus far, we've considered for-loops that iterate over single values. We now extend their application to iterate over multiple values. Specifically, we'll examine two scenarios: iterating simultaneously over two lists and iterating over both the indices and values of a vector.

Iterating over Two Lists

There are two ways to simultaneously iterate over two lists x and y, depending on the combination of elements desired. First, if you seek to iterate over all the possible combinations of their elements, you need the function Iterators.product(x,y). This is part of the package Iterators, which is imported by default in each Julia session.

Alternatively, you can iterate over all the ordered pairs of elements from x and y. This is implemented through the function zip(x,y). In this case, each iteration will consider the pair of i-th elements from x and y,

list1 = [1, 2]
list2 = [3, 4]

for (a,b) in Iterators.product(list1,list2) #it takes all possible combinations
    println([a,b])
end

Output in REPL
[1,3] [2,3] [1,4] [2,4]

list1 = [1, 2]
list2 = [3, 4]

for (a,b) in zip(list1,list2)               #it takes pairs of elements with the same index
    println([a,b])
end

Output in REPL
[1,3] [2,4]

It's also possible to iterate over multiple values via array comprehensions. While iteration over pairs still requires the function zip, iterating over all possible combinations can be achieved with a simpler syntax. This involves the inclusion of mutiple for clauses within the comprehension.

x = [i * j for i in 1:2 for j in 1:2]

Output in REPL
julia>
x
4-element Vector{Int64}: 1 2 2 4

x = [i * j for (i,j) in zip(1:2, 1:2)]

Output in REPL
julia>
x
2-element Vector{Int64}: 1 4

Simultaneously Iterating over Indices and Values

We can also iterate over each pair of index-value of a vector. This is implemented through the function enumerate.

x = ["hello", "world"]

for (index,value) in enumerate(x)
    println("$index $value")
end

Output in REPL
"1 hello" "2 world"

x = [10, 20]


y = [index * value for (index,value) in enumerate(x)]

Output in REPL
julia>
y
2-element Vector{Int64}: 10 40

Iterating over Functions

Functions in Julia are first-class objects. This means that they can be manipulated just like other fundamental data types, such as strings and numbers. In particular, it allows us to define a vector of functions and apply each to an object. The following example illustrates this capability, by using a vector of functions to compute descriptive statistics for a vector x.

Code

using Statistics    # loaded to access the function `mean` and `median`

x              = [10, 50, 100]
list_functions = [maximum, minimum, mean, median]

descriptive(vector,list) = [foo(vector) for foo in list]

Output in REPL
julia>
descriptive(x, list_functions)
4-element Vector{Real}: 100 10 53.333333333333336 50.0