pic
Personal
Website

3e. Map and Broadcasting (Vectorization)

PhD in Economics

Introduction

This section explores element-wise operations on iterable collections, which are characterized by their ability to access elements sequentially. Examples of such collections include vectors, tuples, and ranges.

The first approach covered is the map function. This applies a function to each individual element, making it particularly convenient for transforming collections.

After this, we'll shift our focus to a fundamental technique in Julia known as broadcasting. This allows the user to apply functions and operators element-wise, while maintaining code compact. The technique is quite versatile, supporting operations on collections of equal size, as well as on combinations of scalars and same-size collections. Its distinctive syntax, which involves appending a dot . to the function/operator, makes it easily identifiable.

Remark
On this website, the terms broadcasting and vectorization will be used interchangeably. Strictly speaking, vectorization is restricted to applications with arrays of the same size, while broadcasting is an extension allowing for scalars.
Warning!
Later on the website, we'll explore for-loops as an alternative approach to transforming arrays. Several languages strongly recommend vectorizing operations to improve speed, highly discouraging for-loops.
Such advice does not apply to Julia. In fact, when it comes to optimizing code in Julia, for-loops are often the key to achieving faster performance. In this context, the main advantage of vectorization in Julia is to streamline code without sacrificing much speed.

The "Map" Function

The map function is available in many programming languages, enabling the user to create a new collection with transformed elements. It can be applied in two ways, depending on the number of input vectors.

In its most basic form, map takes a single-argument function foo and a collection x. Its syntax is map(foo,x), and returns a new collection with i-th element foo(x[i]). A common use case for map involves using an anonymous function as the argument foo, as illustrated below.

x = [1, 2, 3]


z = map(log,x)

Output in REPL
julia>
z
3-element Vector{Float64}: 0.0 0.69315 1.09861

julia>
[log(x[1]), log(x[2]), log(x[3])]
3-element Vector{Float64}: 0.0 0.69315 1.09861

x = [1, 2, 3]


z = map(a -> 2 * a, x)

Output in REPL
julia>
z
3-element Vector{Int64}: 2 4 6

julia>
[2*x[1], 2*x[2], 2*x[3]]
3-element Vector{Int64}: 2 4 6

The second way to use map is with functions that take multiple arguments. In this case, the syntax is map(foo, x, y), where foo is a two-argument function, returning a new collection with i-th element foo(x[i], y[i]). When the collections x and y have different sizes, foo is applied element-wise until the shortest collection is exhausted. Note that this rule applies even when either x or y is a scalar, in which case map returns a single element.

To demonstrate the use of map with multiple arguments, let's consider the addition operation +. As you may recall, + can be used in two ways: as an operator (e.g., 2 + 3) and as a function (e.g., +(2, 3)). By using + as a function, we can apply map to perform element-wise addition of multiple collections.

x = [ 1, 2, 3]
y = [-1,-2,-3]

z = map(+, x, y)        # recall that `+` exists as both operator and function

Output in REPL
julia>
z
3-element Vector{Int64}: 0 0 0

julia>
[+(x[1],y[1]), +(x[2],y[2]), +(x[3],y[3])]
3-element Vector{Int64}: 0 0 0

x = [ 1, 2, 3]
y = [-1,-2,-3]

z = map((a,b) -> a+b, x, y)

Output in REPL
julia>
z
3-element Vector{Int64}: 0 0 0

julia>
[x[1]+y[1], x[2]+y[2], x[3]+y[3]]
3-element Vector{Int64}: 0 0 0

x = [ 1, 2, 3]
y = [-1,-2]

z = map(+, x, y)        # recall that `+` is both an operator and a function

Output in REPL
julia>
z
2-element Vector{Int64}: 0 0

julia>
[+(x[1],y[1]), +(x[2],y[2])]
2-element Vector{Int64}: 0 0

x = [ 1, 2, 3]
y =  -1

z = map(+, x, y)        # recall that `+` is both an operator and a function

Output in REPL
julia>
z
1-element Vector{Int64}: 0

julia>
[+(x[1],y[1])]
1-element Vector{Int64}: 0

Broadcasting

The function map can quickly become cumbersome when dealing with complex functions or multiple arguments. This is where broadcasting comes into play, offering a more efficient syntax for element-wise operations.

The concept of broadcasting will be explored in a step-by-step manner. We'll first explore how broadcasting applies to vectors of equal size, covering both functions and operators. After this, we'll demonstrate how broadcasting can also accommodate combinations of scalars and vectors, despite not supporting vectors of different sizes in general. In these cases, the scalar is treated as a vector that matches the size of the vectors.

Unlike other programming languages, broadcasting is an intrinsic feature of Julia, thereby applying to any function or operator. For the particular case of functions, this entails that broadcasting even applies to user-defined functions.

Broadcasting Functions

Broadcasting allows functions to be applied element-wise, significantly expanding their range of applications. The feature is implemented by simply appending a dot after the name of the function, as in foo.(x).

Remarkably, any function foo has a broadcasting counterpart foo.. This implies that the functionality is automatically available for all user-defined functions. Additionally, broadcasting isn't restricted to numeric collections, applying to any iterable collection regardless of its elements. For instance, we'll show below that it can be used for string manipulation.

Similarly to map, broadcasting can be applied to both single- and multiple-argument functions. Each case warrants separate consideration.

In the case of single-argument functions, broadcasting foo over a collection x returns a new collection with foo(x[i]) as its i-th element. The following examples demonstrate its usage.

# `log(a)` is a function for a single element `a`

x = [1,2,3]

Output in REPL
julia>
log.(x)
3-element Vector{Float64}: 0.0 0.69315 1.09861

julia>
[log(x[1]), log(x[2]), log(x[3])] # identical to log.(x)
3-element Vector{Float64}: 0.0 0.69315 1.09861

square(a) = a^2     #user-defined function for a single element 'a'

x = [1,2,3]

Output in REPL
julia>
square.(x)
3-element Vector{Int64}: 1 4 9

julia>
[square(x[1]), square(x[2]), square(x[3])] # identical to square.(x)
3-element Vector{Int64}: 1 4 9

As for multiple-argument functions, suppose a function foo and collections x and y. Then, foo.(x,y) returns a new collection with foo(x[i],y[i]) as its i-th element.

Importantly, collections with different sizes aren't allowed, establishing a clear contrast between broadcasting and map. We'll see later that the sole exception to this rule is when one of the objects is a scalar (or a single-element collection).

To illustrate the usage, we make use of the function max, which returns the maximum of its arguments (e.g., max(1,2) returns 2).

# 'max(a,b)' returns 'a' if 'a>b', and 'b' otherwise

x = [0, 4, 0]
y = [2, 0, 8]

Output in REPL
julia>
max.(x,y)
3-element Vector{Float64}: 2 4 8

julia>
[max(x[1],y[1]), max(x[2]),y[2]), max(x[3]),y[3])] # identical to max.(x,y)
3-element Vector{Float64}: 2 4 8

foo(a,b) = a + b        # user-defined function for single elements 'a' and 'b'

x = [-2, -4, -10]
y = [ 2,  4,  10]

Output in REPL
julia>
foo.(x)
3-element Vector{Int64}: 0 0 0

julia>
[foo(x[1],y[1]), foo(x[2]),y[2]), foo(x[3]),y[3])] # identical to foo.(x,y)
3-element Vector{Float64}: 0 0 0

Remark
Broadcasting can be applied to functions with arguments of any type, not just numbers. For instance, consider the function string, which concatenates string arguments to form a sentence (e.g., string("hello ","world") returns "hello world").

Code

country       =  ["France", "Canada"]
is_in         =  [" is in "  , " is in "]
region        =  ["Europe", "North America"]

Output in REPL
julia>
string.(country, is_in, region)
2-element Vector{String}: "France is in Europe" "Canada is in North America"

Broadcasting Operators

It's also possible to broadcast operators, making the operator apply element-wise. Its use requires prepending a dot before the operator.

For its application, it's helpful to recall the syntax of operators based on the number of operands. Specifically, the syntax of unary operators is <symbol>x, so that .√x broadcasts √. Likewise, the syntax for binary operators is x <symbol> y, such that x .+ y computes the element-wise sum of vectors x and y, resulting in [x[1]+y[1], x[2]+y[2], ...].

x = [ 1,  2,  3]
y = [-1, -2, -3]

Output in REPL
julia>
x .+ y
3-element Vector{Int64}: 0 0 0

x = [1, 2, 3]

Output in REPL
julia>
.√x
3-element Vector{Float64}: 1.0 1.41421 1.73205

Broadcasting Operators With Single-Element Objects

In all the cases covered, broadcasting was based on arguments of the same size. Instead, when it comes to collections of dissimilar size, such as x = [1,2] and y=[3,4,5], broadcasting becomes unfeasible.

One exception to this rule occurs when functions/operations are broadcasted over vectors of equal size and scalars. In these cases, scalars are treated as objects having the same size as the vectors, with all entries equal to the scalar number. For example, given x = [1,2,3] and y = 2, the expression x .+ y produces the same result as defining y = [2,2,2] and then executing x .+ y. This is demonstrated below.

x = [0,10,20]
y = 5

Output in REPL
julia>
x .+ y
3-element Vector{Int64}: 5 15 25

x = [0,10,20]
y = [5, 5, 5]

Output in REPL
julia>
x .+ y
3-element Vector{Int64}: 5 15 25

Remark
We emphasize that the broadcasting mechanism is valid for any iterable collection. Thus, the example for strings presented above, can be rewritten as follows.

Code

country       =  ["France", "Canada"]
is_in         =  " is in "
region        =  ["Europe", "North America"]

Output in REPL
julia>
string.(country, is_in, region)
2-element Vector{String}: "France is in Europe" "Canada is in North America"

Iterable Objects

So far, our examples have focused on broadcasting vectors. Furthermore, we've explored broadcasting by treating functions and operators separately. Next, we apply broadcasting to other elements and combined operations.

We first show that broadcasting can be applied to any iterable object. This includes tuples and ranges, as the following examples demonstrate.

x = (1, 2, 3)    # or simply x = 1, 2, 3

Output in REPL
julia>
log.(x)
(0.0, 0.69315, 1.09861)

julia>
x .+ x
(2, 4, 6)

x = 1:3

Output in REPL
julia>
log.(x)
(0.0, 0.69315, 1.09861)

julia>
x .+ x
(2, 4, 6)

x = (1, 2, 3)    # or simply x = 1, 2, 3
y = 1:3

Output in REPL
julia>
x .+ y
(2, 4, 6)

It's also possible to simultaneously broadcast operators and functions. Given the pervasiveness of such operations, Julia provides the macro @. for an effortless application. The macro should be added at the beginning of the statement, and has the effect of automatically adding a "dot" to each operator and function found in a statement.

To demonstrate its use, suppose we seek to add two vectors element-wise, and then transform the resulting elements by squaring each.

x = [1, 0, 2]
y = [1, 2, 0]

square(x) = x^2

Output in REPL
julia>
square.(x .+ y)
3-element Vector{Int64}: 4 4 4

x = [1, 0, 2]
y = [1, 2, 0]

square(x) = x^2

Output in REPL
julia>
@. square(x + y)
3-element Vector{Int64}: 4 4 4

x = [1, 0, 2]
y = [1, 2, 0]

temp = x .+ y
z    = temp .^ 2

Output in REPL
julia>
temp
3-element Vector{Int64}: 2 2 2

julia>
z
3-element Vector{Int64}: 4 4 4

Broadcasting Functions vs Broadcasting Operators

We've demonstrated that both functions and operators can be broadcasted. This permits us to implement operations in two distinct ways: either broadcast a function that operates on a single element or define a function that directly performs the broadcasted operation.

By computing the squared elements of a vector x, the examples below demonstrate both approaches.

x                 = [1, 2, 3]

number_squared(a) = a ^ 2         # function for a single element 'a'

Output in REPL
julia>
number_squared.(x)
3-element Vector{Int64}: 1 4 9

x                 = [1, 2, 3]

vector_squared(x) = x .^ 2         # function for a vector 'x'

Output in REPL
julia>
vector_squared(x) # '.' not needed (it'd be redundant)
3-element Vector{Int64}: 1 4 9

While both approaches yield the same output, defining a function that operates on a single element is the more advisable choice. This is due to several reasons. Firstly, number_squared(a) enables users to seamlessly perform computations on both scalar values and collections, simply by choosing between executing the function or its broadcasted version. Secondly, it decouples the specific operation from whether its application should be element-wise or not. Lastly, the notation number_square.(x) explicitly conveys that the operation is element-wise, an aspect that would remain hidden in vector_squared(x).

Broadcasting Over One Argument Only

When we broadcast a function or operator over vectors x and y, both are simultaneously iterated. However, we could be interested in operations where we want to iterate only over x, while keeping y fixed. A specific scenario where this could arise is with the function in(a,list). This function assesses whether the scalar a equals some element in the vector list. Thus, executing in(2, [1,2,3]) returns true, because 2 is an element of the vector [1,2,3].

Suppose instead that, given a vector x, we wish to verify whether each of its numbers equals a number in list = [1,2,3]. Below, we show that this can't be implemented directly by broadcasting in.

x    = [1, 2]
list = [1, 2, 3]

Output in REPL
julia>
in.(x, list)
ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 2 and 3

x    = [1, 2, 4]
list = [1, 2, 3]

Output in REPL
julia>
in.(x, list)
3-element BitVector: 1 1 0

In the first example, in.(x, list) errors because x and list should either have the same size or one of them be a scalar. The second example provides an output, but it's not the one we're aiming for: it checks whether 1==1, 2==2, and 4==3, when our goal is to check whether 1 is in [1,2,3], 2 is in [1,2,3], and 3 is in [1,2,3].

Intuitively, we need a way to signal Julia that list should be treated as a single element, while iterating over x. This can be accomplished either by enclosing list in a vector/tuple or by using the function Ref. The syntax of each case is respectively [list], (list,), and Ref(list).

Both [list] and (list,) transform the variable into a vector/tuple, whose only element is itself a vector/tuple. [note] Recall that tuples with a single element must be written with a trailing comma, as in (list,). If instead we use the expression (list), this would be interpreted as list, and hence treated as a vector. Explaining what Ref does is beyond our scope at this point. What matters for practical purposes is that Ref(x) makes x be treated as a scalar, and is as perform as using a tuple (there's some performance penalty in using [list]).

x    = [2, 4, 6]
list = [1, 2, 3]        # 'x[1]' equals the element 2 in 'list'

Output in REPL
julia>
in.(x, [list])
3-element BitVector: 1 0 0

x    = [2, 4, 6]
list = [1, 2, 3]        # 'x[1]' equals the element 2 in 'list'

Output in REPL
julia>
in.(x, (list,))
3-element BitVector: 1 0 0

x    = [2, 4, 6]
list = [1, 2, 3]        # 'x[1]' equals the element 2 in 'list'

Output in REPL
julia>
in.(x, Ref(list))
3-element BitVector: 1 0 0

The output vector is a BitVector, and hence equivalent to [true, false, false]. The result reflects that x[1] equals 2 and 2 belongs to list, whereas the x[2] and x[3] don't equal any element in list.

Warning!
Although it's possible to wrap the list in a vector to be treated as a single element, you should use either a tuple or Ref. They are more performant.

Although we've considered functions, the same principle applies to operators. This can be seen through ∈, which is an operator analogous to the function in. [note] ∈ can also be used as a function, and its syntax is the same as in (i.e., ∈(x,list))

x    = [2, 4, 6]
list = [1, 2, 3]

Output in REPL
julia>
x .∈ (list,) # only 'x[1]' equals an element in 'list'
3-element BitVector: 1 0 0

x    = [2, 4, 6]
list = [1, 2, 3]

Output in REPL
julia>
x .∈ Ref(list) # only 'x[1]' equals an element in 'list'
3-element BitVector: 1 0 0

We've avoided the broadcasting option with a vector, with the goal of emphasizing that we should always broadcast using Ref or a tuple.

Currying and Fixing Arguments (OPTIONAL)

Currying is a technique that transforms the evaluation of a function with multiple arguments into the evaluation of a sequence of functions, each with a single argument. [note] The name comes from the mathematician Haskell Curry, not the spice. For instance, a curried version of f(x,y) can be written f(x)(y) and would provide an identical output.

Our interest in currying lies in its ability to simplify broadcasting: it enables the treatment of an argument as a single object, without the need to use Ref or encapsulate objects as vectors/tuples. The technique could seem confusing for new users. In particular, it requires a good understanding of functions as first-class objects, entailing that functions can be treated as variables themselves. My primary goal is that you can at least recognize the syntax of currying, and thus be able to read code that applies the technique.

We start by illustrating how currying can be applied in general.

addition(x,y) = 2 * x + y

Output in REPL
julia>
addition(2,1)
5

addition(x,y) = 2 * x + y

#the following are identical
curried(x) = (y -> addition(x,y))
curried    = x -> (y -> addition(x,y))

Output in REPL
julia>
curried(2)(1)
5

addition(x,y) = 2 * x + y
curried(x)    = (y -> addition(x,y))

#the following are equivalent
f    = curried(2)     # function of 'y', with 'x' fixed to 2
g(y) = addition(2,y)

Output in REPL
julia>
f(1)
5

julia>
g(1)
5

The key to understanding the syntax is that curried(x) is a function itself, with y as its argument. The second tab illustrates this clearly through the equivalence between f = curried(2) and addition(2,y). These functions help us understand the logic behind curry, but are only useful for the specific case of x=2. Instead, curried(x) allows the user to call the function through curried(x)(y), and so be used for any x.

Bearing in mind that any function can be broadcasted and curried(x) is a function, we can broadcast over y for a fixed x. Specifically, any function foo in Julia can be broadcasted through f.. Noticing that curried(x) plays the same role as foo, curried(x).(y).

As for broadcasting, any function foo in Julia can be broadcasted through f.. And we've determined that curried(x) is a function just like any other. Therefore, curried(x) plays the same role as foo, and so we can broadcast over y for a fixed x through curried(x).(y).

a = 2
b = [1,2,3]
addition(x,y) = 2 * x + y

curried(x) = (y -> addition(x,y))   # 'curried(x)' is a function, and 'y' its argument

Output in REPL
julia>
curried(a).(b)
3-element Vector{Int64}: 5 6 7

a = 2
b = [1,2,3]
addition(x,y) = 2 * x + y
curried(x)    = (y -> addition(x,y))

#the following are equivalent
f    = curried(a)             # 'foo1' is a function, and 'y' its argument
g(y) = addition(2,y)

Output in REPL
julia>
f.(b)
3-element Vector{Int64}: 5 6 7

julia>
g.(b)
3-element Vector{Int64}: 5 6 7

Let's now explore how the currying technique can help treat a vector as a single element in broadcasting. To illustrate this, consider the function in used previously. This function has a built-in curried version, which can be applied through in(list).(x) for vectors list and x. To better demonstrate its usage, the following example compares an implementation with Ref, the built-in curried in, and our own curry implementation.

x    = [2, 4, 6]
list = [1, 2, 3]

Output in REPL
julia>
in.(x,Ref(list))
3-element BitVector: 1 0 0

x    = [2, 4, 6]
list = [1, 2, 3]

our_in(list_elements) = (x -> in(x,list_elements))   # 'our_in(list_elements)' is a function

Output in REPL
julia>
our_in(list).(x) # it broadcasts only over 'x'
3-element BitVector: 1 0 0

x    = [2, 4, 6]
list = [1, 2, 3]

Output in REPL
julia>
in(list).(x) # similar to 'our_in'
3-element BitVector: 1 0 0