allCode.jl
. They've been tested under Julia 1.10.4. <function>
or <operator>
).
This is just notation, and the symbols <
and >
should not be misconstrued as Julia's syntax.
Action | Keyboard Shortcut |
---|---|
Previous Section | Ctrl + 🠘 |
Next Section | Ctrl + 🠚 |
List of Sections | Ctrl + z |
List of Subsections | Ctrl + x |
Close Any Popped Up Window (like this one) | Esc |
Open All Codes and Outputs in a Post | Alt + 🠛 |
Close All Codes and Outputs in a Post | Alt + 🠙 |
Unit | Acronym | Measure in Seconds |
---|---|---|
Seconds | s | 1 |
Milliseconds | ms | 10-3 |
Microseconds | μs | 10-6 |
Nanoseconds | ns | 10-9 |
allCode.jl
. They've been tested under Julia 1.10.4. Reductions are a powerful technique for operations that take collections as inputs and return a single element. They arise naturally when we need to compute summary statistics such as the average, variance, or maximum of a collection.
The reduction process works by iteratively applying an operation to pairs of elements, accumulating the results at each step until the final output is obtained. For example, to compute sum(x)
for a vector x
, a reduction would start by adding the first two elements, then add the third element to that running total, and continue this cumulative process until all elements have been summed.
The method is particularly convenient when we need to transform a vector's elements prior to computing a summary statistic. By operating on scalars, reductions sidestep the need of materializing intermediate outputs, thereby reducing memory allocations. For example, to compute the sum of the log-transformed values in x
, a reduction would avoid the creation of the intermediate vector log.(x)
for computing sum(log.(x))
.
A reduction is a process that applies an operator or function to pairs of elements in collection, ultimately yielding a single output value. Its implementation relies on the use of a for-loop, which updates the result in each iteration.
A classic example of reduction is the addition of all elements in a vector. This applies +
to pairs of elements, iteratively updating the accumulated sum. The methodology is demonstrated below.
x = rand(100)
foo(x) = sum(x)
foo(x)
48.447
x = rand(100)
function foo(x)
output = 0.
for i in eachindex(x)
output = output + x[i]
end
return output
end
foo(x)
48.447
x = rand(100)
function foo(x)
output = 0.
for i in eachindex(x)
output += x[i]
end
return output
end
foo(x)
48.447
The last tab makes use of an updating operator, which offers a concise syntax <operator>=
. When this operator is applied, the variable on the left-hand side is employed as the first operand. For instance, x += a
is equivalent to x = x + a
. Other updating operators are -=
, and *=
.
Reductions rely on either a binary operator or a two-argument function. An example of the former is +
, which we used above. As for functions, max(a,b)
, which returns the maximum of the scalars a
and b
, can be used for computing the maximum of a collection x
.
A reduction operation can only be applied if the binary operation satisfies two mathematical requirements:
Associativity: the way in which operations are grouped does not change the result. For example, addition is associative because (a + b) + c = a + (b + c)
.
Existence of an identity element: there exists an element that leaves any other element unchanged when the binary operation is applied. For example, the identity element of addition is 0 because a + 0 = a
.
For operations accepting reductions, their identity elements are the following.
Operation | Identity Element |
---|---|
Sum | 0 |
Product | 1 |
Maximum | -Inf |
Minimum | Inf |
The identity element constitutes the initial value that we provide to start the iteration. Below, we illustrate how the operations have to be implemented for sum
and max
.
Considering these differences in the identity elements, below we indicate how to implement the reductions for each operation. The function foo1
indicates the desired outcome, while foo2
provides the same outcome via a reduction.
x = rand(100)
foo1(x) = sum(x)
function foo2(x)
output = 0.
for i in eachindex(x)
output += x[i]
end
return output
end
x = rand(100)
foo1(x) = prod(x)
function foo2(x)
output = 1.
for i in eachindex(x)
output *= x[i]
end
return output
end
x = rand(100)
foo1(x) = maximum(x)
function foo2(x)
output = -Inf
for i in eachindex(x)
output = max(output, x[i])
end
return output
end
x = rand(100)
foo1(x) = minimum(x)
function foo2(x)
output = Inf
for i in eachindex(x)
output = min(output, x[i])
end
return output
end
To illustrate the application through a concrete example, consider the operation sum(log.(x))
for a vector x
. This operation transforms x
via log.(x)
and then sums the transformed elements. By default, broadcasting materializes the resulting vector, therefore allocating memory for log.(x)
. However, we're not interested in this result itself: log.(x)
is a temporary vector that will be eventually "reduced" to a scalar. [note] In the section Lazy Operations, we'll explore an alternative, where broadcast doesn't materialize intermediate results. Taking this into account, we should implement an approach that bypasses this unnecessary allocation. Reductions do so by defining a scalar output
, which we iteratively update by summing the transformed values of x
.
Below, we demonstrate the technique implementation for common reductions.
x = rand(100)
foo1(x) = sum(3 .* x)
function foo2(x)
output = 0.
for i in eachindex(x)
output += 3 * x[i]
end
return output
end
@btime foo1($x)
48.528 ns (1 allocation: 896 bytes)
@btime foo2($x)
24.900 ns (0 allocations: 0 bytes)
x = rand(100)
foo1(x) = prod(3 .* x)
function foo2(x)
output = 1.
for i in eachindex(x)
output *= 3 * x[i]
end
return output
end
@btime foo1($x)
47.658 ns (1 allocation: 896 bytes)
@btime foo2($x)
43.592 ns (0 allocations: 0 bytes)
x = rand(100)
foo1(x) = maximum(3 .* x)
function foo2(x)
output = -Inf
for i in eachindex(x)
output = max(output, 3 * x[i])
end
return output
end
@btime foo1($x)
196.296 ns (1 allocation: 896 bytes)
@btime foo2($x)
171.915 ns (0 allocations: 0 bytes)
x = rand(100)
foo1(x) = minimum(3 .* x)
function foo2(x)
output = Inf
for i in eachindex(x)
output = min(output, 3 * x[i])
end
return output
end
@btime foo1($x)
194.364 ns (1 allocation: 896 bytes)
@btime foo2($x)
165.600 ns (0 allocations: 0 bytes)
The previous examples show that reductions require the explicit writing of a for-loop. Unfortunately, this can prove unwieldy to write and read. In light of this, Julia streamlines the process by adding methods to sum
, prod
, maximum
, and minimum
to implement reductions for you.
The syntax is foo(<transforming function>, x)
, where foo
is one of the functions mentioned and x
is the vector to be transformed. For instance, the following examples consider a transformation 3 .* x
. [note] Julia provides the reduce
and mapreduce
functions for implementing additional types of reductions. mapreduce
is the most general between both, as it allows for transformations of the vector by applying the function map
that we studied here. However, the built-in implementations for specific functions for sum
, prod
, maximum
, and minimum
are more efficient, as they're customized to those particular functions.
x = rand(100)
foo(x) = sum(a -> 3 * a, x)
@btime foo($x)
7.000 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = prod(a -> 3 * a, x)
@btime foo($x)
7.207 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = maximum(a -> 3 * a, x)
@btime foo($x)
174.550 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = minimum(a -> 3 * a, x)
@btime foo($x)
169.035 ns (0 allocations: 0 bytes)
Reductions can also be performed when the transforming function requires multiple arguments. To do this, you should enclose the multiple variables using zip
, and referring to each variable through indexes. We illustrate this below, where the transforming function is x .* y
.
x = rand(100); y = rand(100)
foo(x,y) = sum(a -> a[1] * a[2], zip(x,y))
@btime foo($x, $y)
29.276 ns (0 allocations: 0 bytes)
x = rand(100); y = rand(100)
foo(x,y) = prod(a -> a[1] * a[2], zip(x,y))
@btime foo($x, $y)
48.227 ns (0 allocations: 0 bytes)
x = rand(100); y = rand(100)
foo(x,y) = maximum(a -> a[1] * a[2], zip(x,y))
@btime foo($x, $y)
171.721 ns (0 allocations: 0 bytes)
x = rand(100); y = rand(100)
foo(x,y) = minimum(a -> a[1] * a[2], zip(x,y))
@btime foo($x, $y)
166.937 ns (0 allocations: 0 bytes)
Julia provides the function reduce
and mapreduce
, which provide a generic interface for reduction operations. Their difference lies in that reduce
applies the reduction directly, while mapreduce
transforms the collection's elements prior to doing it.
It's worth remarking that you should still rely on the dedicated functions sum
, prod
, max
, and min
for such operations, as they've been optimized for their respective tasks. In this context, our primary use case of reduce
and mapreduce
is when working with packages that provide their own versions. These packages could implement reductions by applying a different approach or algorithm. For instance, the package Folds
provides a parallelized version of both map
and mapreduce
, enabling the utilization of all available CPU cores. Its syntax is identical to Julia's built-in functions.
Let's first consider reduce
, which performs a reduction without transforming the collection. The syntax for the reduction is reduce(<function>,x)
, where <function>
is a two-argument function. The following example demonstrates its usage.
x = rand(100)
foo(x) = reduce(+, x)
@btime foo($x)
6.600 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = reduce(*, x)
@btime foo($x)
6.700 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = reduce(max, x)
@btime foo($x)
167.733 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = reduce(min, x)
@btime foo($x)
167.978 ns (0 allocations: 0 bytes)
Julia additionally provides the mapreduce
function, which combines the functionality of map
and reduce
. As we discussed here, recall that map(foo,x)
transforms each element of the collection x
by applying foo
element-wise. Consequently, mapreduce(<transformation>,<reduction>,x)
first transforms x
's elements through map
, and then applies a reduction to the resulting output.
Note that reduce
can be considered as a special case of mapreduce
, where the latter transforms x
through the identity function, identity(x)
. Likewise, mapreduce(<transformation>,<operator>,x)
produces the same result as reduce(<operator>,map(<transformation>,x))
. However, mapreduce
is more efficient, since it avoids the allocation of the intermediate transformed vector. This is demonstrated below, where we consider a transformation 3 .* x
eventually reduced through sum
. In other words, the output is equivalent to sum(3 .* x)
.
x = rand(100)
foo(x) = mapreduce(a -> 3 * a, +, x)
@btime foo($x)
6.700 ns (0 allocations: 0 bytes)
x = rand(100)
foo(x) = reduce(+, map(a -> 3 * a, x))
@btime foo($x)
54.684 ns (1 allocation: 896 bytes)