Home
Chapters

Generic expressions are represented by <something> (e.g., <function> or <operator>). This is just notation, and the symbols < and > should not be misconstrued as Julia's syntax.

Action	Keyboard Shortcut
Previous Section	`Ctrl + 🠘`
Next Section	`Ctrl + 🠚`
List of Sections	`Ctrl + z`
List of Subsections	`Ctrl + x`
Close Any Popped Up Window (like this one)	`Esc`
Open All Codes and Outputs in a Post	`Alt + 🠛`
Close All Codes and Outputs in a Post	`Alt + 🠙`

When benchmarking, the equivalence of time measures is as follows.

Unit	Acronym	Measure in Seconds
Seconds	s	1
Milliseconds	ms	10^-3
Microseconds	μs	10^-6
Nanoseconds	ns	10^-9

Links
Dark Mode

Personal

Website

8f. Type Stability with Tuples

Martin Alfaro

PhD in Economics

PART II: HIGH PERFORMANCE
7. Introduction to Performance

a. Overview and Goals

b. When To Optimize Code?

c. Benchmarking Execution Time

d. Preliminaries on Types

e. Functions: Type Inference and Multiple Dispatch

8. Type Stability

a. Overview and Goals

b. Defining Type Stability

c. Type Stability with Scalars and Vectors

d. Type Stability with Global Variables

e. Barrier Functions

f. Type Stability with Tuples

g. Type Stability with Higher-Order Functions

h. Gotchas for Type Stability

9. Reducing Memory Allocations

a. Overview and Goals

b. Stack vs Heap

c. Objects Allocating Memory

d. Slice Views to Reduce Allocations

e. Pre-Allocations

f. Reductions

g. Static Vectors for Small Collections

h. Lazy Operations

i. Lazy Broadcasting and Loop Fusion
10. Vectorization (SIMD)

a. Overview and Goals

b. Macros as a Means for Optimizations

c. Introduction to SIMD

d. SIMD: Independence of Iterations

e. SIMD: Unit Strides

f. SIMD: Branchless Code

g. Packages For SIMD

11. Multithreading

a. Overview and Goals

b. Introduction to Multithreading

c. Task-Based Parallelism: @spawn

d. Thread-Safe Operations

e. Parallel For-Loops: @threads

f. Applying Parallelization

g. Packages for Multithreading

Introduction

A function is considered type stable when, given the types of its arguments, the compiler can accurately predict single concrete types for its expressions. This definition, while universal, takes on different forms when applied to specific objects. So far, we've exclusively concentrated on scalars and vectors, whose conditions for type stability are relatively straightforward.

In this section, we begin the analysis of type stability for other data structures by covering tuples. Guaranteeing type stability with tuples is more nuanced compared to vectors, as their type characterization demands more information. Its exploration will challenge our understanding of type stability, demanding a clear grasp of its definition and subtleties.

Warning! - Tuples Are Only Suitable For Small Collections

Remember that tuples should only be used for collections that comprise a few elements. Using them for large collections will result in significant performance degradation, or directly trigger fatal errors.

Comparing Tuples and Vectors

Tuples and vectors are the most ubiquitous forms of collections in Julia, with tuples playing a vital role for two reasons. Firstly, tuples are more performant, as they avoid the overhead of memory allocation. This aspect will be expanded on when we explore static vectors, which are essentially tuples that can be handled as vectors. The second reason is that tuples automatically encompass the case of named tuples, which are merely tuples having symbols as keys, instead of indices.

In comparison to vectors, tuples possess a more intricate type system. To appreciate this, let's compare the information needed for their type descriptions.

The information for vectors is relatively minor, as they represent collections of elements sharing a homogeneous type. Additionally, since the size of vectors can change after creation, their type doesn't need to specify the number of elements. For instance, the type Vector{Float64} conveys that all elements have type Float64, allowing for any number of elements.

Vectors represent collections of elements sharing a homogeneous type, additionally allowing for varying number of elements. Thus, the information needed to describe the types of vectors is relatively minor. For instance, the type Vector{Float64} conveys that all elements have type Float64, not requiring the specification of its size.

For their part, tuples are fixed-size collections that can accommodate heterogeneous types. This makes the characterization of a tuple's type more demanding, whose description requires both the number of elements and the type of each element. For instance, the variable tup = ("hello", 1) has type Tuple{String, Int64}, indicating that the first element has type String and the second one Int64. Furthermore, it implicitly sets the number of elements to two, as there's no possibility of appending or removing elements.

The fact that the number of elements is part of the type becomes clear when tuples contain N elements of the same type T. For this case, Julia provides the convenient alias Ntuple{N, Float64}, which is just syntactic sugar for Tuple{T, T,...,T} with T appearing N times. [note] Considering this, don't interpret NTuple as an abbreviation for the type NamedTuples. The "N" in the former case is referring to a number "N" of elements.

In the following, we show that the choice between tuples and vectors may have different implications for type stability.

Slices of Heterogeneous Tuples Can Still Be Type Stable

The type Tuple provides explicit information about each element's type, thus enabling the preservation of their elements' original types. In contrast, vectors necessarily hold elements with a uniform type, entailing that Julia chooses the smallest set of types that encompasses all the elements' types.

This behavior means that vectors whose elements are extremely different require an abstract type to characterize all elements (e.g., Any). This affects all their slices, which will inherit the type and hence operations on them result in type instability. Instead, operations on slices of tuples can be type stable, even when tuples contain elements of different types.

tup    = (1, 2, "hello")        # type is `Tuple{Int64, Int64, String}`

foo(x) = sum(x[1:2])

@code_warntype foo(tup)         # type stable (output is `Int64`)

vector = [1, 2, "hello"]        # type is `Vector{Any}`

foo(x) = sum(x[1:2])

@code_warntype foo(vector)      # type UNSTABLE

Notice that type promotion could solve this issue. Through this mechanism, Julia attempts to convert each element of the vector into a common concrete type, thus avoiding the need of abstract types like Any. This is what occurs below, where numbers of different type are combined.

tup    = (1, 2, 3.5)            # type is `Tuple{Int64, Int64, Float64}` 

foo(x) = sum(x)

@code_warntype foo(tup)         # type stable (output returned is `Int64`)

vector = [1, 2, 3.5]            # type is `Vector{Float64}` (type promotion)

foo(x) = sum(x)

@code_warntype foo(vector)      # type stable (output returned is `Float64`)

Tuples Contain More Information than Vectors

Given the differences in type information, conversions between tuples and vectors can pose several challenges for type stability. To see this, let's start with the simplest case, where a tuple is converted to a vector. The conclusions drawn from this case are straightforward, as they're essentially a corollary from the previous analysis: type stability holds when the tuple contains type-homogeneous elements or when the types are heterogeneous but can be promoted to a common type.

For the examples, recall that each type automatically creates a function that transforms variables into that type. In particular, below we introduce the function Vector for this purpose.

tup = (1, 2, 3)               # `Tuple{Int64, Int64, Int64}` or just `NTuple{3, Int64}`


function foo(tup)
    x = Vector(tup)           # 'x' has type `Vector(Int64)}`
    sum(x)
end

@code_warntype foo(tup)       # type stable

tup = (1, 2, 3.5)             # `Tuple{Int64, Int64, Float64}`


function foo(tup)
    x = Vector(tup)           # 'x' has type `Vector(Float64)}`
    sum(x)
end

@code_warntype foo(tup)       # type stable

tup = (1, 2, "hello")         # `Tuple{Int64, Int64, String}`


function foo(tup)
    x = Vector(tup)           # 'x' has type `Vector(Any)}`
    sum(x)
end

@code_warntype foo(tup)       # type UNSTABLE

For its part, creating a tuple from a vector will inevitably cause type instability, regardless of the vector's characteristics. The reason is that vectors don't store information about the number of elements they contain. Consequently, the compiler must treat tuples as having a variable number of arguments, with each possible number of elements corresponding to a different concrete type.

x   = [1, 2, "hello"]           # 'Vector{Any}' has no info on each individual type


function foo(x)
    tup = Tuple(x)              # 'tup' has type `Tuple`

    sum(tup[1:2])
end

@code_warntype foo(x)           # type UNSTABLE

x   = [1, 2, 3]                 # 'Vector{Int64}' has no info on the number of elements


function foo(x)
    tup = Tuple(x)              # 'tup' has type `Tuple{Vararg(Int64)}` (`Vararg` means "variable arguments")

    sum(tup[1:2])
end

@code_warntype foo(x)           # type UNSTABLE

Addressing Variable Arguments: Dispatch By Value

A key takeaway from the previous example is that defining tuples from vectors invariably introduce type instability. A simple remedy for this is to convert the tuple outside the function, which we then pass as a function argument. This is demonstrated in the code snippet below.

x   = [1, 2, 3]
tup = Tuple(x)

foo(tup) = sum(tup[1:2])

@code_warntype foo(tup)         # type stable

This approach should be your first option when transforming vectors to tuples. Nonetheless, there may be scenarios where defining the tuple inside the function is unavoidable. In such cases, there are a few alternatives that can be implemented.

Note first that simply passing the vector's number of elements as an argument doesn't solve the issue. The reason is that the compiler generates method instances based on information about types, not values. Thus, a function argument like length(x) merely informs the compiler that the number of elements has type Int64, without providing any additional insight.

Instead, one effective solution is to define the tuple's length using a literal value, as demonstrated below.

x   = [1, 2, 3]

function foo(x)
    tup = NTuple{length(x), eltype(x)}(x)

    sum(tup)
end

@code_warntype foo(x)        # type UNSTABLE

x   = [1, 2, 3]

function foo(x)
    tup = NTuple{3, eltype(x)}(x)

    sum(tup)
end

@code_warntype foo(tup)       # type stable

The downside of this solution is that it defeats the purpose of having a generic function, as it restricts the function to tuples of a single predetermined size. To eliminate the type instability without constraining functionality, we need to introduce a more advanced solution. This is based on a technique known as dispatch by value. Since this approach is more complex to implement, I recommend using it only when passing the tuple as a function argument is unfeasible.

Next, we lay out the principles of dispatch by value, and then apply the technique to the specific case of tuples.

Definiting Dispatch By Value

Dispatch by value enables passing information about values to the compiler. Implementing this feature, nonetheless, requires a workaround, since the compiler only gathers information about types. The hack consists of creating a type that stores values as type parameters. In the case of tuples, this type parameter is simply the vector's number of elements.

The functionality is implemented via the built-in type Val, whose syntax can be best explained with an example. Suppose a function foo and a value a that you wish the compiler to know. The technique requires defining foo with a type-annotated argument having no name, ::Val{a}. After this, you must call foo passing an argument Val(a), which instantiates a type with parameter a.

To illustrate the syntax, we revisit an example from previous sections. This considers a variable y that could be an Int64 or Float64, contingent upon a condition. The ambiguity of y's type is then transmitted to any subsequent operation, leading to type instability.

Dispatch by value is implemented by defining the condition as a type parameter of Val. In this way, the compiler will receive information about whether condition is true or false, and therefore know y's type. This makes it possible to specialize its operations.

function foo(condition)
    y = condition ? 1 : 0.5      # either `Int64` or `Float64`
    
    [y * i for i in 1:100]
end

@code_warntype foo(true)         # type UNSTABLE
@code_warntype foo(false)        # type UNSTABLE

function foo(::Val{condition}) where condition
    y = condition ? 1 : 0.5      # either `Int64` or `Float64`
    
    [y * i for i in 1:100]
end

@code_warntype foo(Val(true))    # type stable
@code_warntype foo(Val(false))   # type stable

Warning!

The function argument Val must be defined with {}, but called with (). This is because types define their parameters with {}, while instances of types require functions.

Dispatching by Value with Tuples

Let's now revisit the conversion of vectors to tuples. As we previously discussed, type instability arises because vectors don't store the size as part of their type information, leaving the compiler without sufficient information to determine the tuple's type.

Dispatch by value provides a solution to this issue: by passing the vector's length as a type parameter, the function call becomes type stable.

x = [1, 2, 3]

function foo(x, N)
    tuple_x = NTuple{N, eltype(x)}(x)   

    2 .+ tuple_x    
end

@code_warntype foo(x, length(x))        # type UNSTABLE

x = [1, 2, 3]

function foo(x, ::Val{N}) where N
    tuple_x = NTuple{N, eltype(x)}(x)   

    2 .+ tuple_x    
end

@code_warntype foo(x, Val(length(x)))   # type stable

NOTATION

PAGE LAYOUT

LINKS TO SECTIONS

KEYBOARD SHORTCUTS

TIME MEASUREMENT

Dark Mode