<function>
or <operator>
).
This is just notation, and the symbols <
and >
should not be misconstrued as Julia's syntax.
Action | Keyboard Shortcut |
---|---|
Previous Section | Ctrl + 🠘 |
Next Section | Ctrl + 🠚 |
List of Sections | Ctrl + z |
List of Subsections | Ctrl + x |
Close Any Popped Up Window (like this one) | Esc |
Open All Codes and Outputs in a Post | Alt + 🠛 |
Close All Codes and Outputs in a Post | Alt + 🠙 |
Unit | Acronym | Measure in Seconds |
---|---|---|
Seconds | s | 1 |
Milliseconds | ms | 10-3 |
Microseconds | μs | 10-6 |
Nanoseconds | ns | 10-9 |
High performance in Julia depends critically on the notion of type stability. Its definition is relatively straightforward: a function is type-stable when the types of its expressions can be inferred from the types of its arguments. When this property holds, Julia can specialize the computation method, resulting in significant performance gains.
Despite its simplicity, type stability is subject to various nuances, and a careful definition requires a solid foundation in two key areas: Julia's type system and the inner workings of functions. The current section equips you with the necessary knowledge to grasp the former, deferring the internals of functions to the next section. Moreover, we focus on scalars and vectors, leaving more complex objects for subsequent sections.
Before you continue, I recommend reviewing the basics of types introduced here.
Variables in Julia are mere tags for objects. In turn, objects hold values described by types. The most common types for scalars are Float64
and Int64
, with the vector counterparts being Vector{Float64}
and Vector{Int64}
. Recall that Vector
is an alias for a one-dimensional array, so that a type like Vector{Float64}
is equivalent to Array{Float,1}
.
Int
As an Alternative to Int64
Int
as the default type for integers. The type Int
is an alias that adapts its value to your CPU's architecture. Most modern computers are 64-bit systems, making Int
be equivalent to Int64
. On 32-bit systems, Int
becomes Int32
. Julia's type system is organized in a hierarchical way. This allows for the definition of subsets and supersets of types, which are called subtypes and supertypes in the context of types. [note] Types don't necessarily have a subtype-supertype hierarchy. For example, Float64
and Vector{String}
exist independently, without a hierarchical relationship. This fact will become clearer when the concepts of abstract and concrete types are defined. For instance, the type Any
is a supertype that includes all possible types in Julia, occupying the highest position in any type hierarchy. Another example of supertype is Number
, which encompasses all numeric types (Float64
, Float32
, Int64
, etc.).
Supertypes provide great flexibility for writing code. They enable the grouping of values to define operations in common. For instance, defining +
for the abstract type Number
ensures its applicability to all number types, regardless of whether they are integers, floats, or their numerical precision.
A special supertype known as Union
will be instrumental for our examples. It represents variables that can potentially hold values with different types, and its syntax is Union{<type1>, <type2>, ...}
. For example, a variable with type Union{Int64, Float64}
could be either an Int64
or Float64
. Note that, by definition, unions are always supertypes of their arguments.
Missing
. For instance, if we load a column that contains both integers and empty entries, the resulting data will be stored with type Vector{Union{Missing,Int64}}
.
The hierarchical nature of types makes it possible to represent subtypes and supertypes as trees. This gives rise to the notions of abstract and concrete types.
An abstract type acts as a parent category, which necessarily breaks down into subtypes. The type Any
in Julia is a prime example. In contrast, a concrete type represents an irreducible unit lacking subtypes, entailing that it's final.
The diagram below illustrates the difference between abstract and concrete types for scalars. The example is based on the hierarchy of the type Number
. It's worth noting that the labels included match the corresponding type name in Julia. [note] The Signed
subtype of Integers
allows for the representation of negative and positive integers. Julia also offers the type Unsigned
, which only accepts positive integers and comprises subtypes such as UInt64
and UInt32
.
For scalars, the distinction between abstract and concrete types is relatively straightforward. On the contrary, the difference for vectors is more subtle, as shown in the diagram below.
The tree reveals that Vector{T}
for a given type T
is a concrete type. By definition, this implies that variables can be instances of some Vector{T}
. Moreover, Vector{T}
can't have subtypes. Consequently, vectors like Vector{Int64}
aren't a subtype of Vector{Any}
, despite Int64
being a subtype of Any
. This feature stands in stark contrast to scalars, where Any
is an abstract type. However, it also perfectly aligns with the understanding of vectors as collections of homogeneous elements, in the sense of sharing the same type.
In Julia, instantiation refers to the process of creating an object with a certain type. Importantly, only objects with concrete types can be instantiated, which entails that there can't be values with abstract types. This fundamental principle has implications for some certain colloquial expressions we commonly use. For example, when we say that a variable has type Any
, it actually means that the variable can assume any concrete type, as long as it's a subtype of Any
.
This distinction will become crucial in what comes, particularly for type-annotating variables. It implies that declaring a variable with an abstract type amounts to restricting the set of possible concrete types, with the variable ultimately adopting one of these concrete types.
At this point, you may be wondering how all this relates to type stability. The answer is given by how Julia performs computations.
Given an operation, high performance requires specialization of the computation method. We'll see that this is unfeasible in the global scope, as Julia treats global variables as embodying any type. In contrast, when we wrap code in a function, Julia begins by identifying concrete types for each argument. With this information, it attempts to identify concrete types for all expressions within the function. When concrete types can indeed be identified, we say that the function is type stable, and Julia is able to specialize its method. Otherwise, if expressions could adopt various concrete types, performance is substantially degraded. This is because Julia is forced to consider multiple implementations, one for each possible type.
For scalars and vectors, this essentially means that expressions must ultimately operate over primitive types. Examples of numeric primitive types are integers and floating-point numbers, such as Int64
, Float64
, and Bool
. Thus, operations like sum
over Vector{Int64}
or Vector{Float64}
allow for specialization, while operating over Vector{Any}
precludes it.
String
doesn't pose a problem for type stability. This is because String
is internally represented as a collection of characters, which are represented by the primitive type Char
.
The remainder of the section is devoted to operators and functions that handle types. Specifically, we'll present the operator <:
to identify supertypes, along with several approaches to declaring a variable's type.
It's quite possible that you won't use any of the techniques presented. The reason is that, as we'll see, Julia identifies the types of variables passed to a function. Nevertheless, the operators introduced will be crucial to understand upcoming sections.
<:
The symbol :<
assesses whether a type T
is a subset of S
. This can be employed as an operator T <: S
or as a function <:(T,S)
. For example, Int64 <: Number
or <:(Int64, Number)
verifies whether Int64
is a subtype of Number
, which would return true
.
# all the statements below are `true`
Float64 <: Any
Int64 <: Number
Int64 <: Int64
# all the statements below are `false`
Float64 <: Vector{Any}
Int64 <: Vector{Number}
Int64 <: Vector{Int64}
The fact that Int64 <: Int64
evaluates to true illustrates a fundamental principle: every type is a subtype of itself. Moreover, in the case of concrete types, this is the only subtype.
where
By combining <:
with the type with Union
, you can also check if a type belongs to a set of types. For example, Int64 <: Union{Int64, Float64}
evaluates whether Int64
equals Int64
or Float64
, thus returning true
.
The approach can be made more widely applicable by using the keyword where
, along with a type parameter T
that can take multiple values. The syntax is <type depending on T> where T <: <set of types>
, where T
can be represented by any other character.
# all the statements below are `true`
Float64 <: Any
Int64 <: Union{Int64, Float64}
Int64 <: Union{T, String} where T <: Number # `String` represents text
# all the statements below are `true`
Vector{Float64} <: Vector{T} where T <: Any
Vector{Int64} <: Vector{T} where T <: Union{Int64, Float64}
Vector{Number} <: Vector{T} where T <: Any
# all the statements below are `false`
Vector{Float64} <: Vector{Any}
Vector{Int64} <: Vector{Union{Int64, Float64}}
Vector{Number} <: Vector{Any}
# all the statements below are `true`
Vector{Float64} <: Vector{<:Any}
Vector{Int64} <: Vector{<:Union{Int64, Float64}}
Vector{Number} <: Vector{<:Any}
# all the statements below are `false`
Vector{Float64} <: Vector{Any}
Vector{Int64} <: Vector{Union{Int64, Float64}}
Vector{Number} <: Vector{Any}
Types constructed through parameters like T
are known as parametric types. In the example above, they allow us to distinguish between a concrete type like Vector{Any}
and a set of concrete types Vector{T} where T <: Any
, where the latter encompasses Vector{Int64}
, Vector{Float64}
, Vector{String}
, etc.
Any
<:
and simply write where T
, Julia implicitly interprets the statement as where T <: Any
.
# all the statements below are `true`
Float64 <: Any
Float64 <: T where T <: Any # identical to the line above
Vector{Int64} <: Vector{T} where T <: Any
# all the statements below are `true`
Float64 <: Any
Float64 <: T where T # identical to the line above
Vector{Int64} <: Vector{T} where T
We now indicate how to type-annotate a variable. The technique can be used to either assert a variable's type during an assignment or to restrict the types of function arguments.
There are two approaches to type-annotating variables. The first one relies on the binary operator ::
and its syntax is x::<type>
. The second approach leverages the Boolean binary operator <:
, which must be combined with ::
and the keyword where
. Specifically, the syntax is x::T where T <: <type>
, where T
can be replaced with any other character.
Next, we illustrate both, by separately considering type-annotations for assignments and for function arguments.
Let's start illustrating the approaches for scalar assignments. Each tab below declares an identical type for x
and for y
.
x::Int64 = 2 # only reassignments to `Int64` are possible
y::Number = 2 # only reassignments to `Float64`, `Float32`, `Int64`, etc are possible
x = 2.5
y = 2.5
y = "hello"
x::T where T <: Int64 = 2 # only reassignments to `Int64` are possible
y::T where T <: Number = 2 # only reassignments to `Float64`, `Float32`, `Int64`, etc are possible
x = 2.5
y = 2.5
y = "hello"
x
in an assignment, you can't modify x
's type afterwards. The only way to fix this is by starting a new Julia session. The fact that x
has the same type across all tabs follows because T <: Float64
only includes Float64
. This occurs as Float64
is a concrete type, which by definition has no subtypes other than itself. Considering this, it's common to directly assert scalar types through ::
rather than <:
.
On the contrary, the choice between ::
or <:
has different implications when a vector's type is asserted. The reason is that declaring Vector{Number}
is quite different from Vector{T} where T <: Number
. The former establishes that Vector{Number}
is the only possible concrete type, while the latter that elements have a concrete type that is a subtype of Number
.
x::Vector{Any} = [1,2,3] # `x` will always be `Vector{Any}`
y::Vector{Number} = [1,2,3] # `y` will always be `Vector{Number}`
typeof(x)
typeof(y)
x::Vector{T} where T <:Any = [1,2,3] # `x` can be reassigned to `Vector{Float64}`, `Vector{String}`, etc
y::Vector{T} where T <: Number = [1,2,3] # `x` can be reassigned to `Vector{Float64}`, `Vector{Int64}`, etc
typeof(x)
typeof(y)
The principles outlined apply even when a variable's type isn't explicitly annotated. The reason is that an assignment without ::
implicitly annotates the variable with Any
, where Any
is the supertype that encompasses all possible types. Specifically, statements like x = 2
and x::Any = 2
are equivalent.
The same occurs when omitting <:
from the expression where T
, which implicitly takes T <: Any
. Thus, for instance, x = 2
is equivalent to x::T where T = 2
and x::T where T <: Any = 2
. Considering this, all the variables below restrict types in the same way.
# all are equivalent
a = 2
b::Any = 2
# all are equivalent
a = 2
b::T where T = 2
c::T where T <: Any = 2
The default restriction to Any
is the reason why we can reassign variables with any value. For instance, given a = 1
, executing a = "hello"
afterwards is valid because a
is implicitly type-annotated with Any
.
where
, especially when where T
is shorthand for where T <: Any
. These concise statements can easily lead to confusion.
a::T where T = 2 # this is not `T = 2`, it's `a = 2`
a::T where {T} = 2 # slightly less confusing notation
a::T where {T <: Any} = 2 # slightly less confusing notation
foo(x::T) where T = 2 # this is not `T = 2`, it's `foo(x) = 2`
foo(x::T) where {T} = 2 # slightly less confusing notation
foo(x::T) where {T <: Any} = 2 # slightly less confusing notation
Function arguments can also be type-annotated. The examples below illustrate this, where the function only processes integers.
function foo1(x::Int64, y::Int64)
x + y
end
foo1(1, 2)
foo1(1.5, 2)
function foo2(x::Vector{T}, y::Vector{T}) where T <: Int64
x .+ y
end
foo2([1,2], [3,4])
foo2([1,2], [3.0, 4.0])
Note that employing the same parameter T
for both arguments forces variables to share the same type. Moreover, types like Int64
preclude the use of Float64
, even for numbers like 3.0
. If you seek to stay flexible, a more suitable approach is to use an abstract type like Number
and two different type parameters.
function foo2(x::T, y::T) where T <: Number
x + y
end
foo2(1.5, 2.0)
foo2(1.5, 2)
function foo3(x::T, y::S) where {T <: Number, S <: Number}
x + y
end
foo3(1.5, 2.0)
foo3(1.5, 2)
In fact, the greatest flexibility is achieved when we don't type-annotate function arguments at all, as they will implicitly default to Any
. This can be observed below, where all the tabs define the same function.
function foo(x, y)
x + y
end
function foo(x::Any, y::Any)
x + y
end
function foo(x::T, y::S) where {T <: Any, S <: Any}
x + y
end
function foo(x::T, y::S) where {T, S}
x + y
end
To conclude this section, we present a particular approach for defining variables. This replicates the values of another variable x
, while constructing the object with a concrete type. The approach relies on the use of special functions called constructors, which create new instances of a concrete type. These functions are useful for transforming a variable x
into another type, provided the transformation is possible.
Constructors are implemented by Type(x)
, where Type
should be replaced with the literal name of the type (e.g., Vector{Float64}
). Like any other function, it supports broadcasting.
x = 1
y = Float64(x)
z = Bool(x)
y
z
x = [1, 2, 3]
y = Vector{Any}(x)
y
x = [1, 2, 3]
y = Float64.(x)
y
x = 1
y = Number(x)
typeof(y)
x = [1, 2]
y = (Vector{T} where T)(x)
typeof(y)
x = 1
z = Any(x)
We can alternatively employ the function convert(T,x)
, which transforms x
to the type T
when possible.
x = 1
y = convert(Float64, x)
z = convert(Bool, x)
y
z
x = [1, 2, 3]
y = convert(Vector{Any}, x)
y
x = [1, 2, 3]
y = convert.(Float64, x)
y