In the previous section, we introduced the fundamentals of memory allocations, highlighting that objects can be stored on either the heap or the stack. Furthermore, we emphasized on the conventional terminology used in programming and Julia, where allocations exclusively refer to those on the heap. This definition also gives rise to the expression that "an object allocates" when the object is stored on the heap.
The distinction isn't merely to economize on words. Rather, it reflects that heap allocations are the ones that matter when it comes to performance: they involve a more complex memory management, which can significantly hinder performance.
In fact, the close relationship between performance and heap allocations can be appreciated through the macros @time
and @btime
. To provide a comprehensive measure of performance, they not only return the total runtime of an operation, but additionally the heap allocations involved.
In the following, we initiate our analysis of memory allocation by categorizing objects into two groups: those that allocate and those that don't.
We start by focusing on objects that don't allocate memory. They include:
Numbers
Tuples
Named Tuples
Ranges
As these objects don't allocate, neither does creating, accessing, and operating on them. This is demonstrated below.
The most common object that allocates memory is arrays. These allocations occur not only when we create an array and assign it to a variable, but also when computations returning arrays are performed on the fly. The following example illustrates this point.
Slicing is another operation that creates an array, and therefore allocates. This is due to the default behavior of slicing, which returns a new copy rather than a view of the original object. The sole exception to this rule is when a single element is accessed, in which case no new allocation occurs.
Other operations that involve array creation include array comprehensions and broadcasting. Remarkably, broadcasting even involves memory allocation when intermediate results are computed internally, but not explicitly returned. This specific case is demonstrated in "Broadcasting 2" below.