Description of the image
CODING GUIDES

Simple World Heatmaps in Julia

PhD in Economics

Summary

  • Goal: To plot world heatmaps in Julia to visually represent country statistics. The approach handles all computations in Julia through DataFrames, and then executes a single command in R from within Julia (through RCall). I also provide a cleaned dataset containing map coordinates, which is ready to be used and identifies countries by their ISO codes.

  • Relevance of the Note: I was looking for a simple way to draw world heatmaps in Julia. While existing Julia packages such as GMT.jl offer more advanced functionalities, they may be overkill for simple uses. Furthermore, standard datasets for map coordinates in R usually require further cleaning before they're ready to be used. To address these issues, I provide a straightforward approach if you want to handle the data and computations through Julia's DataFrames.


Representing each country's share of the world income (fake data), the final outcome would be as follows

  • Links to Download the Files of this Note:

    • World map coordinates

    • The Julia code is here under the name allCode.jl. If you wish to run the code with the same package versions used in this post, download instead allCode_withPkg.jl along with Project.toml and Manifest.toml.

1st Step: Handling Our Data in Julia

The Packages

The following script lists all the packages in Julia and R for plotting heatmaps. Keep in mind that any code written in Julia between R""" and """ is interpreted as R code, which is an application of the package RCall. [note] RCall occasionally fails to recognize R's path after installing it. If this occurs, try specifying the path to R on your computer and then building the package. Each of these steps on my computer by running ENV["R_HOME"]="C:\\Program Files\\R\\R-4.3.0" and Pkg.build("RCall").

Loading Packages
using DataFrames, RCall, CSV, Downloads

R"""
library(ggplot2)
library(svglite) #to save graphs in svg format, otherwise not necessary
"""

The (Fake) Data

We illustrate the approach through a self-contained example. To implement this, we generate some mock data representing each country's share of global income. This is contained in a DataFrame called our_df. The code for creating the DataFrame is omitted, as the specifics of how the mock data were generated are unimportant for our purposes. What matters is that, after cleaning our data and computing statistics, our_df will have the following variables.

Output in REPL
julia> our_df[1:5,:]
5×2 DataFrame
 Row │ iso3    share_income
     │ String  Float64
─────┼──────────────────────
   1 │ ABW         0.694218
   2 │ AFG         0.48446
   3 │ AGO         0.450032
   4 │ AIA         0.169835
   5 │ ALB         0.487509

Now, we proceed to plot a world heatmap that represents the variable share_income.

2nd Step: Adding Map Coordinates to the Data

To plot each country's share of the world's income, we need map coordinates. The ggplot2 package in R can provide this information by executing map_data("world"), which internally calls the maps package. However, this dataset demands further cleaning before it can be used for world maps. [note] For instance, the dataset doesn't include each country's ISO code by default. Adding this information requires performing a matching using regular expressions, but the method is not guaranteed to work, as the documentation indicates. Building on these data, I provide a cleaned dataset for map coordinates at this link. The dataset also includes some variables that identify zones, enabling you to restrict the plot to specific regions.

The dataset with map coordinates can be merged with our DataFrame by running the snippet code below. This creates a new variable called merged_df, which is the DataFrame we'll use to plot the heatmap.

Merging Datasets
# we merge our dataframe with the file having map coordinates
link           = Downloads.download("https://alfaromartino.github.io/data/countries_mapCoordinates.csv")
df_coordinates = DataFrame(CSV.File(link)) |> x-> dropmissing(x, :iso3)

merged_df = leftjoin(df_coordinates, our_df, on=:iso3)
Output in REPL
julia> merged_df
4×11 DataFrame
 Row │ long      lat       group    order  short_name_country  iso2      iso3     iso_name  zone           subzone          share_income
     │ Float64   Float64   Float64  Int64  String              String3?  String3  String?   String15?      String15?        Float64?
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ -65.4285  43.5614     260.0  15259  Canada              CA        CAN      Canada    North America  North America        0.13938
   2 │  74.6131  40.2722     430.0  30429  China               CN        CHN      China     Asia           Asia                 2.7678
   3 │ -60.2417   5.25796    667.0  45666  Guyana              GY        GUY      Guyana    Latin America  South America        0.240533
   4 │ -87.7618  18.4461     985.0  61077  Mexico              MX        MEX      Mexico    Latin America  Central America      0.559792

Specifically, the map coordinates are represented through four variables: long, lat, group, and order. These variables and iso3 are all we need to identify and draw a world map. Additionally, share_income represents our statistic of interest.

3rd Step: Calling R to Plot the Map

Once we've computed our statistics in Julia and identified countries by their ISO codes, the final step is to use R to create the heatmap. To accomplish this, we must first define a file path in Julia, where our graphs will be stored. If you run the code I provided, the folder will be located at C:\maps\graph_folder.

Creating Path to Store the Graph
isdir(joinpath("C:/", "maps")) || mkdir(joinpath("C:/", "maps")) #create 'maps' folder if it doesn't exist

graphs_folder = joinpath("C:/", "maps")

To use R from within Julia, we need to convert the DataFrame into R's format. This can be done in two ways. The first option is to directly send merged_df to R, which will create an R's DataFrame in R with the same name. However, this option is only recommended if you plan to further manipulate the data in R, which is not our case. The second option is to simply interpolate merged_df when we reference it in R.

Below, we provide the implementation of both options. After this, we'll exclusively use the second option, as it's simpler.

# we create the graph using RCall


R"""
#baseline code
our_gg <- ggplot() + geom_polygon(data = $(merged_df), 
                                  aes(x=long, y = lat, group = group, 
                                      fill=share_income)) 

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph0.svg"), plot = our_gg)
"""
@rput merged_df     # we send our merged dataframe to R

# we create the graph using RCall
R"""
#baseline code
our_gg <- ggplot() + geom_polygon(data = merged_df, 
                                  aes(x=long, y = lat, group = group, 
                                      fill=share_income)) 

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph0.svg"), plot = our_gg)
"""


We refer to this graph as the baseline case, since it uses the default layout provided by ggplot2. The difference between the approaches can be seen through the use of @rput or the interpolation of merged_df. In both options, though, graphs_folder is directly interpolated, indicating to R where to store the graphs (we could've alternatively used @rput to send the variable graphs_folder to R).

The fill option constitutes the only information you need to fill in if you use this code as a template. It indicates the variable to be plotted, which is share_income in our case. The rest of the code should remain the same, regardless of what you wish to plot. In particular, aes(x=long, y = lat, group = group) must be specified in the same way every time you use it, as it provides the necessary information for drawing a world map.

And that's it! You can use the code snippet as a template for your own graphs. The resulting graph is shown below.

Plotted Graph

As the baseline graph may not be visually appealing, we next provide further options to enhance its appearance. The final code we'll provide can also be used as a template that can be "copied and pasted", similar to the baseline script.

4rd Step (optional): Tweaking the Graph

For each option added to the graph, we state the corresponding code and the resulting plot. We only change one option at a time, and you can skip the rest of this section if you wish to apply the code off the shelf. This code includes all the options we describe below, and produces the graph shown at the beginning of this note.

The codes and graphs are hidden by default. Press on "Code" to see the code, and on "Result" for the graph. Alternatively, press Ctrl + 🠛 and Ctrl + 🠙 to open and close all codes and graphs simultaneously.

Getting Rid of Antarctica

Code
merged_df2 = merged_df[.!(occursin.(r"Antarct", merged_df.short_name_country)),:]

R"""
#base code
our_gg <- ggplot() + geom_polygon(data = $(merged_df2), 
                                  aes(x=long, y = lat, group = group, fill=share_income))

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph01a.svg"), plot = our_gg)
"""
Plotted Graph


Getting Rid of Axes, Grids, and Background

We first define the following function using RCall:

Code
R"""
user_theme <- function(){
  theme(
    panel.background = element_blank(),
    panel.border     = element_blank(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),

    axis.line    = element_blank(),
    axis.text.x  = element_blank(),
    axis.text.y  = element_blank(),
    axis.ticks   = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
      )
      } 
"""

And then plot the graph

Code
R"""
#base code
our_gg <- ggplot() + geom_polygon(data = $(merged_df), 
                                  aes(x=long, y = lat, group = group, fill=share_income))

#we get rid of background, grid, and latitude/longitude
our_gg <- our_gg + user_theme()

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph01b.svg"), plot = our_gg)
"""
Plotted Graph


Changing Aspect Ratio

Code
R"""
#base code
our_gg <- ggplot() + geom_polygon(data = $(merged_df), 
                                  aes(x=long, y = lat, group = group, fill=share_income))

#changing aspect ratio
our_gg <- our_gg + coord_fixed(1.3) 

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph01c.svg"), plot = our_gg)
"""
Plotted Graph


Changing Color Scale

A quirky feature of ggplot2 is that lighter colors are associated with higher values. The following code inverts the color scale, so that darker colors correspond to higher values.

Code
R"""
#base code
our_gg <- ggplot() + geom_polygon(data = $(merged_df), 
                                  aes(x=long, y = lat, group = group, fill=share_income))

#to change color scale
# see colors in R here https://derekogle.com/NCGraphing/resources/colors
our_gg <- our_gg + scale_fill_gradient(low="lightskyblue", high="steelblue4", name = "Income Share") 

#additional options for the legend of color scale 
our_gg <- our_gg + theme(
                         legend.position = "bottom",
                         legend.key.height = unit(0.5, "cm"),
                         legend.key.width = unit(2, "cm"))

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph01d.svg"), plot = our_gg)
"""
Plotted Graph


Changing Color of Borders

Code
R"""
#base code + borders
our_gg <- ggplot() + geom_polygon(data = $(merged_df), 
                                  aes(x=long, y = lat, group = group, fill=share_income),
                                  color = "white", linewidth=0.1) 

#saving the graph
ggsave(filename = file.path($(graphs_folder),"file_graph01e.svg"), plot = our_gg)
"""
Plotted Graph


Cropping the Graph

Code
R"""
#base code
our_gg <- ggplot() + geom_polygon(data = $(merged_df), 
                                  aes(x=long, y = lat, group = group, fill=share_income))

#saving the graph
height <- 5
ggsave(filename = file.path($(graphs_folder),"file_graph01f.svg"), plot = our_gg, width = height * 3, height = height)
"""
Plotted Graph


Final Code that Includes All Options

You can use the following code off the shelf to plot a world heatmap. First, execute the following lines in Julia, without making any changes.

Code
R"""
user_theme <- function(){
  theme(
    panel.background = element_blank(),
    panel.border     = element_blank(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),

    axis.line    = element_blank(),
    axis.text.x  = element_blank(),
    axis.text.y  = element_blank(),
    axis.ticks   = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
      )
      } 
"""

Then, execute the script below by only making two modifications: insert the name of your Julia DataFrame at the option data, and indicate the variable to be plotted at fill. In our example, these options respectively correspond to merged_df2 (if we do not want to plot Antarctica, otherwise merged_df) and share_income.

Code
merged_df2 = merged_df[.!(occursin.(r"Antarct", merged_df.short_name_country)),:]

R"""
#base code
our_gg <- ggplot() + geom_polygon(data = $(merged_df2), 
                                  aes(x=long, y = lat, group = group, 
                                      fill=share_income),
                                  color = "white", linewidth=0.05)
                                  
#we get rid of background, grid, and latitude/longitude
our_gg <- our_gg + user_theme()

#changing aspect ratio
our_gg <- our_gg + coord_fixed(1.3) 

#to change color scale (see color codes here https://derekogle.com/NCGraphing/resources/colors)
our_gg <- our_gg + scale_fill_gradient(low="lightskyblue", high="steelblue4", name = "Income Share")

#additional options for the legend of color scale 
our_gg <- our_gg + theme(
                         legend.position = "bottom",
                         legend.key.height = unit(0.5, "cm"),
                         legend.key.width = unit(2, "cm"))

#saving the graph
height <- 5
ggsave(filename = file.path($(graphs_folder),"file_graph01.svg"), plot = our_gg, width = height * 3, height = height)
"""
Plotted Graph