Analytics Made Accessible

View Original

Save Time with ChaRt Templates

**Download the R syntax and data used in this post **

Image reads: ChaRt Templates enclosed in double curly brackets.

Not too long ago, I wrote a blog post on how to turn a chart into a template in Excel. Well, I recently had a conversation with an old colleague of mine, and they wanted to learn more about how to create chart templates in the R programming environment.

TL;DR: Do you work in R and love the ggplot2 package? Read this post to learn how to create a chart template in R.

 So, why do we care about templates?

Designing data visualizations can be a daunting and lengthy process. So, when you get a chart just right, there is nothing more annoying (and time-consuming) than having to start the process all over again on another project for the same type of chart. If you find yourself spending more time creating charts than sharing your organization's data stories, invest in some chart templates

Templates eliminate mundane design decisions. And bonus: they guarantee that everyone in your organization is producing materials that feature charts that are professional and consistently on-brand.  

Let's say you are an evaluator, and you just completed a large study on the perspectives of art museum directors. As a part of the study, you conducted a survey. One of the questions on the survey asked museum directors:

When making decisions about which artwork to acquire at your museum, which of the following do you prioritize?

Respondents could select all that apply from the following list:

  • Artist Identity (e.g., Jewish, Islamic, Christian, Buddhist, etc.)

  • Time Period (e.g., contemporary, romanticism, medieval)

  • Region (e.g., European, Southeast Asian, African)

  • Medium (e.g., paintings, prints, sculptures)

  • Religion (e.g., Jewish, Islamic, Christian, Buddhist, etc.)

You run the numbers and produce the following summary table:

Priority Percentage
Region 59%
Artist Identity 79%
Religion 10%
Medium 53%
Time Period 67%

You decide that you want to visualize the data using a lollipop chart and get to work on creating the plot in R.

## Install and load tidyverse
# install.packages("tidyverse")
library(tidyverse)

## Data to plot
art_data <- tibble(Area = c(Priority = "Region", "Artist identity", "Religion", "Medium","Time period"), Percentage = c(0.59, 0.79, 0.10, 0.53, 0.67)) 

## plot data
art_data %>%
arrange(Percentage)%>%
mutate(Area = factor(Area, levels = Area)) %>%
ggplot(aes(x = Area, y = Percentage)) +
geom_segment(aes(x = Area, xend = Area, y = 0, 
yend = Percentage), color = "#5b5b5b", linewidth = 0.75) +
geom_point(color = "#7030A0", size = 8, alpha = 1) +
scale_y_continuous(limits=c(0,1), breaks = seq(0,1, by = 0.25), 
labels = paste0("\n", seq(0,1, by = 0.25)*100, "%")) +
scale_x_discrete(expand = c(0,1)) +
coord_flip() +
geom_hline(yintercept = seq(0, 1, by = 0.25), colour= "#BFBFBF", 
linewidth = 0.20) +
labs(x = "", y = "") +
theme_void() +
theme(legend.position = "none",
plot.title = element_blank(),
axis.ticks.x =element_blank(),
axis.ticks.length = unit(0.10, "cm"),
axis.text.y = element_text(size = 11, color = "#404040", family = "Arial",
margin=margin(r = 1.25), hjust = 1),
axis.text.x  = element_text(size=11, color = "#404040", family = "Arial"),
axis.title.y  = element_blank(),
axis.title.x  = element_blank(),
axis.line = element_blank(),
panel.border = element_blank(), 
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),        
plot.margin = margin(t = 0.5, r = 0.5, b = 0.5, l = 1, "cm"))

The Model Chart

Image of a lollipop chart plotting values for 5 categories: Artist identity (79%); Time Period (67%); Region (59%); Medium (53%); Religion (10%)..

You send the chart off to your boss and pat yourself on the back. A job well done. Your boss falls in love with it and says, "This is amazing! Let's do the same thing for the school project data!"

You say, "Sure!" while groaning internally. That last chart you created and all the fancy formatting took you half an hour to get just right.

Never fear; you DO NOT need to start over. Just turn your chart into a template. That way, if your boss circles back and decides they want to apply the same chart to a different dataset, you can create the plot in under a minute.

 

HOW TO TURN A CHART INTO A TEMPLATE IN R (in 2 steps!)

STEP 1: Create a Function

One of the benefits of working in R is that you can (pretty much) turn anything into a function. A chart is no different. Here, our goal is to create a user-defined function that quickly creates a lollipop chart.

The basic (horizontal) lollipop chart has two components:

  • Values plotted along the x-axis; and

  • Categories (or groups) plotted along the y-axis

So, at a minimum, the function you create for this chart type must have two inputs (or arguments). Let's call these two arguments variable and category. A third argument called dataset would also be helpful; this would allow us to specify the data frame or tibble containing the variables. You will also need to write a series of statements within the body of the function that will generate the plot.

Lucky for us, we have some existing syntax we can pull from:

# Un-stylized Lollipop Chart
art_data %>%
arrange(Percentage) %>%
mutate(Area = factor(Area, levels = Area)) %>%
ggplot(aes(x = Area, y = Percentage)) +
geom_segment(aes(x = Area, xend = Area, y = 0, 
yend = Percentage), color = "#5b5b5b", linewidth = 0.75) +
geom_point(color = "#7030A0", size = 8, alpha = 1) +
coord_flip()
Image of a lollipop chart plotting values for 5 categories: Artist identity (79%); Time Period (67%); Region (59%); Medium (53%); Religion (10%)..

But wait, you cannot use the EXACT same syntax because the whole point of a function is to avoid having to write the same block of code repeatedly. So, rather than hardcode variables or data frame names, pass arguments that represent your data and variable names to the function that you can call on to generate the lollipop chart.

Data frames and tibbles are easy to reference. Say you wanted to create a function (with a single argument) that prints the first two rows of any data frame or tibble. You could write:

First_Two_Rows <- function(dataset){
dataset %>%
slice(1:2)
}

set.seed(0721)
boop <- data.frame(a = sample(1:50, 10, replace = TRUE))
First_Two_Rows(dataset = boop)
1   37
2   21

To refer to variables within a data frame (without quotes), use the curly-curly operator ({{ }}). Say you wanted to create a function that filters a dataset and retains any rows where a variable value is greater than 10. You could write:

Ten_greater <- function(dataset, variable){

dataset %>%
filter({{variable}} > 10)
}

set.seed(0721)
boop2 <- data.frame(a = sample(1:20, 10, replace = TRUE))
Ten_greater(dataset = boop2, variable = a)
1 20
2 20
3 14

Now that you have a sense of how to reference data frames and variables let's create a function that generates a basic lollipop chart. To start, our function is going to have three arguments:

  1. dataset (a tibble or data frame);

  2. variable (a variable in your dataset that represents values to plot along the x-axis; numbers or percentages); and

  3. category (a variable in your dataset that represents group names to plot along the y-axis).

Importantly, make sure to include an arrange statement that sorts values:

lolli_chart <- function(dataset, variable, category){

dataset %>%
arrange({{variable}})%>%
mutate({{category}} := factor({{category}}, levels = ({{category}}))) %>%
ggplot(aes(x = {{category}}, y = {{variable}})) +
geom_segment(aes(x = {{category}}, xend = {{category}}, y = 0, 
yend = {{variable}}), color = "#5b5b5b", linewidth = 0.75) +
geom_point(color = "#7030A0", size = 8, alpha = 1) +
coord_flip()

}

set.seed(0721)
fake_data <-
tibble(group = sample(paste0("Group ", letters[1:5]), size = 5, replace = FALSE), per = runif(5, min = 45, max = 85)/100)

lolli_chart(dataset = fake_data, variable = per, category = group)
ImImage of an un-stylized lollipop chart plotting values for 5 categories: Group b: 80%, Group d: 62%, Group e: 56%, Group c: 71%, and Group a: 46%.

Looking good, but what if you wanted to control what values are shown along the x-axis? No problem, simply add three additional arguments

  1. low_num (minimum value)

  2. high_num (maximum value); and

  3. by_num (unit increments).

So, if you wanted to generate a lollipop chart showing percentages where the x-axis values range from 0 (0%) to 1 (100%) and increase in increments of 0.25 (or 25%), you would enter 0 for low_num, 1 for high_num, and 0.25 for by_num. Note: Here, the scaling is applied to the y-axis because coord_flip is used:

lolli_chart <- function(dataset, variable, category, low_num, high_num, by_num){

dataset %>%
arrange({{variable}}) %>%
mutate({{category}} := factor({{category}}, levels = {{category}})) %>%
arrange({{variable}})%>%
ggplot(aes(x = {{category}}, y = {{variable}})) +
geom_segment(aes(x = {{category}}, xend = {{category}}, y = 0, 
yend = {{variable}}), color = "#5b5b5b", linewidth = 0.75) +
geom_point(color = "#7030A0", size = 8, alpha = 1) +
scale_y_continuous(limits=c(low_num, high_num), 
                            breaks = seq(low_num, high_num, 
                            by = by_num), 
                            labels = paste0("\n", seq(low_num,
                            high_num, by = by_num)*100, "%")) +
coord_flip()
}

set.seed(0721)
fake_data <-
tibble(group = sample(paste0("Group ", letters[1:5]), size = 5, replace = FALSE),
          per = runif(5, min = 45, max = 85)/100)

lolli_chart(dataset = fake_data, variable = per,category = group, low_num = 0, high_num = 1, by_num = 0.25)
ImImage of an un-stylized lollipop chart plotting values for 5 categories: Group b: 80%, Group d: 62%, Group e: 56%, Group c: 71%, and Group a: 46%.

Okay, the final step: applying a custom theme. (This is probably my favorite part.) For this example, you want the same purple color to be applied to the dots at the end of each line, but you can still play around with the overall look and feel of the plot.  

Years ago, I created a theme called ama_theme_x that I use for most of my charts and graphs. The theme:

  1. Removes the chart background color and panel border

  2. Removes gridlines

  3. Removes tick marks

  4. Removes axis labels

  5. Sets the chart axis font to Arial

  6. Sets the chart axis font color to dark grey (hex: #404040)

  7. Sets custom margins

Note: In this example, I also used geom_hline to create customized gridlines for the x-axis and removed the x and y-axis labels with the labs function.

When you are done, be sure to add the custom theme to your function!

### ggplot theme 
ama_theme_x <- function(){

theme_void() %+replace%
theme(legend.position = "none",
plot.title = element_blank(),
axis.ticks.x =element_blank(),
axis.ticks.length = unit(0.10, "cm"),
axis.text.y = element_text(size = 11, color = "#404040", family = "Arial",
margin=margin(r = 1.25), hjust = 1),
axis.text.x  = element_text(size=11, color = "#404040", family = "Arial"),
axis.title.y  = element_blank(),
axis.title.x  = element_blank(),
axis.line = element_blank(),
panel.border = element_blank(), 
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),        
plot.margin = margin(t = 0.5, r = 0.5, b = 0.5, l = 1, "cm"))
}


## lollipop chart function
lolli_chart <- 
function(dataset, variable, category, low_num, high_num, by_num){

dataset %>%
arrange({{variable}}) %>%
mutate({{category}} := factor({{category}}, levels = {{category}})) %>%
arrange({{variable}})%>%
ggplot(aes(x = {{category}}, y = {{variable}})) +
geom_segment(aes(x = {{category}}, xend = {{category}}, y = 0, 
yend = {{variable}}), color = "#5b5b5b", linewidth = 0.75) +
geom_hline(yintercept = seq(low_num, high_num, 
                    by = by_num), colour= "#BFBFBF", 
                    linewidth = 0.20) +
geom_point(color = "#7030A0", size = 8, alpha = 1) +
scale_y_continuous(limits=c(low_num, high_num), 
                                breaks = seq(low_num, high_num, 
                                by = by_num), 
                                labels = paste0("\n", seq(low_num,
                                high_num, by = by_num)*100, "%")) +
coord_flip() +
labs(x = "", y = "") +
ama_theme_x() 

}

STEP 2: Apply the Function

Now that you have created the function (and custom theme), apply it to a new dataset:

set.seed(0721)
dat_to_use <-
tibble(group = sample(paste0("Organization ", letters[1:7]), 
            size = 7, replace = FALSE),
          percentages = runif(7, min = 45, max = 85)/100)      

lolli_chart(dataset = dat_to_use, variable = percentages, category = group, low_num = 0, high_num = 1, by_num = 0.25)

And voila:

Image of lollipop chart plotting values for 7 categories: Organization f: 67%, Organization b: 58%, Organization e: 55%, Organization c: 65%, Organization g: 65%, Organization d: 50%, Organization a: 48%.

These tips have saved me a ton of time when designing custom charts and graphs in R. The best part? You do not have to sacrifice design quality. Try it yourself, and let me know what you think in the comments!

(This example did not include error handling or other checks. In practice, you would want to include these statements to ensure, for example, the inputs match what you would expect or that the variables you are referencing exist within the dataset.)

If R is not your thing, venture over to my other post to learn how to create chart templates in Excel.

 

** The data used in the first example are drawn from a 2020 report (p. 32) detailing the perspectives of art museum directors (p. 32). All other data were generated in R **