I am creating multiple graphs with the same basic template, just with different subjects as the data source. Sometimes two points can be very close so I wanted to use ggrepel to make sure the text labels were not overlapping, but I cannot seem to get the text to reliably not overlap when points are close to each other. I am making around 100 graphs where the points sometimes don't overlap and everything is fine, but when they do overlap, ggrepel doesn't seem to be helping.
Plot with overlapping text geoms
Here I have no force, no box.padding, only a little nudge and I set the direction to adjust on the "y" axis only. I have tried to adjust the force and box.padding, and maybe I just don't know the right settings, but nothing has worked that looks good. Also, the size of the plot is set because it needs to fit into a template report which is an important constraint. I would appreciate all and any help! Thanks.
toy_data <- tibble(
fy = c("2019", "2020", "2022", "2023", "2024"),
v1 = c(0.0438688113641111, 0.052572706935123, 0.0111137572437882, 0.0118893212278426, 0.01225),
v2 = c(0.0411, 0.0741, 0.1075, 0.0255, 0.0244),
v3 = c(0.42359475940349, 0.40169349585931, 0.279370529327611, 0.324827447947991, 0.344568567026194),
id = c(1, 1, 1, 1, 1)
)
toy_data %>%
filter(id == 1) %>%
ggplot() +
geom_line(aes(x = fy,
y = v2,
group = 1),
color = "purple",
linewidth = 2) +
geom_point(aes(x = fy,
y = v2),
color = 'purple',
size=2,
shape=21,
fill="white") +
geom_line(aes(x = fy,
y = v1,
group = 1),
color = "blue",
linewidth = 2) +
geom_point(aes(x = fy,
y = v1),
color = 'blue',
size=2,
shape=21,
fill="white") +
geom_line(aes(x = fy,
y = v3,
group = 1),
color = "darkgreen",
linewidth = 2) +
geom_point(aes(x = fy,
y = v3),
color = 'darkgreen',
size=2,
shape=21,
fill="white") +
geom_text_repel(data = toy_data %>%
filter(fy == 2019, id == 1),
aes(x = "2019",
y = v2,
label = paste0(as.character(round(v2 * 100, 2)), "%")),
nudge_x = -.25,
direction = "y",
color = "purple",
# force = 1.2,
# force_pull = 2,
# box.padding = .95,
min.segment.length = Inf,
fontface = 'bold',
seed = 102
) +
geom_text_repel(data = toy_data %>%
filter(fy == 2024, id == 1),
aes(x = "2024",
y = v2,
label = paste0(as.character(round(v2 * 100, 2)), "%")),
nudge_x = .25,
direction = "y",
color = "purple",
# force = 1.2,
# force_pull = 2,
# box.padding = .95,
min.segment.length = Inf,
fontface = 'bold',
seed = 102) +
geom_text_repel(data = toy_data %>%
filter(fy == 2019, id == 1),
aes(x = "2019",
y = v1,
label = paste0(as.character(round(v1 * 100, 2)), "%")),
nudge_x = -.25,
# nudge_y = .005,
direction = "y",
color = "blue",
# force = 1.2,
# force_pull = 2,
# box.padding = .95,
min.segment.length = Inf,
fontface = 'bold',
seed = 102
) +
geom_text_repel(data = toy_data %>%
filter(fy == 2024, id == 1),
aes(x = "2024",
y = v1,
label = paste0(as.character(round(v1 * 100, 2)), "%")),
nudge_x = .25,
direction = "y",
color = "blue",
# force = 1.2,
# force_pull = 2,
# box.padding = .95,
min.segment.length = Inf,
fontface = 'bold',
seed = 102
) +
geom_text_repel(data = toy_data %>%
filter(fy == 2019, id == 1),
aes(x = "2019",
y = v3,
label = paste0(as.character(round(v3 * 100, 2)), "%")),
nudge_x = -.25,
direction = "y",
color = "darkgreen",
# force = 1.2,
# force_pull = 2,
# box.padding = .95,
min.segment.length = Inf,
fontface = 'bold',
seed = 102
) +
geom_text_repel(data = toy_data %>%
filter(fy == 2024, id == 1),
aes(x = "2024",
y = v3,
label = paste0(as.character(round(v3 * 100, 2)), "%")),
nudge_x = .25,
direction = "y",
color = "darkgreen",
# force = 1.2,
# force_pull = 2,
# box.padding = .95,
min.segment.length = Inf,
fontface = 'bold',
seed = 102)
3 Answers 3
As alluded to by Greg Snow, it is better to format your data to work for ggplot2 rather than trying to make ggplot2 work for your data. If you combine your V* data into a single column, then you only have to call each geom_*() once.
To do this, you can pivot your data to long form using tidyr::pivot_longer(). I have also created a named list for assigning colour values. Your example plot only has labels at the first and last value so I modified the filter() for geom_text_repel(), edit to suit.
If you are only intending to plot just the first and last value for each V*, it can be easier to use two geom_text_repel() calls to control each side independently. It also makes sense to convert your fy values to numeric. Currently 2021 is not showing which makes the plot 'lie' as years are actually continuous variables.
You will need to play around with the label placement parameters to get them to match your example plot.
library(tidyr)
library(dplyr)
library(ggplot2)
library(ggrepel)
# Map list of colours to columns
col_map <- c(
v1 = "blue",
v2 = "purple",
v3 = "darkgreen"
)
# Pivot to long form
toy_data <- toy_data %>%
pivot_longer(-c(fy, id))
head(toy_data)
# # A tibble: 6 ×ばつ 4
# fy id name value
# <chr> <dbl> <chr> <dbl>
# 1 2019 1 v1 0.0439
# 2 2019 1 v2 0.0411
# 3 2019 1 v3 0.424
# 4 2020 1 v1 0.0526
# 5 2020 1 v2 0.0741
# 6 2020 1 v3 0.402
# Convert fy to numeric
toy_data$fy <- as.numeric(toy_data$fy)
toy_data %>%
filter(id == 1) %>%
ggplot() +
geom_line(aes(fy, value, group = name, colour = name),
linewidth = 2) +
geom_point(aes(fy, value, group = name, colour = name),
size = 2,
shape = 21,
fill = "white") +
geom_text_repel(data = toy_data %>%
filter(row_number() == 1, .by = name),
aes(fy, value, group = name, colour = name,
label = paste0(as.character(round(value * 100, 2)), "%")),
nudge_x = -0.25,
direction = "both",
min.segment.length = Inf,
fontface = 'bold') +
geom_text_repel(data = toy_data %>%
filter(row_number() == n(), .by = name),
aes(fy, value, group = name, colour = name,
label = paste0(as.character(round(value * 100, 2)), "%")),
nudge_x = 0.25,
direction = "both",
min.segment.length = Inf,
fontface = 'bold') +
scale_colour_manual(name = "Values",
values = col_map) +
scale_x_continuous(expand = expansion(add = c(0.75, 0.75)),
breaks = 2019:2024) +
guides(colour = guide_legend(override.aes = list(label = "")))
Comments
I have great news for you -- this can be made much easier by making your data tidy, then all the points and lines can be specified with one geom each. Here I use two repel layers to make the left and right sides, respectively.
toy_data |>
filter(id == 1) |>
tidyr::pivot_longer(v1:v3) |>
ggplot(aes(fy, value, color = name, group = name)) +
geom_line(linewidth = 2) +
geom_point(shape = 21, size = 2, fill = "white") +
geom_text_repel(aes(label = scales::percent(value, accuracy = 0.01)),
hjust = 1.2, direction = "y",
data = ~dplyr::filter(., fy == "2019")) +
geom_text_repel(aes(label = scales::percent(value, accuracy = 0.01)),
hjust = -0.2, direction = "y",
data = ~dplyr::filter(., fy == "2024")) +
guides(color = "none") +
scale_color_manual(values = c("blue", "purple", "darkgreen"))
Result, using 14 lines of code:
By comparison, here's what the original 125 lines of code gives me: enter image description here
Comments
I think that the problem is that you are using multiple calls to geom_text_repel with just one text item per call. This means that each call does not know where the other text labels are and therefore does not do the repelling. You need to combine all of the positions and labels into a single data frame, and use 1 call to geom_text_repel (or 2 calls for right and left labels).