Rounding in frequency tables in R

Question 1

I was wondering if anyone proficient in R/RMarkdown would be able to guide me with an issue. I am looking to generate a frequency table and so far, I have been using tableby of the arsenal package as it is easy and convenient to integrate in a RMarkdown docx/html. However, I have been asked to provide rounded frequencies (to the nearest 5 or 10) and have been trying to find ways to do it without much success.

I have generated a fake simple dataset as I cannot share my data for confidentialy reason and this is how I would do a normal table.

set.seed(1234)
library(dplyr)
library(arsenal)
x1 <- c(rep("Man",40),rep("Woman",60)) %>% as.factor()
x2 <- sample(c("Sick","Healthy"),100,replace=TRUE) %>% as.factor()
df <- data.frame(x1,x2)
Control_notrounded <- tableby.control(digits=0,digits.pct=2,cat.stats=c("countpct","Nmiss2"))
table <- tableby(x1~x2,control=Control_notrounded,data=df)
print(summary(table))

However, even though rounding to the nearest 10 with a traditional rounding function is performed by passing digits=-1, this does not seem to be a working approach with that function as I get a warning indicating that digits must be >=0.

Control_rounded <- tableby.control(digits=-1,digits.pct=2,cat.stats=c("countpct","Nmiss2"))
table2 <- tableby(x1~x2,control=Control_rounded,data=df)
print(summary(table2))

Is there any way to do that? Otherwise, would anyone have an alternative package that would allow to create relatively straightforwardly frequency tables with rounded values?

Question 2

I can recommend using the gtsummary package for creating baseline tables instead - then try the following round_5_gtsummary() function from this little GitHub package:

set.seed(1234)
library(dplyr)
library(gtsummary)
library(stringr)
x1 <- c(rep("Man",40),rep("Woman",60)) %>% as.factor()
x2 <- sample(c("Sick","Healthy"),100,replace=TRUE) %>% as.factor()
df <- data.frame(x1,x2)
install.packages("devtools")
devtools::install_github("zheer-kejlberg/Z.gtsummary.addons")
library(Z.gtsummary.addons)
df %>% tbl_summary(by = "x1") %>% 
 add_overall(last = TRUE) %>% 
 round_5_gtsummary() %>%
 add_p()

Result: enter image description here

WEIGHTED VERSION

# Create IPT weights
library(WeightIt)
df$w <- weightit(x1~x2, data = df, estimand = "ATT", focal = "Man")$weights

Use survey to create a svydesign object. Then apply tbl_svysummary() to that:

library(survey)
df %>% survey::svydesign(~1, data = ., weights = ~w) %>%
 tbl_svysummary(by = "x1", include=c(x2)) %>%
 add_overall(last = TRUE) %>%
 round_5_gtsummary() %>%
 add_p()

ALTERNATIVE WAY:

To use the built-in tbl_summary(digits=) argument to separately round the counts and percentages, you can do:

library(gtsummary)
library(dplyr)
set.seed(1234)
round_5 <- function(vec) {
 fun <- function(x) {
 if (x < 1) { return(round(x*100/5)*5)
 } else { return(round(x/5)*5) }
 }
 vec <- purrr::map_vec(vec, .f = fun)
}
df <- data.frame(
 x1 = c(rep("Man", 40), rep("Woman", 60)) %>% as.factor(),
 x2 = sample(c("Sick", "Healthy"), 100, replace = TRUE) %>% as.factor()
)
df %>% 
 tbl_summary(
 by = "x1",
 digits = all_categorical() ~ round_5
 ) %>% 
 add_overall(last = TRUE) %>% 
 add_p()

Results:

enter image description here

Note, this version doesn't recalculate percentages after rounding the counts; rather, it just rounds both separately.

Question 3

Wow, that's amazing thanks! Would that approach allow the incorporation of sampling weights?

Question 4

Sure - updated my answer with a weighted example.

ZKA 5153 silver badges13 bronze badges · Accepted Answer · 2023-11-17 02:55:05Z

I can recommend using the gtsummary package for creating baseline tables instead - then try the following round_5_gtsummary() function from this little GitHub package:

set.seed(1234)
library(dplyr)
library(gtsummary)
library(stringr)
x1 <- c(rep("Man",40),rep("Woman",60)) %>% as.factor()
x2 <- sample(c("Sick","Healthy"),100,replace=TRUE) %>% as.factor()
df <- data.frame(x1,x2)
install.packages("devtools")
devtools::install_github("zheer-kejlberg/Z.gtsummary.addons")
library(Z.gtsummary.addons)
df %>% tbl_summary(by = "x1") %>% 
 add_overall(last = TRUE) %>% 
 round_5_gtsummary() %>%
 add_p()

Result: enter image description here

WEIGHTED VERSION

# Create IPT weights
library(WeightIt)
df$w <- weightit(x1~x2, data = df, estimand = "ATT", focal = "Man")$weights

Use survey to create a svydesign object. Then apply tbl_svysummary() to that:

library(survey)
df %>% survey::svydesign(~1, data = ., weights = ~w) %>%
 tbl_svysummary(by = "x1", include=c(x2)) %>%
 add_overall(last = TRUE) %>%
 round_5_gtsummary() %>%
 add_p()

ALTERNATIVE WAY:

To use the built-in tbl_summary(digits=) argument to separately round the counts and percentages, you can do:

library(gtsummary)
library(dplyr)
set.seed(1234)
round_5 <- function(vec) {
 fun <- function(x) {
 if (x < 1) { return(round(x*100/5)*5)
 } else { return(round(x/5)*5) }
 }
 vec <- purrr::map_vec(vec, .f = fun)
}
df <- data.frame(
 x1 = c(rep("Man", 40), rep("Woman", 60)) %>% as.factor(),
 x2 = sample(c("Sick", "Healthy"), 100, replace = TRUE) %>% as.factor()
)
df %>% 
 tbl_summary(
 by = "x1",
 digits = all_categorical() ~ round_5
 ) %>% 
 add_overall(last = TRUE) %>% 
 add_p()

Results:

enter image description here

Note, this version doesn't recalculate percentages after rounding the counts; rather, it just rounds both separately.

Wow, that's amazing thanks! Would that approach allow the incorporation of sampling weights?

CollectivesTM on Stack Overflow

Rounding in frequency tables in R

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related