I am rewriting some old code where I take a dataframe in r
and convert it using the tidyverse
packages to a list of lists, where each element is one row of the original dataframe - which is itself a list with each column an element.
My previous function achieved it like so:
library(tidyverse)
feat <- function(coords, height = 20, vjust = 75, fill = "orange"){
inst <- purrr::flatten(purrr::by_row(coords, ..f = function(x)
rec <- list(type = "rect",
x = as.numeric(x[1]),
y = vjust,
width = as.numeric(x[2]-x[1]),
height = height,
fill = fill),
.labels = FALSE))
return(inst)
}
However by_row()
has been depreciated from the purrr
package and so I would like to rewrite it. This is my attempt:
feat2 <- function(coords, height = 20, vjust = 75, fill = "orange"){
inst <- coords %>%
mutate(type = "rect",
x = as.numeric(start),
y = vjust,
width = as.numeric(end-start),
height = height,
fill = fill) %>%
select(-start, -end) %>%
mutate(count = 1:n()) %>%
nest(-count) %>%
select(-count) %>%
mutate(data = map(data, ~ flatten(.x))) %>% pull()
return(inst)
}
which does the job but I feel there should be a quicker, more elegant way to achieve this. Do you have any ideas on that aspect?
Here is an example data set:
coords <- structure(list(start = c(126, 433, 603, 1604), end = c(327, 495,
644, 1831)), .Names = c("start", "end"), row.names = c(NA, -4L
), class = "data.frame")
and the desired output:
result <- list(structure(list(type = "rect", x = 126, y = 75, width = 201,
height = 20, fill = "blue"), .Names = c("type", "x", "y",
"width", "height", "fill")), structure(list(type = "rect", x = 433,
y = 75, width = 62, height = 20, fill = "blue"), .Names = c("type",
"x", "y", "width", "height", "fill")), structure(list(type = "rect",
x = 603, y = 75, width = 41, height = 20, fill = "blue"), .Names = c("type",
"x", "y", "width", "height", "fill")), structure(list(type = "rect",
x = 1604, y = 75, width = 227, height = 20, fill = "blue"), .Names = c("type",
"x", "y", "width", "height", "fill")))
2 Answers 2
without any external packages:
feat3 <- function(coords, height = 20, vjust = 75, fill = "blue"){
xx <- apply(coords, 1, function(x) {
list(type = "rect",
x = as.numeric(x[1]),
y = vjust,
width = as.numeric(x[2] - x[1]),
height = height,
fill = fill)
}
)
return(xx)
}
all.equal(feat3(coords), result)
# [1] TRUE
Benchmarks:
microbenchmark::microbenchmark(
feat3(coords),
feat2(coords), unit = "relative"
)
# Unit: relative
# expr min lq mean median uq max neval cld
# feat3(coords) 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 100 a
# feat2(coords) 315.8261 297.2919 254.5025 258.7822 248.2652 286.9575 100 b
-
\$\begingroup\$ It's somewhat undesirable that
apply
first converts all the data to characters. It also forces you to usex[1]
andx[2]
to access specific columns, so you lose code readability a bit... I would suggest you build a data.framex
then dounname(do.call(Map, c(list, x)))
to convert to a list of list. It's not nearly as fast but it is more robust and maintainable. In fact, the op might want to definedf_to_lol <- function(df) unname(do.call(Map, c(list, df)))
so he can use it within tidyverse, e.g.,mutate(...) %>% df_to_lol
. \$\endgroup\$flodel– flodel2017年12月13日 23:58:43 +00:00Commented Dec 13, 2017 at 23:58 -
\$\begingroup\$ @flodel It sounds that you should write a separate answer... \$\endgroup\$minem– minem2017年12月14日 07:24:15 +00:00Commented Dec 14, 2017 at 7:24
-
\$\begingroup\$ There are not a lot of people posting answers in the [r] group (thanks for joining and welcome). I prefer we help each other improve our answers than get into a competition for best answer. \$\endgroup\$flodel– flodel2017年12月17日 22:10:13 +00:00Commented Dec 17, 2017 at 22:10
I also asked this question over on the Rstudio community forums where user mgirlich provided this answer:
feat4 <- function(coords, height = 20, vjust = 75, fill = "orange") {
coords %>%
# transmute keeps only the columns specified, so there is no need to
# deselect start and end afterwards
transmute(
type = "rect",
x = as.numeric(start),
y = vjust,
width = as.numeric(end - start),
height = height,
fill = fill
) %>%
purrr::transpose()
}
It is slower than the answer by @minem but the use of the transpose()
function was new to me so I'll post it here.
by_row
has been migrated topurrrlyr
andpurrrlyr::by_row
doesn't seem to be deprecated (though I'll admit the tools inpurrrlyr
seem to be left "aside" of the rest of the tidyverse). \$\endgroup\$