I am trying to translate the following from Stata
clear
set obs 1000
generate y = floor((10-0+1)*runiform() +0)
recode y (7=0) (8=0) (9=1) (10=2)
I thought I had it with the following code:
library(dplyr)
mydata <- y ~ floor((10-0+1)*runif(1000)+0)
recode (mydata, '7'=0, '8'=0, '9'=1, '10'=2)
However, the last line keeps giving me an error:
Error in UseMethod("recode"):no applicable method for 'recode' applied to an object of class "formula".
Any ideas?
1 Answer 1
You can use cut:
n = 1000L
y = cut(runif(n, 0, 11), c(-Inf, 9, 10, Inf), right = FALSE, ordered = TRUE)
You can see how it worked with table:
# y
# [-Inf,9) [9,10) [10, Inf)
# 813 91 96
If you really want the codes, you can use as.integer(y)-1L. Read ?cut and ?factor for more details on ordinal data in R.
I've been assuming so far that there is some rhyme or reason to the recoding rule. If there is not, best to store it in a separate table and draw values from there (which is the same thing I would do in Stata):
rec = data.frame(old = c(7,8,9,10), new = c(0,0,1,2))
n = 1000L
y = floor(runif(n, 0, 11))
DF = data.frame(id = 1:10, y)
library(data.table)
setDT(DF)
DF[rec, on=c(y = "old"), y := new]
DF[, .N, keyby=y]
# y N
# 1: 0 288
# 2: 1 179
# 3: 2 174
# 4: 3 101
# 5: 4 82
# 6: 5 93
# 7: 6 83
You'd need to install the data.table package for this to work, though.
2 Comments
base R?Thxsmatch function would work, maybe m = match(DF$y, rec$old); DF$y = ifelse(is.na(m), DF$y, rec$new[m]) You could also post a bounty to see if someone else has a better answer stackoverflow.com/help/bounty
mydata <- floor((10-0+1)*runif(1000)+0)10 - 0 + 1is11, always and in both Stata and R. Similarly, adding0is useless. In both cases, if these are steps towards something more general, then fine, but that is not at all obvious.