3

I am trying to translate the following from Stata

clear
set obs 1000
generate y = floor((10-0+1)*runiform() +0)
recode y (7=0) (8=0) (9=1) (10=2)

I thought I had it with the following code:

library(dplyr)
mydata <- y ~ floor((10-0+1)*runif(1000)+0)
recode (mydata, '7'=0, '8'=0, '9'=1, '10'=2)

However, the last line keeps giving me an error:

Error in UseMethod("recode"):no applicable method for 'recode' applied to an object of class "formula".

Any ideas?

Philipp HB
1891 silver badge14 bronze badges
asked Aug 29, 2016 at 14:49
2
  • 2
    Try mydata <- floor((10-0+1)*runif(1000)+0) Commented Aug 29, 2016 at 14:56
  • 1
    Minor details: 10 - 0 + 1 is 11, always and in both Stata and R. Similarly, adding 0 is useless. In both cases, if these are steps towards something more general, then fine, but that is not at all obvious. Commented Aug 29, 2016 at 17:42

1 Answer 1

2

You can use cut:

n = 1000L
y = cut(runif(n, 0, 11), c(-Inf, 9, 10, Inf), right = FALSE, ordered = TRUE)

You can see how it worked with table:

# y
# [-Inf,9) [9,10) [10, Inf) 
# 813 91 96

If you really want the codes, you can use as.integer(y)-1L. Read ?cut and ?factor for more details on ordinal data in R.


I've been assuming so far that there is some rhyme or reason to the recoding rule. If there is not, best to store it in a separate table and draw values from there (which is the same thing I would do in Stata):

rec = data.frame(old = c(7,8,9,10), new = c(0,0,1,2))
n = 1000L
y = floor(runif(n, 0, 11))
DF = data.frame(id = 1:10, y)
library(data.table)
setDT(DF)
DF[rec, on=c(y = "old"), y := new]
DF[, .N, keyby=y]
# y N
# 1: 0 288
# 2: 1 179
# 3: 2 174
# 4: 3 101
# 5: 4 82
# 6: 5 93
# 7: 6 83

You'd need to install the data.table package for this to work, though.

answered Aug 29, 2016 at 15:16
Sign up to request clarification or add additional context in comments.

2 Comments

Cool answer, @Frank, but any clue how to perform the recoding using only base R?Thxs
Thanks @ÁlvaroA.Gutiérrez-Vargas I think something involving the match function would work, maybe m = match(DF$y, rec$old); DF$y = ifelse(is.na(m), DF$y, rec$new[m]) You could also post a bounty to see if someone else has a better answer stackoverflow.com/help/bounty

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.