2

I'm trying to do something very simple in R that I can do in Stata but I can't quite get it right.

Here is my sample of my data

data<-data.frame(
 C1=c(rep(2,5), rep(20,5), rep(70,5)),
 C2=c(rep(20,5), rep(70,5), rep(80,5)),
 year=rep(1990:1994, 3), 
 VAR1=NA,
 VAR2=NA,
 VAR3=NA
)

in Stata I can do this

replace VAR1=1 if CC1=2 & CC2==20 & year == 1990
replace VAR2=60 if CC1=2 & CC2==20 & year == 1990
replace VAR3=70 if CC1=2 & CC2==20 & year == 1990

annoyingly Stata syntax does not allow

replace VAR1=1 & VAR2=60 & VAR3=70 if CC1=2 & CC2==20 & year == 1990

using the first Stata code

this

data1<-data.frame(C1=c(2),C2=c(20),year=c(1990),VAR1=NA,VAR2=NA,VAR3=NA)

becomes this

data2<-data.frame(C1=c(2),C2=c(20),year=c(1990),VAR1=c(1),VAR2=c(60),VAR3=c(70))

I can't find anything similar to this problem (it's very likely that I'm not asking/looking for the right phrase)

I'd like to do either the 1st but preferably the 2nd Stata command in R.

Nick Cox
37.4k6 gold badges37 silver badges51 bronze badges
asked Jul 21, 2019 at 0:03
1
  • The Stata syntax you want uses & in two quite different senses, & is a logical operator, not punctuation in a list of things to be done. Commented Jul 21, 2019 at 6:25

2 Answers 2

3

If your condition is going to remain the same for all the columns you can calculate them once to get indices in different column and assign the values together.

inds <- with(data, C1 == 2 & C2 == 20 & year == 1990)
data[inds, paste0("VAR", 1:3)] <- as.list(c(1, 60, 70))
data
# C1 C2 year VAR1 VAR2 VAR3
#1 2 20 1990 1 60 70
#2 2 20 1991 NA NA NA
#3 2 20 1992 NA NA NA
#4 2 20 1993 NA NA NA
#5 2 20 1994 NA NA NA
#6 20 70 1990 NA NA NA
#7 20 70 1991 NA NA NA
#8 20 70 1992 NA NA NA
#9 20 70 1993 NA NA NA
#10 20 70 1994 NA NA NA
#11 70 80 1990 NA NA NA
#12 70 80 1991 NA NA NA
#13 70 80 1992 NA NA NA
#14 70 80 1993 NA NA NA
#15 70 80 1994 NA NA NA

If you might have different conditions for different columns you can have a look at dplyr package which makes it easier such replacement using pipes

library(dplyr)
data %>%
 mutate(VAR1 = replace(VAR1, C1 == 2 & C2 == 20 & year == 1990, 1), 
 VAR2 = replace(VAR2, C1 == 2 & C2 == 20 & year == 1990, 60), 
 VAR3 = replace(VAR3, C1 == 2 & C2 == 20 & year == 1990, 70))
answered Jul 21, 2019 at 2:26
Sign up to request clarification or add additional context in comments.

Comments

2

Here is one option using data.table

library(data.table)
nm1 <- grep("VAR", names(data))
setDT(data)[C1 == 2 & C2 == 20 & year == 1990, (nm1) := .(1, 60, 70)]
data
# C1 C2 year VAR1 VAR2 VAR3
# 1: 2 20 1990 1 60 70
# 2: 2 20 1991 NA NA NA
# 3: 2 20 1992 NA NA NA
# 4: 2 20 1993 NA NA NA
# 5: 2 20 1994 NA NA NA
# 6: 20 70 1990 NA NA NA
# 7: 20 70 1991 NA NA NA
# 8: 20 70 1992 NA NA NA
# 9: 20 70 1993 NA NA NA
#10: 20 70 1994 NA NA NA
#11: 70 80 1990 NA NA NA
#12: 70 80 1991 NA NA NA
#13: 70 80 1992 NA NA NA
#14: 70 80 1993 NA NA NA
#15: 70 80 1994 NA NA NA

Or another option is to set the key while creating the data.table and then specify the i with the values

setDT(data, key = c("C1", "C2", "year"))
data[.(2, 20, 1990), (nm1) := .(1, 60, 70)]

Or using tidyverse

library(tidyverse)
i1 <- with(data, C1 == 2 & C2 == 20 & year == 1990)
data %>% 
 select(starts_with("VAR")) %>%
 map2_df(., c(1, 60, 70), ~ replace(.x, i1, .y)) %>%
 bind_cols(data %>% 
 select(1:3), .)

data

data <- structure(list(C1 = c(2, 2, 2, 2, 2, 20, 20, 20, 20, 20, 70, 
70, 70, 70, 70), C2 = c(20, 20, 20, 20, 20, 70, 70, 70, 70, 70, 
80, 80, 80, 80, 80), year = c(1990L, 1991L, 1992L, 1993L, 1994L, 
1990L, 1991L, 1992L, 1993L, 1994L, 1990L, 1991L, 1992L, 1993L, 
1994L), VAR1 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_), VAR2 = c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), VAR3 = c(NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_)), 
class = "data.frame", row.names = c(NA, 
-15L))
answered Jul 21, 2019 at 3:47

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.