0

I got a column of labelled values. Let's call it country. When I run:

attr(dat[["Country"]], "labels")

I get the next table:

USA Germany France UK Spain India Saudi Arabia 
 1 2 3 4 5 6 7 

Now I got a new column of int values that are not labelled. Let's call it newCountry. I would like to change those int values to the label of the original Country column. In other words, I would like to go from this in an efficient way...

3 2 2 1 5 4

to this...

France Germany Germany USA Spain UK

asked Jan 29, 2020 at 18:22
6
  • 1
    Just extract attr(dat[["Country"]], "labels")[i] where i <- c(3, 2, 2, 1, 5, 4). Commented Jan 29, 2020 at 18:30
  • I have tried that but it just returns the number again not the actual label. Commented Jan 29, 2020 at 18:42
  • OK, it seems that the labels have names. Assign the output of attr(etc) to, say, labs. What does names(labs) return? If it returns the countries names then extract from those, names(labs)[i]. Commented Jan 29, 2020 at 18:44
  • Cool! It does return the countries names. However this approach may only work if the label number (1,2,3,4,5...) starts in 1 and is sequential. Commented Jan 29, 2020 at 18:51
  • No, it will work irrespective of the labels themselves, it's the names of a vector (which happens to be the labels) that is being subset by i. Maybe it's better if I explain in an answer? Commented Jan 29, 2020 at 18:53

1 Answer 1

1

The problem is that the data frame has a column, Country, with the attribute "labels" set. In its turn, this attribute, which is just a vector, has the attribute "names" set. So the steps to get the "names" of the "labels" are:

  1. Get the "labels" of column Country;
  2. Get the "names" of the vector of labels;
  3. Extract the names corresponding to a vector of indices, the vector i.

First read in the posted data.

nms <- scan(text = "USA Germany France UK Spain India 'Saudi Arabia'",
 what = character())
i <- scan(text = "3 2 2 1 5 4")

Now create a data set example.

labs <- setNames(1:7, nms)
dat <- data.frame(Country = sample(letters, 7))
attr(dat[["Country"]], "labels") <- labs

And extract what the question asks for, following the steps above.

labsCountry <- attr(dat[["Country"]], "labels")
names(labsCountry)[i]
#[1] "France" "Germany" "Germany" "USA" "Spain" "UK"

Or a one-liner:

names(attr(dat[["Country"]], "labels"))[i]
#[1] "France" "Germany" "Germany" "USA" "Spain" "UK"

To see that this does not depend on the values of the labels, create a second example.

labs2 <- setNames(101:107, nms)
attr(dat[["Country"]], "labels") <- labs2

And though the "labels" are different, the same instructions work:

attr(dat[["Country"]], "labels")
# USA Germany France UK Spain India Saudi Arabia 
# 101 102 103 104 105 106 107
labsCountry <- attr(dat[["Country"]], "labels")
names(labsCountry)[i]
answered Jan 29, 2020 at 19:04
Sign up to request clarification or add additional context in comments.

3 Comments

Well, I am facing an issue with this solution. When I get a labelled value of 0, this doesn't work as the match is done assuming by index assuming that labelled values start in 0. Imagine USA is 0 instead of 1. Do you have any idea how we could fix this?
@SarahíAguilar In R vectors are 1-based. If the label of USA is "0", it will still be the 1st element of the labels vector and the code should work. Can you edit the question with an example where it fails?
Exactly. But I have managed to fix this already. What I did was simply a table with values and their labels and then matched that with the array values. Thanks, though!

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.