1
\$\begingroup\$

This function takes a numeric vector and returns "con" if it's continous and "bin" if it's binary.

It does not take the multinomial case into account, i.e. if a variable y has three possible values 0, 1, 2, it's treated like a continuous variable.

Code:

checkBinaryTrait = function(v, naVal="NA") {
 if(!is.numeric(v)) stop("Only numeric vectors are accepted.")
 vSet = unique(v)
 if(!missing(naVal)) vSet[which(vSet == naVal)] = NA
 vSet = vSet[which(!is.na(vSet))]
 if(any(as.integer(vSet) != vSet)) return("con")
 if(length(vSet) > 2) return("con")
 "bin"
}

Tests:

v = c(1, 1.1, 1, 1.1, NA)
checkBinaryTrait(v)
v = c(1, 2, 1, 2, NA)
checkBinaryTrait(v)
v = c(-9, 2.3, 4.1, -9, -9)
checkBinaryTrait(v, -9)
v = c(-9, 2, 4, -9, -9)
checkBinaryTrait(v, -9)
asked Nov 18, 2014 at 16:40
\$\endgroup\$
2
  • \$\begingroup\$ How do you treat a vector that contains identical values only? \$\endgroup\$ Commented Nov 18, 2014 at 18:51
  • \$\begingroup\$ Do you want to treat NA as a value too? \$\endgroup\$ Commented Nov 18, 2014 at 18:59

3 Answers 3

3
\$\begingroup\$

Using the return statement is not recommended. You can get the same effect by rewriting with else if and else, like this:

checkBinaryTrait = function(v, naVal="NA") {
 if (!is.numeric(v)) stop("Only numeric vectors are accepted.")
 vSet = unique(v)
 if (!missing(naVal)) vSet[vSet == naVal] = NA
 vSet = vSet[!is.na(vSet)]
 if (any(as.integer(vSet) != vSet)) "con"
 else if (length(vSet) > 2) "con"
 else "bin"
}

I also removed the unnecessary which calls. This code still passes all your tests.

Actually the last statement can be further simplified to:

 if (any(as.integer(vSet) != vSet) || length(vSet) > 2) "con"
 else "bin"

You might also want to change the return type to TRUE or FALSE, in which case the last statement would become simply:

 !(any(as.integer(vSet) != vSet) || length(vSet) > 2)

And then, how about renaming checkBinaryTrait to is.binary?

Finally, the <- operator is more common than =. For example Google's style guide explicitly forbids using =.

answered Nov 18, 2014 at 20:31
\$\endgroup\$
3
\$\begingroup\$

Let me address some portions of your code before providing an alternative implementation.

  • missing(naVal)

    I would prefer not using this approach but an appropriate neutral default value for naVal. We can use NULL for this purpose.

  • vSet[which(vSet == naVal)] = NA

    Replacing calues with NA before removing them is an unnecessary step. Furthermore, replacing values with NA is easier with the is.na<- function, for example, is.na(vSet) <- vSet == naVal.

  • vSet[which(!is.na(vSet))]

    You can omit NA values with the na.omit function.


Here's an alternative implementation. For details, have a look at the comments.

checkBinaryTrait <- function(v, naVal = NULL) { 
 if( !is.numeric(v) ) stop("Only numeric vectors are accepted.")
 # remove NA's
 v2 <- na.omit(v)
 # get unique values
 v_unique <- unique(v2)
 # remove 'naVal's
 v_unique2 <- v_unique[! v_unique %in% naVal]
 # count number of unique values and check whether all values are integers
 if ( length(unique(v_unique2)) > 2L || 
 any(as.integer(v_unique2) != v_unique2) ) "con" else "bin"
}

Some tests:

> checkBinaryTrait(v, -9)
[1] "bin"
> checkBinaryTrait(c(1, 1.1, 1, 1.1, NA))
[1] "con"
> checkBinaryTrait(c(1, 2, 1, 2, NA))
[1] "bin"
> checkBinaryTrait(c(-9, 2.3, 4.1, -9, -9), -9)
[1] "con"
> checkBinaryTrait(c(-9, 2, 4, -9, -9), -9)
[1] "bin"

This implementations also allows multiple naVal values:

> checkBinaryTrait(c(1, 2, 2, 1, -9, -9.9), c(-9, -9.9))
[1] "bin"
answered Nov 18, 2014 at 21:40
\$\endgroup\$
1
\$\begingroup\$

I felt factors() was appropriate here. Assuming decimal values are also considered (nothing in the question related to this)

checkBinaryTrait = function(v,naVal = "NA"){
 if (!is.numeric(v)) stop("Only numeric vectors are accepted.")
 if(length(levels(factor(v[-which(v == naVal)]))) < 3) "bin" else "con"
}

If only integers are to be considered, coercing into integers:

checkBinaryTrait = function(v,naVal = "NA"){
 if (!is.numeric(v)) stop("Only numeric vectors are accepted.")
 if(length(levels(factor(as.integer(v[-which(v == naVal)])))) < 3) "bin" else "con"
}
answered Dec 2, 2014 at 10:47
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.