Showing posts with label referencing sequential variables. Show all posts
Showing posts with label referencing sequential variables. Show all posts
Monday, January 10, 2011
Example 8.20: Referencing lists of variables, part 2
[フレーム]
In Example 8.19, we discussed how to refer to a group of variables with sequential names, such as varname1, varname2, varname3. This is trivial in SAS and can be done in R as we showed.It's also sometimes useful to refer to all variables which begin with a common character string. For example, in the HELP data set, there are the variables cesd, cesd1, cesd2, cesd3 and cesd4.
SAS
In SAS, this can be done with the : operator. This functions much like the * wildcard available in many operating systems.
proc means data="c:\book\help.sas7bdat" mean;
var cesd:;
run;
Variable Mean
------------------------
CESD1 22.7154472
CESD2 23.5837321
CESD3 22.0685484
CESD4 20.1428571
CESD 32.8476821
------------------------
R
This functionality is not built into R. But, as with the sequentially named variable problem, you can use the string functions available within R to replicate the effect.
In this case, we use the names() function (section 1.3.4) to get a list of the variables in the data set, then search for names whose beginnings match the desired string using the substr() function (section 1.4.3). Note that the substr() == section returns a vector of logicals, rather than variable names.
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
mean(ds[, substr(names(ds), 1, 4) == "cesd"], na.rm=TRUE)
cesd1 cesd2 cesd3 cesd4 cesd
22.71545 23.58373 22.06855 20.14286 32.84768
The typing required for the previous statement is rather involved, and requires counting characters. You may want to make a function to do this instead.
The function will accept a data frame as input and return the data frame with just the desired variables. It looks much like the direct version displayed above, but uses the substitute() function to access the "varname" parameter as text, rather than as an object. I store those characters in the object vname.
matchin = function(dsname, varname) {
vname = substitute(varname)
return(dsname[substr(names(dsname),1,nchar(vname)) == vname])
}
Now we can just type
mean(matchin(ds, cesd), na.rm=TRUE)
with results identical to those displayed above.
Monday, January 3, 2011
Example 8.19: Referencing lists of variables
[フレーム]
In section 1.11.4 (p. 50), we discuss referring to lists of variables in a data set. In SAS, this can be done for variable stored in adjacent columns with the "var_x -- var_y" syntax and for variables with sequentially enumerated suffixes with the "var_n1 - var_n2" syntax. We state in the above referenced section that R has no straightforward equivalent ability to reference a list of variables by name, though to reference by location is trvial. Wayne Richter (of the NY State Department of Environmental Conservation) pointed out a reference from Muenchen's excellent text that makes this task relatively straightforward to undertake in R for variables with sequential numerical suffixes.R
Here we demonstrate this by displaying the means of the cesd1, cesd2, cesd3, and cesd4 variables measuring depressive symptoms at each of the followup time points for the HELP study.
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
mean(ds[, paste('cesd', seq(1:4), sep = '')], na.rm=TRUE)
which generates the output:
cesd1 cesd2 cesd3 cesd4
22.71545 23.58373 22.06855 20.14286
This approach selects a set of variables by generating a character vector of variable names using the paste() function (section 1.4.5) and the seq() function (section 1.11.3). Then the mean() function is applied to the selected columns.
SAS
This task is straightforward in SAS, using the - syntax (section 1.11.4) in the var statement in proc means.
proc means data=ds maxdec=2 n mean;
var cesd1 - cesd4;
run;
The MEANS Procedure
Variable Label N Mean
-----------------------------------------
CESD1 1 cesd 246 22.72
CESD2 2 cesd 209 23.58
CESD3 3 cesd 248 22.07
CESD4 4 cesd 266 20.14
-----------------------------------------
Subscribe to:
Comments (Atom)