missoNet

R-CMD-check CRAN status CRAN downloads arXiv License: GPL-2

Multi-task regression and network estimation with missing responses — no imputation required!

missoNet jointly estimates regression coefficients and the response network (precision matrix) from multi-response data where some responses are missing (MCAR/MAR/MNAR). Estimation is based on unbiased estimating equations with separate L1 regularization for coefficients and the precision matrix, enabling robust multi-trait analysis under incomplete outcomes.


Why missoNet?

If you only have a single response, classical lasso/elastic net (e.g., glmnet) is simpler and likely faster.


Installation

CRAN (stable)

 install.packages("missoNet")

GitHub (development)

 # install.packages("devtools")
devtools::install_github("yixiao-zeng/missoNet", build_vignettes = TRUE)

Quick start

 library(missoNet)
 
 # Example data with ~15% missing responses (MCAR)
sim <- generateData(n = 300, p = 50, q = 10, rho = 0.15, missing.type = "MCAR")
 
 # Fit along two lambda paths; choose via BIC (no CV)
fit <- missoNet(X = sim$X, Y = sim$Z, GoF = "BIC")
 
 # Extract estimates at the selected solution
Beta <- fit$est.min$Beta # p x q regression coefficients
Theta <- fit$est.min$Theta # q x q precision (conditional network)
 
 # Visualize selection path
 plot(fit, type = "scatter")

Cross‐validation & prediction

 # 5-fold CV over (lambda.beta, lambda.theta)
cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5)
 
 # Inspect CV heatmap and selected models (min and 1-SE variants)
 plot(cvfit, type = "heatmap")
 
 # Predict responses on new data
Y_hat <- predict(cvfit, newx = sim$X, s = "lambda.min")

Tip: Try s = "lambda.1se.beta" or "lambda.1se.theta" for more conservative sparsity when available.


Parallel processing

 library(parallel)
 
cl <- makeCluster(max(1, detectCores() - 1))
cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5,
 parallel = TRUE, cl = cl)
 stopCluster(cl)

Advanced usage

Custom penalty factors

 # Lessen the penalty for prior-important predictors
p <- ncol(sim$X); q <- ncol(sim$Z)
beta.pen.factor <- matrix(1, p, q)
beta.pen.factor[c(1, 2), ] <- 0.1
 
fit <- missoNet(X = sim$X, Y = sim$Z,
 beta.pen.factor = beta.pen.factor)

Adaptive search (faster large runs)

fit <- missoNet(X = sim$X, Y = sim$Z,
 adaptive.search = TRUE,
 n.lambda.beta = 50,
 n.lambda.theta = 50)

Documentation

 vignette("missoNet-introduction")
 vignette("missoNet-cross-validation")
 vignette("missoNet-case-study")

If vignettes are not available from CRAN binaries on your platform, install from source using the GitHub command above with build_vignettes = TRUE.


Performance notes

Actual performance will depend on sparsity, signal-to-noise, and missingness mechanisms.


When to use (and not)

Great for

Not ideal for - Single-response regression (use glmnet or similar) - Extremely sparse information (e.g., >50% missing responses across most traits)


Citation

If you use missoNet in your research, please cite:

 @article{zeng2025missonet,
 title = {Multivariate regression with missing response data for modelling regional DNA methylation QTLs},
 author = {Zeng, Yixiao and Alam, Shomoita and Bernatsky, Sasha and Hudson, Marie and Colmegna, In{\'e}s and Stephens, David A and Greenwood, Celia MT and Yang, Archer Y},
 journal = {arXiv preprint arXiv:2507.05990},
 year = {2025},
 url = {https://arxiv.org/abs/2507.05990}
}

Contributing

Contributions and issues are welcome! Please open a discussion or pull request on the GitHub repository.


License

GPL-2. See the LICENSE file.

AltStyle によって変換されたページ (->オリジナル) /