1National Science Library (Chengdu), Chinese Academy of Sciences
2School of Statistics, Renmin University of China
Abstract
Social network platforms today generate vast amounts of data, including network structures and a large number of user-defined tags, which reflect users’ interests. The dimensionality of these personalized tags can be ultrahigh, posing challenges for model analysis in targeted preference analysis. Traditional categorical feature screening methods overlook the network structure, which can lead to incorrect feature set and suboptimal prediction accuracy. This study focuses on feature screening for network-involved preference analysis based on ultrahigh-dimensional categorical tags. We introduce the concepts of self-related features and network-related features, defined as those directly related to the response and those related to the network structure, respectively. We then propose a pseudo-likelihood ratio feature screening procedure that identifies both types of features. Theoretical properties of this procedure under different scenarios are thoroughly investigated. Extensive simulations and real data analysis on Sina Weibo validate our findings.
Funding Statement
This research is supported by the National Natural Science Foundation of China (NSFC 72471230,71873137), the Special Research Assistant Grant Program of the Chinese Academy of Sciences (E4C00013), the Sichuan Provincial Fund Project for Philosophy and Social Sciences (SCJJ24ND326), fund for building world-class universities (disciplines) of the Renmin University of China, the MOE Project of Key Research Institute of Humanities and Social Sciences (grant 22JJD110001) and Public Computing Cloud, Renmin University of China, Big Data and Responsible Artificial Intelligence for National Governance, Renmin University of China.
Acknowledgments
The authors would like to thank the anonymous referees, an Associate Editor, and the Editor for their insightful and constructive comments that improved the quality of this article.
Danyang Huang and Bo Zhang are the corresponding authors.
Citation
Wei Hu. Danyang Huang. Bo Zhang. "Pseudo-likelihood ratio screening based on network data with applications." Ann. Appl. Stat. 19 (3) 2517 - 2538, September 2025. https://doi.org/10.1214/25-AOAS2058
Information