Density Estimation
Last update: 17 Sep 2025 08:57
First version: 5 October 2010
Two topics of particular interest: estimating conditional
densities, and estimating the densities of short subsequences from
time series.
Recommended, bigger picture:
- Luc Devorye and Gabor Lugosi, Combinatorial Methods in Density
Estimation
- Peter Hall, Jeff Racine and Qi Li, "Cross-Validation and the Estimation of Conditional Probability Densities", Journal of the American Statistical Association 99 (2004): 1015--1026 [PDF]
- Jeffrey S. Racine, "Nonparametric Econometrics: A Primer",
Foundations and Trends in Econometrics
3 (2008): 1--88 [Good primer of nonparametric techniques
for regression, density estimation and hypothesis testing; next to no economic
content (except for examples). Presumes reasonable familiarity with parametric
statistics. PDF
reprint]
- Jeffrey S. Simonoff, Smoothing Methods in Statistics
- Larry Wasserman
- All of Statistics
- All of Nonparametric Statistics
Recommended, close-ups:
- Andrew R. Barron and Chyong-Hwa Sheu, "Approximation of Density
Functions by Sequences of Exponential Families", Annals of Statistics 19 (1991): 1347--1369
- Giulio Biroli and Marc Mézard, "Kernel Density Estimators in Large Dimensions", arxiv:2408.05807 [Dep't. of "of course it's really a spin glass". More exactly: for large enough bandwidths and rich enough sample sizes (relative to the dimension), there are lots of sample points near any place where we're evaluating the density and a central limit theorem holds. Below that point, as the bandwidth shrinks or we have fewer samples, things start to look spin-glass-y, with weird large fluctuations. And below that, every density estimate is basically driven a few, often just one, sample points.]
- Susan M. Buchman, Ann B. Lee, Chad M. Schafer, "High-Dimensional Density Estimation via SCA: An Example in the Modelling of Hurricane Tracks", arxiv:0907.0199
- Bruce E. Hansen
- "Nonparametric Conditional Density
Estimation" [PDF preprint, 2004]
- "Nonparametric Estimation of Smooth Conditional Distributions" [Preprint]
- Dirk Husmeier, Neural Networks for Conditional Probability Estimation: Forecasting Beyond Point Predictions
- Rafael Izbicki, A Spectral Series Approach to High-Dimensional Nonparametric Inference [Ph.D. thesis, CMU statistics department, 2014]
- Rafael Izbicki, Ann Lee, Chad Schafer, "High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation",
AISTATS 2014: 420--429
- Daniel McDonald, "Minimax Density Estimation for Growing Dimension", AIStats 2017 194--203
- Abdelkader Mokkadem, Mariane Pelletier, Yousri Slaoui, "The stochastic approximation method for the estimation of a multivariate probability density", arxiv:0807.2960
- Alessandro Rinaldo, Aarti Singh, Rebecca Nugent, Larry Wasserman, "Stability of Density-Based Clustering", arxiv:1011.2771
- Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya,
Masashi Sugiyama, "Relative Density-Ratio Estimation for Robust Distribution Comparison",
Neural Computation 25 (2013): 1324--1370 [This is not the relative density between \( p \) and \( q \) in the Handcock-Morris sense, just the ratio between \( p \) and \( ap+(1-a)q \), for adjustable \( a \). (This is to keep the density ratio from going to infinity anywhere.) The thing seems a bit hackish, but still worth considering...]
- Lin Yuan, Sergey Kirshner, Robert Givan, "Estimating Densities with Non-Parametric Exponential Families", arxiv:1206.5036
- Yan Zheng, Jeffrey Jestes, Jeff M. Phillips, Feifei Li, "Quality and efficiency for kernel density estimates in large data", pp. 433--444 of SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data [Reprint via Dr. Li]
- Victoria Zinde-Walsh, "Nonparametric functionals as generalized functions", arxiv:1303.1435
To read:
- Ethan Anderes, Marc Coram, "A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms", arxiv:1205.5314
- Alain Berlinet, Gérard Biau and Laurent Rouvière,
"Optimal L1 Bandwidth selection for variable kernel density estimates",
Statistics and
Probability Letters 74 (2005): 116--128 ["[O]ne can
improve performance of kernel density estimates by varying the bandwidth with
the location and/or the sample data at hand. Our interest in this paper is in
the data-based selection of a variable bandwidth... an automatic selection
procedure inspired by the combinatorial tools developed in Devroye and
Lugosi... the expected L1 error of the corresponding selected estimate is up to
a given constant multiple of the best possible error plus an additive term
which tends to zero under mild assumptions"]
- Blair Bilodeau, Dylan J. Foster, Daniel M. Roy, "Minimax rates for conditional density estimation via empirical entropy", Annals of Statistics 51 (2023): 762--790, arxiv:2109.10461
- Z. I. Botev, J. F. Grotowski, and D. P. Kroese, "Kernel density estimation via diffusion", Annals of Statistics 38 (2010): 2916--2957
- Blair Bilodeau, Dylan J. Foster, Daniel M. Roy, "Minimax rates for conditional density estimation via empirical entropy", Annals of Statistics 51 (2023): 762--790, arxiv:2109.10461
- Serge Cohen, Erwan Le Pennec, "Conditional Density Estimation by Penalized Likelihood Model Selection", arxiv:1103.2021
- Tilman M. Davies, Martin L. Hazelton, Jonathan. C Marshall, "sparr: Analyzing Spatial Relative Risk Using Fixed and Adaptive Kernel Density Estimation in R", Journal of Statistical Software 39:1 (2011)
- Sam Efromovich
- Bradley Efron and Robert Tibshirani, "Using Specially Designed Exponential Families for Density Estimation", Annals of Statistics
24 (1996): 2431--2461
- Michael Feindt, "A Neural Bayesian Estimator for Conditional
Probability Densities", physics/0402093
- Evarist Giné and Hailin Sang, "Uniform asymptotics for kernel density estimators with variable bandwidths", arxiv:1007.4350
- Evarist Giné and Richard Nickl
- David Haussler, Manfred Opper, "Mutual information, metric entropy and cumulative relative entropy risk",
Annals of Statistics 25 (1997): 2451--2492
- Nima S. Hejazi, Mark J. van der Laan and David Benkeser, "haldensify: Highly adaptive lasso conditional density estimation in R", Journal of Open Source Software 7 (2022): 4522
- Han Liu, John Lafferty and Larry Wasserman, "Tree
Density Estimation", arxiv:1001.1557
- Han Liu, Min Xu, Haijie Gu, Anupam Gupta, John Lafferty, Larry Wasserman, "Forest Density Estimation", Journal of Machine
Learning Research 12 (2011): 907--951
- Yanyuan Ma, Jeffrey D. Hart and Raymond J. Carroll, "Density Estimation in Several Populations With Uncertain Population Membership", Journal of the American Statistical Association 106 (2011): 1180--1192
- Reason Lesego Machete, "Early Warning with Calibrated and Sharper Probabilistic Forecasts", arxiv:1112.6390
- Brendan P. M. McCabe, Gael M. Martin, David Harris, "Efficient probabilistic forecasts for counts", Journal
of the Royal Statistical Society B 73 (2011): 253--272
- Andrew B. Nobel, Gusztav Morvai, Sanjeev R. Kulkarni, "Density estimation from an individual numerical sequence", IEEE Transactions on
Information Theory 44 (1998): 537--541, arxiv:0710.2500
- Andriy Norets, "Approximation of conditional densities by smooth mixtures of regressions", Annals of Statistics 38
(2010): 1733--1766, arxiv:1010.0581
- Michael Nussbaum, "Asymptotic Equivalence of Density Estimation and
Gaussian White Noise", Annals of Statistics 24
(1996): 2399--2430
- Seonjo Park and Panos M. Pardalos, "Deep data density estimation through Donsker-Varadhan representation", Annals of Mathematics and Artificial Intelligence 93 (2025): 7--17 [Honestly, this looks weird, but possible cool]
- Alessandro Rinaldo and Larry Wasserman, "Generalized Density
Clustering", Annals of Statistics 38
(2010): 2678--2722, arxiv:0907.3454
- Olga Y. Savchuk, Jeffrey D. Hart, and Simon J. Sheather, "Indirect
Cross-Validation for Density
Estimation", Journal
of the American Statistical Association 105 (2010):
415--423
- Bharath Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Aapo Hyv\"{a}rinen, Revant Kumar, "Density Estimation in Infinite Dimensional Exponential Families", Journal of Machine Learning Research 18:57 (2017): 1--59
- Yuefeng Wu, Subhashis Ghosal, "Kullback Leibler property of kernel mixture priors in Bayesian density estimation", Electronic Journal
of Statistics 2 (2008): 298--331, arxiv:0710.2746
- Bin Yu, "Density Estimation in the $L^{\infty}$ Norm for Dependent Data with Applications to the Gibbs Sampler", Annals of Statistics
21 (1993): 711--735
- Adriano Zanin Zambom, Ronaldo Dias, "A Review of Kernel Density Estimation with Applications to Econometrics", arxiv:1212.2812