Regularized Learning Algorithm

From GM-RKB
Jump to navigation Jump to search

A Regularized Learning Algorithm is a supervised model-based learning algorithm that ...



References

2017

2016

Model Fit measure Entropy measure[1] [2]
AIC/BIC [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \|\beta\|_0 }[/math]
Ridge regression [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \|\beta\|_2 }[/math]
Lasso [3] [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \|\beta\|_1 }[/math]
Basis pursuit denoising [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \lambda\|\beta\|_1 }[/math]
Rudin-Osher-Fatemi model (TV) [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \lambda\|\nabla\beta\|_1 }[/math]
Potts model [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \lambda\|\nabla\beta\|_0 }[/math]
RLAD[4] [math]\displaystyle{ \|Y-X\beta\|_1 }[/math] [math]\displaystyle{ \|\beta\|_1 }[/math]
Dantzig Selector[5] [math]\displaystyle{ \|X^\top (Y-X\beta)\|_\infty }[/math] [math]\displaystyle{ \|\beta\|_1 }[/math]
SLOPE[6] [math]\displaystyle{ \|Y-X\beta\|_2 }[/math] [math]\displaystyle{ \sum_{i=1}^p \lambda_i|\beta|_{(i)} }[/math]

A linear combination of the LASSO and ridge regression methods is elastic net regularization.
  1. Bishop, Christopher M. (2007). Pattern recognition and machine learning (Corr. printing. ed.). New York: Springer. ISBN 978-0387310732. 
  2. Duda, Richard O. (2004). Pattern classification + computer manual : hardcover set (2. ed. ed.). New York [u.a.]: Wiley. ISBN 978-0471703501. 
  3. Tibshirani, Robert (1996). "Regression Shrinkage and Selection via the Lasso" (PostScript). Journal of the Royal Statistical Society, Series B 58 (1): 267–288. MR 1379242 . http://www-stat.stanford.edu/~tibs/ftp/lasso.ps . Retrieved 2009年03月19日. 
  4. Template:Cite conference
  5. Candes, Emmanuel; Tao, Terence (2007). "The Dantzig selector: Statistical estimation when p is much larger than n". Annals of Statistics 35 (6): 2313–2351. arXiv:math/0506081. doi:10.1214/009053606000001523. MR 2382644. 
  6. Małgorzata Bogdan, Ewout van den Berg, Weijie Su & Emmanuel J. Candes (2013). "Statistical estimation and testing via the ordered L1 norm". arXiv preprint arXiv:1310.1969. arXiv:1310.1969v2 . http://arxiv.org/pdf/1310.1969v2.pdf . 

2015

  • https://www.quora.com/What-is-the-difference-between-L1-and-L2-regularization/answer/Justin-Solomon
    • QUOTE: ... you can view regularization as a prior on the distribution from which your data is drawn (most famously Gaussian for least-squares), as a way to punish high values in regression coefficients, and so on.
  • Compressibility and K-term approximation http://cnx.org/contents/U4hLPGQD@5/Compressible-signals#uid10
    • QUOTE: A signal's compressibility is related to the lp space to which the signal belongs. An infinite sequence x(n) is an element of an lp space for a particular value of p if and only if its lp norm is finite: [math]\displaystyle{ ∥x∥p=(∑i|xi|p)1p\lt ∞ }[/math]

      The smaller p is, the faster the sequence's values must decay in order to converge so that the norm is bounded. In the limiting case of p=0, the "norm" is actually a pseudo-norm and counts the number of non-zero values. As p decreases, the size of its corresponding lp space also decreases. Figure shows various lp unit balls (all sequences whose lp norm is 1) in 3 dimensions.

      As the value of p decreases, the size of the corresponding lp space also decreases. This can be seen visually when comparing the the size of the spaces of signals, in three dimensions, for which the lp norm is less than or equal to one. The volume of these lp "balls" decreases with p.

2011

2007

2004

1996


Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Regularized_Learning_Algorithm&oldid=887992"