Scoring algorithm
Scoring algorithm, also known as Fisher's scoring,[1] is a form of Newton's method used in statistics to solve maximum likelihood equations numerically, named after Ronald Fisher.
Sketch of derivation
[edit ]Let {\displaystyle Y_{1},\ldots ,Y_{n}} be random variables, independent and identically distributed with twice differentiable p.d.f. {\displaystyle f(y;\theta )}, and we wish to calculate the maximum likelihood estimator (M.L.E.) {\displaystyle \theta ^{*}} of {\displaystyle \theta }. First, suppose we have a starting point for our algorithm {\displaystyle \theta _{0}}, and consider a Taylor expansion of the score function, {\displaystyle V(\theta )}, about {\displaystyle \theta _{0}}:
- {\displaystyle V(\theta )\approx V(\theta _{0})-{\mathcal {J}}(\theta _{0})(\theta -\theta _{0}),,円}
where
- {\displaystyle {\mathcal {J}}(\theta _{0})=-\sum _{i=1}^{n}\left.\nabla \nabla ^{\top }\right|_{\theta =\theta _{0}}\log f(Y_{i};\theta )}
is the observed information matrix at {\displaystyle \theta _{0}}. Now, setting {\displaystyle \theta =\theta ^{*}}, using that {\displaystyle V(\theta ^{*})=0} and rearranging gives us:
- {\displaystyle \theta ^{*}\approx \theta _{0}+{\mathcal {J}}^{-1}(\theta _{0})V(\theta _{0}).,円}
We therefore use the algorithm
- {\displaystyle \theta _{m+1}=\theta _{m}+{\mathcal {J}}^{-1}(\theta _{m})V(\theta _{m}),,円}
and under certain regularity conditions, it can be shown that {\displaystyle \theta _{m}\rightarrow \theta ^{*}}.
Fisher scoring
[edit ]In practice, {\displaystyle {\mathcal {J}}(\theta )} is usually replaced by {\displaystyle {\mathcal {I}}(\theta )=\mathrm {E} [{\mathcal {J}}(\theta )]}, the Fisher information, thus giving us the Fisher Scoring Algorithm:
- {\displaystyle \theta _{m+1}=\theta _{m}+{\mathcal {I}}^{-1}(\theta _{m})V(\theta _{m})}..
Under some regularity conditions, if {\displaystyle \theta _{m}} is a consistent estimator, then {\displaystyle \theta _{m+1}} (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.[2]
See also
[edit ]References
[edit ]- ^ Longford, Nicholas T. (1987). "A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects". Biometrika. 74 (4): 817–827. doi:10.1093/biomet/74.4.817.
- ^ Li, Bing; Babu, G. Jogesh (2019), "Bayesian Inference" , Springer Texts in Statistics, New York, NY: Springer New York, Theorem 9.4, doi:10.1007/978-1-4939-9761-9_6, ISBN 978-1-4939-9759-6, S2CID 239322258 , retrieved 2023年01月03日
Further reading
[edit ]- Jennrich, R. I. & Sampson, P. F. (1976). "Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation". Technometrics . 18 (1): 11–17. doi:10.1080/00401706.1976.10489395 (inactive 12 July 2025). JSTOR 1267911.
{{cite journal}}
: CS1 maint: DOI inactive as of July 2025 (link)