Matrix t-distribution
Find sources: "Matrix t-distribution" – news · newspapers · books · scholar · JSTOR (April 2016) (Learn how and when to remove this message)
Notation | {\displaystyle {\rm {T}}_{n,p}(\nu ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})} | ||
---|---|---|---|
Parameters |
{\displaystyle \mathbf {M} } location (real {\displaystyle n\times p} matrix) | ||
Support | {\displaystyle \mathbf {X} \in \mathbb {R} ^{n\times p}} | ||
{\displaystyle {\frac {\Gamma _{p}\left({\frac {\nu +n+p-1}{2}}\right)}{(\pi )^{\frac {np}{2}}\Gamma _{p}\left({\frac {\nu +p-1}{2}}\right)}}|{\boldsymbol {\Omega }}|^{-{\frac {n}{2}}}|{\boldsymbol {\Sigma }}|^{-{\frac {p}{2}}}}
| |||
CDF | No analytic expression | ||
Mean | {\displaystyle \mathbf {M} } if {\displaystyle \nu >1}, else undefined | ||
Mode | {\displaystyle \mathbf {M} } | ||
Variance | {\displaystyle \mathrm {cov} (\mathrm {vec} (\mathbf {X} ))={\frac {{\boldsymbol {\Sigma }}\otimes {\boldsymbol {\Omega }}}{\nu -2}}} if {\displaystyle \nu >2}, else undefined | ||
CF | see below |
In statistics, the matrix t-distribution (or matrix variate t-distribution) is the generalization of the multivariate t-distribution from vectors to matrices.[1] [2]
The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution: If the matrix has only one row, or only one column, the distributions become equivalent to the corresponding (vector-)multivariate distribution. The matrix t-distribution is the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse Wishart distribution placed over either of its covariance matrices,[1] and the multivariate t-distribution can be generated in a similar way.[2]
In a Bayesian analysis of a multivariate linear regression model based on the matrix normal distribution, the matrix t-distribution is the posterior predictive distribution.[3]
Definition
[edit ]For a matrix t-distribution, the probability density function at the point {\displaystyle \mathbf {X} } of an {\displaystyle n\times p} space is
- {\displaystyle f(\mathbf {X} ;\nu ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})=K\times \left|\mathbf {I} _{n}+{\boldsymbol {\Sigma }}^{-1}(\mathbf {X} -\mathbf {M} ){\boldsymbol {\Omega }}^{-1}(\mathbf {X} -\mathbf {M} )^{\rm {T}}\right|^{-{\frac {\nu +n+p-1}{2}}},}
where the constant of integration K is given by
- {\displaystyle K={\frac {\Gamma _{p}\left({\frac {\nu +n+p-1}{2}}\right)}{(\pi )^{\frac {np}{2}}\Gamma _{p}\left({\frac {\nu +p-1}{2}}\right)}}|{\boldsymbol {\Omega }}|^{-{\frac {n}{2}}}|{\boldsymbol {\Sigma }}|^{-{\frac {p}{2}}}.}
Here {\displaystyle \Gamma _{p}} is the multivariate gamma function.
Properties
[edit ]If {\displaystyle \mathbf {X} \sim {\mathcal {T}}_{n\times p}(\nu ,\mathbf {M} ,\mathbf {\Sigma } ,\mathbf {\Omega } )}, then we have the following properties[2] :
Expected values
[edit ]The mean, or expected value is, if {\displaystyle \nu >1}:
- {\displaystyle E[\mathbf {X} ]=\mathbf {M} }
and we have the following second-order expectations, if {\displaystyle \nu >2}:
- {\displaystyle E[(\mathbf {X} -\mathbf {M} )(\mathbf {X} -\mathbf {M} )^{T}]={\frac {\mathbf {\Sigma } \operatorname {tr} (\mathbf {\Omega } )}{\nu -2}}}
- {\displaystyle E[(\mathbf {X} -\mathbf {M} )^{T}(\mathbf {X} -\mathbf {M} )]={\frac {\mathbf {\Omega } \operatorname {tr} (\mathbf {\Sigma } )}{\nu -2}}}
where {\displaystyle \operatorname {tr} } denotes trace.
More generally, for appropriately dimensioned matrices A,B,C:
- {\displaystyle {\begin{aligned}E[(\mathbf {X} -\mathbf {M} )\mathbf {A} (\mathbf {X} -\mathbf {M} )^{T}]&={\frac {\mathbf {\Sigma } \operatorname {tr} (\mathbf {A} ^{T}\mathbf {\Omega } )}{\nu -2}}\\E[(\mathbf {X} -\mathbf {M} )^{T}\mathbf {B} (\mathbf {X} -\mathbf {M} )]&={\frac {\mathbf {\Omega } \operatorname {tr} (\mathbf {B} ^{T}\mathbf {\Sigma } )}{\nu -2}}\\E[(\mathbf {X} -\mathbf {M} )\mathbf {C} (\mathbf {X} -\mathbf {M} )]&={\frac {\mathbf {\Sigma } \mathbf {C} ^{T}\mathbf {\Omega } }{\nu -2}}\end{aligned}}}
Transformation
[edit ]Transpose transform:
- {\displaystyle \mathbf {X} ^{T}\sim {\mathcal {T}}_{p\times n}(\nu ,\mathbf {M} ^{T},\mathbf {\Omega } ,\mathbf {\Sigma } )}
Linear transform: let A (r-by-n), be of full rank r ≤ n and B (p-by-s), be of full rank s ≤ p, then:
- {\displaystyle \mathbf {AXB} \sim {\mathcal {T}}_{r\times s}(\nu ,\mathbf {AMB} ,\mathbf {A\Sigma A} ^{T},\mathbf {B} ^{T}\mathbf {\Omega B} )}
The characteristic function and various other properties can be derived from the re-parameterised formulation (see below).
Re-parameterized matrix t-distribution
[edit ]Notation | {\displaystyle {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})} | ||
---|---|---|---|
Parameters |
{\displaystyle \mathbf {M} } location (real {\displaystyle n\times p} matrix) | ||
Support | {\displaystyle \mathbf {X} \in \mathbb {R} ^{n\times p}} | ||
{\displaystyle {\frac {\Gamma _{p}(\alpha +n/2)}{(2\pi /\beta )^{\frac {np}{2}}\Gamma _{p}(\alpha )}}|{\boldsymbol {\Omega }}|^{-{\frac {n}{2}}}|{\boldsymbol {\Sigma }}|^{-{\frac {p}{2}}}}
| |||
CDF | No analytic expression | ||
Mean | {\displaystyle \mathbf {M} } if {\displaystyle \alpha >p/2}, else undefined | ||
Variance | {\displaystyle {\frac {2({\boldsymbol {\Sigma }}\otimes {\boldsymbol {\Omega }})}{\beta (2\alpha -p-1)}}} if {\displaystyle \alpha >(p+1)/2}, else undefined | ||
CF | see below |
An alternative parameterisation of the matrix t-distribution uses two parameters {\displaystyle \alpha } and {\displaystyle \beta } in place of {\displaystyle \nu }.[3]
This formulation reduces to the standard matrix t-distribution with {\displaystyle \beta =2,\alpha ={\frac {\nu +p-1}{2}}.}
This formulation of the matrix t-distribution can be derived as the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse multivariate gamma distribution placed over either of its covariance matrices.
Properties
[edit ]If {\displaystyle \mathbf {X} \sim {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})} then[2] [3]
- {\displaystyle \mathbf {X} ^{\rm {T}}\sim {\rm {T}}_{p,n}(\alpha ,\beta ,\mathbf {M} ^{\rm {T}},{\boldsymbol {\Omega }},{\boldsymbol {\Sigma }}).}
The property above comes from Sylvester's determinant theorem:
- {\displaystyle \det \left(\mathbf {I} _{n}+{\frac {\beta }{2}}{\boldsymbol {\Sigma }}^{-1}(\mathbf {X} -\mathbf {M} ){\boldsymbol {\Omega }}^{-1}(\mathbf {X} -\mathbf {M} )^{\rm {T}}\right)=}
- {\displaystyle \det \left(\mathbf {I} _{p}+{\frac {\beta }{2}}{\boldsymbol {\Omega }}^{-1}(\mathbf {X} ^{\rm {T}}-\mathbf {M} ^{\rm {T}}){\boldsymbol {\Sigma }}^{-1}(\mathbf {X} ^{\rm {T}}-\mathbf {M} ^{\rm {T}})^{\rm {T}}\right).}
If {\displaystyle \mathbf {X} \sim {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})} and {\displaystyle \mathbf {A} (n\times n)} and {\displaystyle \mathbf {B} (p\times p)} are nonsingular matrices then[2] [3]
- {\displaystyle \mathbf {AXB} \sim {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {AMB} ,\mathbf {A} {\boldsymbol {\Sigma }}\mathbf {A} ^{\rm {T}},\mathbf {B} ^{\rm {T}}{\boldsymbol {\Omega }}\mathbf {B} ).}
The characteristic function is[3]
- {\displaystyle \phi _{T}(\mathbf {Z} )={\frac {\exp({\rm {tr}}(i\mathbf {Z} '\mathbf {M} ))|{\boldsymbol {\Omega }}|^{\alpha }}{\Gamma _{p}(\alpha )(2\beta )^{\alpha p}}}|\mathbf {Z} '{\boldsymbol {\Sigma }}\mathbf {Z} |^{\alpha }B_{\alpha }\left({\frac {1}{2\beta }}\mathbf {Z} '{\boldsymbol {\Sigma }}\mathbf {Z} {\boldsymbol {\Omega }}\right),}
where
- {\displaystyle B_{\delta }(\mathbf {WZ} )=|\mathbf {W} |^{-\delta }\int _{\mathbf {S} >0}\exp \left({\rm {tr}}(-\mathbf {SW} -\mathbf {S^{-1}Z} )\right)|\mathbf {S} |^{-\delta -{\frac {1}{2}}(p+1)}d\mathbf {S} ,}
and where {\displaystyle B_{\delta }} is the type-two Bessel function of Herz[clarification needed ] of a matrix argument.
See also
[edit ]Notes
[edit ]- ^ a b Zhu, Shenghuo and Kai Yu and Yihong Gong (2007). "Predictive Matrix-Variate t Models." In J. C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, NIPS '07: Advances in Neural Information Processing Systems 20, pages 1721–1728. MIT Press, Cambridge, MA, 2008. The notation is changed a bit in this article for consistency with the matrix normal distribution article.
- ^ a b c d e Gupta, Arjun K and Nagar, Daya K (1999). Matrix variate distributions. CRC Press. pp. Chapter 4.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ a b c d e Iranmanesh, Anis, M. Arashi and S. M. M. Tabatabaey (2010). "On Conditional Applications of Matrix Variate Normal Distribution". Iranian Journal of Mathematical Sciences and Informatics, 5:2, pp. 33–43.
External links
[edit ]