In many situations, there is a symmetric matrix of interest , but one only has a perturbed version of it
. How is
affected by
?
The basic example is Principal Component Analysis (PCA). Let for some random vector
, and let
be the sample covariance matrix on independent copies of
. Namely, we observe
i.i.d. random variable distributed as
and we set
![Rendered by QuickLaTeX.com X](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.png)
![Rendered by QuickLaTeX.com \hat{\Sigma}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9b6c81bbb5536c768ef1fecad482385a_l3.png)
1. Some intuition
An eigenspace of is the span of some eigenvectors of
. We can decompose
into its action on an eigenspace
and its action on the orthogonal complement
:
where
![Rendered by QuickLaTeX.com E_0](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-1910cab4f9a1d5d503876a28893a252a_l3.png)
![Rendered by QuickLaTeX.com S](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png)
![Rendered by QuickLaTeX.com \Sigma](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png)
![Rendered by QuickLaTeX.com S](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png)
![Rendered by QuickLaTeX.com E_1](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9394d34b436c6d82638ee9df1de5b475_l3.png)
![Rendered by QuickLaTeX.com S^{\perp}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-a89011a4911a6f116525949f29095b8b_l3.png)
![Rendered by QuickLaTeX.com \hat{\Sigma}=\Sigma + H](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-e05958cf93e44516cca60692ff993201_l3.png)
![Rendered by QuickLaTeX.com \hat S](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9aee0362a60229ecb95851c49674aee2_l3.png)
![Rendered by QuickLaTeX.com (\hat S) =](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-30ff862ab8acc3460fe86e13692807df_l3.png)
![Rendered by QuickLaTeX.com ( S)](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-3aa8dfffe330f42fbe44a7569805101c_l3.png)
Suppose we find a few eigenvalues of
![Rendered by QuickLaTeX.com \hat{\Sigma}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9b6c81bbb5536c768ef1fecad482385a_l3.png)
![Rendered by QuickLaTeX.com \hat{S}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-72761f31bd7c2907e2e1a51deb346ee7_l3.png)
![Rendered by QuickLaTeX.com \Sigma](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png)
![Rendered by QuickLaTeX.com S](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png)
![Rendered by QuickLaTeX.com \hat{S}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-72761f31bd7c2907e2e1a51deb346ee7_l3.png)
![Rendered by QuickLaTeX.com \hat{S}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-72761f31bd7c2907e2e1a51deb346ee7_l3.png)
![Rendered by QuickLaTeX.com S](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png)
![Rendered by QuickLaTeX.com S](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png)
![Rendered by QuickLaTeX.com E_0 \alpha](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-3eeeac7a8c4ed6f5321afe82a7152ab4_l3.png)
![Rendered by QuickLaTeX.com \alpha \in \mathbb{R}^{\mathrm{dim}(S)}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-6e956130ccf48e837375c25720830ad5_l3.png)
![Rendered by QuickLaTeX.com \hat{S}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-72761f31bd7c2907e2e1a51deb346ee7_l3.png)
![Rendered by QuickLaTeX.com F_0F_0^{*}E_0\alpha](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-56311ad6a65a3585fda2e42abcd87304_l3.png)
Therefore vectors in will be well-approximated by
if
is “small”.
The condition we will need is separation between the eigenvalues corresponding to and those corresponding to
. Suppose the eigenvalues corresponding to
are all contained in an interval
. Then we will require that the eigenvalues corresponding to
be excluded from the interval
for some
. To see why this is necessary, consider the following example:
Here, the “size” of
![Rendered by QuickLaTeX.com H](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-379db1fc1f84b7ce56b92463183097f9_l3.png)
![Rendered by QuickLaTeX.com \Sigma](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-14fb1e14301ad034b94e3db3ff52c0c9_l3.png)
![Rendered by QuickLaTeX.com \lambda_1 = 1 + \delta](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-654fd0e7d297f98f5a26ebba6f8398ea_l3.png)
![Rendered by QuickLaTeX.com \lambda_2 = 1 -\delta](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-88b32c7729f4fd7f8b2612c840513e87_l3.png)
![Rendered by QuickLaTeX.com v_1 = (1, 0)](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-936e09331966c94af27279f505136c53_l3.png)
![Rendered by QuickLaTeX.com v_2 = (0, 1)](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-f3c5f9b3c06d87f98f09a890d91fade6_l3.png)
![Rendered by QuickLaTeX.com \hat{\Sigma}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9b6c81bbb5536c768ef1fecad482385a_l3.png)
![Rendered by QuickLaTeX.com \hat{\lambda}_1 = 1 + \delta](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-d7fa286edf8c7b1f1547f43690bf26a2_l3.png)
![Rendered by QuickLaTeX.com \hat{\lambda}_2 = 1 - \delta](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-dbf53d4d05090ceff36693b4fd1b5585_l3.png)
![Rendered by QuickLaTeX.com v_1 = (1/\sqrt 2, 1/\sqrt 2)](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-16ba70112cf310033b3441e9739aa07a_l3.png)
![Rendered by QuickLaTeX.com \hat v_2 = (-1/\sqrt 2, 1/\sqrt 2)](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-1e4b4a79357cef0dfad19d2e7f5f5ddb_l3.png)
![Rendered by QuickLaTeX.com \hat v_2^* v_1=1/ \sqrt 2](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-fe18cd714ff951d32f69edd8cdc0cf58_l3.png)
arbitrarily large relative to
![Rendered by QuickLaTeX.com \delta](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-02c416d77f6650e9c7849397bf6e11bf_l3.png)
2. Distances between subspaces
Let and
be
-dimensional subspaces of
, with
. Let
and
be projectors onto these two subspaces.
We first consider the angle between two vectors, that is, when . Then,
and
are spanned by vectors. Denote the two corresponding vectors as
. The angle between
and
is defined as:
Now, we need to extend this concept to subspaces (when
![Rendered by QuickLaTeX.com r > 1](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-5a7c447ca46c80e37a53924989b1ac31_l3.png)
![Rendered by QuickLaTeX.com E](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-764e1c770271f92700e1a4fbce46c668_l3.png)
![Rendered by QuickLaTeX.com F](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-2510519bbe1660dfdffb4195c7287343_l3.png)
![Rendered by QuickLaTeX.com d\times r](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-1386bbd33ae5bd412dccb83f7e86ad82_l3.png)
![Rendered by QuickLaTeX.com E](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-764e1c770271f92700e1a4fbce46c668_l3.png)
![Rendered by QuickLaTeX.com =\mathcal E](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-7b67f55a0d56ae3aa5a6425a3f912279_l3.png)
![Rendered by QuickLaTeX.com F](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-2510519bbe1660dfdffb4195c7287343_l3.png)
![Rendered by QuickLaTeX.com = \mathcal F](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-f02953f7d68a36b28d28a79fa79fd848_l3.png)
![Rendered by QuickLaTeX.com \Pi_{\mathcal E} =E E^{*}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-95ecee69e3abfe35e4a715a839f0a984_l3.png)
![Rendered by QuickLaTeX.com \Pi_{\mathcal E} =F F^{*}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-54e5ae0a1281ace89722e461e7d538af_l3.png)
Now, we define the angle between subspaces
![Rendered by QuickLaTeX.com \mathcal E](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-8c89f9ab3d3411ab214e07b054a43b10_l3.png)
![Rendered by QuickLaTeX.com \mathcal F](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-6e2c39753db514e440e9ea0c88090a14_l3.png)
Definition
The canonical or principal angles between and
are:
where
![Rendered by QuickLaTeX.com \sigma_1, \dots , \sigma_r](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-4a4ca70d6073f6dbe6778fda037f1140_l3.png)
![Rendered by QuickLaTeX.com E^* F](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9b82a38e64fcc29bbb2027f3c2976f4f_l3.png)
![Rendered by QuickLaTeX.com F^* E](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-11f6d6d062702ac54553692ce51b9e8c_l3.png)
A general result known as CS-decomposition in linear algebra gives the following:
where
Another way of defining canonical angles is the following:
Definition
The canonical angles between the spaces and
are
for
, where
are the singular values of
Now, given the definition of the canonical angles, we can define the distances between subspaces and
as the following.
Definition
The distance between and
is
, which is a metric over the space of
-dimensional linear subspaces of
. Equivalently,
3. Davis Kahan Theorem
Theorem
Let and
be symmetric matrices with
and
orthogonal matrices. If the eigenvalues
are contained in an interval
, and the eigenvalues of
are excluded from the interval
for some
, then
for any unitarily invariant norm
![Rendered by QuickLaTeX.com \|\cdot\|](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-1a5f3d1cb88a3655928a38aa04691a60_l3.png)
Proof.
Since we have
Furthermore,
![Rendered by QuickLaTeX.com F_1^* (\Sigma+ H) = \hat{\Lambda}_1F_1^*](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-ca58e538c04491d4661f16d91c168d23_l3.png)
Let
![Rendered by QuickLaTeX.com c := \frac{a+b}{2}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-3ca5830c47305c6f5dac8be44453c58c_l3.png)
![Rendered by QuickLaTeX.com r := \frac{b-a}{2} \geq 0](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-c3355f3b42dc0af2d5bf0349ccc2288b_l3.png)
Here we have used a centering trick so that
![Rendered by QuickLaTeX.com \Lambda_0 - cI](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-9194b9f08570a04a65b7ee0f7c8ce01b_l3.png)
![Rendered by QuickLaTeX.com [-r, r]](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-a84a1f09e39415d77a0fb8f09d8dda3f_l3.png)
![Rendered by QuickLaTeX.com \hat{\Lambda}_1 - cI](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-817b214808387b71439f8a3b2de559f7_l3.png)
![Rendered by QuickLaTeX.com (-r -\delta, r + \delta)](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-e3aa0c9acccc8054e50839974eac34a9_l3.png)
![Rendered by QuickLaTeX.com \|\Lambda_0-cI\|_2\le r](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-0da1296fb60fe927bb2ea74c464e1338_l3.png)
![Rendered by QuickLaTeX.com \|\left(\hat{\Lambda}_1-cI\right)^{-1} \|_2 \leq \left( r+\delta \right)^{-1}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-ff3199905719ada9f6e6936a7fbbdac1_l3.png)
and
We conclude that
![Rendered by QuickLaTeX.com \|F_1^*H E_0\| \geq \delta \|F_1^* E_0\|.](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-b557f19d13b4ccb5c13387a9e92049e6_l3.png)
Another version of the Davis-Kahan Theorem which is more popular in the community of statisticians is the following.
Theorem
Let and
be
symmetric matrices with respective eigenvalues
and
.
Fix , and let
and
be
matrices with orthonormal columns corresponding to eigenvalues
and
.
Let and
be the subspaces spanned by columns of
and
. Define the eigengap as
where we define
![Rendered by QuickLaTeX.com \hat{\lambda}_0 = -\infty](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-ca8f7ca87c9f52cff3b5b01185d07d28_l3.png)
![Rendered by QuickLaTeX.com \hat{\lambda}_{d+1} = \infty](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-63ac8630d6c6d11049a0a1da0c54b01a_l3.png)
If
![Rendered by QuickLaTeX.com \delta > 0](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-0e9a9eb2fef6037a8fb5971507ac7dd9_l3.png)
The result also holds for the operator norm
![Rendered by QuickLaTeX.com \|\cdot\|_{op}](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-60b1d1d28027daa7ab2901179ad8dd02_l3.png)
![](http://quentin-duchemin.alwaysdata.net/wiki/wp-content/uploads/2021/03/sintheta.png)
![Rendered by QuickLaTeX.com \delta](https://quentin-duchemin.alwaysdata.net/wiki/wp-content/ql-cache/quicklatex.com-02c416d77f6650e9c7849397bf6e11bf_l3.png)
By Weyl’s theorem, one can show that the sufficient (but not necessary) condition for in Davis-Kahan theorem is
When the matrices are not symmetric, there exists a generalized version of the Davis-Kahan Theorem called Wedin’s theorem.
Marylin Ser
Thank you for another informative blog. Where else could I get that type of info written in such a perfect way? I have a project that I’m just now working on, and I have been on the look out for such info.
Dana Blankschan
You really make it seem so easy with your presentation but I find this topic to be actually something which I think I would never understand. It seems too complicated and extremely broad for me. I’m looking forward for your next post, I will try to get the hang of it!