# High-dimensional changepoint estimation via sparse projection [comments by Yudong Chen]

I congratulate the authors for such a brilliant paper. The data-driven projection approach also discussed in previous papers (e.g. Aston and Kirch (2014)) turns out to be very useful in tackling offline high-dimensional changepoint estimation. I would like to suggest that the result for single changepoint estimation can be extended to sub-gaussian (see Vershynin (2012)) noise structure. Only the tail bound for the CUSUM transformation of noise requires to be relaxed as the Ornstein-Uhlenbeck process can not be used. For completeness, I define the notion of sub-subgaussianity first, then I list the corresponding modified results for sub-gaussian noise:
Definition. A real-valued random variable ${X}$ is said to have a sub-gaussian distribution with parameter ${\sigma^2>0}$, denoted ${X\in \textbf{sG}(\sigma^2)}$ if for all ${t \in \mathbb{R}}$ it holds that

$\displaystyle \mathbb{E}\left[\mathrm{e}^{tX}\right]\le \mathrm{e}^{\frac{t^2 \sigma^2}{2}} .$

Lemma 4.(online supplement) Let ${W=(W_1,...,W_n)}$ have independent components with ${W_i \in \textbf{sG}(\sigma^2)}$ for all ${i}$. Then for ${u>0}$, we have

$\displaystyle \mathbb{P}\left( \lVert E \rVert_\infty \ge u\sigma \right) \le 2(n-1)\mathrm{e}^{-\frac{u^2}{2}}.$

This result comes directly from the tail bound of a sub-gaussian distribution and a union bound.
Proposition 1.(main paper) Suppose that ${\hat{M}}$ satisfies (15) for either ${\mathcal{S}=\mathcal{S}_1}$ or ${\mathcal{S}=\mathcal{S}_2}$. Let ${\hat{v}\in \mathrm{argmax}_{\tilde{v}\in\mathbb{S}^{p-1}} \lVert M^T\tilde{v} \rVert_2}$ be the leading singular vector of ${\hat{M}}$. If we choose ${\lambda \ge 2\sigma \sqrt{\log{(pn)}}}$, then

$\displaystyle \sup_{P \in \mathcal{P}(n,p,k,1,\vartheta, \tau, \sigma^2)} \mathbb{P}_P\left(\sin \angle(\hat{v},v) > \frac{32\lambda\sqrt{k}}{\tau\vartheta\sqrt{n}}\right) \le \frac{2}{pn}.$

This result comes from the following modification to eqn (20) in the paper:

$\displaystyle \mathbb{P}\left( \lVert E \rVert_\infty \ge 2\sigma \sqrt{\log{(pn)}} \right) \le 2p(n-1)(pn)^{-2} \le \frac{2}{pn}.$

Theorem 1.(main paper) Suppose ${\sigma^2>0}$ is known. Let ${\hat{z}}$ be the output of Algorithm 3 with input ${X \sim P \in \mathcal{P}(n,p,k,1,\vartheta, \tau, \sigma^2)}$ and ${\lambda := 2\sigma\sqrt{\log{(pn)}}}$. There exist universal constants ${C,C'>0}$ such that if ${n \ge 10}$ is even, ${z}$ is even and

$\displaystyle \frac{C\sigma}{\vartheta \tau} \sqrt{\frac{k\log{(pn)}}{n}}\le 1,$

then

$\displaystyle \mathbb{P}_P\left( \frac{1}{n}|\hat{z}-z|\le \frac{C'\sigma^2 \log n}{n\vartheta^2} \right) \le 1-\frac{4}{pn}-\frac{4}{n}-\frac{32\log{(n/2)}}{n}.$

The proof is very similar to the one in the paper, while we can only choose ${\lambda_1}$ to be ${2\sigma \sqrt{\log n}}$ and use Lemma 4, Lemma 5 and Proposition 1 to bound the probability of ${\Omega_1^c \cup \Omega_2^c}$ again.
I would also like to remark that the method presented in the paper, though it is offline, is reasonably efficient even in online setting. However there is still very much to be explored in online changepoint detection. I notice that Soh and Chandrasekaran (2017) presented a method using convex optimisation and filtering. The method can be applied in online setting, however the assumption that the signal is sparse is weaker.

1.  Aston, J. A. D. and Kirch, C. (2014) Efficiency of change point tests in high dimensional settings. arXiv preprint, arxiv:1409.1771v2.
2. Soh, Y. S. and Chandrasekaran, V. (2017) High-dimensional change-point estimation: combining filtering with convex optimization. Appl. Comput. Harmon. Anal., 43, 122-147.
3. Vershynin, R. (2012) Introduction to the non-asymptotic analysis of random matrices. Compressed sensing, 210-268, Cambridge Univ. Press, Cambridge.

## Author: xi'an

I am a professor of Statistics at Université Paris Dauphine, France, and University of Warwick, United Kingdom, with a definitely unhealthy (but so far not fatal) fascination for mountains and (easy) climbing, in particular for Scotland in Winter, an almost-daily run, and a reading list mainly centred at fantasy books… Plus a blog that often seems to take most of my time. Not that anyone forces me to edit it...!