I congratulate the authors for such a brilliant paper. The data-driven projection approach also discussed in previous papers (e.g. Aston and Kirch (2014)) turns out to be very useful in tackling offline high-dimensional changepoint estimation. I would like to suggest that the result for single changepoint estimation can be extended to sub-gaussian (see Vershynin (2012)) noise structure. Only the tail bound for the CUSUM transformation of noise requires to be relaxed as the Ornstein-Uhlenbeck process can not be used. For completeness, I define the notion of sub-subgaussianity first, then I list the corresponding modified results for sub-gaussian noise:

**Definition.** A real-valued random variable is said to have a *sub-gaussian* distribution with parameter , denoted if for all it holds that

**Lemma 4.**(online supplement) Let have independent components with for all . Then for , we have

This result comes directly from the tail bound of a sub-gaussian distribution and a union bound.

**Proposition 1.**(main paper) Suppose that satisfies (15) for either or . Let be the leading singular vector of . If we choose , then

This result comes from the following modification to eqn (20) in the paper:

**Theorem 1.**(main paper) Suppose is known. Let be the output of Algorithm 3 with input and . There exist universal constants such that if is even, is even and

then

The proof is very similar to the one in the paper, while we can only choose to be and use Lemma 4, Lemma 5 and Proposition 1 to bound the probability of again.

I would also like to remark that the method presented in the paper, though it is offline, is reasonably efficient even in online setting. However there is still very much to be explored in online changepoint detection. I notice that Soh and Chandrasekaran (2017) presented a method using convex optimisation and filtering. The method can be applied in online setting, however the assumption that the signal is sparse is weaker.

- Aston, J. A. D. and Kirch, C. (2014) Efficiency of change point tests in high dimensional settings.
*arXiv preprint*, arxiv:1409.1771v2. - Soh, Y. S. and Chandrasekaran, V. (2017) High-dimensional change-point estimation: combining filtering with convex optimization.
*Appl. Comput. Harmon. Anal.*,**43**, 122-147. - Vershynin, R. (2012) Introduction to the non-asymptotic analysis of random matrices.
*Compressed sensing*, 210-268, Cambridge Univ. Press, Cambridge.