Skip to main content

The problem of tail approximation using CLT

info

Reference textbook High-Dimensional Statistics: A Non-Asymptotic Viewpoint, Martin J. Wainwright

The tail of standard normal distribution t>0t>0:

(1tβˆ’1t3)β‹…12Ο€eβˆ’t2/2≀P{gβ‰₯t}≀1tβ‹…12Ο€eβˆ’t2/2\left( \frac { 1 } { t } - \frac { 1 } { t^{ 3 } } \right) \cdot \frac { 1 } { \sqrt { 2 \pi } } e^{ - t^{ 2 } / 2 } \leq P \{ g \geq t \} \leq \frac { 1 } { t } \cdot \frac { 1 } { \sqrt { 2 \pi } } e^{ - t^{ 2 } / 2 }
P{gβ‰₯t}≀12Ο€eβˆ’t2/2Β whenΒ tβ‰₯1P \{ g \geq t \} \leq \frac { 1 } { \sqrt { 2 \pi } } e^{ - t^{ 2 } / 2 }\text{ when } t \geq 1

Tail Approximation for SNS_N: B(N,1/2)

By Chebyshev

P{SNβ‰₯34N}≀P{∣SNβˆ’N2∣β‰₯N4}≀4NP \left\{ S_{ N } \geq \frac { 3 } { 4 } N \right\} \leq P \left\{ \left| S_{ N } - \frac { N } { 2 } \right| \geq \frac { N } { 4 } \right\} \leq \frac { 4 } { N }

By CLT

P{SNβ‰₯34N}=P{ZN=SNβˆ’N/2N/4β‰₯N/4}β‰ˆP{gβ‰₯N/4}≀12Ο€eβˆ’N/8\mathbb { P } \left\{ S_{ N } \geq \frac { 3 } { 4 } N \right\} = \mathbb { P } \left\{ Z_{ N } = \frac { S_{ N } - N / 2 } { \sqrt { N / 4 } } \geq \sqrt { N / 4 } \right\} \approx \mathbb { P } \{ g \geq \sqrt { N / 4 } \} \leq \frac { 1 } { \sqrt { 2 \pi } } e^{ - N / 8 }

Berry-Essen CLT, ρ=E∣X1βˆ’ΞΌβˆ£3/Οƒ3\rho = \mathbb { E } \left| X_{ 1 } - \mu \right|^{ 3 } / \sigma^{ 3 }

∣P{ZNβ‰₯t}βˆ’P{gβ‰₯t}βˆ£β‰€ΟN\left| \mathbb { P } \left\{ Z_{ N } \geq t \right\} - \mathbb { P } \{ g \geq t \} \right| \leq \frac { \rho } { \sqrt { N } }

Thus the approximation error is of order 1N\frac{1}{\sqrt{N}}, which ruins the desired exponential decay.