A new sampling learning method for neural networks is proposed. Derived from an integral representation of neural networks, an
probability distribution of hidden parameters is introduced. In general rigorous sampling from the oracle distribution holds numerical difficulty, a linear-time sampling algorithm is also developed. Numerical experiments showed that when hidden parameters were initialized by the oracle distribution, following backpropagation converged faster to better parameters than when parameters were initialized by a normal distribution.