Speech enhancement is required for improving the quality and intelligibility in various applications such as recognition, hearing aids and other personal assistant devices. Due to the varying acoustic environments, online enhancement is a very significant aspect for its applicability in practical scenarios. This emphasizes the need to observe the environment and enhance the speech accordingly. Adaptive filters were used previously to provide online enhancement, but a neural network based online enhancement has not been proposed previously. In this paper, we employ a unique architecture based on Long- Short Term Memory (LSTM) networks to enhance single channel speech online. The LSTM network is trained online in a novel way by minimizing the Stein’s unbiased risk estimate. This method of retraining helps the network to learn denoising without using a clean sample or ground truth. To avoid training for each and every sample we have used policy iteration with reward function based on ITU-T P.563, the widely-used single ended perceptual measure. The performance of this LSTM retraining can be observed with the increased PESQ of the enhanced speech by 0.53 on average. The proposed method also improves intelligibility which can be seen from the improvement in the metric STOI by 0.22.