This chapter provides an introduction to stochastic approximation theory, a part of stochastic optimization now widely used in Machine Learning and Deep Learning applications. We show when and how to design a stochastic gradient, a stochastic pseudo-gardient or a zero search recursive stochastic algorithm. We prove the main convergence theorems as well as their rates of convergence (Central Limit theorem). The Ruppert & Polyak averaging procedure, which allows to minimize the asymptotic variance of such procedures, is also analyzed. Various applications to finance are developed: computation of implicit parameters (volatility, correlation, etc), calibration and the computation of value-at-risk and conditional value-at-risk (expected shortfall).