By changing the definition of the winning unit, Kohonen's original learning rule can be viewed as performing stochastic gradient descent on an energy function. It is showed in two ways: by explicitly computing derivatives and as a limiting case of a “soft” version of self-organizing maps with probabilistic winner assignments. Kinks in a one-dimensional map and twists in a two-dimensional map correspond to local minima in the energy landscape of the network weights. Changing the determination of the winning unit has no effect on the basic properties of the Kohonen learning algorithm, which is a relatively simple procedure with remarkable self-organizing capabilities. At a more theoretical level, many results concerning the original Kohonen learning algorithm are not particularly surprising from a stochastic approximation or an optimization point of view. Often, these results are difficult to proof for technical reasons.