1 Introduction

The mass difference between proton and neutron had been puzzling for a long time. Ever since Heisenberg had introduced isospin symmetry to explain the near degeneracy of these two levels [1], it was taken for granted that the strong interaction is invariant under isospin rotations and that the mass difference is of electromagnetic origin. In this framework, it was difficult, however, to understand the experimental fact that the neutral particle is heavier than the charged one. A first step towards a resolution of the paradox was taken by Coleman and Glashow, who introduced the tadpole dominance hypothesis [2, 3], which associates the bulk of the electromagnetic self-energies with an octet operator. The origin of the tadpole remained mysterious, however. The puzzle was solved only in 1975, when it was realised that the strong interaction does not conserve isospin, because the masses of the up- and down-quarks strongly differ [4]. The crude estimates for the ratios of the three lightest quark masses obtained in that work, \(m_u/m_d\simeq 0.67\) and \(m_s/m_d\simeq 22.5\), have in the meantime been improved considerably. In particular, Weinberg [5] pointed out that in the chiral limit, the Dashen theorem provides an independent estimate of the quark mass ratios, as it determines the electromagnetic self-energies of the kaons in terms of those of the pions. Neglecting higher orders in the expansion in powers of \(m_u,m_d\) and \(m_s\), he obtained the estimate \(m_u/m_d\simeq 0.56\), \(m_s/m_d\simeq 20.1\). Also, the decay \(\eta \rightarrow 3\pi \) turned out to be a very sensitive probe of isospin breaking [610]. The quark mass ratios obtained from that source also confirmed the picture. According to the most recent edition of the FLAG review [11], the current lattice averages are \(m_u/m_d=0.46(3)\), \(m_s/m_d= 20.0(5)\).

1.1 Cottingham formula, dispersion relations

The analysis of [4] relies on the Cottingham formula [12], which invokes dispersion relations to relate the spin-averaged nucleon matrix elements of the time-ordered product, \(\langle p|Tj^\mu (x)j^\nu (y)|p\rangle \), to those of the commutator of the electromagnetic current, \(\langle p|[j^\mu (x),j^\nu (y)]|p\rangle \). Lorentz invariance and current conservation determine the Fourier transforms of these matrix elements in terms of two invariant amplitudes, which only depend on the two variables \(\nu =p\cdot q/m\) and \(q^2\), where \(m\) is the nucleon mass and q the photon momentum. We stick to the notation used in [4] and denote the invariant amplitudes by \(T_1(\nu ,q^2), T_2(\nu ,q^2)\) and \(V_1(\nu ,q^2), V_2(\nu ,q^2)\), respectively. Explicit formulae that specify the matrix elements \(\langle p|Tj^\mu (x)j^\nu (y)|p\rangle \) and \(\langle p|[j^\mu (x),j^\nu (y)]|p\rangle \) in terms of the invariant amplitudes are listed in Appendix A, where we also exhibit the relations between the structure functions \(V_1(\nu ,q^2)\), \(V_2(\nu ,q^2)\) and the cross sections \(\sigma _{T}\) and \(\sigma _{L}\) of electron scattering.

In the space-like region and for \(\nu \ge 0\), the structure functions represent the imaginary parts of the time-ordered amplitudes:

$$\begin{aligned}&\text {Im}\,T_1(\nu ,q^2)=\pi V_1(\nu ,q^2),\nonumber \\&\text {Im}\,T_2(\nu ,q^2)=\pi V_2(\nu ,q^2),\quad \nu \ge 0,\quad q^2\le 0.\end{aligned}$$
(1)

While the functions \(V_1(\nu ,q^2), V_2(\nu ,q^2)\) are odd under \(\nu \rightarrow -\nu \), the time-ordered amplitudes \(T_1(\nu ,q^2), T_2(\nu ,q^2)\) are even. In view of the contributions arising from Regge exchange, \(V_1(\nu ,q^2) \sim \nu ^\alpha \), \(V_2(\nu ,q^2)\sim \nu ^{\alpha -2}\), only \(T_2\) obeys an unsubtracted dispersion relation, while for \(T_1\) a subtraction is needed.Footnote 1 For \(q^2<0\), the dispersion relations thus take the form

$$\begin{aligned}&T_1(\nu ,q^2)= S_1(q^2)+2\nu ^2\int _0^\infty \frac{\mathrm{d}\nu '}{\nu '} \,\frac{V_1(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon },\nonumber \\&T_2(\nu ,q^2)= 2\int _0^\infty \mathrm{d}\nu '\, \nu ' \,\frac{ V_2(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }. \end{aligned}$$
(2)

The formulae hold in the cut \(\nu \)-plane; the upper and lower half-planes are glued together along the interval \(|\nu |<Q^2/2m\) of the real axis (throughout, we use \(Q^2\equiv -q^2\) whenever this is convenient). As illustrated with the discussion in Appendix E, it is important that kinematic singularities, zeros, and constraints be avoided – throughout this paper, we work with the amplitudes defined in Appendix A, which are free of these [1416].

We refer to \(S_1(q^2)\) as the subtraction function. It represents the value of the amplitude \(T_1(\nu ,q^2)\) at \(\nu =0\). For later use we introduce the analogous notation also for \(T_2(\nu ,q^2)\):

$$\begin{aligned} S_1(q^2)\equiv T_1(0,q^2),\quad S_2(q^2)\equiv T_2(0,q^2).\end{aligned}$$
(3)

1.2 Reggeons and fixed poles

In [4] it is assumed that the asymptotic behaviour is determined by Reggeon exchange. The contribution of a Regge pole to a scattering amplitude at large centre-of-mass energy squared s and small momentum transfer \(t\le 0\) has the form (see e.g. [17]):

$$\begin{aligned} T(s,t) = - \frac{\pi \beta _{\alpha }(t)}{\sin \pi \alpha (t) } \{\exp [-i\pi \alpha (t)]+\tau \}s^{\alpha (t)}, \end{aligned}$$
(4)

where \(\alpha (t)\) and \(\beta (t)\) denote the trajectory and the residue, respectively, and \(\tau \) is the signature. In the context of the present paper, we are concerned with \(t=0\) and \(\tau =1\). The continuation of the asymptotic formula (4) to low energies is not unique. For definiteness, we work with the representation

$$\begin{aligned} T^\mathrm{R}_1(\nu ,q^2)= & {} -\sum _{\alpha >0} \frac{\pi \beta _{\alpha }(Q^2)}{\sin \pi \alpha }\nonumber \\&\times \,\{(s_0-s_+-i\epsilon )^{\alpha }+(s_0-s_--i\epsilon )^{\alpha }\}, \end{aligned}$$
(5)

where \(s_+\) and \(s_-\) stand for \(s_{\pm }=(p\pm q)^2=m^2\pm 2m\nu -Q^2\) and \(s_0\ge m^2\) is a constant. Equation (5) is manifestly symmetric under photon crossing. Unless the intercept \(\alpha \) is an integer,Footnote 2 the first term in the curly brackets contains a branch cut along the positive real axis, starting at \(2m\nu =s_0-m^2+Q^2\). The second is real there. One readily checks that, on the upper rim of this cut, the individual terms in the sum (5) differ from the asymptotic expression (4) only through contributions of \(O(s^{\alpha -1})\).

The basic assumption made in [4] is that, in the limit \(\nu \rightarrow \infty \) at fixed \(q^2\), only the Reggeons survive, so that the difference tends to zero:Footnote 3

$$\begin{aligned} T_1 (\nu ,q^2)- T^\mathrm{R}_1(\nu ,q^2)\rightarrow 0,\end{aligned}$$
(6)

We refer to this hypothesis as Reggeon dominance. A nonzero limiting value in (6) would represent a \(\nu \)-independent term. In Regge language, a term of this type would correspond to a fixed pole at angular momentum \(J=0\). The Reggeon dominance hypothesis (6) thus excludes the occurrence of such a fixed pole.

The presence or absence of a fixed pole at \(J=0\) in Compton scattering is a standard topic in Regge pole theory [18] and the literature contains several works advocating the presence of such a contribution. In particular, the universality conjecture formulated in [19] has received considerable attention (see e.g. [20] and the papers quoted therein).

Note, however, that these considerations go beyond the safe grounds provided by asymptotic freedom. While the short-distance properties of QCD ensure that, if both \(\nu \) and \(q^2\) are large, the behaviour of \(T_1(\nu ,q^2)\) and \(T_2(\nu ,q^2)\) is governed by the perturbative expansion in powers of the strong coupling constant, the behaviour in the Regge region, where only \(\nu \) becomes large, is not controlled by the short-distance properties of QCD. In particular, values of \(q^2\) of the order of \(\Lambda _\text {QCD}^2\) are outside the reach of perturbation theory, even if \(\nu \) is large.

The perturbative analysis shows that an infinite set of graphs needs to be summed up to understand the high-energy behaviour of the amplitudes in the Regge region. The dominating contributions can be represented in terms of poles and cuts in the angular momentum plane (Reggeon calculus, Reggeon field theory). The behaviour of the sum thus differs qualitatively from the one of the individual diagrams.

There is solid experimental evidence for the presence of Reggeons also in the data. Equation (6) amounts to the assumption that the asymptotic behaviour of the current correlation function can be understood in terms of these. In the analysis described in the present paper, this assumption plays a key role. In particular, as will be demonstrated explicitly below, it uniquely fixes the subtraction function relevant for the difference between proton and neutron in terms of the electron cross sections, so that the entire self-energy difference can be expressed in terms of these cross sections. In other words, the necessity of a subtraction in the fixed-\(q^2\) dispersion relation for \(T_1(\nu ,q^2)\) modifies the relation between the self-energy difference and the electron cross sections, but does not destroy it.Footnote 4

The subtraction functions occurring in the fixed-t dispersion relations relevant for real Compton scattering are analysed in [21, 22]. As shown there, the experimental information on the differential cross sections can be used to impose bounds on the subtraction functions. In particular, these bounds lead to the conclusion that the electric polarisability of the proton is necessarily larger than the magnetic one, in conformity with experiment. An update of this work with the data available today is highly desirable. Unfortunately, this approach to the problem cannot readily be extended to virtual Compton scattering, because data on the differential cross sections are available only for real photons.

1.3 Recent work

The numerical analysis of [4] was based on the scaling laws proposed by Bjorken [23]. The data available at the time were perfectly consistent with these, but Bjorken scaling correctly accounts for the short-distance properties of QCD only to leading order in the perturbative expansion in powers of \(\alpha _s\). The higher-order contributions generate specific violations of Bjorken scaling [24, 25]. In the meantime, the implications of the phenomenon and the corresponding modification of the short-distance properties of the matrix elements \(\langle p|Tj^\mu (x)j^\nu (y)|p\rangle \) have been investigated by Collins [26]. Unfortunately, however, he did not reevaluate the self-energy difference in this framework. In fact, the question of whether the Reggeons do dominate the asymptotic behaviour or whether the amplitude in addition contains a fixed pole at \(I = 1\), \(J = 0\) is not touched at all in that work.

Motivated in part by the study of hadron electromagnetic mass shifts on the lattice (see, e.g., [2729]), the Cottingham formula has recently been reexamined [3036], but the central issue in this context – the possible occurrence of fixed poles – is not addressed in these papers, either. Instead, the electron cross sections \(\sigma _{T}\), \(\sigma _{L}\) and the subtraction function \(S_1(q^2)\) are treated as physically independent quantities. The main problem with the framework set up in [31] is that a direct experimental determination for \(S_1(q^2)\) is not available. To bridge the gap, the authors set up a model which parameterises the dependence of the subtraction function on \(q^2\). The overall normalisation, \( S_1(0)\), can in principle be determined from the difference between the magnetic polarisabilities of proton and neutron, albeit the experimental value is subject to rather large uncertainties [37]. The main problem in this approach, however, is the momentum dependence of the subtraction function, which leads to a systematic uncertainty that is difficult to quantify.

1.4 Structure of the present paper

The remaining sections are organised as follows. In Sect. 2, we show how the Reggeon dominance hypothesis (6) fixes the subtraction function \(S_1(q^2)\) from space-like data alone. In Sect. 3, we discuss the splitting of the amplitudes \(T_i\) into elastic and inelastic contributions. We derive sum rules for the nucleon polarisabilities in Sect. 4, while a thorough phenomenological analysis is provided in Sect. 5. In particular, the sum rules allow us to predict the difference between the electric polarisabilities of proton and neutron. In view of the fact that the proton polarisabilities are experimentally known more accurately, our result can be turned into a prediction of the electric polarisability of the neutron, which is consistent with observation but somewhat more precise. The magnetic polarisabilities then follow from the Baldin sum rule. Section 6 is devoted to the electromagnetic self-energies of proton and neutron. We discuss the renormalisation of the Cottingham formula, in particular the role of the subtraction function in the evaluation of the self-energy and provide a comparison with recent work on the issue. A summary and concluding remarks are given in Sect. 7. In Appendix A, we detail the notation used. Appendix B reviews those properties of Compton scattering we are making use of. In particular, we discuss the frame-dependence of the spin average and derive the low-energy theorem which underlies the sum rule for the electric polarisability. Appendices C and D contain a short discussion of the role of causality in our analysis. Last but not least, we note that in [3033] a comparison with the analysis of [4] is attempted. Unfortunately, many of the statements made there are simply incorrect. Some of the misconceptions are rectified in Appendix E.

2 Determination of the subtraction function

The Regge amplitude obeys a once-subtracted fixed-\(q^2\) dispersion relation:

$$\begin{aligned} T_1^\mathrm{R}(\nu ,q^2) =T_1^\mathrm{R}(0,q^2)+2\nu ^2\int _0^\infty \frac{\mathrm{d}\nu '}{\nu '} \,\frac{V_1^\mathrm{R}(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }.\end{aligned}$$
(7)

In the space-like region and for \(\nu \ge 0\), the absorptive part of the amplitude specified in (5) is given by

$$\begin{aligned} V^\mathrm{R}_1(\nu ,q^2) =\sum _{\alpha >0} \beta _\alpha (Q^2)\,\theta (s_+-s_0) \left( s_+-s_0\right) ^{\alpha }.\end{aligned}$$
(8)

The Reggeon dominance hypothesis (6) implies that the difference between the full amplitude and the Regge contributions, \(\overline{T}_1(\nu ,q^2)\equiv T_1(\nu ,q^2)- T_1^\mathrm{R}(\nu ,q^2)\), obeys an unsubtracted dispersion relation. In particular, the value of \(\overline{T}_1(0,q^2)=S_1(q^2)-T_1^\mathrm{R}(0,q^2)\) is given by an integral over the difference \(V_1(\nu ,q^2)-V_1^\mathrm{R}(\nu ,q^2)\). Hence the subtraction function can be represented as

$$\begin{aligned} S_1(q^2)= & {} T_1^\mathrm{R}(0,q^2)+2 \int _0^\infty \frac{\mathrm{d}\nu }{\nu } \,\{ V_1(\nu ,q^2)- V_1^\mathrm{R}(\nu ,q^2)\}. \end{aligned}$$
(9)

This formula explicitly represents the subtraction function in terms of measurable quantities: the structure function \( V_1(\nu ,q^2)\) is determined by the cross sections for inclusive electron–nucleon scattering. The high-energy behaviour of these cross sections also determines the Reggeon residues \(\beta _\alpha (Q^2)\) and thereby fixes the term \(T^\mathrm{R}_1(0,q^2)\), as well as the corresponding contribution to the structure function, \(V_1^\mathrm{R}(\nu ,q^2)\).

If the trajectory intercepts \(\alpha \) were all below zero, the unsubtracted dispersion integral over \(V_1^\mathrm{R}(\nu ,q^2)\) would converge and would exactly compensate the first term on the right of (9) – the subtraction function would then be given by the unsubtracted dispersion integral over \(V_1(\nu ,q^2)\). The expression for the subtraction function in (9) shows how the divergence of the unsubtracted dispersion integral generated by the Reggeons is handled: the corresponding contribution is removed from the integrand, so that the integral converges also at the physical values of the intercepts. The modification is compensated by the term \(T_1^\mathrm{R}(0,q^2)\), which must be added to the integral over the remainder. The procedure amounts to analytic continuation in \(\alpha \) from negative values, where \(T^\mathrm{R}_1(0,q^2)\) is given by the unsubtracted dispersion integral over \(V_1^\mathrm{R}(\nu ,q^2)\) to the physical values, where that representation does not hold any more.

We emphasise that the specific form used for the Regge parameterisation does not matter. In particular, the Regge amplitude specified in (5) involves a free parameter, \(s_0\). Since it does not affect the leading term in the asymptotic behaviour, the value used for \(s_0\) is irrelevant – our results are independent thereof. In the following, we simplify the equations by taking \(s_0\) in the range \(s_0\ge (m+M_\pi )^2\), which has the advantage that \(V_1^\mathrm{R}(\nu ,q^2)\) then vanishes outside the inelastic region.

3 Elastic and inelastic contributions

3.1 Elastic part

The contributions to the structure functions arising from the elastic reaction \(e+N\rightarrow e+N\) are determined by the electromagnetic form factors of the nucleon. In the space-like region, these contributions are restricted to the lines \(q^2 =\pm 2\nu m\) and read (\(i = 1,2\))

$$\begin{aligned} \begin{aligned}&V_i^\text {el}(\nu ,q^2)= v_i^\text {el}(q^2)\{\delta (q^2+2m\nu )-\delta (q^2-2m\nu )\},\\&v_1^\text {el}(q^2)=\frac{2m^2}{4m^2-q^2}\{G_E^2(q^2)-G_M^2(q^2)\},\\&\begin{aligned} v_2^\text {el}(q^2)&=\frac{2m^2}{(-q^2)(4m^2-q^2)})\\&\quad \times \,\{4m^2G_E^2(q^2)-q^2G_M^2(q^2)\},\end{aligned} \end{aligned} \end{aligned}$$
(10)

where \(G_E(t)\) and \(G_M(t)\) are the Sachs form factors.

The elastic contributions to the time-ordered amplitudes \(T_1,T_2\) cannot be specified as easily. In perturbation theory, they are usually referred to as Born terms, and it is not a trivial matter to specify them at higher orders of the calculation. In effective low-energy theories, the decomposition into a Born term and a ’structure part’ is not a simple matter, either. For a detailed discussion of these aspects, we refer to [3840]. In the framework of dispersion theory, however, the decomposition is unambiguous. The reason is that analytic functions are fully determined by their singularities and their asymptotic behaviour: dispersion theory provides a representation of the amplitudes in terms of its singularities. In our framework, this representation is given by the dispersion relations (2) and the sum rule (9). The elastic contribution is the part of the amplitude which is generated by the singularities due to the elastic intermediate states. These are specified in (10). Accordingly, the elastic parts of \(T_1,T_2\) are obtained by simply replacing \(V_1,V_2\) with \(V_1^\text {el},V_2^\text {el}\) and dropping the Regge contributions. In the case of \(T_2\), this leads to

$$\begin{aligned} T_2^\text {el}(\nu ,q^2)=2\int _0^\infty \mathrm{d}\nu ' \nu ' \,\frac{V_2^\text {el}(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }.\end{aligned}$$
(11)

In the case of \(T_1(\nu ,q^2)\) there are two contributions, one from the subtraction function, the other from the subtracted dispersion integral:

$$\begin{aligned} T_1^\text {el}(\nu ,q^2)=S^\text {el}_1(q^2)+2\nu ^2\int _0^\infty \frac{d\nu '}{\nu '} \,\frac{V_1^\text {el}(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }.\end{aligned}$$
(12)

The sum rule (9) for the subtraction function implies

$$\begin{aligned} S^\text {el}_1(q^2)=2 \int _0^\infty \frac{\mathrm{d}\nu }{\nu } \,V_1^\text {el}(\nu ,q^2).\end{aligned}$$
(13)

Taken together, the two terms on the right hand side of (12) yield the unsubtracted dispersion integral, so that the expression takes the same form as the one for \(T_2^\text {el}(\nu ,q^2)\):

$$\begin{aligned} T_1^\text {el}(\nu ,q^2)=2\int _0^\infty \mathrm{d}\nu ' \nu '\, \frac{V_1^\text {el}(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }.\end{aligned}$$
(14)

Inserting the explicit expressions for the elastic contributions to the structure functions, we obtain

$$\begin{aligned} T_1^\text {el}(\nu ,q^2)= & {} \frac{4m^2q^2}{(4m^2\nu ^2-q^4)(4m^2-q^2)}\nonumber \\&\times \,\{G_E^2(q^2)-G_M^2(q^2)\},\nonumber \\ T_2^\text {el}(\nu ,q^2)= & {} -\frac{4m^2}{(4m^2\nu ^2-q^4)(4m^2-q^2)}\nonumber \\&\times \,\{4m^2G_E^2(q^2)-q^2G_M^2(q^2)\}. \end{aligned}$$
(15)

Both functions tend to zero when \(\nu \) becomes large: by construction, the elastic part of \(T_1(\nu ,q^2)\) does not contain a singularity at infinity. Moreover, as demonstrated in Appendix D, even taken by itself, the elastic contributions can be represented in manifestly causal form.

The explicit expression for the elastic part of the subtraction function,

$$\begin{aligned} S^\text {el}_1(q^2)=-\frac{4m^2}{q^2(4m^2-q^2)}(G_E^2(q^2)-G_M^2(q^2)),\end{aligned}$$
(16)

exclusively involves the form factors, which are known very precisely.

3.2 Inelastic part

We refer to the remainder as the inelastic part of the amplitude:

$$\begin{aligned} T_i(\nu ,q^2)=T_i^\text {el}(\nu ,q^2)+T_i^\text {inel}(\nu ,q^2),\quad i= 1,2.\end{aligned}$$
(17)

In contrast to the elastic part, which contains the poles generated by the elastic intermediate states and is singular at the origin, the inelastic part is regular there. At high energies, the converse is true: while the elastic part tends to zero, the inelastic part includes the contributions from the Reggeons, which are singular at infinity. In particular, the sum rule for the inelastic part of the subtraction function reads

$$\begin{aligned} S^\text {inel}_1(q^2)= & {} T_1^\mathrm{R}(0,q^2)+2\int _{\nu _\text {th}}^\infty \frac{\mathrm{d}\nu }{\nu } \{V_1(\nu ,q^2)-V_1^\mathrm{R}(\nu ,q^2)\} , \end{aligned}$$
(18)

where \(\nu _\text {th}=M_\pi +(M_\pi ^2-q^2)/2m\) denotes the inelastic threshold. The dispersive representation for the inelastic part of \(T_1(\nu ,q^2)\) then becomes

$$\begin{aligned} T_1^\text {inel}(\nu ,q^2)=S^\text {inel}_1(q^2)+\, 2\nu ^2 \int _{\nu _\text {th}}^\infty \frac{\mathrm{d}\nu '}{\nu '} \,\frac{V_1(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }. \end{aligned}$$
(19)

In the case of \(T_2(\nu ,q^2)\), a subtraction is not needed. The contribution from the elastic intermediate state to the dispersion integral in (2) coincides with the expression for \(T_2^\text {el}(\nu ,q^2)\) in (15). Removing this part, which is even more singular at the origin than \(T_1^\text {el}(\nu ,q^2)\), we obtain the following representation for the inelastic part:

$$\begin{aligned} T_2^\text {inel}(\nu ,q^2)= 2\int _{\nu _\text {th}}^\infty \mathrm{d}\nu '\, \nu ' \,\frac{ V_2(\nu ',q^2)}{\nu '^2-\nu ^2-i\epsilon }.\end{aligned}$$
(20)

3.3 Subtraction function in terms of cross sections

The structure function \(V_1(\nu ,q^2)\) is a linear combination of the transverse and longitudinal cross sections, see Appendix A:

$$\begin{aligned}&V_1(\nu ,q^2)= \frac{m\nu }{2\alpha _\text {em}}k(\nu ,Q^2)\{\bar{\sigma }_{L}(\nu ,Q^2)-\sigma _{T}(\nu ,Q^2)\},\nonumber \\&\bar{\sigma }_{L}(\nu ,Q^2)\equiv \frac{\nu ^2}{Q^2}\sigma _{L}(\nu ,Q^2),\\&k(\nu ,Q^2) \equiv \frac{1}{2\pi ^2} \frac{\nu -Q^2/2m}{\nu (\nu ^2+Q^2)}.\nonumber \end{aligned}$$
(21)

The representation of the subtraction function thus involves integrals over the transverse and longitudinal cross sections. For \(S_1^\text {inel}(q^2)\) and \(S_2^\text {inel}(q^2)\), the following integrals are relevant:

$$\begin{aligned} \Sigma ^{T}(Q^2)= & {} \int _{\nu _\text {th}}^\infty \mathrm{d}\nu \; k(\nu ,Q^2)\,\sigma _{T}(\nu ,Q^2), \end{aligned}$$
(22)
$$\begin{aligned} \Sigma ^{L}_1(Q^2)= & {} \frac{\alpha _\text {em}}{m}T_1^\mathrm{R}(0,q^2)\nonumber \\&+ \int _{\nu _\text {th}}^\infty \mathrm{d}\nu \; k(\nu ,Q^2)\,\Delta \bar{\sigma }_{L}(\nu ,Q^2), \end{aligned}$$
(23)
$$\begin{aligned}&\Sigma ^{L}_2(Q^2)= \int _{\nu _\text {th}}^\infty \mathrm{d}\nu \; k(\nu ,Q^2)\,\sigma _{L}(\nu ,Q^2),\end{aligned}$$
(24)
$$\begin{aligned}&\Delta \bar{\sigma }_{L}(\nu ,Q^2) \equiv \bar{\sigma }_{L}(\nu ,Q^2)-\bar{\sigma }_{L}^\mathrm{R}(\nu ,Q^2). \end{aligned}$$
(25)

Expressed in terms of these, \(S_1^\text {inel}(q^2)\) and \(S_2^\text {inel}(q^2)\) are given by

$$\begin{aligned}&S_1^\text {inel}(q^2)=\frac{m}{\alpha _\text {em}}\Sigma _1(Q^2),\nonumber \\&S_2^\text {inel}(q^2)=\frac{m}{\alpha _\text {em}}\Sigma _2(Q^2),\nonumber \\&\Sigma _1(Q^2)=-\Sigma ^{T}(Q^2)+\Sigma ^{L}_1(Q^2) ,\nonumber \\&\Sigma _2(Q^2)=\Sigma ^{T}(Q^2)+\Sigma ^{L}_2(Q^2) . \end{aligned}$$
(26)

While the transverse parts of \(\Sigma _1(Q^2)\) and \(\Sigma _2(Q^2)\) only differ in sign, the longitudinal parts are quite different. Regge asymptotics implies that \(\sigma _{T}\) as well as \(\sigma _{L}\) grow in proportion to \(\nu ^{\alpha -1}\). Accordingly, the integral \(\Sigma ^{T}(Q^2)\) converges – it represents a generalisation of the integral relevant for the Baldin sum rule to \(Q^2\ne 0\) (cf. Sect. 4.2). While \(\Sigma _2^{L}(Q^2)\) is dominated by the contributions from the low-energy region and rapidly converges as well, it is essential that Reggeon exchange be accounted for in \(\Sigma _1^{L}(Q^2)\).

We are assuming that, at high energies, the longitudinal cross section can be approximated with a representation of the form

$$\begin{aligned} \bar{\sigma }_{L}^\mathrm{R}(\nu ,Q^2)= & {} 8\pi ^2\alpha _\text {em}\,\frac{\nu ^2}{2m\nu -Q^2}\nonumber \\&\times \,\sum _{\alpha >0} \beta _\alpha (Q^2)(2m\nu -Q^2+m^2-s_0)^\alpha . \end{aligned}$$
(27)

At \(Q^2=0\), a Reggeon term proportional to \(\nu ^{\alpha }\) in \(V_1\) corresponds to a contribution to \(\bar{\sigma }_{L}\) that is proportional to \(\nu ^{\alpha +1}\). For nonzero values of \(Q^2\), however, the factor in front of the sum implies that the corresponding cross section contains sub-leading contributions. As discussed at the end of Sect. 2, the specific form used for the Regge parameterisation is not essential – as long as it satisfies a once-subtracted dispersion relation and correctly represents the asymptotic behaviour of the physical cross section. We stick to the one specified in (5), which leads to (27).

3.4 Chiral expansion

Chiral perturbation theory (\(\chi \)PT) exploits the fact that in the limit \(m_u,m_d\rightarrow 0\) (at fixed \(\Lambda _\text {QCD}, m_s, \ldots , m_t\)) QCD acquires an exact chiral symmetry, which strongly constrains the low-energy properties of the amplitudes. The chiral perturbation series provides a representation of the quantities of interest in powers of momenta and quark masses. In the chiral limit, the pion is a massless particle, but when the quark masses \(m_u,m_d\) are turned on, the pion picks up mass in proportion to the square root thereof, \(M_\pi ^2 =(m_u+m_d) B+O(m_q^2 \log m_q)\).

In the context of the present paper, we only need the chiral expansion of the form factors \(G_E(q^2)\), \(G_M(q^2)\), and of the functions \(S_1(q^2)\), \(S_2(q^2)\). These quantities involve a single momentum variable, \(q^2\). As we work in the isospin limit, \(m_u=m_d\), the corresponding chiral perturbation series involves an expansion in the two variables \(M_\pi \) and \(q^2\). The series can be ordered in powers of \(M_\pi \); the coefficients then depend on the ratio

$$\begin{aligned} \tau =-\frac{q^2}{4M_\pi ^2},\end{aligned}$$
(28)

which counts as a quantity of O(1). In contrast to the straightforward Taylor series in powers of \(q^2\), the chiral expansion is able to cope with the infrared singularities generated by the pions.

To leading order in the chiral expansion, the infrared singularities are described by a set of one-loop graphs of the effective theory [41]. In the case of the magnetic Sachs form factor, for instance, the evaluation of the relevant graphs within Heavy Baryon \(\chi \)PT leads to the following expression for the first non-leading term in the chiral expansion [42]:Footnote 5

$$\begin{aligned} G^p_M(q^2)= & {} \mu ^p - \frac{g_A^2 mM_\pi }{16\pi F_\pi ^2} \left\{ (1+\tau )\frac{\arctan \sqrt{\tau }}{\sqrt{\tau }}-1\right\} \nonumber \\&+\,O(M_\pi ^2\log M_\pi ),\nonumber \\ G^n_M(q^2)= & {} \mu ^n +\frac{g_A^2 mM_\pi }{16\pi F_\pi ^2} \left\{ (1+\tau )\frac{\arctan \sqrt{\tau }}{\sqrt{\tau }}-1\right\} \nonumber \\&+\,O(M_\pi ^2\log M_\pi ). \end{aligned}$$
(29)

Up to and including \(O(M_\pi )\), the magnetic form factor can thus be represented in terms of the magnetic moment \(\mu \), the pion decay constant, \(F_\pi =92.21(14)\,\text {MeV}\) [44], and the nucleon matrix element of the axial charge, \(g_A=1.2723(23)\) [45]. The formula shows that, up to higher-order contributions, the singularity is described by a function of the ratio \(\tau =(-q^2)/4M_\pi ^2\): the scale is set by the pion mass, not by \(\Lambda _\text {QCD}\). The presence of a scale that disappears in the chiral limit also manifests itself in the slope of the form factor at \(q^2=0\), i.e. in the magnetic radius: the above representation shows that the chiral expansion of the magnetic radii of proton and neutron starts with a term of \(O(1/M_\pi )\).

The low-energy behaviour of the electric Sachs form factors is less singular:

$$\begin{aligned} G^p_E(q^2)= & {} 1+O(M_\pi ^2\log M_\pi ),\nonumber \\ G^n_E(q^2)= & {} O(M_\pi ^2\log M_\pi ). \end{aligned}$$
(30)

Accordingly, the chiral expansion of the electric radii does not start with a term of \(O(1/M_\pi )\), but with a chiral logarithm, comparable to the situation with the charge radius of the pion.

The subtraction function also diverges if the chiral limit is taken at a fixed value of the ratio \(q^2/M_\pi ^2\): the leading term in the chiral expansion of \(S^\text {inel}_1(q^2)\) is of order \(1/M_\pi \) and is determined by \(F_\pi \) and \(g_A\) as well [46]:

$$\begin{aligned} S^\text {inel}_1(q^2)= & {} -\frac{g_A^2 m}{64\pi F_\pi ^2 M_\pi \tau }\left\{ 1-\frac{\arctan \sqrt{\tau }}{\sqrt{\tau }}\right\} \nonumber \\&+\,O(\log M_\pi ). \end{aligned}$$
(31)

The expansion of the analogous term in \(T_2\) starts with [46]:

$$\begin{aligned} S_2^\text {inel}(q^2)= & {} -\frac{g_A^2 m}{64\pi F_\pi ^2 M_\pi \tau }\left\{ 1-(1+4\tau )\frac{\arctan \sqrt{\tau }}{\sqrt{\tau }}\right\} \nonumber \\&+\,O(\log M_\pi ). \end{aligned}$$
(32)

In either case, the leading term is the same for proton and neutron – for \(S^\text {inel}_1(q^2)\) and \(S^\text {inel}_2(q^2)\), the chiral expansion of the difference between proton and neutron only starts at \(O(\log M_\pi )\).

4 Nucleon polarisabilities

4.1 Low-energy theorems

In contrast to the elastic parts, which are singular at the origin, the inelastic contributions to \(T_1(\nu ,q^2), T_2(\nu ,q^2)\) do admit a Taylor series expansion in powers of \(\nu \) and \(q^2\). Two low-energy theorems relate the leading terms in this expansion to the polarisabilities of the nucleon. The theorems amount to rather nontrivial statements, because the functions \(T_1(\nu ,q^2), T_2(\nu ,q^2)\) represent the virtual Compton scattering amplitude in the forward direction, while the experimental determination of the polarisabilities relies on real Compton scattering at nonzero scattering angle. A concise derivation is given in Appendix B.

In the above notation, the low-energy theorems take the simple form

$$\begin{aligned} S_1^\text {inel}(0)= & {} - \frac{\kappa ^2}{4m^2}-\frac{m}{\alpha _\text {em}}\,\beta _M,\end{aligned}$$
(33)
$$\begin{aligned} S_2^\text {inel}(0)= & {} \frac{m}{\alpha _\text {em}}\,(\alpha _E+\beta _M), \end{aligned}$$
(34)

where \(\kappa \) is the anomalous magnetic moment, \(\alpha _E\) and \(\beta _M\) are the electric and magnetic polarisabilities of the particle, and \(\alpha _\text {em}\) is the fine structure constant. These relations show that the polarisabilities contain an elastic as well as an inelastic part, while their sum, \(\alpha _E+\beta _M\), is purely inelastic:

$$\begin{aligned} \alpha _E^\text {el}= \frac{\alpha _\text {em}\kappa ^2}{4m^3},\quad \beta _M^\text {el}=-\frac{\alpha _\text {em}\kappa ^2}{4m^3}.\end{aligned}$$
(35)

Table 1 shows that the elastic parts only represent a small fraction of the polarisabilities.

Table 1 Experimental values of the nucleon polarisabilities, in units of \(10^{-4}\,\text {fm}^3\), as determined from EFT extractions in Compton scattering [47, 49, 51] and analyses of the Baldin sum rule [48, 50] (see also [37, 52]). The latter results were imposed in [47, 49], so that the quoted errors for \(\alpha _E\) and \(\beta _M\) are anticorrelated

4.2 Sum rules for the polarisabilities

The left hand side of the low-energy theorem (33) represents the inelastic part of the subtraction function at \(q^2=0\):

$$\begin{aligned} \beta _M^\text {inel}=-\frac{\alpha _\text {em}}{m}\,S^\text {inel}_1(0).\end{aligned}$$
(36)

The representation for the subtraction function in (18) thus amounts to a sum rule for the inelastic part of the magnetic polarisability. Adding the elastic contribution, the sum rule takes the form

$$\begin{aligned} \beta _M = \Sigma ^{T} (0)-\Sigma ^{L}_1(0)-\frac{\alpha _\text {em}\kappa ^2}{4m^3}.\end{aligned}$$
(37)

To our knowledge, this sum rule is new. It states that, in the absence of fixed poles, the magnetic polarisabilities of proton and neutron are determined by the cross sections for photo- and electroproduction. If the amplitude \(T_1(\nu ,q^2)\) were to obey an unsubtracted dispersion relation, the Regge terms in the expression for \(\Sigma ^{L}_1(Q^2)\) could be dropped, so that the sum rule would reduce to the one proposed in [53]. Regge asymptotics implies that a subtraction is needed, but if the Reggeon trajectories and residues are known, the subtraction can be expressed in terms of these.

Evaluating the dispersive representation (20) at \(\nu =q^2=0\), we obtain

$$\begin{aligned} S_2^\text {inel}(0)= 2\int _{\nu _\text {th}}^\infty \frac{\mathrm{d}\nu }{\nu }V_2(\nu ,0) =\frac{m}{2\pi ^2\alpha _\text {em}}\int _{\nu _\text {th}}^\infty \frac{\mathrm{d}\nu }{\nu ^2}\,\sigma _\text {tot}(\nu ) .\end{aligned}$$
(38)

The low-energy theorem (34) thus represents the familiar Baldin sum rule [54]. The integral occurring here is a limiting case of the quantity \(\Sigma ^{T}(Q^2)\) introduced in (22): the Baldin sum rule amounts to

$$\begin{aligned} \alpha _E+\beta _M=\Sigma ^{T} (0).\end{aligned}$$
(39)

Comparison with (37) shows that the electric polarisability obeys a sum rule that exclusively involves the longitudinal cross section and the anomalous magnetic moment:

$$\begin{aligned} \alpha _E = \Sigma ^{L}_1(0)+\frac{\alpha _\text {em}\kappa ^2}{4m^3} .\end{aligned}$$
(40)

5 Numerical analysis

5.1 Experimental information

We evaluate the cross section integrals on the following basis.Footnote 6

\({{W}<1.3}\): At low energies, the resonance \(\Delta (1232)\) generates the most important inelastic contribution. It decays almost exclusively into \(\pi N\) final states which have been thoroughly explored. The SAID, MAID, Dubna–Mainz–Taipei (DMT), and chiral-MAID collaborations provide pion photo- and electroproduction cross sections into these channels [5563].Footnote 7 For \(W< 1.3\) and real photons (\(Q^2=0\)), the transverse cross section is well approximated by the sum over these contributions. In particular, the representations we are using are consistent with isospin symmetry, which implies that the contributions from the \(\Delta \) to the proton and neutron cross sections are the same up to symmetry-breaking effects of \(O(m_u-m_d,\alpha _\text {em})\), which are expected to be very small. Moreover, as seen from Fig. 1 (left panel), the \(\Delta \) dominates in the transverse cross sections and gives very small contributions to the longitudinal ones. This property is directly related to the smallness of the C2 Coulomb quadrupole form factor for the \(\Delta N\gamma ^*\) transition. In the non-relativistic quark model, where both the nucleon and the \(\Delta \) are zero-orbital-momentum three-quark states, this form factor, as well as the one of the E2 electric quadrupole, vanish altogether [6668].

Fig. 1
figure 1

Cross sections \(\sigma _{T}\) and \(\bar{\sigma }_{L}\equiv \nu ^2/Q^2\sigma _{L}\) in the \(\Delta \)-region, for \(Q^2= 0\)

The comparison of the full lines in the two panels of Fig. 1 shows that, in the region where the \(\Delta \) generates the dominant contribution, the transverse cross sections for proton and neutron are indeed nearly the same: the differences are smaller than the individual terms by an entire order of magnitude [66]. For the polarisabilities, the behaviour of the ratio \(\sigma _{L}/Q^2\) in the limit \(Q^2\rightarrow 0\) is relevant. Since MAID and DMT offer a representation also for this quantity, these parameterisations are particularly convenient for us. For definiteness, we identify the central values of the cross sections in the region \(W<1.3\) with the average of MAID and DMT, abbreviated as MD: \(\sigma _\text {MD}=\frac{1}{2}(\sigma _\text {MAID}+\sigma _\text {DMT})\). As far as the proton cross sections are concerned, the results obtained with SAID, MAID, and DMT are practically the same, but Fig. 2 shows that for the small differences between proton and neutron, this is not the case. The uncertainties in the input used for the cross sections do affect our numerical results and will be discussed together with these.

Fig. 2
figure 2

Consistency check at the transition point \(W=1.3\). The plot compares the representations of MAID and DMT used below that point with the BC-parametrisation used above it. The difference between MAID and DMT and the band attached to BC represent an estimate of the uncertainties to be attached to these parameterisations. As discussed in the text, the picture implies that these parameterisations provide a coherent framework only for \(Q^2>0.5\)

\({1.3<{W}<3}\): In the intermediate region, we rely on the work of Bosted and Christy (BC), who provide parameterisations of the transverse and longitudinal proton and neutron cross sections in the resonance region, \(m+M_\pi <W<3.2\), in the range \(0<Q^2<8\) [69, 70]. These contain a wealth of information, but suffer from a number of shortcomings. In particular, their fit to the data is carried out under the assumption that the ratio \(\sigma _{L}/\sigma _{T}\) is the same for proton and neutron. An experimental analysis that does not rely on this assumption would be most welcome. Second, the parameterisation does not properly cover the region of very small photon virtualities (cf. [71]): (a) The algebraic form of the representation used for \(\sigma _{L}\) implies that the quantity \(\bar{\sigma }_{L}\equiv \sigma _{L}\nu ^2/Q^2\) disappears when Q tends to zero instead of approaching a nonzero limiting value. (b) Isospin symmetry implies that the contributions of the resonance \(\Delta (1232)\) to proton and neutron are the same, but, as noted in [36], the BC-parameterisation does not respect this symmetry to the expected accuracy. (c) The parameterisation of the contribution from the resonance N(1530) exhibits an unphysical dependence on \(Q^2\): in the tiny interval \(0< Q^2< 0.001\), the contribution from this resonance to the transverse cross section of the proton varies by about 40 %. Although this artefact only manifests itself at very small values of \(Q^2\), it seriously affects our calculation because the results obtained for the polarisabilities depend on whether we simply evaluate the sum rules at \(Q^2=0\) or use very small positive values of \(Q^2\) – for the physical cross sections, a difference of this sort cannot arise.

In the interval \(1.3<W<3\), we use the following crude estimate for mean values and errors: (i) The central value is identified with the result obtained with the BC-parameterisation. (ii) In order to wash out the spikes occurring at very small values of \(Q^2\), we assign an 8 % uncertainty to the BC-representation of the proton cross sections: \(\Delta \sigma ^p = 0.08\,\sigma ^p\). (iii) Since the difference between the proton and neutron cross sections is much smaller than the individual terms, small relative errors in the latter can generate large relative errors in the difference. For this reason, we use the same error estimate for \(\sigma ^{p-n}\) as for the individual terms, i.e. work with \(\Delta \sigma ^{p-n}= 0.08\,\sigma ^p\).

The comparison of the representations for the difference between the proton and neutron cross sections used below and above \(W = 1.3\) offers a consistency test on our calculations. Figure 2 compares the representations of MAID and DMT with the uncertainty band attached to BC at the transition point. The left panel shows that the representations for the difference of the transverse cross sections used below and above that point agree with one another only for \(Q^2>0.5\). The problem arises from the deficiencies mentioned above, which prevent us from reliably evaluating the cross section integrals at low values of \(Q^2\). The right panel shows that the uncertainties in the difference of the longitudinal cross sections are considerable, but within these, the representations used are coherent.

\({{W}>3}\): We estimate the contributions from higher energies with the representation of Alwall and Ingelman [72]. It is based on the vector-meson-dominance model [7376] and offers a parameterisation of the transverse and longitudinal cross sections of the form

$$\begin{aligned} \sigma _{T}= & {} \beta ^{T}_{P}(Q^2)s^{\alpha _P-1}+\beta _R^{T}(Q^2) s^{\alpha _R-1},\nonumber \\ \sigma _{L}= & {} \beta ^{L}_P(Q^2)s^{\alpha _P-1}+\beta _R^{L}(Q^2) s^{\alpha _R-1}, \end{aligned}$$
(41)

where \(s=W^2\) is the square of the centre-of-mass energy. The Pomeron cut is approximated by a Regge pole at \(\alpha _P=1.091\), while the Reggeons with the quantum numbers of f and \(a_2\) are lumped together in a single contribution with \(\alpha _R= 0.55\).

The Pomeron residues of proton and neutron are the same:

$$\begin{aligned} \beta ^{T}_P(Q^2)^n= \beta ^{T}_P(Q^2)^p,\quad \beta ^{L}_P(Q^2)^n= \beta ^{L}_P(Q^2)^p.\end{aligned}$$
(42)

For the remainder, we follow [4], invoke SU(3), and stick to the value of the D / F ratio quoted there (for the definition of the Regge couplings D and F and a review of their determination, we refer to [77]):

$$\begin{aligned}&\beta _R^{T}(Q^2)^n = \xi \,\beta _R^{T}(Q^2)^p,\quad \beta _R^{L}(Q^2)^n=\xi \, \beta _R^{L}(Q^2)^p,\nonumber \\&\quad \xi = \frac{6F-4D}{9F-D}\simeq 0.74. \end{aligned}$$
(43)

The parameterisations for the structure function \(F_2\) of Capella et al. [78, 79] and for the ratio \(\sigma _{L}/\sigma _{T}\) of Sibirtsev et al. [80] provide an alternative Regge representation of the cross sections, which we refer to as CS. In Fig. 3, the consistency check made at the transition point \(W=1.3\) is repeated for \(W=3\). The plot shows that, for \(Q^2<1.4\), the central representation of AI is indeed contained in the uncertainty band attached to BC, while the one of CS runs above it. The comparison indicates that, at low values of \(Q^2\), working with AI yields a coherent picture, while with CS this is not the case. For \(Q^2>2\), however, the situation is reversed: there, the AI-representation yields values for the difference between the transverse cross sections that are too small while the CS-representation is consistent with the values obtained from BC. This confirms the conclusion reached in [72]: the above form of the AI-representation applies as it stands only for \(Q^2 < 1\). At higher values of \(Q^2\), the parameterisation underestimates the size of the structure function \(F_2\) and further contributions have to be added for the vector-meson-dominance formulae to become compatible with the observed behaviour. Since we do not account for these and the uncertainties we attach to the central representation do not cover the gap, the input we are working with becomes incoherent for \(Q^2>2\). The right panel, on the other hand, shows that the representations we are using for the longitudinal cross section do survive the consistency test, irrespective of the value of \(Q^2\).

Note that we are discussing the properties of the difference between the proton and neutron cross sections. The main problem here is that all of the well-established properties of the proton cross sections drop out when taking the difference between proton and neutron. High precision is required to measure the remainder, in particular also at high energies, where the Pomeron dominates the scenery. Also, since the longitudinal cross section is significantly smaller than the transverse one, pinning it down accurately is notoriously difficult. In both of the above representations, the ratio \(\sigma _{L}/\sigma _{T}\) is taken to be energy-independent.Footnote 8 This appears to be consistent with experiment, but since we are not aware of a theoretical explanation, a test of the validity of this assumption would be very useful. For recent applications of these representations to the amplitudes under consideration we refer to [71, 81, 82].

Fig. 3
figure 3

Consistency check at the transition point \(W = 3\); see main text

5.2 Evaluation of \(\Sigma ^{T}\) and \(\Sigma _2\) for the proton

We start the discussion of the cross section integrals with the one over the transverse cross section, \(\Sigma ^{T}(Q^2)\), which is specified in (22). The value at the origin is relevant for the sum of the electric and magnetic polarisabilities, \(\Sigma ^{T}(0)=\alpha _E+\beta _M\). Since the longitudinal cross section vanishes at \(Q^2=0\), the function \(\Sigma _2(Q^2)= \Sigma ^{T}(Q^2)+\Sigma _2^{L}(Q^2)\) takes the same value there. In fact, Fig. 4 shows that \(\Sigma _2(Q^2)\) is dominated by the transverse part also at nonzero virtuality – the longitudinal part amounts to a modest correction.

As pointed out in [71], the structure function \(F_1(x,Q^2)\) can also be used to continue the integral relevant for the Baldin sum rule to nonzero values of \(Q^2\):

$$\begin{aligned} \Sigma _{F_1}(Q^2)= & {} \frac{8m\alpha _\text {em}}{Q^4}\int _0^{x_\text {th}}dx\, x F_1\nonumber \\= & {} \frac{1}{2\pi ^2 }\int _{\nu _\text {th}}^\infty \frac{d\nu }{\nu ^3}\; (\nu -Q^2/2m)\,\sigma _{T} . \end{aligned}$$
(44)

Since the integrand differs from the one relevant for \(\Sigma ^{T}(Q^2)\) only by the factor \(1+Q^2/\nu ^2\), the quantity \(\Sigma _{F_1}\) also reduces to \(\alpha _E+\beta _M\) when \(Q^2\) vanishes, but drops off somewhat less rapidly when \(Q^2\) grows.

Fig. 4
figure 4

Cross section integrals related to the Baldin sum rule for the proton (numerical values in units of \(10^{-4}\,\text {fm}^3\)). At \(Q^2=0\), all of the quantities shown reduce to \(\alpha _E^p+\beta _M^p\). The plot focuses on small values of \(Q^2\), where the behaviour is dominated by the Nambu–Goldstone bosons – in the chiral limit these generate an infrared singularity. The short-dashed line shows the parameter-free result obtained from \(\chi \)PT at leading order. The cross section integrals are evaluated with the parameterisation specified in Sect. 5.1, except for \(\tilde{\Sigma }_{F_1}\), where the contribution from the region of the Delta is calculated with BC instead of MD

The lines for \(\Sigma ^{T}\), \(\Sigma _2\) and \(\Sigma _{F_1}\) in Fig. 4 are obtained by using the parameterisations specified in Sect. 5.1. As stated there, the contributions from the region \(W<1.3\) are evaluated with the mean of MAID and DMT, but we could just as well have used SAID – on this plot, the difference would barely be visible.

In [71], the function \(\Sigma _{F_1}(Q^2)\) is instead evaluated with the BC-parameterisation, also in the region of the \(\Delta \)-resonance. This leads to the behaviour indicated by the dash-dotted line labelled \(\tilde{\Sigma }_{F_1}\). The topmost line, which is obtained by evaluating the same formula with the MD-parameterisation, is higher by about 0.8 units. The difference is closely related to the fact that the BC-parameterisation does not respect isospin symmetry to the expected accuracy (see the discussion in Sect. 5.3).

As pointed out by Bernard et al. [83], \(\chi \)PT neatly explains the size of the combination of polarisabilities occurring in the Baldin sum rule. The parameter-free expression (32) for the leading term in the chiral perturbation series of \(\Sigma _2(Q^2)\) is shown as a dashed line. The comparison with the experimental result for \(\alpha _E+\beta _M\) shows that, at small values of \(Q^2\), the leading term of the chiral series dominates. In the limit \(Q^2\rightarrow 0\), this term reduces to

$$\begin{aligned} \alpha _E+\beta _M=\frac{11\alpha _\text {em}g_A^2}{192 \pi F_\pi ^2 M_\pi }.\end{aligned}$$
(45)

In the chiral limit this formula diverges in inverse proportion to \(M_\pi \): if the quarks are taken massless, \(T_2^\text {inel}(\nu ,q^2)\) contains an infrared singularity at \(\nu =q^2=0\).

The same singularity also shows up in the \(Q^2\)-dependence, which exhibits the presence of an unusually small scale: at leading order of the chiral expansion, the function \(\Sigma _2(Q^2)\) depends on \(Q^2\) only via the variable \(\tau =Q^2/4M_\pi ^2\). Hence the scale is set by \(2M_\pi \) rather than \(M_\rho \). Figure 4 shows that, in reality, \(\Sigma _2(Q^2)\) drops even more rapidly, partly on account of the second-sheet pole associated with the \(\Delta \), partly due to other higher-order contributions of the chiral series [40, 46, 84, 85].

The spike seen in Fig. 4 at tiny values of \(Q^2\) illustrates the artefact mentioned in Sect. 5.1, which concerns the contribution from the resonance N(1530): if the numerical values of the integrals in the region \(0.002 < Q^2 <0.005\) are fit with a low-order polynomial, the extrapolation to \(Q^2=0\) is higher than the result of the direct evaluation at \(Q^2=0\), by about 0.4 units. Since the experimental information from real Compton scattering and from photoproduction is more stringent than the one from electron scattering, which for these very small values of \(Q^2\) necessarily involves extrapolations, we think that the results obtained by evaluating the integral over the transverse cross section at \(Q^2=0\) are more reliable. The value obtained there with MAID or DMT is \((\alpha _E+\beta _M)^p=14.1\), while SAID yields a result that is lower by about 0.1 units. The numbers obtained at \(Q^2=0\) with the parameterisations we are using thus agree with the result \((\alpha _E+\beta _M)^p=13.8(4)\) quoted in the review [37], which stems from [48].

5.3 \(\Sigma ^{T}\) and \(\Sigma _2\): proton–neutron difference

Figure 5 shows the difference between the integrals over the proton and neutron cross sections. The picture looks very different from Fig. 4: while there, the curves start at \(\Sigma \simeq \) 14 and rapidly drop with \(Q^2\), those in Fig. 5 start at \(\Sigma \simeq \) 0 and stay there. Since the integrals under consideration are rapidly convergent, the behaviour of the cross sections in the resonance region is relevant. The reason why not much is left in the difference between proton and neutron is that, in that region, the proton and neutron cross sections are nearly the same. In particular, as mentioned in Sect. 5.1, isospin symmetry implies that the most prominent low-energy phenomenon, the \(\Delta \), drops out when taking the difference between the proton and neutron cross sections. The cancellation of the main contributions also manifests itself in the chiral perturbation series: the leading terms in \(\Sigma _2^p\) and \(\Sigma _2^n\) are large, of order \(1/M_\pi \), but the coefficients are the same, so that the chiral expansion of \(\Sigma _2^{p-n}\) only starts at O(1).

Fig. 5
figure 5

Cross section integrals relevant for the difference between proton and neutron

For our cross section integrals to exhibit these features, it is essential that the representations we are using in the region of the \(\Delta \) respect isospin symmetry. The dash-dotted line illustrates the fact that the BC-parameterisation of the cross sections violates this constraint quite strongly: the bump seen around \(Q^2\simeq 0.1\) arises from the difference between proton and neutron which occurs in that parameterisation in the region of the \(\Delta \). As mentioned above, the difference between the parameterisations MD and BC in the region \(W<1.3\) also shows up in Fig. 4. It so happens that the difference between the results obtained via extrapolation from \(Q^2>0.002\) and via evaluation at \(Q^2=0\) nearly cancels the one between the contributions from the region of the \(\Delta \) obtained with BC and with MD, so that the number obtained for \(\alpha _E^p+\beta _M^p\) in [81] agrees with experiment.

The spike seen at very small values of \(Q^2\) is about twice as large as the one in Fig. 4 and manifests itself much more prominently because the difference between proton and neutron is an order of magnitude smaller than the individual terms. The value obtained at \(Q^2=0\) is consistent with the experimental result, \((\alpha _E+\beta _M)^{p-n}=-1.4(6)\).

5.4 Pomeron exchange

The integrals considered in the preceding two subsections converge rapidly. Their properties are governed by the low-energy behaviour of the cross sections – the asymptotic behaviour does not play a significant role. For the integral \(\Sigma _1^{L}(Q^2)\) specified in (23), the situation is very different: for this integral to converge, it is essential that the asymptotic behaviour of the longitudinal cross section be known, so that it can properly be accounted for. At high energies, the leading contribution stems from Pomeron exchange, which generates a branch point at \(J=1\) in the angular momentum plane. In phenomenological parameterisations, such as the one specified in (41), the branch cut is often approximated by a Regge pole in the range \(1<\alpha _P<2\). For this parameterisation to have the required asymptotic accuracy, it must describe the contribution from the Pomeron up to terms that disappear in the limit \(\nu \rightarrow \infty \).

The Regge representation we are using to describe the asymptotic behaviour of the structure functions leads to the parameterisation (27). In this framework, the Pomeron term in (41) not only generates a leading contribution to the cross section with \(\alpha =\alpha _P\), but also a daughter with \(\alpha =\alpha _P-1\). Furthermore, in contrast to the situation with the parameterisation of the contributions from the non-leading Reggeons, the value of the parameter \(s_0\) does matter here: a change in the value of \(s_0\) generates an asymptotic contribution proportional to \(\nu ^{\alpha _P-1}\). If the integral in (23) does converge for one particular value of \(s_0\), it diverges for any other value.

As an illustration of the mathematical problem we are facing here, consider a contribution of the form

$$\begin{aligned} \Delta T_1(\nu ,q^2)= & {} \frac{1}{2}\xi (q^2)\{(s_1-m^2-2m\nu -q^2)^\delta \nonumber \\&+\,(s_1-m^2+2m\nu -q^2)^\delta \}, \end{aligned}$$
(46)

which is free of fixed poles. For \(\nu \ge 0\), \(q^2\le 0\), the corresponding absorptive part is given by

$$\begin{aligned} \Delta V_1(\nu ,q^2)= & {} -\frac{\sin \pi \delta }{2\pi }\xi (q^2)\,\theta (m^2+2m\nu +q^2-s_1)\nonumber \\&\times \,(m^2+2m\nu +q^2-s_1)^\delta .\end{aligned}$$
(47)

In the limit \(\delta \rightarrow 0\), the modification of the structure function disappears, while the change in the time-ordered amplitude does not, but takes the form of a fixed-pole contribution, \(\Delta T_1(\nu ,q^2)\rightarrow \xi (q^2)\), which can have any desired value.

In short: although the hypothesis that the Reggeons properly account for the behaviour at large values of \(\nu \) uniquely determines the subtraction function even if Pomeron exchange contributes, the evaluation of (23) requires knowledge of the asymptotic behaviour to an accuracy that is beyond reach. In the absence of theoretical information about the properties of the Pomeron, we are dealing with what Hadamard [86] called an ill-posed problem: in principle, the data do determine the solution, but tiny changes in the data (structure function) can lead to substantial changes in the solution (subtraction function). For this reason, we do not discuss the sum rules for the individual polarisabilities of proton and neutron any further.

A model-independent determination of the subtraction function occurring in the dispersive representation of the proton Compton amplitude is also of interest in connection with the proton radius puzzle (for a recent review see [87]). As pointed out in [88], at least part of the discrepancy could be explained if for some reason the contribution to the Lamb shift that is governed by the virtual Compton scattering amplitude were significantly larger than expected. The \(\chi \)PT analyses [40, 46, 84, 85] as well as the recent works on effective field theory [89] and finite-energy sum rules [90] were largely motivated by this puzzle; an improved knowledge of the subtraction function would be of interest also in that context. Unfortunately, however, a major breakthrough in the theoretical understanding of the Pomeron is required before the sum rule set up above could reliably be evaluated.

5.5 Evaluation of \(\Sigma _1^{L}\) for the proton–neutron difference

In the present subsection, we focus on the difference between proton and neutron, where the Pomeron drops out: the asymptotic behaviour of \(\sigma _{L}(\nu ,Q^2)^{p-n}\) is dominated by the non-leading terms in (41), which grow less rapidly with \(\nu \), so that the problems discussed in the preceding subsection do not arise. The analysis of the difference is of interest for two reasons: (1) the sum rule is obtained under the same premises (absence of fixed poles, Reggeon dominance hypothesis) as the Cottingham formula. Consequently, confronting the result with existing experimental information on the polarisabilities, one may test the validity of this hypothesis; (2) our result for \(\alpha _E^{p-n}\) is somewhat more accurate than the determination based on the current experimental information. Combined with the experimental values of the polarisabilities of the proton and the Baldin sum rule, this yields an improved prediction for the polarisabilities of the neutron.

Figure 6 compares the integrals over the transverse and longitudinal cross sections, for the difference between proton and neutron. The function \(\Sigma ^{T}(Q^2)^{p-n}\) already occurred in Fig. 5 – we are now merely focusing on a smaller range in the variable \(Q^2\). The plot shows that the integral \(\Sigma _1^{L}(Q^2)^{p-n}\) behaves in a qualitatively different way. Both integrals are small, but while \(\Sigma ^{T}(Q^2)\) exhibits the pronounced spike at \(Q^2=0\) discussed earlier, the dependence on \(Q^2\) of \(\Sigma _1^{L}(Q^2)\) is dominated by the contribution from the region of the \(\Delta \), which is well understood – in particular, the MAID and DMT representations show nearly the same \(Q^2\)-dependence. Using the mean of the two as central value and half of the difference as an estimate for the uncertainty for the contributions from \(W<1.3\) would in our opinion represent a fair recipe, but to stay on the conservative side, we double the error estimate. For the value of the integral at \(Q^2=0\) this prescription yields \(\Sigma _1^{L}(0)_\text {MD}=-1.4(4)\). The contributions from intermediate energies, \(1.3<W<3\), are small: the estimate \(\Sigma _1^{L}(0)_\text {BC}=0.2(2)\) covers the deficiencies of the representation used there. Above that range, we use the AI-representation, attach an uncertainty of 30 % to it, and get \(\Sigma _1^{L}(0)_\text {AI}=- 0.3(1)\). Adding errors in quadrature, we finally obtain

$$\begin{aligned} \Sigma _1^{L}(0)^{p-n}=-1.6(4) .\end{aligned}$$
(48)
Fig. 6
figure 6

Cross section integrals relevant for the subtraction function

5.6 Prediction for the polarisabilities of the neutron

In view of Eq. (40), the result (48) amounts to a prediction for the difference between the electric polarisabilities of proton and neutron:

$$\begin{aligned} \alpha _E^{p-n}=-1.7(4).\end{aligned}$$
(49)

This is consistent with the current experimental value, \(\alpha _E^{p-n}=-0.9(1.6)\), but significantly more precise. The numerical result obtained from the Baldin sum rule for the difference in the value of \(\alpha _E+\beta _M\) between proton and neutron, \((\alpha _E+\beta _M)^{p-n}=-1.4(6)\), then implies

$$\begin{aligned} \beta _M^{p-n}=0.3(7).\end{aligned}$$
(50)

According to (36), this result also determines the value of the subtraction function relevant for the self-energy difference at \(Q^2=0\):

$$\begin{aligned} S_1^\text {inel}(0)^{p-n}=-0.3(1.2)\,\text {GeV}^{-2} .\end{aligned}$$
(51)

Finally, combining the current experimental result for the electric and magnetic polarisabilities of the proton, \(\alpha _E^p=10.65(50)\) and \(\beta _M^p=3.15(50)\), with the numbers for \(\alpha _E^{p-n}\) and \((\alpha _E+\beta _M)^{p-n}\), we arrive at a prediction for the electric and magnetic polarisabilities of the neutron:

$$\begin{aligned} \alpha _E^n=12.3(7),\quad \beta _M^n=2.9(0.9).\end{aligned}$$
(52)

These are also consistent with the current experimental values, \(\alpha _E^n=11.55(1.5)\), \(\beta _M^n=3.65(1.50)\), and more precise.

Note that the procedure used avoids relying on the available parameterisations of the transverse cross section. These contain sharp spikes at very small values of \(Q^2\), which make the evaluation of \(\Sigma ^{T}(0)\) problematic. We make use of the fact that those present in the longitudinal cross section are much milder and allow us to assign a meaningful uncertainty to \(\Sigma _1^{L}(0)\). We also emphasise that the fluctuations exclusively affect the behaviour at small values of \(Q^2\). For the evaluation of the electromagnetic self-energy to be discussed in Sect.  6, these deficiencies are of no concern, because phase space suppresses the contributions from the vicinity of the point \(Q^2=0\).

5.7 Result for the subtraction function

According to (26), the inelastic part of the subtraction function relevant for the self-energy is determined by the difference between the integrals \(\Sigma _1^{L}(Q^2)^{p-n}\) and \(\Sigma ^{T}(Q^2)^{p-n}\). The central values of these integrals are shown in Fig. 6. The narrow band in Fig. 7 indicates the corresponding result for the subtraction function. The width of the band is obtained by evaluating the uncertainties in the contributions arising from the three subintervals, separately for the transverse and longitudinal contributions, and adding the results in quadrature. For better visibility, the vertical axis is stretched with the inverse of the dipole form factor, \(N=(1+Q^2/M_d^2)^2\), \(M_d^2=0.71\,\text {GeV}^2\). As discussed in Sect. 5.1, the region \(Q^2<0.5\) contains unphysical fluctuations – this is why we chop the uncertainty band off there. Note also that, although the calculation returns reasonable results even at \(Q^2=2\), it is not reliable there, because it does not account for the contributions by which the AI-parameterisation needs to be supplemented in order to agree with experiment at those values of \(Q^2\) (see Sect. 5.1).

Fig. 7
figure 7

Momentum dependence of the subtraction function (GeV units). For better visibility, the vertical axis is stretched with the inverse of the dipole form factor, \(N=(1+Q^2/M_d^2)^2\), \(M_d^2=0.71\,\text {GeV}^2\). In this normalisation, the ansatz proposed in [31] (WCM) represents a broad band of nearly constant width, determined by the experimental value of the difference between the magnetic polarisabilities of proton and neutron. The curves are drawn for the current experimental value, which is indicated by the error bar on the left and concerns the value at \(Q^2=0\), but is displaced to make it visible. The range obtained with the model in [36] (ESTY) starts with the same width at \(Q^2=0\), but shrinks as \(Q^2\) grows. The comparatively narrow third band represents our work. We do not present an error estimate in the region \(0<Q^2<0.5\), because there our results are sensitive to the inadequacies of the parameterisations used for the cross sections, but we do show our prediction for the value of the subtraction function at \(Q^2=0\)

The figure also indicates the value \(S_1^\text {inel}(0)^{p-n}= 1.0(2.7)\) obtained from the current experimental result for \(\beta _M^{p-n}\), as well as our prediction in (51). These numbers concern the value of the subtraction function at \(Q^2=0\), but are slightly displaced for better visibility.

5.8 Comparison with previous work

Recently, Walker-Loud et al. [31] proposed a simple ansatz for the subtraction function. In our notation, their proposal amounts to

$$\begin{aligned} S_\text {WCM}(q^2)= & {} -\left( \frac{m_0^2}{m_0^2-q^2}\right) ^{2}\frac{m\beta _M}{\alpha _\text {em}}\nonumber \\&+\,\frac{1}{q^2}\{G_M^2(q^2)-F_D^2(q^2)\}. \end{aligned}$$
(53)

The singularity at \(q^2=0\) arises from the elastic contribution in (15). The corresponding expression for the inelastic part of the subtraction function,Footnote 9

$$\begin{aligned} S_\text {WCM}^\text {inel}(q^2)= & {} -\left( \frac{m_0^2}{m_0^2-q^2}\right) ^{2}\frac{m\beta _M}{\alpha _\text {em}}\nonumber \\&-\frac{4m^2\{G_E(q^2)-G_M(q^2)\}^2}{(4m^2-q^2)^2}, \end{aligned}$$
(54)

is regular at \(q^2=0\) and one readily checks that the ansatz is consistent with the low-energy theorem (36). It amounts to an extrapolation of that formula to nonzero values of \(q^2\), controlled by the parameter \(m_0\). In Fig. 7, this expression is indicated as a broad band of nearly constant width.

The ansatz (53) for the subtraction function generates a logarithmic divergence in the integral (62) for the corresponding contribution to the self-energy difference. As discussed in Sect. 6.1, the self-energy difference indeed diverges logarithmically. The divergence is absorbed in the electromagnetic renormalisation of \(m_u\) and \(m_d\), which is of order \(e^2m_u\), \(e^2m_d\). As pointed out by Erben et al. [36], the logarithmic divergence generated by the ansatz (53) is not proportional to the masses of the two lightest quarks and can thus not be absorbed in their renormalisation: the particular extrapolation proposed in [31] is not consistent with the short-distance properties of QCD. The variant proposed in [36],

$$\begin{aligned} S_\text {ESTY}^\text {inel}(q^2)= & {} -\left( \frac{m_1^2-c\,q^2}{m_1^2-q^2}\right) \left( \frac{m_1^2}{m_1^2-q^2}\right) ^{2}\frac{m\beta _M}{\alpha _\text {em}}\nonumber \\&-\frac{4m^2\{G_E(q^2)-G_M(q^2)\}^2}{(4m^2-q^2)^2}, \end{aligned}$$
(55)

repairs this shortcoming, as it disconnects the behaviour at small values of \(q^2\) from the asymptotic behaviour. This expression is represented by the central band that gradually shrinks if \(Q^2\) increases.

The explicit choice made in [36] for the coefficient c implicitly assumes that the contribution from the subtracted dispersion integral, \(m_\gamma ^{\text {disp}}\), stays finite when the cut-off is removed, so that the logarithmic divergence then exclusively arises from the term \(m_\gamma ^S\), which stems from the subtraction function. As discussed in detail in [4], however, the deep inelastic region also contributes to the coefficient of the logarithmic divergence. The scaling violations do not extinguish this contribution [26]. Hence the choice made for c cannot be taken literally, but it does have the proper quark mass factors, so that the divergence arising from the subtraction function is suppressed. Since the authors cut the integral over the subtraction function off at \(\Lambda ^2=2\,\text {GeV}^2\), it barely makes any difference whether c is set equal to zero or taken from [36]. In fact, one of the variants of the model studied in [35] does correspond to \(c=0\).

6 Self-energy

6.1 Cottingham formula

The electromagnetic self-energy of a hadron diverges logarithmically. To first order in \(\alpha _\text {em}\) the renormalised electromagnetic Lagrangian requires counter terms proportional to the operators \(\mathbbm {1}\), \(\bar{q}q\) and \(O_G =G_{\mu \nu }^a G^{a\mu \nu }\):

$$\begin{aligned} \mathcal{L}_\text {em}= & {} -\frac{e^2}{2}\int \mathrm{d}^4y \tilde{D}_\Lambda (x-y)Tj^\mu (x)j_\nu (y)+\Delta E\,\mathbbm {1}\nonumber \\&+\sum _{q=u,d,\ldots }\delta m_q\,\bar{q}q-\frac{\delta g}{2g^3}O_G, \end{aligned}$$
(56)

where \(\tilde{D}_\Lambda (x)\) is the regularised photon propagator in coordinate space. The counter term proportional to the unit operator does not contribute to the self-energy. The remainder is determined by the renormalisation of the quark masses and of the coupling constant g required by the electromagnetic interaction. To leading order, these are given by (see for instance [91])

$$\begin{aligned}&\delta m_q=\frac{3e^2}{16\pi ^2}\,\log \frac{\Lambda ^2}{\mu ^2}\; Q_q^2\,m_q,\nonumber \\&\delta g= -\frac{e^2g^3}{256\pi ^4 m}\,\log \frac{\Lambda ^2}{\mu ^2}\sum _{q=u,d,\ldots }Q_q^2 . \end{aligned}$$
(57)

The form of the regularisation used for the photon propagator is irrelevant – it exclusively affects the value of the running scale \(\mu \).

The proton and neutron matrix elements of the operator (56) lead to a version of the Cottingham formula [12] that is valid in QCD:

$$\begin{aligned} m_\gamma= & {} \frac{i e^2}{2m(2\pi )^4} \int \mathrm{d}^4q D_\Lambda (q^2)\{3 q^2 T_1+(2\nu ^2+q^2)T_2\}\nonumber \\&+\,\text {counter terms}. \end{aligned}$$
(58)

It represents the electromagnetic self-energy in terms of the time-ordered amplitudes \(T_1\) and \(T_2\) specified in Appendix A.

6.2 Elastic part of the self-energy

Analogously to the electric and magnetic polarisabilities, the self-energy also consists of an elastic and an inelastic part,

$$\begin{aligned} m_\gamma = m_\gamma ^\text {el}+ m_\gamma ^\text {inel}.\end{aligned}$$
(59)

The contribution from the elastic intermediate states remains finite when the cut-off is removed. It is obtained by replacing \(T_1,T_2\) with the elastic parts \(T_1^\text {el},T_2^\text {el}\), which are given explicitly in (15), and replacing \(D_\Lambda (q^2)\) with the full photon propagator, \(D(q^2)=(-q^2-i\epsilon )^{-1}\). With a Wick rotation, the expression can be brought to the form

$$\begin{aligned}&m^\text {el}_\gamma =\frac{\alpha _\text {em}}{8\pi m^3}\int _0^\infty dQ^2 Q^2\{f_1 \,v_1^\text {el}(-Q^2)+f_2 \,v_2^\text {el}(-Q^2)\},\nonumber \\&f_1= 3\left\{ \sqrt{1+\frac{1}{y}}-1\right\} ,\nonumber \\&f_2= (1-2y)\sqrt{1+\frac{1}{y}}+2y, \end{aligned}$$
(60)

where \(v_1^\text {el}(q^2)\) and \(v_2^\text {el}(q^2)\) represent the sums of squares of form factors specified in (10). The variable y stands for \(y\equiv \nu ^2/Q^2\). For the elastic contribution, which is concentrated to the line \(Q^2=2m\nu \), we have \(y=Q^2/4m^2\). In [4], the dipole approximation for the Sachs form factors was used, which yields \( m_\gamma ^\text {el}=0.63\,\text {MeV}\) for the proton and \(-0.13\,\text {MeV}\) for the neutron, so that the elastic contribution to the self-energy difference amounts to \((m_\gamma ^\text {el})^{p-n} =0.76\,\text {MeV}\).

In the meantime, the precision to which the form factors are known has increased significantly. For a thorough review of the experimental information, we refer to [92]. The above estimates of the elastic contributions to the proton and neutron self-energies do receive significant corrections, but the difference between proton and neutron is affected by less than \(0.02\,\text {MeV}\). Compared to the uncertainties in the contributions arising from the deep inelastic region, the departures from the dipole approximation are too small to matter.

6.3 Inelastic part of the self-energy

The inelastic part receives three distinct contributions:

$$\begin{aligned} m_\gamma ^\text {inel}= m^S_\gamma +m^\text {disp}_\gamma + m^\text {ct}_\gamma .\end{aligned}$$
(61)

The term \(m^S_\gamma \) arises from the subtraction function \(S^\text {inel}_1(q^2)\), \(m^{\text {disp}}_\gamma \) is given by a dispersion integral over the structure functions, and \(m^{\text {ct}}_\gamma \) accounts for the fact that the electromagnetic interaction renormalises the quark masses as well as the coupling constant of QCD. In the above discussion of the polarisabilities, renormalisation did not play any role, because these concern the properties of \(T_1,T_2\) at low energies. In fact, the inelastic part of the magnetic polarisability exclusively picks up the contribution from the subtraction function specified in (36). In the decomposition used in (61), we have \(\beta ^\text {inel}=\beta ^S\), \(\beta ^{\text {disp}} =\beta ^{\text {ct}}=0\).

The term \( m^S_\gamma \) is obtained by replacing \(T_1(\nu ,q^2)\) in (58) by the subtraction function \(S^\text {inel}_1(q^2)\), performing a Wick rotation, and averaging over the directions of the Euclidean momentum. The result reads

$$\begin{aligned} m_\gamma ^S=\frac{3\alpha _\text {em}}{8\pi m}\int _0^{\Lambda ^2}dQ^2\,Q^2\, S^\text {inel}_1(-Q^2).\end{aligned}$$
(62)

This term measures the size of the self-energy arising from the subtraction function (more precisely, the inelastic part thereof – the remainder is included in \(\delta m_\gamma ^\text {el}\)).

The second term on the right of (61) is obtained by replacing the amplitudes \(T_1,T_2\) with their inelastic parts \(T_1^\text {inel},T_2^\text {inel}\) and dropping the contribution from the subtraction function in the dispersive representation for \(T_1^\text {inel}\). The explicit expression reads

$$\begin{aligned} m_\gamma ^{\text {disp}}= & {} \frac{\alpha _\text {em}}{2\pi m}\int _0^{\Lambda ^2}dQ^2 Q^2\int _{\nu _\text {th}}^\infty d\nu \,\nu \nonumber \\&\times \,\Bigg \{\bigg (f_1-\frac{3}{2y}\bigg ) \,V_1(\nu ,-Q^2)+f_2 \,V_2(\nu ,-Q^2)\Bigg \}. \end{aligned}$$
(63)

The term with 3 / 2y makes the difference between the unsubtracted and subtracted dispersion integral over \(V_1\): it removes the leading term in the behaviour of \(f_1\) when \(Q^2\) is held fixed and \(\nu \) tends to \(\infty \), so that the integral over \(\nu \) converges, despite the growth of \(V_1\) generated by Reggeon exchange. On the other hand, when \(Q^2\) becomes large, the behaviour in the deep inelastic region is relevant. In QCD, the contributions from that region diverge logarithmically if the cut-off is removed. In (63), we have simply cut the integral off at \(Q^2=\Lambda ^2\) – this amounts to a regularisation of the photon propagator in Euclidean space: \(D_\Lambda (-Q^2)=\theta (\Lambda ^2-Q^2)/Q^2\).

In the normalisation of the states (A.2), the mass shift generated by the counter terms in (56) is given by

$$\begin{aligned} m^{\text {ct}}_\gamma =-\sum _{q=u,d,\ldots }\frac{\delta m_q}{2m}\,\langle p|\bar{q}q|p\rangle +\frac{\delta g}{4mg^3}\langle p|O_G|p\rangle .\end{aligned}$$
(64)

Neglecting second-order isospin-breaking effects proportional to \(e^2(m_u-m_d)\), the proton and neutron matrix elements of operators with isospin zero are the same. Hence the operators \(O_G\), \(\bar{s}s\), \(\bar{c}c\), ...drop out in the self-energy difference. Moreover, isospin symmetry relates the neutron matrix elements of the light quarks to those for the proton, e.g. \(\langle k|\bar{u}u| k \rangle ^{n}=\langle k|\bar{d}d| k \rangle ^{p}\). Using these properties, the contribution from the electromagnetic renormalisation of the quark masses to the self-energy difference can be brought to the form

$$\begin{aligned} (m^{\text {ct}}_\gamma )^{p-n} =-\frac{\alpha _\text {em}}{24\pi m}(4m_u-m_d)\log \frac{\Lambda ^2}{\mu ^2} \langle p| \bar{u}u-\,\bar{d}d |p\rangle . \end{aligned}$$
(65)

The formula shows that the coefficient of the logarithmic divergence is proportional to the masses of the two lightest quarks. In the chiral limit the divergence disappears altogether: if u and d are taken massless, the self-energy difference approaches a finite limit if the cut-off is removed. In reality, the contributions from the deep inelastic region do generate a logarithmic divergence, albeit with a small coefficient. An update of the analysis performed in [4] is needed to account for the scaling violations in the corresponding contributions to the renormalised self-energy difference.

6.4 Numerical evaluation

In [31], the contribution from the subtraction function to the self-energy difference is evaluated with \(\Lambda ^2=2\,\text {GeV}^2\). According to (26), the inelastic part of the subtraction function is given by the difference between two cross section integrals. The part which involves the transverse cross sections, \(\Sigma ^{T}(Q^2)\), generates a convergent contribution to Eq. (62) for the self-energy. As discussed above, our numerical representation of \(\Sigma ^{T}(Q^2)\) becomes incoherent at values of \(Q^2\) below 0.5, but phase space suppresses that region, so that our estimate, \(m_\gamma ^S(\Sigma ^{T})\simeq -0.14\,\text {MeV}\), should be close to the truth (actually, with a coherent representation of the available experimental information, this part could be evaluated rather accurately, even without cutting the integral off). The corresponding integral over the longitudinal cross section, \(\Sigma _1^{L}(Q^2)\), is less sensitive to the shortcomings of the representation we are using (this is why we were able to obtain a rather accurate prediction for the difference between the electric polarisabilities of proton and neutron). Numerically, the contribution from that integral to the self-energy difference is tiny: \(m_\gamma ^S(\Sigma _1^{L})\simeq -0.03\,\text {MeV}\). In other words: the contributions from the Reggeons do require a subtraction, but taken together with those arising from low energies, the entire contribution from the longitudinal cross section to the subtraction function generates a negligibly small part of the self-energy difference. Together with the number for the contributions from the transverse cross section given above, we obtain

$$\begin{aligned} m_\gamma ^S=-0.17\,\text {MeV}.\end{aligned}$$
(66)

This is to be compared with the number obtained by instead inserting Eq. (54) in Eq. (62). With the central value \(\beta _M^{p-n}= -1\) used as input in [31], we obtain \((m_\gamma ^S)^\text {WCM}=0.50\,\text {MeV}\). Keeping all other parts of the calculation in [31] as they are, but replacing the ansatz for the subtraction function made there with our prediction, the numerical result for the self-energy difference, \(m_\gamma ^\text {WCM}=1.30\,\text {MeV}\), is lowered by \(0.67\,\text {MeV}\), so that the central value becomes \(m_\gamma =0.63\,\text {MeV}\). Repeating the exercise with the model of [36], i.e. replacing Eq. (54) by (55), we instead obtain \((m_\gamma ^S)^\text {ESTY}=0.20\,\,\text {MeV}\), so that in this case, the central value \(m_\gamma ^\text {ESTY}=1.04\,\text {MeV}\) is lowered by \(0.37\,\text {MeV}\), which leads to \(m_\gamma =0.67\,\text {MeV}\). In either case, the early estimate obtained in [4], \(m_\gamma ^\text {GL}=0.76(30)\) is confirmed. Comparing their parameterisation with recent lattice data on the electromagnetic self-energy difference, the authors of [31, 35] obtain results for the difference of the magnetic polarisabilities, \(\beta _M^{p-n}=-0.87(85)\) and \(\beta _M^{p-n}=-1.12(40)\), respectively, which is lower than our prediction in (50). The difference reflects the fact that, in Fig. 7, the bands that correspond to their models run above ours. While these extractions involve a model dependence which is difficult to quantify, there has recently been progress in the direct calculation of the polarisability from the lattice; see [93].

Note that the momentum dependence of the subtraction function must match the behaviour in the deep inelastic region. Taken by itself, the contribution from the subtraction function is very sensitive to the choice of the cut-off \(\Lambda \). As shown in [4], the term \(m_\gamma ^{\text {disp}}\) is equally sensitive, but the sum of the two contributions is nearly independent of \(\Lambda \), because the Cottingham formula only contains the very weak logarithmic divergence that is related to the electromagnetic renormalisation of the quark masses \(m_u\) and \(m_d\). As indicated in (65), the coefficient of the divergence is proportional to these masses and hence very small. Also, it does not come exclusively from the subtraction function. The contributions to \(m_\gamma ^{\text {disp}}\) arising from the deep inelastic region contribute to the coefficient of the logarithmic divergence as well. These were estimated in [4] on the basis of the data available at the time, which did not show any violations of Bjorken scaling. In the meantime, there has been considerable progress in understanding the properties of the structure functions in the deep inelastic region and there is very clear evidence for scaling violations. For a thorough review of these developments, we refer to [94]. A corresponding update of the results obtained on the basis of the Cottingham formula would be of high interest, also in view of the progress made in calculating electromagnetic self-energies on the lattice, but this goes beyond the scope of the present paper.

7 Summary and conclusion

  1. 1.

    Causality relates the imaginary part of the amplitude for Compton scattering on the nucleon in the forward direction to the cross section of the process \(e+N\rightarrow e+\text {anything}\). The relation holds for real photons as well as virtual photons of space-like momentum, \(q^2\le 0\). The spin-averaged forward scattering amplitude involves two invariants, which we denote by \(T_1(\nu ,q^2)\) and \(T_2(\nu ,q^2)\). Their imaginary parts are determined by the transverse and longitudinal cross sections of electron scattering, \(\sigma _{T}\) and \(\sigma _{L}\).

  2. 2.

    Regge asymptotics implies that only \(T_2(\nu ,q^2)\) obeys an unsubtracted fixed-\(q^2\) dispersion relation, while the one for \(T_1(\nu ,q^2)\) requires a subtraction, which represents the value of the amplitude at \(\nu =0\): \(S_1(q^2)=T_1(0,q^2)\). The dispersive representation of the spin-averaged forward Compton scattering amplitude thus consists of two parts: an integral over the cross sections \(\sigma _{T},\sigma _{L}\) and an integral over the subtraction function \(S_1\). The same also holds for the Cottingham formula, which represents the electromagnetic self-energy of the nucleon in terms of the spin-averaged forward Compton amplitude.

  3. 3.

    It had been pointed out long ago [4] that – unless the Compton amplitude contains a fixed pole at \(J=0\) – the subtraction function is unambiguously determined by the cross sections of electron scattering. We do not know of a proof that the Compton amplitude of QCD is free of fixed poles, but we assume that this is the case and refer to this assumption as Reggeon dominance. As briefly discussed in Sect. 1.2, the validity of this hypothesis is questioned in the literature. Indeed, an analysis of the Compton amplitude based on first principles that would determine the behaviour in the Regge region (high energies, low photon virtualities) is not available. If the hypothesis were to fail, this would be most interesting, as it would imply that the known contributions generated by the short-distance singularities and the exchange of Reggeons do not fully account for the high-energy behaviour of QCD.

  4. 4.

    On the basis of Reggeon dominance, we have derived an explicit representation of the subtraction function in terms of the electron scattering cross sections. The representation requires the asymptotic behaviour of the longitudinal cross section to be known up to contributions that disappear at high energies. For the proton Compton amplitude, where Pomeron exchange generates the dominating contribution, the available information does not suffice to reliably evaluate the subtraction function. In the difference between proton and neutron, however, the Pomeron drops out. We have shown that the experimental information available at low photon virtuality does suffice to work out the subtraction function relevant for this difference.

  5. 5.

    In [31], the electron cross sections \(\sigma _{T}\), \(\sigma _{L}\) and the subtraction function \(S_1(q^2)\) are instead treated as physically independent quantities. The authors invoke the low-energy theorem that relates the value of the subtraction function at \(q^2=0\) to the magnetic polarisability and use experimental information about the latter to pin down the value of the subtraction function at the origin. As direct experimental information about the \(q^2\)-dependence is not available, the authors construct a model for that. Figure 7 compares their model with our prediction. As pointed out in [36], the model of [31] is not consistent with the fact that the coefficient of the logarithmic divergence vanishes in the chiral limit. The alternative ansatz for the subtraction function proposed there, which does obey this constraint, is also shown in Fig. 7.

  6. 6.

    The authors of [31] use their ansatz for the subtraction function to evaluate the difference between the self-energies of proton and neutron and obtain \(m_\gamma ^\text {WCM}=1.30(03)(47)\,\text {MeV}\), significantly higher than the result obtained in [4], \(m_\gamma ^\text {GL}=0.76(30)\,\text {MeV}\). The difference is blamed on a ’technical oversight’ committed in [4]. This claim is wrong: it suffices to replace their ansatz for the subtraction function with the parameter-free representation used in [4], which is spelt out explicitly in (26) above. Leaving all other elements of their calculation as they are, the central value for the self-energy difference then drops to \(m_\gamma =0.63\,\text {MeV}\), thereby neatly confirming the old result. The same conclusion is reached with the calculation performed in [36].

  7. 7.

    We emphasise that the present work only concerns low photon virtualities. An update of the analysis carried out in [4] which accounts for the progress made on the experimental and theoretical sides during the last 40 years – in particular an evaluation of the contributions from the deep inelastic region which accounts for the violations of Bjorken scaling – is still missing.

  8. 8.

    Our representation for the subtraction function also leads to a prediction for the difference between the electric polarisabilities of proton and neutron. The result is given in (49). Using the currently accepted results obtained from the Baldin sum rule, this also determines the difference of the magnetic polarisabilities and, using the comparatively rather precise, known value of the electric polarisability of the proton, we obtain an estimate also for the polarisabilities of the neutron. The result is given in (52).

  9. 9.

    The fact that the results obtained from Reggeon dominance are consistent with experiment and even somewhat more precise amounts to a nontrivial test of the hypothesis that the Compton amplitude is free of fixed poles. Quite apart from the possibility of taking new data at small photon virtuality, an improved representation of the available experimental information on the cross sections would allow us to reduce the uncertainties quite substantially – in particular, if the deficiencies of the available parameterisations mentioned in Sect. 5.1 could be removed, the main source of uncertainties in our calculation would immediately disappear.

  10. 10.

    The main problem we are facing with our analysis is that all of the well-established features of electron scattering drop out when taking the difference between proton and neutron: the leading terms of the chiral perturbation series are the same, the contribution from the most prominent resonance, the \(\Delta (1232)\), is the same, and the leading asymptotic term due to Pomeron exchange is also the same. Since all of these contributions cancel out, not much is left over. Only a fixed pole could prevent the subtraction function relevant for the difference between proton and neutron from being small. The available data do not exclude the occurrence of a fixed pole, but they indicate that if the phenomenon occurs at all, then the pole must have a rather small residue.