Abstract
We present a procedure to solve the inverse Ising problem, that is, to find the interactions between a set of binary variables from the measure of their equilibrium correlations. The method consists in constructing and selecting specific clusters of spins, based on their contributions to the cross-entropy of the Ising model. Small contributions are discarded to avoid overfitting and to make the computation tractable. The properties of the cluster expansion and its performances on synthetic data are studied. To make the implementation easier we give the pseudo-code of the algorithm.
Similar content being viewed by others
Notes
We have to minimize here rather than maximize since the true Lagrange multipliers take imaginary values, the couplings and fields being their imaginary part.
The vocable ‘field’, should strictly speaking, be used when the variables σ i are spins taking ±1 values. For 0,1 variable, the use of the denomination ‘chemical potential’ would be more appropriate. We keep to the simpler denomination ‘field’ hereafter.
The minimal strength of the couplings which can be ‘detected’ depend on the quality of the sampling, and scales as \(\sqrt{\log N/B}\), see Sect. 4.3.2.
A further theoretical argument supporting the existence of the cancellation property is, in the case of perfect sampling, the fact that the entropy S must be extensive in N. As S is the sum of ∼2N cluster entropies, those contributions must compensate each other.
In the present work where spins are equal to 0,1, the couplings (J ij ) and fields (h i ) are the one’s in spin 0,1 (\(\hat{J}_{ij},\hat{h_{i}}\)) by the transformation: \(J_{ij}=4 \hat{J}_{i,j}\) and \(h_{i}=-\frac{1}{2} \sum_{j\neq i} J_{ij} +\hat{h}_{i}\). The numerical experiments of [17] where done with ±1 spins.
If \(p=\frac{1}{2}\), ΔS IS is of the order of B −K.
This statement is widely believed to be true in the theory of liquids literature. The fast decay of the inverse susceptibility, χ −1(r), or, equivalently, of the direct pair correlation, g(r), is used to approximately close the hierarchy of correlation functions. The Percus-Yevik closure scheme, which gives an accurate equation of state for liquids of hard spheres, assumes that the inverse susceptibility vanishes above the interaction range of the potential (diameter of a particle).
We have verified that the dependence on N is weaker in the case of periodic boundary conditions.
The numerical experiments of [17] were done with ±1 spins and with coupling parameter θ and field ν=0; in the present work where spins are equal to 0,1, the corresponding couplings and fields are: J=4θ and \(h_{i}=-\frac{1}{2} \sum_{j\neq i} J_{ij}\).
In mean-field spin-glasses, as the couplings scale as the inverse square root of the number N of spins, only loop diagrams have non-zero weights in the thermodynamical limit.
With N=40, p=0.024, we find that the standard deviation is of the order of 10−13 for B=106, and 2⋅10−11 for B=105, see Fig. 7.
References
Brush, S.G.: Rev. Mod. Phys. 39, 883 (1967)
Schneidman, E., Berry, M.J. II, Segev, R., Bialek, W.: Nature 440, 1007 (2006)
Tkacik, G., Schneidman, E., Berry, M.J. II, Bialek, W.: arXiv:q-Bio.NC/0611072 (2006)
Marre, O., El Boustani, S., Frégnac, Y., Destexhe, A.: Phys. Rev. Lett. 102, 138101 (2009)
Peyrache, A., et al.: Nat. Neurosci. 12, 919 (2009)
Weigt, M., et al.: Proc. Natl. Acad. Sci. 106, 67 (2009)
Balakrishnan, S., et al.: Proteins 79, 1061 (2011)
Bialek, W., Cavagna, A., Giardina, I., Mora, T., Silvestri, E., Viale, M., Walczak, A.M.: arXiv:1107.0604 (2011)
Cocco, S., Monasson, R.: Phys. Rev. Lett. 106, 090601 (2011)
Jaynes, E.T.: Proc. IEEE 70, 939 (1982)
Cocco, S., Leibler, S., Monasson, R.: Proc. Natl. Acad. Sci. 106, 14058 (2009)
Opper, M., Saad, D. (eds.): Advanced Mean Field Methods: Theory and Practice. MIT Press, Cambridge (2001)
Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: Cogn. Sci. 9, 147 (1985)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Berlin (2009)
Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Information Science and Statistics. Springer, Berlin (2005)
Ravikumar, P., Wainwright, M.J., Lafferty, J.: Ann. Stat. 38, 1287 (2010)
Bento, J., Montanari, A.: Which graphical models are difficult to learn? In: NIPS, 2009
Jerrum, M., Sinclair, A.: The Markov chain Monte Carlo method: an approach to approximate counting and integration. In: Hochbaum, D.S. (ed.) Approximation Algorithms for NP-Hard Problems. PWS, Boston (1996)
Cagliotti, E., Kuna, T., Lebowitz, J.L., Speer, E.R.: Markov Process. Relat. Fields 12, 257 (2006)
Kuna, T., Lebowitz, J.L., Speer, E.R.: J. Stat. Phys. 129, 417 (2007)
Swendsen, R.H.: Phys. Rev. Lett. 52, 1165 (1984)
Meinshausen, N., Bühlmann, P.: Ann. Stat. 34, 1436 (2006)
Aurell, E., Ekeberg, M.: arXiv:1107.3536 (2011)
Plefka, T.: J. Phys. A, Math. Gen. 15, 1971 (1982)
Georges, A., Yedidia, J.: J. Phys. A, Math. Gen. 24, 2173 (1991)
Georges, A.: In: Mancini, F., Avella, A. (eds.) Lectures on the Physics of Highly Correlated Electron Systems VIII: 8th Training Course in the Physics Correlated Electron Systems and High-Tc Superconductors, AIP Conf. Proc., vol. 715, p. 3 (2004)
Thouless, D.J., Anderson, P.W., Palmer, R.G.: Philos. Mag. 35, 593 (1977)
Sherrington, D., Kirkpatrick, S.: Phys. Rev. Lett. 35, 1792 (1975)
Tanaka, T.: Phys. Rev. E 58, 2302 (1998)
Roudi, Y., Aurell, E., Hertz, J.: Front. Comput. Neurosci. 3, 22 (2009)
Sessak, V., Monasson, R.: J. Phys. A 42, 055001 (2009)
Pelizzola, A.: J. Phys. A 38, R 309 (2005)
Mézard, M., Mora, T.: J. Physiol. (Paris) 103, 107 (2009)
Marinari, E., Van Kerrebroeck, V.: J. Stat. Mech. P02008 (2010)
de Dominicis, C.: J. Math. Phys. 3, 983 (1962)
Hansen, J.P., McDonald, I.R.: Theory of Simple Liquids. Academic Press, New York (1976)
Frisch, H.L., Lebowitz, J.L.: The Equilibrium Theory of Classical Fluids. Benjamin, New York (1964). (A lecture note and reprint volume)
Flajolet, P., Gourdon, X., Dumas, P.: Theor. Comput. Sci. 144, 3 (1995)
Schnitzer, M.J., Meister, M.: Neuron 37, 499–511 (2003)
Gumbel, E.J.: Statistics of Extremes. Dover, New York (2004)
MacKay, D.J.C.: Neural Comput. 4, 415 (1991)
Bray, A.J., Moore, M.A.: J. Phys. A 10, 1927 (1977)
Percus, J.K., Yevick, G.J.: Phys. Rev. 110, 1 (1958)
Borzi, C., Ord, G., Percus, J.K.: J. Stat. Phys. 46, 51 (1986)
Barton, J.: private communication
Gori, G., Trombettoni, A.: J. Stat. Mech. P10021 (2011)
Fisher, M.: J. Math. Phys. (N.Y.) 5, 944 (1964)
Fisher, M.: Phys. Rev. 162, 480 (1967)
Zobin, D.: Phys. Rev. 5, 2387 (1978)
Acknowledgements
We are grateful to J. Barton, J. Lebowitz, E. Speer for very useful and stimulating discussions, in particular regarding the correspondence between the inverse susceptibility and the direct correlation functions and the practical implementation of the inference algorithm. We thank E. Aurell for pointing to us the difference between P and Q, see Sect. 5.4.1.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Expression of the Entropy of Clusters with Size K=3
In this appendix, we give the analytical expression for the entropy of a cluster with K=3 spins. Using this expression instead of minimizing the cross-entropy (7) offers a valuable computational speed-up as there are O(N 3) clusters of size K=3. We start with the definition of the entropy P(σ):
We then replace the probabilities P(σ 1,σ 2,σ 3) of the eight configurations of the three spins above with their expressions in terms of the probabilities {p i ,p kl } in the data, and of the probability p 123 that the three spins are equal to 1: P(1,1,1)=p 123, P(1,1,0)=p 12−p 123, P(1,0,1)=p 13−p 123, P(0,1,1)=p 23−p 123, P(0,0,1)=p 3−p 23−p 13+p 123, P(0,1,0)=p 2−p 12−p 23+p 123, P(1,0,0)=p 1−p 13−p 12+p 123, P(0,0,0)=1−p 1−p 2−p 3+p 12+p 13+p 23−p 123. The only unknown quantity (not available from p) is the probability p 123. To determine p 123 we impose
which means that the three-body coupling J 123 vanishes. Condition (A.2) gives a third degree equation on p 123,
with
Upon substitution of p 123 in (A.1) we obtain the desired cross-entropy S 3, as a function of the three average values p i and the three two-point averages p kl . The expression of the cluster-entropy is given by,
according to (28). The expressions of the cluster-entropies for one and two spins are given by, respectively, (23) and (25). Similarly, one obtains the expressions for the contributions of the 3-spin cluster to the values of the interactions parameters by differentiating ΔS with respect to the p i ’s and the p kl ’s.
Appendix B: Diagrammatic Expansion of the Cluster-Entropies in Powers of the Connected Correlations
A better understanding of the cluster expansion and of the role of the reference entropy S 0 can be gained through the diagrammatic expansion of the entropy S(p) in powers of the connected correlations (high-temperature expansion),
Note that the entry M ij of the matrix M defined in (21) vanishes linearly with c ij . Thus, an expansion in powers of c ij is equivalent to an expansion in powers of M ij . A procedure to derive in a systematic way the diagrammatic expansion of S(p) is proposed in [31]. The diagrammatic expansion provides a simple representation of the cluster-entropies, in which the entropy S(p) can be represented as a sum of connected diagrams (Fig. 26). Each diagram is made of sites, connected or not by one or more edges. Each point symbolizes a variable, and carries a factor p i . The presence of n(≥0) edges between the sites k and l results in a multiplicative factor (c kl )n. The contribution of a diagram to the entropy is the product of the previous factors, times a function of the p i specific to the topology of the diagram, see [31]. Diagrams of interest include (Fig. 26):
-
the N single-point diagrams, whose contributions are ΔS (i)(p i );
-
the ‘loop’ diagrams, which consist of a circuit with K edges going through K sites i 1→i 2→⋯→i K →i 1, whose contributions to the entropy are
$$ S_{\mathit{loop}} (\mathbf{p}|i_1,i_2,\ldots,i_K) = (-1)^{K-1} M_{i_1,i_2}M_{i_2,i_3}\ldots M_{i_{K-1},i_K}M_{i_K,i_1} ;$$(B.2) -
the Eulerian circuit diagrams, for which there exists a closed path visiting each edge exactly once;
-
the non-Eulerian diagrams, with the lowest number of links (smallest power in M).
The entropy for two variables i,j, S(p i ,p j ,p ij ) (24), is the sum of the two single-point diagrams i and j, plus the sum of all connected diagrams made of the two sites i and j with an arbitrary large number of edges (n≥2) in between (first two columns in Fig. 26). According to (25), the cluster-entropy ΔS (i,j)(p i ,p j ,p ij ) is equal to the latter sum (second column in Fig. 26). More generally, the entropy of a cluster ΔS Γ(p) is the infinite sum of all diagrams whose sites are the indices in Γ.
We now interpret the Mean Field expression for the entropy, S MF , in the diagrammatic framework. We start from identity (21), and rewrite,
Using the fact that the diagonal elements of M are equal to unity, the term corresponding to K=1 above vanishes. For K≥2, we have
where the matrix \(\hat{M}\) has the same off-diagonal elements as M, and has zero diagonal elements. Each term in the above sum corresponds to an Eulerian circuit over K′≤K sites, where K′ is the number of distinct indices in (i 1,i 2,…,i K ). Note that the same circuit can be obtained from different K-uplets of indices. Consider for instance the longest circuits, obtained for K′=K, i.e. all distinct indices. 2K different K-uplets (i 1,i 2,…,i K ) correspond to the same circuit, as neither the starting site nor the orientation of the loop matter. For instance, i 1→i 2→i 3→i 1, i 2→i 3→i 1→i 2, i 1→i 3→i 2→i 1, … are all equivalent. This multiplicity factor 2K precisely cancels the 2K at the denominator in (B.3). The contribution corresponding to a circuit therefore coincides with expression (B.2) for the loop entropy. We conclude that
-
S MF (p) sums up all loop diagrams exactly;
-
S MF (p), in addition, sums up Eulerian circuit diagrams, but with weights a priori different from their values in the cross-entropy S(p).Footnote 10 An exception is the three-variable Eulerian diagram shown in Fig. 26, whose weights in S MF and S coincide.
-
no non-Eulerian diagram is taken into account in S MF (p).
As a conclusion, the diagrammatic expansion provides a natural justification for the choice of the reference entropy S 0(p)=S MF (p). In addition, it provides us with the dominant contribution to the cluster-entropies once the Mean-Field entropy is substracted, see Fig. 26. A detailed study of those dominant contributions is presented in Appendix D.
Appendix C: Properties of the Cluster-Entropies of the One-dimensional Ising Model
Consider the one-dimensional Ising model with nearest-neighbor couplings and periodic boundary conditions. The Hamiltonian of the model is
where the spins σ i take 0,1 values. The parameters of the model are the N identical fields h i =h, the N couplings J i,i+1=J between neighbors and the remaining N×(N−3)/2 zero couplings J i,j =0 between non-neighbors.
We recall a few elementary facts about the model. The transfer matrix is
The eigenvalues are \(\lambda_{\pm}=\frac{1}{2} (e^{J+h}+1\pm \sqrt{(e^{J+h}-1)^{2}+4 e^{h}} )\), and the two components of the eigenvectors are, respectively, \(v_{\pm} (1)=-(1-\lambda_{\pm })/\sqrt{e^{h}+(1-\lambda_{\pm})^{2}}\) and \(v_{\pm} (2)=e^{h/2}/\sqrt {e^{h}+(1-\lambda_{\pm})^{2}}\). The probability that a spin is up is given by, in the N→∞ limit,
and the connected correlation at distance d is
where the correlation length is given by ξ=−1/log(λ −/λ +).
3.1 C.1 Calculation of the Cluster-Entropies and Cancellation Property
In this section, we show the exact cancellation property between the entropies of clusters with different sizes discussed in Sect. 4.1. We will see that this property is a direct consequence of the existence of a unique interaction path along the unidimensional chain.
3.1.1 C.1.1 Case S 0=0
We first consider the case where the reference entropy is zero. Let Γ=(i 1,i 2,…,i K ) be a cluster of size K, with i 1<i 2<⋯<i K . Due to the unidimensional nature of the interactions, the Gibbs distribution over the K-spin configurations σ obeys the chain rule,
where P(⋅,⋅) and P(⋅|⋅) denote, respectively, joint and conditional probabilities. Inserting the above formula into expression (9) for the cross-entropy, we obtain
Each variable \(\sigma_{i_{l}}\), with l=2,…,K−1, appears three times in (C.6), which explains the presence of three fields h with the same index i l . After optimization over \(\mathbf{J}=(\{J_{i_{l},i_{l+1}}\},\{h^{\rightarrow}_{i_{l}}\},\{h^{\leftarrow}_{i_{l}}\},\{h^{0}_{i_{l}}\})\) all these fields are equal, and we obtain
Hence the cross-entry S(p) is the sum of the 1-cluster entropies and of the entropies of the 2-clusters made of adjacent sites. None of the other cluster-entropies appear, which proves that they cancel each other. To illustrate the cancellation mechanism, consider the case K=3. According to (C.7),
Comparing with (28) we obtain
which shows that the entropy of a 3-cluster and the one of a 2-cluster with the same extremities i 1,i 3 are opposite to one another. By a recursive applications of (C.7) this result can be immediately generalized to higher values of K. The entropy of a K-cluster is simply the entropy of the 2-cluster with the same extremities, multiplied by (−1)K−2. Hence, identity (43) is established.
According to formula (C.4) for the connected correlation, the entropy of a two-site cluster is a function of the distance d between the two sites:
where
To obtain the expression (C.11) for F, we have used formula (26) for the 2-spin cluster-entropy, with p 1=p 2=p and p 12=p 2+c 12, where the correlation c 12 is given by (C.4). Note that F(u)=O(u 2) for small u, in agreement with scaling (41).
3.1.2 C.1.2 Case S 0=S MF
We now introduce the reference entropy S 0=S MF . The matrix M defined in (21) has elements
The inverse of M, G=M −1, is a tridiagonal matrix, whose non-zero elements are
Consider now the Gaussian model over N real-values variables φ i , whose energy function is given by
For this Gaussian model, the logarithm of the partition function is (up an irrelevant additional constant), \(\log Z[\mathbf{G}]= -\frac{1}{2} \log \det G\). By construction, model (C.14) is the solution of the inverse Gaussian problem, with data: 〈φ i 〉=0,〈φ i φ j 〉=M ij . Hence, S 0 can be interpreted as the cross-entropy of Gaussian model (C.14) under those data. A key feature of the Gaussian model above is that its interaction matrix G ij is tridiagonal. Only nearest neighbor variables are coupled to each other according to (C.13). We conclude that the Gaussian model is a one-dimensional model. Consequently, it obeys a chain rule similar to (C.5). This is the only requirement for the main conclusion of Sect. C.1.1 to hold: in the cluster expansion of S 0, the entropy of a K-cluster is simply equal to the entropy of the 2-cluster with the same extremities, multiplied by (−1)K−2. As both the expansions of S and the one of S 0 enjoy this property, so does the expansion of S−S 0.
We conclude this section by the expression of the 2-cluster entropy ΔS (i,i+d). In the presence of the reference entropy S 0=S MF , we substract the following contribution to expression (C.10), see (21),
Hence, function F(u) defined in (C.11) should be substracted \(\frac{1}{2} \log(1-u^{2})\). It is a simple check that \(F(u)- \frac{1}{2} \log (1-u^{2})= O(u^{3})\), in agreement with scaling (45).
3.2 C.2 Examples and Calculation of Diagrammatic Coefficients
We have studied the histograms of cluster-entropies with K=2 and K=3 spins for specific choices of J,h. The averages p i and p ij were calculated exactly through formulas (C.3) and (C.4) (perfect sampling). We have found that
-
for K=2, the cluster-entropies take discrete values, labelled by the distance d between the two extremities of the cluster, e.g. Γ=(i,i+d). Expanding F(u) to the lowest order in u (for S 0=S MF ) we find the asymptotic formula for the 2-cluster entropy:
$$ \Delta S_{i,i+d} \simeq\frac{(2p-1)^2}{6p(1-p)}e^{-3d/\xi} ,$$(C.16)in agreement with (D.1). We have verified that this formula is in very good agreement with the numerics at large distance d.
-
for K=3, we find that the entropies of 3-clusters of the type (i 1,i 2,i 3) depend only on the distances d=i 3−i 1 between the extremities. In addition, the values are exactly the opposite of the ones found for K=2-clusters with the same distance d as expected. Two differences are:
-
The peak in d=1 in the histogram is not present because the minimal distance between three spins is d=2. The largest 3-spins entropy thus corresponds to triplets of the type (i,i+1,i+2).
-
The height of the peak (number of clusters) corresponding to distance d is (d−1)N. The degeneracy (d−1) is the number of ways of choosing the location of site i 2 in between i 1 and i 3.
-
We now show how the value of the cluster entropy can be found back from the leading terms in the diagrammatic expansion calculated in Sect. D.4. Let us call d′=j−i<d the distance between the first two sites in the cluster. For each diagram in Fig. 27 we give in Table 1 the sum of the distances of its links, i.e. the power of exp(−1/ξ).
Interestingly, the lowest total distances are found in diagrams (a) and (d), while the latter diagram is of a higher power (6) in terms of the correlated function than the former (5). Hence, contrary to the case of independent spins (Appendix F), diagrams (a) and (d) give the dominant contributions to the entropy. Summing the contributions of (a) and (d) we find
To derive the coefficients α (5) and α (6), we impose that ΔS (i,j,k) is the opposite of (C.16). We deduce that α (5) and α (6) are given by, respectively, (D.4) and (D.6).
The exact cancellation property discussed above has important consequences for the inferred fields and couplings. Consider for instance the coupling J i,i+2, which vanishes in the 1D-Ising model with nearest-neighbor interactions (C.1). As the connected correlation c i,i+2 is not equal to zero, a contribution to the coupling will be collected from the cluster (i,i+2) itself, equal to
Other contributions will come from larger clusters. For instance the cluster (i,i+1,i+2) will give an additional
The sum of the two contributions above vanishes due to the cancellation property. It can be checked that the contributions coming from all the other clusters vanish, too, which makes the coupling J i,i+2=0 as it should.
Appendix D: Leading Diagrammatic Contributions to Small Cluster-Entropies
We analyze the dominant diagrams contributing to the cluster-entropies for the various values of the cluster sizes, K, in the limit of small connected correlations c kl .
4.1 D.3 Case K=2
The entropy ΔS (i,j) of a 2-spin cluster is the sum of all diagrammatic contributions containing two spins and an arbitrary number of links between them, corresponding to the power of the expansion parameter M ij =c ij /(p i (1−p i )p j (1−p j )) (Fig. 26) and Appendix B. For small values of M ij the largest contribution to ΔS (i,j) is the one with three links (cubic power of M ij ), if the reference entropy S 0=S MF removes the two-link loop diagram. The entropy contribution of this diagram was computed in [31], with the result
where
The superscript 3 refers to the power of the connected correlation.
4.2 D.4 Case K=3
For K=3 the leading term to ΔS (i,j,k ) in powers of M ij was not derived analytically in [31]. Based on the studies of the unidimensional Ising model and the independent spin models (Appendix C), we find that the leading diagrams are diagrams (a), (b), (c) in Fig. 27 (bold diagrams in Fig. 26), whose sum is given by
with
Note that \(\alpha^{(5)}_{ijk}\) differs from \(\alpha^{(5)}_{jik}\). We have also found the coefficients of the subsequent diagrams, of the order of M 6. These diagrams are labelled by (d), (e), (f) in Fig. 27. Their total contribution to the cluster-entropy is
with
4.3 D.5 Generic Case K≥4
The above results for K=3 are easily generalized to any value of the cluster size K≥4. The diagrammatic expansion of a K-spin cluster includes all circuits where pairs of spins are linked together. Each diagram with (one or two) links between i l and i l+1 (l=1,…,K−1) and (one or two) links between i 1 and i K gives
At the next order in power of M ij , each diagram with three links between i l and i l+1 (l=1,…,K−1) gives a contribution
Appendix E: Critical Correlation Length ξ c for the Absolute Convergence
In this appendix, we briefly explain why the cluster-entropy series is absolutely convergent if and only if the correlation length ξ is smaller than
Here, Ω=2 when the reference entropy is S 0=0, and Ω=3 when S 0=S MF . Parameter v denotes the number of neighbors of a site on the lattice, supposed to be uniform. For instance, v=2D on a hypercubic lattice in dimension D≥1.
Consider a set of K distinct points on the lattice. Let \({\mathcal{N}}(L)\) be the number of closed paths of length L visiting all K points. We obviously have \({\mathcal{N}} (L)\leq v^{L}\). Hence, the series
is convergent if ξ<ξ c . Reciprocally, let L 0 be the length of the shortest closed path \({\mathcal{C}}_{0}\) encircling the K points. A closed path of length L 1+L 0 can be built from \({\mathcal{C}}_{0}\) by attaching a closed loop of length L 1 to any one of the sites in \({\mathcal{C}}_{0}\). Hence, for L≥L 0+2, \({\mathcal{N}} (L)\geq L_{0}v^{L-L_{0}}\). We deduce that the series (E.2) is divergent if ξ>ξ c .
Appendix F: Distribution of Cluster-Entropies for the Independent Spin Model
We generate B configurations of N independent spins σ i . Spin i is equal to 1 with probability p and to zero with probability 1−p (for simplicity we assume here that all the frequencies p i are equal to the same value p). The empirical connected correlations c ij computed from the B sampled configurations of spins are generally non-zero. The marginal distribution of c ij is a normal law, with zero mean and standard deviation (48). The largest values of the correlations are, for a system with N spins, of the order of
according to extreme value theory.
We have checked the validity of formulas (D.3) and (D.5) for, respectively, the cluster-entropies ΔS ij and ΔS ijk with numerics carried out from randomly sampled configurations. The agreement, for B=106 sampled configurations, is excellent due to the small value of c B ≃2⋅10−5.
6.1 F.6 Distribution of Cluster-Entropies for K=2
To derive the analytical expression of the distribution in the N→∞ limit, we use the small-correlation formula (D.1) for ΔS (1,2), and the fact that the distribution of the connected correlation is Gaussian. As a result, approximating α 1,2 with its average value α obtained by substituting p 1 and p 2 with p in (D.2), we obtain
This distribution is a stretched exponential at infinity, and diverges in zero. Its standard deviation is
This analytical prediction is in very good agreement with the distribution of cluster-entropies found from numerical data for large values of B, e.g. B=106 for p=0.024. For finite N, the largest correlations are given by (F.1), and cluster-entropies cannot exceed ΔS (1,2), of the order of
6.2 F.7 Distribution of the Cluster-Entropies for K=3
The leading order contribution to the entropy of a 3-cluster is given by (D.3). We want to calculate the distribution of ΔS (1,2,3) when the connected correlations c ij are random Gaussian variables, of zero mean and variance (c B )2. We neglect the correlations between c 12,c 13,c 23, which is legitimate for large B. Let us call x=ΔS (1,2,3)/(α(c B )5), with α given by (D.4). We can easily evaluate the variance of each of the three terms of the sum in (D.3) as the product of the variances of the three terms in the product, based on the approximation that the connected correlations c ij are independent stochastic variables. We obtainFootnote 11
Let P(x) be the probability density of x. Though we have not been able to find a closed expression for P(x), the asymptotics behavior of P for large or small arguments can characterized analytically.
6.2.1 F.7.3 Large x Behavior
The Mellin transform of P [38] is
where
The tail of P(x) at large x can be studied by considering large values of λ. We expect the dominant contribution to the multiple integral on the right hand side of (F.6) to come from large correlations. The location of the main contribution to the integral is the value of (c 12,c 13,c 23) which maximizes F. As F is invariant under any permutation of its arguments, we look for a maximum where c 12=c 13=c 23≡c ∗. A straigthforward calculation shows that
We now use the saddle-point method again, this time to estimate the integral on the left hand side of (F.6). We obtain
which is true when λ is very large. Hence, F ∗(λ) is the Legendre transform of logP(x). Solving (F.9) gives
at large x. The distribution of the cluster entropies ΔS (1,2,3) thus follows a stretched exponential with exponent \(\frac{2}{5}\). This decay is much slower than an exponential, and leads to large tails as can be seen from Fig. 7.
6.2.2 F.7.4 Small x Behavior
In order for the rescaled entropy x to be small, at least one among the three correlations should be small according to (D.3). Without restriction, we may assume that c 12 is the smallest of the three correlations. As c 12 appears once with power one, and twice with power two in (D.3), we approximate \(x \simeq c_{12}c_{13}^{2}c_{23}^{2}\). The Mellin transform of P is, for negative λ,
The largest pole is located in \(\lambda=-\frac{1}{2}\), and is of order 2. According to standard results on the inversion of Mellin transforms [38], we obtain a precise characterization of the divergence of the probability density at small x,
where C is a constant.
6.3 F.8 Distribution of Cluster-Entropies for Generic K≥4
In general, for K≥3, the leading contribution to \(\Delta S_{(i_{1},i_{2},\ldots i_{K})}\) (D.7) contains the sum of K×(K−1)!/2 terms, each one being the product of K random variables, among which (K−1) are elevated to power two, and 1 is elevated to power 1. The factor K comes from the fact that there are K way of choosing the single link in the circuits with K spins. The factor (K−1)!/2 is the number of non-equivalent circuits going through K spins. We define the rescaled entropy x through
The approach followed in Sect. F.7 to calculate the asymptotic behavior of the probability density P of x for K=3 can be extended without difficulty to any value of K>3. We find that P(x) diverges when x→0, with
where C is a constant. Hence the shape of the distribution of x is, up to logarithmic terms, independent of K. On the contrary, the tail of the distribution for large x is very sensitive to K,
As in the K=3 case, the distribution of the cluster entropies ΔS follows a stretched exponential. The exponent of the stretched exponential decreases with K. The variance of the distribution can be easily evaluated, with the result
Appendix G: Inverse Susceptibility Matrix for the Unidimensional Ising Model
Hereafter, we want to invert the matrix χ, whose elements are given in (60). The matrix is of dimension \(\frac{1}{2} N(N-1)\), and each element is labelled by two indices (i,j) and (k,l), with i<j and k<l. Each index (i,j) can be represented by a site of coordinates i and j on the half-grid of Fig. 28(a). We now show that the non-zero entries of the inverse susceptibility matrix, (χ −1) ij,kl , are in one-to-one correspondence with the sites (i,j) and (k,l) that are either identical, or nearest neighbors, or diagonally opposed on the elementary mesh of the half-grid (Fig. 28(b, c, d)). Depending on the value of the difference j−i, the number of those sites is equal to 9, 8, or 6.
We start with the case j−i≥3 (Fig. 28(b)). By symmetry, the nine unknown matrix elements (χ −1) ij,kl take only three independent values, denoted by γ for (k,l)=(i,j), β for (k,l) and (i,j) nearest neighbors, and α for (k,l)=(i±1,j±1). We now write the matrix inversion identity,
for various values of (m,n). Let d=j−i. For m=i,n=j, constraint (G.1) gives
which should hold for all d≥3. We deduce two coupled equations for the three unknown variables:
For m=i+1,n=j, constraint (G.1) is equivalent to
The d-dependent term in the equation above cancels by virtue of (G.3). We are left with an additional equation over α,β,γ:
By symmetry of the matrices χ,χ −1, no new constraint is obtained when the values of m,n are further varied. Solving (G.3), (G.4), (G.6) we obtain
The analysis of the other cases j=i+2 (Fig. 28(c)) and j=i+1 (Fig. 28(d)) can be done along the same lines. We do not write the calculations in details, and simply report the results. The case j=i+2 is very similar to the previous case. There are 8 coefficients to be calculated, with three independent values, α′,β′,γ′. It turns out that
As for the last case, j=i+1, we call α″ the values of the entries of χ −1 with (k,l)=(i−1,j−1),(i−1,j+1),(i+1,j+1), β″ the values of the entries with (k,l)=(i−1,j),(i,j+1), and γ″ the diagonal element corresponding to (k,l)=(i,j). After some elementary algebra, we find
All those results are reported in (61).
Rights and permissions
About this article
Cite this article
Cocco, S., Monasson, R. Adaptive Cluster Expansion for the Inverse Ising Problem: Convergence, Algorithm and Tests. J Stat Phys 147, 252–314 (2012). https://doi.org/10.1007/s10955-012-0463-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-012-0463-4