main-content

## Weitere Artikel dieser Ausgabe durch Wischen aufrufen

01.12.2019 | Regular article | Ausgabe 1/2019 Open Access

# Segregation in religion networks

Zeitschrift:
EPJ Data Science > Ausgabe 1/2019
Autoren:
Jiantao Hu, Qian-Ming Zhang, Tao Zhou
Wichtige Hinweise

Not applicable.

## Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## 1 Introduction

Religious belief is a sort of mental cement that connects, blends people of diverse colors, ages and sexes. It could facilitate human cooperation [16], promote civic engagement [710], improve life satisfaction [1115], intersect with politics [1620], impact people’s mental and/or physical health [2124], influence social morality [25, 26], coexist with science [2730] and even boom economic development [31, 32]. On the other side, people usually do not wish to have close relationships or frequent interactions to others of different religious beliefs. And thus people of different faiths tend to form relatively isolated communities [33]. Analogous to the separation of races [34], such religious segregation largely influences (usually negatively) culture evolution, economic development, political pattern, and so on [35, 36]. Furthermore, some different religions are even hostile to each other, which would increase prejudice between religions [3739], and lead to regional violence, intergroup conflict and moral prejudice against atheists [4046].
The above studies provided fruitful results about religions, but few researchers paid attention to believers in China. According to the latest data from “2012 China Family Panel Studies (CFPS)”, only 10% population in China are religious. Even though it is a small proportion, it is still worthy to study since the total amount of people is over 100 million. Within the believers, 64.66% are Buddhists, 22.03% are Christians, 5.17% are Taoists, and 4.41% are Islamists. To date, researches on religions in China were mainly based on questionnaire surveys. Some statistical results [47] have been reported, including the number of believers with different faiths, the gender ratio of the believers, the educational level of the believers, etc. Nevertheless, the relationships among different religious groups are still unknown. The main reason is the lack of available data. During recent years, the online social platforms have recorded interactions among huge population, which makes it possible to quantify the segregation in different religions.
We focus on weibo.​com, one of the largest online social platform in China. We identify 6875 believers in Christianity, Buddhism, Islam and Taoism, and construct a directed network based on the follower → followee relationship among these believers. Through analyzing the mixing pattern of the religion network, we find most of the links are created between the individuals having same belief. This phenomenon of homophily [33] is more significant in the Moslems and Taoists. In other word, the religion network is highly segregative. There are only 1.6% of links connecting different religions. Although the few cross-religion links are apparently important to the network connectivity, it is surprised that these links are remarkably more important than links with highest betweennesses [48] or bridgenesses [49]. In particular, we also find that 46.7% of these cross-religion links are probably related to charitable issues. The contribution of this paper is two-fold. Firstly, we provide quantitative insights into religious segregation. Secondly, we claim that charitable issues might play a positive role in facilitating religious syncretism.

## 2 Dataset

The dataset was collected from Sina Weibo (weibo.​com) in April 2016. Sina Weibo is a Chinese microblogging website (similar to twitter.​com), which has over 500 million registered users up to now. The number of monthly active users reaches about 340 million. Users can follow others, for example, if user A follows user B, B is called a followee, meanwhile A is a follower that receives B’s updates. Users are allowed to describe themselves with several tags and a short description.
Here we focus on the users who believe in Christianity, Buddhism, Islam or Taoism. To extract the believers from weibo.​com, we build up a set of religious keywords (they are Chinese keywords and readers are encouraged to see the full list in the following website http://​www.​dcjingsai.​com/​common/​share/​73.​html), which covers the most frequently-used words related to religion. Then we searched these keywords on weibo.​com and got about 170K users. Due to the ambiguity problem in word-cutting of Chinese text,1 we only keep the users who have at least two keywords (repeatable) in their descriptions and nicknames (similar to the name in one’s profile on twitter, which can be customized by user). Then the number of remaining users is about 9K. Lastly, we check these 9K candidates one by one by hand to make sure all the 6875 users under consideration are religious users.
Focusing on these religious users, we extract a subgraph $$G ( {V,E} )$$ from weibo.​com, where V and E denote the sets of nodes and links, respectively. The node set V contains 6875 believers in four major religions in China, including 3153 Christians, 2791 Buddhists, 470 Islamists and 461 Taoists. The link set contains 76,678 directed links, and the average degree is 11.15. Figure 1(a) presents a visual layout of the network, from which one can see clearly that connections inside a religion are dense while connections in-between different religions are much sparser. As shown in Fig. 1(b) and Fig. 1(c), both out-degree and in-degree distributions follow power-law-like distributions, as $$p(k) \sim {k^{ - \alpha }}$$ where k denotes the degree and α is the power-law exponent. The power-law exponents, estimated by the maximum likelihood method [50], are $$\alpha \approx 2.93$$ and $$\alpha \approx 2.47$$ for out-degree and in-degree distributions, respectively. Further, we analyze induced subgraphs [51] of each religion, where an induced subgraph of a given religion contains users in this religion and all links connecting these users. Induced subgraph for each religion exhibits scale-free property [52] (see Fig. 2), indicating the existence of leaders (with large in-degrees) and enthusiasts (with large out-degrees).

## 3 Results

### 3.1 The basic structure and mixing pattern

Neglecting the directions of links, we can obtain an undirected version $$G' ( {V,E'} )$$ from G, where two nodes i and j are connected if either there is a link from i to j or there is a link from j to i. In $$G'$$, there are in total 64,712 links. $$G'$$ displays clustering feature as indicated by its high clustering coefficient [53] $$C = 0.37$$, and community structure with a high modularity [54] $$Q = 0.57$$ if we directly treat individuals in one religion as one community. However, neither clustering coefficient nor modularity is enough to characterize the aggregation of believers in the same religion or the segregation of believers in different religions, since the former only considers local organization and the latter is very sensitive to the community sizes [55]. Accordingly, we look into the detailed mixing pattern of the religion network. Denote $${e_{ij}}$$ the fraction of links from religion i to religion j ($$i,j = 1,2,3,4$$), $${a_{i}} = \sum_{j} {{e_{ij}}}$$ the fraction of links from religion i, and $${b_{j}} = \sum_{i} {{e_{ij}}}$$ the fraction of links pointing to religion j, the corresponding mixing matrix is shown in Table 1. Obviously, the religion network is highly assortative, with most links connecting believers in the same religion. In fact, only 1.6% links are connecting believers of different faiths. We further calculate the assortativity coefficient r [56], which lies in $$[-1,1]$$ with $$r=1$$ corresponding to the perfect assortative mixing (see Methods). The assortativity coefficient of the religion network is surprisingly high, as $$r = 0.973$$. In comparison, it is even higher than some well-known social networks with remarkable segregation, such as sexual partnerships mixed by races [57] ($$r = 0.621$$) and Twitter web of politicians in democratic party and republican party [58] ($$r = 0.954$$). Besides assortativity coefficient, we also introduce some other indices which are usually used to quantify the network-level segregation. They are E–I index ($${S _{\mathrm{EI}}}$$) [59], Gupta–Anderson–May (GAM index) [60] and Odds-ratio for within-group ties (Odds-ratio WG ties) [61]. The detailed definitions of these indices are shown in Methods. Table 2 shows the results of these indices for the religion network in comparison with the political network and the ethnic network. In consistent to the results by assortativity coefficient, the religious network exhibits the highest degree of segregation.
Table 1
Mixing matrix of the religion network. The number in ith row and jth column represents $${e_{ij}}$$, the fraction of links from religion i to religion j
Religion
Religion
$${a_{i}}$$
Christianity
Buddhism
Islam
Taoism
Christianity
0.5594303
0.0028952
0.000104332
0.0002739
0.5627038
Buddhism
0.0017606
0.2971778
0.000091291
0.0048254
0.3038551
Islam
0.0001956
0.0005999
0.056769869
0.0001304
0.0576958
Taoism
0.0001695
0.0046167
0.000013042
0.070946
0.0757453
$${b_{j}}$$
0.5615561
0.3052897
0.05697853
0.0761757

Table 2
Segregation measures applied to the religion network, the Twitter web of politicians in democratic party and republican party (political network) and the sexual partnerships network mixed by races (ethnic network). The value of Odds-ratio WG ties for the ethnic network cannot be obtained since we do not know the number of non-existent links in the ethnic network, which is necessary to apply the Odds-ratio WG ties

Religion network
Political network
Ethnic network
$$-{S_{\mathrm{EI}}}$$
0.969
0.954
0.477
Assortativity Coefficient
0.973
0.954
0.621
GAM index
0.964
0.954
0.520
Odds-ratio WG ties
101.05
37.94
N/A
We further compare the mixing matrix of the religion network G with its randomized counterpart $${G^{\mathrm{null}}}$$, i.e., one null model of G, which is obtained by the degree-preserved link-rewiring process [62] (see Methods). This process can ensure each node in $${G^{\mathrm{null}}}$$ has exactly the same degree as in G. Since G is heterogeneous (with power-law-like degree distribution), it is possible that such heterogeneity may lead to segregation. Since both the original network G and the null model $${G^{\mathrm{null}}}$$ have the same degree heterogeneity, the direct comparison between their mixing patterns can demonstrate the role of the connecting pattern (the choices of whom to connect to) in producing the segregation structure. Table 3 shows the mixing matrix of the null network. We define the connecting ratio from religion i to religion j of G to $${G^{\mathrm{null}}}$$ as $$\rho _{ij} = e_{ij} / e_{ij}^{\mathrm{null}}$$, where $$e_{ij}^{\mathrm{null}}$$ is the fraction of links from religion i to religion j ($$i,j = 1,2,3,4$$) in the null network. Table 4 shows such ratios, from which one can observe two remarkable phenomena: (i) Believers statistically tend to connect with others of the same faith as indicated by ∀i, $${\rho _{ii}} > 1$$, while Islam and Taoism exhibit the highest level of homophily with $${\rho _{33}} = 18.84$$ and $${\rho _{44}} = 12.42$$; (ii) The ratios associated with Buddhism, say $${\rho _{2 \bullet }}$$ and $${\rho _{ \bullet 2}}$$, are all the largest ones in corresponding rows and columns excluded the diagonal elements, indicating that Buddhism plays the key role in cross-religion communications in China.
Table 3
Mixing matrix of the null network. The number in ith row and jth column represents $${e_{ij}}$$, the fraction of links from religion i to religion j
Religion
Religion
$${a_{i}}$$
Christianity
Buddhism
Islam
Taoism
Christianity
0.3152012
0.1703618
0.032669084
0.0444717
0.5627038
Buddhism
0.1704531
0.0945904
0.016940974
0.0218707
0.3038551
Islam
0.0329169
0.0176452
0.003012598
0.0041211
0.0576958
Taoism
0.042985
0.0226923
0.004355878
0.0057122
0.0757453
$${b_{j}}$$
0.5615561
0.3052897
0.05697853
0.0761757

Table 4
Connecting ratios of the religion network to the null network. The number in ith row and jth column represents the ratio of links from religion i to religion j in the religion network to those in the null network
Religion
Religion
Christianity
Buddhism
Islam
Taoism
Christianity
1.7748355
0.0169946
0.00319361
0.0061584
Buddhism
0.010329
3.1417345
0.00538876
0.2206321
Islam
0.0059429
0.0339985
18.8441558
0.0316456
Taoism
0.0039442
0.2034483
0.00299401
12.420091

### 3.2 Cross-religion links analysis

As indicated by the structural statistics, a tiny number of cross-religion links (i.e., links connecting individuals in different religions, see Fig. 3(a) for visualization) play a critical role in maintaining the global connectivity of the religion network. To quantify the significance of cross-religion links, we apply the link percolation dynamics [63], where links are ranked by a certain criterion and then removed one by one in order. If removing some links following a criterion (or some rules) can rapidly break the network into pieces, this criterion (or rules) is regarded as a good indicator for the importance of links. For convenience, we consider the undirected version $$G'$$, wherein there are in total 1124 cross-religion links. The global connectivity is intuitively measured by the ratio of nodes in the giant component (i.e., the largest connected component) to the total number of nodes N, denoted by $${R_{GC}}$$. Increasing $${\rho _{r}}$$, the fraction of links being removed, the percolation dynamics may come across a phase transition where the network suddenly breaks into many small fragments at the corresponding critical point, accompanied by a sharp drop of $${R_{GC}}$$. To precisely locate the critical point $$\rho _{r}^{c}$$, we adopt the normalized susceptibility $$\widetilde{S} = \sum_{s < s_{\max }} {\frac{{{n_{s}} {s^{2}}}}{N}}$$ [64], where $${n_{s}}$$ denotes the number of components with size s and s runs from the smallest size to the second largest size. If there is a percolation phase transition, an obvious peak in the $$\widetilde{S}({\rho _{r}})$$ curve can be observed that corresponds to the critical point $$\rho _{r}^{c}$$, at which the network disintegrates. A set of links whose removal leads to faster decay of $${R_{GC}}$$ and smaller value of $$\rho _{r}^{c}$$ is considered to be more significant in maintaining the network connectivity.
We compare the following four methods in identifying significant links for connectivity maintaining: (i) Removing the 1124 cross-religion links in a random order; (ii) Removing links in a descending order of their betweennesses [48]; (iii) Removing links in a descending order of their bridgenesses [49]; (iv) Removing links in a descending order of their degrees [65]. The explicit definitions of betweenness, bridgeness and degree for an arbitrary link are presented in Methods. As shown in Fig. 3(b), as the increasing of $${\rho _{r}}$$, $${R_{GC}}$$ decreases much faster when removing cross-religion links first. Remarkable peaks are observed only for the cross-religion links and the largest-betweenness-first method, while the critical point of the former ($$\rho _{r}^{c} = 1119/64\text{,}712$$) is one order of magnitude smaller than that of the latter ($$\rho _{r}^{c} = 14\text{,}956/64\text{,}712$$). In comparison, the performance of the cross-religion links is far better than betweennesses, bridgenesses and degrees.
Table 5
Distribution and average degrees of nodes in different types. In each illustration plot, the black node is the ego under consideration and the white node(s) is (are) its neighbor(s). The first four rows show the distribution of nodes of different types in different religions. The numbers in the brackets denote the number of charitable nodes. The average degree of the religion network is 11.15

Type 1
Type 2
•→∘
Type 3
∘→•
Type 4
∘→•→∘
Christianity
2930
153
61 (1)
9 (1)
Buddhism
2475
170
78 (24)
68 (6)
Islam
417
41
5 (0)
7 (0)
Taoism
271
109
42 (0)
39 (1)
Average out-degree
9.1674
25.0085
22.7366
38.7236
Average in-degree
6.7857
9.3002
134.3978
48.2602

## 4 Discussion

In summary, though everybody has observed some evidence about religion segregation in daily life, this paper provides quantitative analysis based on an extracted religion network from weibo.​com. The extent of networked segregation for different religions, measured by the assortativity coefficient, is even higher than that for different races or different political parties. In fact, to our knowledge, the present religion network exhibits the highest segregation among all previously reported social networks. Among the four religions under consideration, Buddhism plays the most significant role in promoting the cross-religion communications. We still cannot be sure this is a specific phenomenon in China as Buddhism itself is one of a few mainstays of the Chinese culture or a universal phenomenon over the world since the Buddhist doctrines are very inclusive and tolerant. A solid answer to this question asks for more data from twitter.​com as well as other representative social networks at national level. We have also found that the small-scale religions in China, namely Islam and Taosim, show much higher level of cohesion (see Table 4), which probably reflects a general observation that the subculture group of smaller size usually shows a higher level of homophily and denser interactions [66].
A tiny fraction of cross-religion links maintain the global connectivity, whose removal will lead to much faster breakdown of the network in comparison with those links with highest betweennesses or bridgenesses. Therefore we want to understand the underlying reasons of the generation of these cross-religion links. To our surprise, about half links point to charitable nodes. This strong evidence suggests that charity may be a common interest that can stride across the ideological barriers between religions. Accordingly, encouraging and holding charity-related activities, and at the same time inviting participants from different religions, may be an effective method to facilitate cross-religion communications.
In this paper, we demonstrate the effectiveness and validity of the data-driven paradigm in the studies of religious issues, and we believe it will turn to be the mainstream methodology in the near future [67, 68]. However, the current data is of very small size in comparison with the whole population of Chinese religious believers. Therefore the reported findings just provide a tiny and early step towards the comprehensive landscape of communicating patterns between believers of different faiths. Three open issues are left for further studies. First of all, we would like to test the universality of the present observations based on data from other countries. Secondly, we want to see the evolution of the connecting patterns of religion networks by tracing the temporal data [69]. Lastly, it would be interesting to see the role of religious believers in the whole social network, instead of the network containing only believers. For example, whether the echochamber exists between followers of different religions in which they only share similar posts and only engage in conversation with similar ones. This is of particular importance for countries like China where theists are the minority and their social inclusion needs to be promoted.

## 5 Methods

### 5.1 Segregation measures

Segregation measures are used to quantify whether and to which extent links tend to connect nodes in the same type.
(1)
Assortativity coefficient [56] is defined as $$r = \frac{{\sum_{i} {{e_{ii}}} - \sum_{i} {{a_{i}} {b_{i}}} }}{{1 - \sum_{i} {{a_{i}}{b_{i}}} }}$$, where $${e_{ii}}$$, $${a_{i}}$$ and $${b_{i}}$$ are introduced in the main text. In the case of the perfect assortative mixing, all links connecting nodes in the same type, leading to $$\sum_{i} {{e_{ii}}} = 1$$ and $$r = 1$$.

(2)
E–I index [59] indicates differences within and between groups links, say $${S_{\mathrm{EI}}} = \frac{{\sum_{g} {\sum_{h \ne g} {{m_{gh}}} } - \sum_{g} {{m_{gg}}} }}{{\sum_{g} {\sum_{h} {{m_{gh}}} } }}$$, where $${m_{gh}}$$ is the number of links between groups g and h, and $$m_{gg}$$ is the number of links inside group g.

(3)
Gupta–Anderson–May [60] is defined as $${S_{\mathrm{GAM}}}=\frac{{\sum_{g} {{f_{gg}} - 1} }}{ {K - 1}}$$, where $${f_{gg}} = \frac{{{m_{gg}}}}{{\sum_{h} {{m_{gh}}} }}$$ is the proportion of links in group g to links of group g pointing to other groups and K is the total number of groups in the network.

(4)
Odds-ratio for within-group ties (Odds-ratio WG ties) [61] is calculated through $${S_{\mathrm{ORWG}}}=\frac{ {\sum_{g} {{m_{gg}}\sum_{g} {\sum_{h \ne g} {\overline{m}_{gh}}} } }}{{\sum_{g} {{m_{gg}}\sum_{g} {\sum_{h \ne g} {{m_{gh}}} } } }}$$, where $$\overline{m}_{gh}$$ denotes the number of disconnected links between groups g and h.

### 5.2 Degree-preserved link-rewiring process

This process randomly reshuffles links while keeps the out-degree and in-degree of each node unchanged [62]. At each time step, we randomly select two links $$A \to B$$ and $$C \to D$$. If the link $$A \to D$$ or $$C \to B$$ exists, we go back to reselect two links, otherwise these two links $$A \to B$$ and $$C \to D$$ are replaced by $$A \to D$$ and $$C \to B$$. We repeat such operation for sufficiently long time (106 steps in this paper) to obtain the randomized counterpart (called null network) of the original network.

### 5.3 Benchmark link centralities

Betweenness centrality of a link l is the fraction of shortest paths between pairs of nodes passing through l [48], say $$B{C_{l}} = \sum_{s,t \in V,s \ne t} {\frac{{\sigma ( {s,l,t} )}}{ {\sigma ( {s,t} )}}}$$, where $$\sigma ( {s,t} )$$ is the number of shortest paths between nodes s and t, and $$\sigma ( {s,l,t} )$$ is the number of those paths passing through link l. The bridgeness of a link l is defined as $${B_{l}}={\sqrt{{S_{x}}{S_{y}}} } / {{S_{l}}}$$ [49], where x and y are the two endpoints of link l. $${S_{x}}$$ and $${S_{l}}$$ are the sizes of the maximum cliques (i.e., complete subgraph) that contain node x and link l, respectively. The degree of link l is defined as $${D_{l}}={k_{x}}{k_{y}}$$ [65], where $${k_{x}}$$ and $${k_{y}}$$ are the degrees of the two endpoints of l.

### Acknowledgements

The authors would like to acknowledge Jun Wang for providing the raw data.

### Availability of data and materials

The datasets analyzed during the current study are available at http://​www.​dcjingsai.​com/​common/​share/​73.​html.

### Competing interests

The authors declare that they have no competing interests.

Not applicable.

## Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
1
In Chinese text, we do NOT naturally have words, but characters. One, two or even more characters will form a word. A Chinese sentence looks like ABCDEFCGHHIAD, which contains a number of characters without any blanks in-between them. So in the word level, this sentence can be decomposed as “AB CD EFCG HH I AD” or “AB CDEF CGH HI AD”. There are usually more than one way to decompose a sentence into words. Some decompositions will lead to meaningless sentence, but it is very possible that a sentence has multiple ways to be decomposed with reasonable meanings. Therefore, the word-cutting usually depends on semantics of the whole text and is a big challenge in natural language processing in China [70, 71].

## Unsere Produktempfehlungen

### Premium-Abo der Gesellschaft für Informatik

Sie erhalten uneingeschränkten Vollzugriff auf alle acht Fachgebiete von Springer Professional und damit auf über 45.000 Fachbücher und ca. 300 Fachzeitschriften.

### Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

• über 69.000 Bücher
• über 500 Zeitschriften

aus folgenden Fachgebieten:

• Automobil + Motoren
• Bauwesen + Immobilien
• Business IT + Informatik
• Elektrotechnik + Elektronik
• Energie + Umwelt
• Finance + Banking
• Management + Führung
• Marketing + Vertrieb
• Maschinenbau + Werkstoffe
• Versicherung + Risiko

Testen Sie jetzt 30 Tage kostenlos.

### Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

• über 58.000 Bücher
• über 300 Zeitschriften

aus folgenden Fachgebieten:

• Bauwesen + Immobilien
• Business IT + Informatik
• Finance + Banking
• Management + Führung
• Marketing + Vertrieb
• Versicherung + Risiko

Testen Sie jetzt 30 Tage kostenlos.

Weitere Produktempfehlungen anzeigen
Literatur
Über diesen Artikel

Zur Ausgabe