In this section, we discuss the results when 5 % of the edges was masked as unknown edges. Figures
5 and
6 show the results using Adamic/Adar and Katz, respectively. In each figure, (a), (b), and (c) show the results of the Enron, Irvine, and Friends-and-Family datasets, respectively. First, let us look at the result in Fig.
5a. The abscissa of the figure is the AUC value, while the ordinate is
r, which is the ratio of the number of the usable edges to the total one in the network graphs, as described in the previous section. When
\(r=1\), since all the edges in the network graphs are used, there is no difference among all the methods. On the other hand, when
\(r=0.6\), the top 60 % of the edges with the higher weights are used for link prediction. In the figure, the results of the proposed method (
\(n=1, 2, 3\)), the count-based method, the period-based method, and the random method are plotted. As we see in Fig.
5a, it was found that as
r was smaller, the AUC values became smaller. This is simply because link prediction became more difficult when the number of usable edges was limited. As we expected, the random method gave the lowest performance among all the methods, which demonstrated the effectiveness of edge weighting on the basis of temporal characteristics of interpersonal communication. As we see in Fig.
5a, the proposed method was superior to the other methods. Next, in the comparison of the proposed methods with
\(n=\)1, 2, and 3, by setting
n to 1, the result was the best when
r was between 0.7 and 0.9. In other words, taking the period giving the maximum value of the energy spectral density into account was effective enough rather than considering the periods giving the second and third maximum values when usable edges were limited to 70–90 %, which is a realistic and reasonable range (i) when the capacity of the number of edges recordable in a GDB is limited, or (ii) when only a limited number of edges is used for high-speed analysis. However, the cases of
\(n=\) 2 and 3 were slightly better than
\(n=1\) when
r was 0.6 and 0.65. This observation suggests, as the number of unusable edges is more strictly limited, considering the periods giving the second and third maximum values of the energy spectral density becomes more meaningful to maintain the prediction accuracy.
As shown in Fig.
5b, c, the results obtained using the other datasets also presented the three trends we have already observed: the random method gave the lowest performance; the proposed method was the most effective of all; the proposed method when
\(n=1\) was basically better than when
\(n=\) 2 and 3. However, we see a couple of different observations dataset by dataset. For example, in Fig.
5b, the difference between the random method and other methods was small. Another example is that, in Fig.
5c, setting
\(n=2, 3\) in the proposed method was the most effective of all the methods when
r was 0.7 or smaller. From the discussion about the results in Fig.
5, we have reached the following conclusions: (1) In the proposed method, it is the most effective in the realistic range of usable edges to use only the maximum value of the energy spectral density for edge weighting by setting
\(n=1\); (2) the proposed method performs best as long as
n is set appropriately; (3) the proposed method works well for various types of datasets.
Next, we discuss Fig.
6a–c. Compared with Fig.
5, the AUC values were basically higher in Fig.
6. This is reasonable because Katz was developed to improve the link prediction performance against the classical methods like Adamic/Adar, which only considers 2-hop relationships. We confirmed the following three trends we had observed in Fig.
5: the random method gave the lowest performance; the proposed method was the most effective of all; in the proposed method,
\(n=1\) was better than
\(n=\) 2 and 3. However, the results were slightly different among different datasets. The period-based method was less effective than the random method in Fig.
6a. The difference among different methods was small in Fig.
6b. Fortunately, the overall observation is the same as the one from Fig.
5: (1) In the proposed method, it is the most effective in the realistic range of usable edges to use only the maximum value of the energy spectral density for edge weighting by setting
\(n=1\); (2) the proposed method performs best as long as
n is set appropriately; (3) the proposed method works well for various types of datasets. In addition, through the discussion from Figs.
5 and
6, it has been verified that (4) the proposed method performs for different link prediction methods.