1 Introduction
2 Jaya algorithm
3 Team formation problem (TFP)
4 The proposed algorithm
Parameter | Definition |
---|---|
V
| Experts set |
SN(V, E) | Social network of experts |
S
| Skill set |
T
| A task that has required skills |
X
| Team of experts |
\(C(s_k)\)
| Experts set with skill \(s_k\) |
\(s(v_i)\)
| Skill of expert \(v_i\) |
\(e(v_i,v_j)\)
| Communication cost between experts \(v_i\) and \(v_j\) |
4.1 Principles of Jaya algorithm
-
Initialization. The algorithm starts by generating the initial population randomly \(X^{t+1}_{j,k}\), \(j=1,\ldots , m\), m is the number of problem variables, \(k=1,2,\ldots ,SS\), and SS is the population size.
-
Population evaluating. At iteration t, the solutions in the population are evaluating where the best \((X^{t}_{\mathrm{best}})\) and worst \((X^{t}_{\mathrm{worst}})\) solutions are assigned.
-
Solutions updating. Each solution in the population is updating based on the best and worst solutions as shown in Eq. 1:where \(r^t_{1,j}\) and \(r^t_{2,j}\) are two random numbers in the range [0, 1]. If the new solution \(X^{t+1}_{j,k}\) is better than the current solution \(X^{t}_{j,k}\), then the new solution becomes the current solution.$$\begin{aligned} X^{t+1}_{j,k}= & {} X^{t}_{j,k}+r^t_{1,j}[(X^{t}_{j,\mathrm{best}})\nonumber \\&-|(X^{t}_{j,k})|]-r^t_{2,j}[(X^{t}_{j,\mathrm{worst}})-|(X^{t}_{j,k})|] \end{aligned}$$(1)
-
Termination criteria The previous steps are repeated until termination criteria satisfied.
4.2 A modified swap operator (MSO)
4.3 Improved Jaya algorithm with a modified swap operator (IJMSO)
-
Initialization. In IJMSO, the initial population are generated randomly \(X_{j,k}^{t}\), where \(j=1,\ldots ,m \), m is the dimension of the problem and \(k=1, \ldots ,SS\), and SS is the population size.
-
Solution evaluation. The objective function \(f(X^{t})\) for each solution is calculated, and the best and the worst solutions are assigned in the population. The communication cost between two experts can be computed as shown in Eq. 2.where \(s(v_i)\) and \(s(v_j)\) are the skill set of expert \(v_i\) and expert \(v_j\). TFP is an optimization problem and it can be defined as shown in Eq. 3.$$\begin{aligned} e_{ij}= 1- \frac{s(v_i)\cap s(v_j)}{s(v_i)\cup s(v_j)} \end{aligned}$$(2)where \((x_i\in X)\) is a cardinality of team X and \(e_{ij} \in [0,1]\) is a weight (communication cost) between each two adjacent experts in each solution.$$\begin{aligned} \mathrm{Min} f(x_i)= \sum _{i=1}^{x_i}\sum _{j=i+1}^{x_i} e_{ij} \end{aligned}$$(3)
-
Single-point crossover. In order to improve the current solution, we obtained the overall best solution \(X^{t}_\mathrm{best}\) and the current solution \((X^{t})\) and we select the best solution from the obtained offspring
-
Solution update. The position of each solution in the population is updated according to Eq. 4, which represent the main conversion from continuous domain to discrete domain through using different operators and modified swap operator.where “\(\oplus \)” is a combining operator of two swap operators. The mark “\(\otimes \)” means the probability of \(r^t_{1,j}\) that all swap operators are selected in the swap sequences \((X^{t}_{j,\mathrm{cross}})-(X^{t}_{j,k})\) and the probability of \(r^t_{2,j}\) that all swap operators are selected in the swap sequences \((X^{t}_{j,\mathrm{worst}})-(X^{t}_{j,k})\).$$\begin{aligned} X^{t+1}_{j,k}= & {} X^{t}_{j,k} \oplus r^t_{1,j} \otimes [(X^{t}_{j,\mathrm{cross}})-(X^{t}_{j,k})]\nonumber \\&-r^t_{2,j} \otimes [(X^{t}_{j,\mathrm{worst}})-(X^{t}_{j,k})] \end{aligned}$$(4)If the new solution is better than the current solution, then we accept the new solution; otherwise, we select the current solution to be the new solution.
-
Termination criteria. The overall process are repeated until number of iterations.
4.4 An illustrative example of IJMSO for TFP
-
\(s(a)=\{publications,conference,research\}\),
-
\(s(b)=\{conference,funding,publications\}\),
-
\(s(c)=\{journals,phd,research\}\),
-
\(s(d)=\{publications,cv\}\),
-
\(s(e)=\{conference,phd\}\).
-
Initialization. In the IJMSO algorithm, initial population is generated randomly as illustrated in Table 2.
Solution_id | Solution |
---|---|
A | (a,c,e) |
B | (d,e,a) |
C | (d,c,b) |
D | (a,c,b) |
-
Solution evaluation. The solution is evaluated by calculating the summation of all communication cost between all experts in it as shown in Eqs. 2, 3. Table 3 shows an example of solution evaluation process.The solution with the overall minimum weight represents a gbest (the global best solution), while the solution with overall maximum weight represents a gworst (global worst solution).
Solution_id | Solution | f(x) | |
---|---|---|---|
A | (a,c,e) | 0.57 | gworst |
B | (d,e,a) | 0.4 | |
C | (d,c,b) | 0.2 | gbest |
D | (a,c,b) | 0.5 |
-
Solution updating. In each iteration, the solution is updated and computed according Eq. 4.In the above example and based on Table 3, “gbest” is solution C and “gworst” is solution A. In each iteration, the solution is updated and computed according to the update Eq. 4 as follows.The main steps for updating individual A is represented as follows:The main steps for updating individual B is represented as follows1.Consider \(r_1=1\) and \(r_2=0.7\)3.The one with minimum communication cost is chosen as a result of crossover; in this case, \(f(A_1)= 0.2\) and \(f( A_2)=0.5\). Therefore, the \(X^{t}_\mathrm{cross}\) solution is \(A_1= (d,c,e)\)5.For “gbest,” \(A_1-A\) : MSO (1,2,0)6.For “gworst,” \(A-A = 0\) (i.e., identical solutions)7.Solution A is updated as follows \(A = (a,c,e)\) \(\oplus \) (MSO \((1,2,0))= (a,c,e)\)8.Individual A(a, c, e) is updated to (a, c, e)9.The communication cost of it is \(f(A)=0.57\)The same procedure is applied for solution C and D.1.Consider \(r_1=1\) and \(r_2=0.7\)2.A single-point crossover applied with the same procedure with an individual B as shown in Fig. 5b3.The one with minimum communication cost is chosen as a result of crossover; in this case, \(f(B_1)= 0.4\) and \(f(B_2)=0.4\). Therefore, the \(X^{t}_\mathrm{cross}\) solution is \(B_1 (d,c,e)\)5.For “gbest” part \(B_1-B\) : MSO (2,1,0)6.For “gworst” part, \(A-B\) = MSO (1,0,2) , MSO(2,0,1), MSO(3,2,0). It means for \(skill_\mathrm{id}=1\) exclude \(expert_\mathrm{id} =2\) and replace it with another expert chosen randomly. \(A-B\) = MSO (1,0,1) , MSO(2,0,0), MSO(3,2,1)7.Solution B is updated as follows: \(B {=}(d{,}e{,}a)\) \(\oplus \) (MSO(2,1,0), MSO(1,0,1), MSO(2,0,0)) = (d, c, a) \(\oplus \) (MSO (1,0,1), MSO(2,0,0)) = (a, c, a) \(\oplus \) (MSO(2,0,0)) = (a, c, a)8.Individual B(d, e, a) is updated to (a, c, a)9.The communication cost of it is \(F(B)=0.17\)According to that example, the next iteration for solution A is still the same, but for solution B changed from 0.4 to 0.17 and the same “gbest” can be updated according to the solution that has a minimum communication cost.
-
Termination criteria. The overall steps are repeated until satisfied number of iterations which result the most feasible team is formed so far for required skills (i.e., the global best solution “gbest” so far).
5 Numerical experiments
5.1 Parameter setting
Exp. no. | No. of | No. of individuals | No. of |
---|---|---|---|
iterations | in initial population | skills | |
1 | 5 | 4 | 2 |
2 | 10 | 4 | 4 |
3 | 15 | 6 | 6 |
4 | 20 | 6 | 8 |
5 | 25 | 8 | 10 |
5.2 DBLP dataset
-
The set of experts contains the authors with at least three published papers in DBLP. There are 77 authors that have published papers more than three.
-
If there are two experts have sharing papers’ skills, then they become connected and their communication cost is calculated as shown in Eq. 2.
-
We have considered the most important shared skills between experts extracted from the title of 267 papers.
5.2.1 Comparison between IJMSO and other meta-heuristic algorithms with DBLP dataset
Exp. no. | No. of skills | GA | PSO | ABO | Jaya | IJMSO | |
---|---|---|---|---|---|---|---|
1 | 2 | Best | 0.85 | 0.85 | 0.63 | 0.72 | 0.5 |
Worst | 0.96 | 0.97 | 0.93 | 0.96 | 0.93 | ||
Mean | 0.915 | 0.923 | 0.875 | 0.889 | 0.722 | ||
St.d | 0.0392 | 0.0333 | 0.0891 | 0.0707 | 0.1996 | ||
2 | 4 | Best | 5.05 | 5.19 | 5.07 | 4.88 | 4.57 |
Worst | 5.76 | 5.66 | 5.63 | 5.5 | 5.23 | ||
Mean | 5.383 | 5.405 | 5.25 | 5.163 | 4.805 | ||
St.d | 0.2440 | 0.1816 | 0.1594 | 0.1809 | 0.2445 | ||
3 | 6 | Best | 13.61 | 13.12 | 13 | 12.83 | 11.66 |
Worst | 14.23 | 14.21 | 13.74 | 13.72 | 12.95 | ||
Mean | 13.961 | 13.792 | 13.369 | 13.212 | 12.61 | ||
St.d | 0.2206 | 0.2912 | 0.2794 | 0.2947 | 0.3599 | ||
4 | 8 | Best | 25.66 | 25.32 | 24.82 | 23.53 | 22.22 |
Worst | 26.82 | 26.86 | 26.26 | 25.37 | 24.79 | ||
Mean | 26.045 | 25.946 | 25.475 | 24.692 | 23.552 | ||
St.d | 0.3636 | 0.5116 | 0.4512 | 0.5611 | 0.8486 | ||
5 | 10 | Best | 40.95 | 41.5 | 39.44 | 38.05 | 37.04 |
Worst | 42.79 | 42.11 | 42.11 | 40.94 | 40.5 | ||
Mean | 41.93 | 41.80167 | 40.86 | 39.635 | 38.732 | ||
St.d | 0.5284 | 0.2513 | 0.8419 | 0.9158 | 1.2010 |
5.3 StackExchange dataset
-
The expert set consists of users that have at least 10 posts (i.e., distinct tags) in academia.stackexchange (192 users)
-
If there are two experts with share post’ tags (skills), then they become connected. The communication cost \(e_{ij}\) of expert i and j is evaluated as shown in Eq. 2.
-
We have considered the most important shared skills such as “publications,” “phd” and “conference” between experts extracted from the tags of distinct posts’ title by users using StringTokenizer in Java.
5.3.1 Comparison between IJMSO and other meta-heuristic algorithms with StackExchange dataset
Exp. no. | No. of skills | GA | PSO | ABO | Jaya | IJMSO | |
---|---|---|---|---|---|---|---|
1 | 2 | Best | 0.81 | 0.81 | 0.81 | 0.78 | 0.77 |
Worst | 0.88 | 0.86 | 0.85 | 0.86 | 0.82 | ||
Mean | 0.83 | 0.84 | 0.82 | 0.82 | 0.79 | ||
St.d | 0.0263 | 0.0163 | 0.0125 | 0.0231 | 0.0164 | ||
2 | 4 | Best | 4.3 | 4.25 | 4.28 | 4.24 | 4.06 |
Worst | 5.14 | 5.2 | 5.16 | 5.07 | 4.59 | ||
Mean | 4.83 | 4.87 | 4.74 | 4.53 | 4.28 | ||
St.d | 0.3673 | 0.3417 | 0.3695 | 0.2975 | 0.1477 | ||
3 | 6 | Best | 12.11 | 12.05 | 11.53 | 11.40 | 10.50 |
Worst | 13.12 | 13.18 | 12.80 | 12.32 | 11.79 | ||
Mean | 12.759 | 12.76 | 12.05 | 11.95 | 11.35 | ||
St.d | 0.3565 | 0.3909 | 0.3370 | 0.2666 | 0.4161 | ||
4 | 8 | Best | 21.89 | 23 | 20.45 | 20.67 | 19.43 |
Worst | 24.24 | 24.25 | 23.19 | 23.09 | 22.63 | ||
Mean | 23.31 | 23.68 | 22.23 | 22.08 | 21.54 | ||
St.d | 0.6262 | 0.4273 | 0.9435 | 0.7386 | 1.0354 | ||
5 | 10 | Best | 36.92 | 36.57 | 35.98 | 35.38 | 33.69 |
Worst | 42 | 42.11 | 42.11 | 36.84 | 38.22 | ||
Mean | 38.40 | 38.55 | 37.22 | 36.144 | 35.48 | ||
St.d | 1.3947 | 1.8753 | 1.8337 | 0.5002 | 1.1868 |
5.4 The confidence interval (CI) test
5.4.1 Confidence interval (CI) of IJMSO and the other meta-heuristic algorithm for DBLP dataset
No. of skills | GA | PSO | ABO | Jaya | IJMSO |
---|---|---|---|---|---|
2 | \(0.915 \pm 0.0243\) | \(0.923 \pm 0.0206\) | \(0.875 \pm 0.0552\) | \(0.889 \pm 0.0438\) | \(0.722 \pm 0.1237\) |
4 | \(5.383 \pm 0.1512\) | \(5.405 \pm 0.1125\) | \(5.25 \pm 0.0988\) | \(5.163 \pm 0.1121\) | \(4.805 \pm 0.1516\) |
6 | \(13.961 \pm 0.1367\) | \(13.792 \pm 0.1805\) | \(13.369 \pm 0.1731\) | \(13.212 \pm 0.1826\) | \(12.61 \pm 0.2230\) |
8 | \(26.045 \pm 0.2254\) | \(25.946 \pm 0.3171\) | \(25.475 \pm 0.2796\) | \(24.692 \pm 0.3477\) | \(23.552 \pm 0.5259\) |
10 | \(41.93 \pm 0.3275\) | \(41.80167 \pm 0.1557\) | \(40.86 \pm 0.5218\) | \(39.635 \pm 0.5676\) | \(38.732 \pm 0.7444\) |
5.4.2 Confidence interval (CI) of IJMSO and other meta-heuristic algorithm for StackExchange dataset
No. of skills | GA | PSO | ABO | Jaya | IJMSO |
---|---|---|---|---|---|
2 | \(0.836 \pm 0.0163\) | \(0.84 \pm 0.0101\) | \(0.823 \pm 0.0077\) | \(0.827 \pm 0.0143\) | \(0.794 \pm 0.0102\) |
4 | \(4.835 \pm 0.2276\) | \(4.873 \pm 0.2118\) | \(4.742 \pm 0.2290\) | \(4.530 \pm 0.1843\) | \(4.280 \pm 0.0915\) |
6 | \(12.759 \pm 0.2209\) | \(12.763 \pm 0.2423\) | \(12.058 \pm 0.2089\) | \(11.959 \pm 0.1652\) | \(11.347 \pm 0.2579\) |
8 | \(23.307 \pm 0.3881\) | \(23.687 \pm 0.2648\) | \(22.237 \pm 0.5848\) | \(22.081 \pm 0.4578\) | \(21.545 \pm 0.6417\) |
10 | \(38.407 \pm 0.8644\) | \(38.55 \pm 1.1623\) | \(37.226 \pm 1.1365\) | \(36.144 \pm 0.3100\) | \(35.484 \pm 0.7355\) |
5.5 The running time of IJMSO and the other algorithms
Exp. no. | No. of skills | GA | PSO | ABO | Jaya | IJMSO | |
---|---|---|---|---|---|---|---|
1 | 2 | Best | 0.22 | 0.56 | 0.25 | 0.60 | 0.56 |
Worst | 0.42 | 0.98 | 0.51 | 0.80 | 1.13 | ||
Mean | 0.29 | 0.74 | 0.34 | 0.70344 | 0.69 | ||
2 | 4 | Best | 1.40 | 2 | 1.40 | 2.10 | 2.20 |
Worst | 1.90 | 4.8 | 1.80 | 2.8 | 2.50 | ||
Mean | 1.55 | 2.62 | 1.56 | 2.31 | 2.35 | ||
3 | 6 | Best | 9.13 | 10.53 | 8.93 | 10.80 | 10.73 |
Worst | 15.66 | 16.40 | 14.46 | 14.2 | 13.46 | ||
Mean | 11.01 | 12.34 | 10.46 | 12.20 | 11.76 | ||
4 | 8 | Best | 21.88 | 24.56 | 21.88 | 24.85 | 24.70 |
Worst | 27.29 | 29.77 | 25.66 | 33.93 | 33.50 | ||
Mean | 23.88 | 26.64 | 23.76 | 27.75 | 26.91 | ||
5 | 10 | Best | 57.96 | 59.80 | 56.28 | 60.65 | 58.60 |
Worst | 64.24 | 69.36 | 64.15 | 70.24 | 71.77 | ||
Mean | 61.30 | 63.39 | 60.54 | 64.09 | 63.67 |
Exp. no. | No. of skills | GA | PSO | ABO | Jaya | IJMSO | |
---|---|---|---|---|---|---|---|
1 | 2 | Best | 1.072 | 1.53 | 1.07 | 1.46 | 1.08 |
Worst | 3.45 | 3.44 | 3.52 | 3.39 | 3.73 | ||
Mean | 1.47 | 1.85 | 1.94 | 1.87 | 1.91 | ||
2 | 4 | Best | 11.33 | 11.46 | 11.15 | 11.74 | 10.87 |
Worst | 34.80 | 35.48 | 37.20 | 36.89 | 37.87 | ||
Mean | 14.59 | 14.99 | 17.54 | 17.40 | 17.81 | ||
3 | 6 | Best | 56.84 | 58.85 | 56.98 | 58.04 | 58.01 |
Worst | 65.77 | 174.62 | 177.69 | 73.84 | 184.40 | ||
Mean | 59.22 | 72.81 | 71.50 | 61.92 | 96.54 | ||
4 | 8 | Best | 135.83 | 141.86 | 135.85 | 140.76 | 141.51 |
Worst | 148.00 | 157.24 | 449.11 | 465.46 | 437.40 | ||
Mean | 142.24 | 146.38 | 175.39 | 204.76 | 230.49 | ||
5 | 10 | Best | 358.77 | 369.32 | 361.53 | 359.27 | 389.24 |
Worst | 396.02 | 1136.48 | 382.75 | 1124.89 | 1094.82 | ||
Mean | 383.10 | 534.88 | 374.79 | 459.04 | 472.21 |