1 Introduction
-
The child has symptoms of strawberry red tongue and swollen red hands.
-
This kid is suffering from Kawasaki disease.
-
The experiments conducted on the recent TREC 2016 CDS task dataset. The results obtained on this new dataset are consistent with those on the TREC 2014 and 2015 CDS datasets.
-
The proposed approach is further evaluated on five standard IR test collections. Results show that our approach is able to outperform strong baselines for IR tasks other than CDS.
2 Related Work
2.1 BM25 and PRF
2.2 State-of-the-Art CDS Methods
2.3 The Best-Performing Methods in the TREC CDS Tasks
3 Feedback-Based Semantic Relevance
3.1 Generating Embeddings of Biomedical Articles
3.2 Using Embeddings for CDS
4 Experimental Settings
4.1 Datasets
Topic type - diagnosis | |
Summary: A 78-year-old male presents with frequent stools and melena. | |
Description: 78 M transferred to nursing home for rehab after CABG. Reportedly readmitted with a small NQWMI. Yesterday, he was noted to have a melanotic stool and then today he had approximately 9 loose BM some melena and some frank blood just prior to transfer, unclear quantity | |
Note: 78 M w/pmh of CABG in early \([**Month (only) 3**]\) at \([**Hospital6 4406**]\) (transferred to nursing home for rehab on \([**12-8**]\) after several falls out of bed.) He was then readmitted to \([**Hospital6 1749**]\) on \([**3120-12-11**]\) after developing acute pulmonary edema/CHF/unresponsiveness?. There was a question whether he had a small MI; he reportedly had a small NQWMI. He improved with diuresis and was not intubated. Yesterday, he was noted to have a melanotic stool earlier this evening and then approximately 9 loose BM w/ some melena and some frank blood just prior to transfer, unclear quantity |
Model | infNDCG | infAP | MAP | R-Prec |
---|---|---|---|---|
Summary field | ||||
BM25 | 0.2524 | 0.0805 | 0.1537 | 0.2004 |
\(BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.2698
\(+\,\)6.89%* | 0.0935 16.15%* | 0.1628
\(+\,\)5.92%* | 0.2067
\(+\,\)3.14% |
\(BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.2748
\(+\,\)
8.87%*
|
0.0953
\(+\,\)
18.39%*
|
0.1645
\(+\,\)
7.03%*
|
0.2083
\(+\,\)
3.94%
|
Description field | ||||
BM25 | 0.2460 | 0.0700 | 0.1440 | 0.2065 |
\(BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.2751
\(+\,\)11.83%* |
0.0918
\(+\,\)
31.14%*
| 0.1623
\(+\,\)12.71%* | 0.2196
\(+\,\)6.34%* |
\(BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.2830
\(+\,\)
15.04%*
| 0.0911
\(+\,\)30.14%* |
0.1661
\(+\,\)
15.35%*
|
0.2206
\(+\,\)
6.83%*
|
Model | infNDCG | infAP | MAP | R-Prec |
---|---|---|---|---|
Summary field | ||||
BM25 | 0.2695 | 0.0736 | 0.1650 | 0.2198 |
\(BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.2980
\(+\,\)10.58%* | 0.0831 12.91%* | 0.1758
\(+\,\)6.55%* | 0.2345
\(+\,\)6.69%* |
\(BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.2986
\(+\,\)
10.80%*
|
0.0842
\(+\,\)
14.40%*
|
0.1791
\(+\,\)
8.55%*
|
0.2408
\(+\,\)
9.55%*
|
Description field | ||||
BM25 | 0.2724 | 0.0733 | 0.1641 | 0.2184 |
\(BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.2877
\(+\,\)5.62%* | 0.0837 14.19%* | 0.1762
\(+\,\)7.37%* | 0.2325
\(+\,\)6.46%* |
\(BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.3016
\(+\,\)
10.72%*
|
0.0873
\(+\,\)
19.10%*
|
0.1806
\(+\,\)
10.05%*
|
0.2370
\(+\,\)
8.52%*
|
Model | infNDCG | infAP | MAP | R-Prec |
---|---|---|---|---|
Summary field | ||||
\(BM25_{PRF}\)
| 0.2081 | 0.0274 | 0.0806 | 0.1491 |
\(BM25_{PRF}+SEM_{d_{pv}\text {-}D^k_{PRF}}\)
| 0.2260
\(+\,\)8.60%* | 0.0318
\(+\,\)16.06%* | 0.0817
\(+\,\)1.36% |
0.1501
\(+\,\)
0.67%
|
\(BM25_{PRF}+SEM_{d_{add}\text {-}D^k_{PRF}}\)
|
0.2493
\(+\,\)
19.80%*
|
0.0345
\(+\,\)
25.91%*
|
0.0837
\(+\,\)
3.85%*
| 0.1486 −0.33% |
Description field | ||||
\(BM25_{PRF}\)
| 0.1547 | 0.0153 | 0.0523 | 0.1121 |
\(BM25_{PRF}+SEM_{d_{pv}\text {-}D^k_{PRF}}\)
| 0.1724
\(+\,\)11.44%* | 0.0222
\(+\,\)45.10%* |
0.0594
\(+\,\)
13.58%* | 0.1148
\(+\,\)2.40% |
\(BM25_{PRF}+SEM_{d_{add}\text {-}D^k_{PRF}}\)
|
0.1786
\(+\,\)
15.45%*
|
0.0225
\(+\,\)
47.06%*
| 0.0583
\(+\,\)11.47%* |
0.1195
\(+\,\)
6.60%*
|
Note Field | ||||
\(BM25_{PRF}\)
| 0.1698 | 0.0206 | 0.0669 | 0.1197 |
\(BM25_{PRF}+SEM_{d_{pv}\text {-}D^k_{PRF}}\)
| 0.1849
\(+\,\)8.90%* | 0.0242
\(+\,\)17.90%* | 0.0665
\(-\)0.60% | 0.1154
\(-\)3.60% |
\(BM25_{PRF}+SEM_{d_{add}\text {-}D^k_{PRF}}\)
|
0.1957
\(+\,\)
15.25%*
|
0.0255
\(+\,\)
23.79%*
|
0.0709
\(+\,\)
5.98%*
|
0.1243
\(+\,\)
3.84%
|
Method | infNDCG | infAP | MAP | R-Prec |
---|---|---|---|---|
2014 CDS task | ||||
\(BM25+Sim_{d_{Para}\text {-}Q}\)
| 0.2618 | 0.0763 | 0.1579 | 0.1518 |
\(BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.2698
\(+\,\)3.06% | 0.0935
\(+\,\)22.54%* | 0.1628
\(+\,\)3.10% | 0.2067
\(+\,\)36.17%* |
\(BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.2748
\(+\,\)
4.97%*
|
0.0953
\(+\,\)
24.90%*
|
0.1645
\(+\,\)
4.18%
|
0.2083
\(+\,\)
37.22%*
|
2015 CDS task | ||||
\(BM25+Sim_{d_{Para}\text {-}Q}\)
| 0.2742 | 0.0657 | 0.1642 | 0.1491 |
\(BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.2980
\(+\,\)8.68%* | 0.0831
\(+\,\)26.48%* | 0.1758
\(+\,\)7.06%* | 0.2345
\(+\,\)57.28%* |
\(BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.2986
\(+\,\)
8.90%*
|
0.0842
\(+\,\)
28.16%*
|
0.1791
\(+\,\)
9.07%*
|
0.2408
\(+\,\)
61.50%*
|
Method | infNDCG | infAP |
---|---|---|
SNUMedinfo
| 0.2674 | 0.0659 |
\(BM25+SEM_{d\text {-}D^k_{PRF}}\)
|
0.2830
|
0.0911
|
Method | infNDCG | infAP |
---|---|---|
WSU-IR
| 0.2939 | 0.0842 |
\(BM25+SEM_{d\text {-}D^k_{PRF}}\)
|
0.3016
\(+\,\)
2.62%
|
0.0873
\(+\,\)
3.68%
|
4.2 Experimental Design
5 Evaluation Results
Method | infNDCG | infAP | MAP | R-Prec |
---|---|---|---|---|
Automatic run | ||||
wsuirdaa
| 0.2939 | 0.0842 | 0.1864 | 0.2306 |
\(wsuirdaa+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.3130
\(+\,\)6.50%* | 0.0896
\(+\,\)6.41%* | 0.1905
\(+\,\)2.20% | 0.2396
\(+\,\)3.90% |
\(wsuirdaa+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.3157
\(+\,\)
7.42%*
|
0.0898
\(+\,\)
6.65%*
|
0.1926
\(+\,\)
3.33%
|
0.2469
\(+\,\)
7.07%*
|
Manual run | ||||
wsuirdma
| 0.3109 | 0.0880 | 0.1968 | 0.2493 |
\(wsuirdma+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.3265
\(+\,\)5.02%* | 0.0940
\(+\,\)6.82%* | 0.2015
\(+\,\)2.39% | 0.2605
\(+\,\)4.49% |
\(wsuirdma+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.3335
\(+\,\)
7.27%*
|
0.0963
\(+\,\)
9.43%*
|
0.2054
\(+\,\)
4.37%
|
0.2643
\(+\,\)
6.02%*
|
Method | infNDCG | infAP | MAP | R-Prec |
---|---|---|---|---|
Automatic run | ||||
\(wsuirdaa+SEM_{d_{LDA}\text {-}D^k_{PRF}}\)
| 0.2963 | 0.0853 | 0.1864 | 0.2306 |
\(wsuirdaa+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.3130
\(+\,\)5.64%* | 0.0896
\(+\,\)5.04%* | 0.1905
\(+\,\)2.20% | 0.2396
\(+\,\)3.90% |
\(wsuirdaa+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.3157
\(+\,\)
6.55%*
|
0.0898
\(+\,\)
5.28%*
|
0.1926
\(+\,\)
3.33%
|
0.2469
\(+\,\)
7.07%*
|
Manual run | ||||
\(wsuirdma+SEM_{d_{LDA}\text {-}D^k_{PRF}}\)
| 0.3117 | 0.0887 | 0.1970 | 0.2494 |
\(wsuirdma+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
| 0.3265
\(+\,\)4.75%* | 0.0940
\(+\,\)5.98%* | 0.2015
\(+\,\)2.28% | 0.2605
\(+\,\)4.45% |
\(wsuirdma+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
|
0.3335
\(+\,\)
6.99%*
|
0.0963
\(+\,\)
8.57%*
|
0.2054
\(+\,\)
4.26%
|
0.2643
\(+\,\)
5.97%*
|
5.1 Application of the Semantic Relevance Score to Other State-of-the-Art Methods
6 Experimental Results on Other IR Test Collections
6.1 Experimental Settings
Collection | TREC Task | Topics |
\(\#\) of Topics |
\(\#\) of Docs |
---|---|---|---|---|
disk1&2 | 1, 2, 3 ad hoc | 51–200 | 150 | 741,856 |
disk4&5 | Robust 2004 | 301–450 601–700 | 250 | 528,155 |
WT10G | 9, 10 Web | 451–550 | 100 | 1,692,096 |
GOV2 | 2004-2006 Terabyte Ad-hoc | 701–850 | 150 | 25,178,548 |
ClueWeb09 B | 2009-2011 Web | wt1–150 | 150 | 50,220,423 |
6.2 Results
Model | disk1&2 | disk4&5 | WT10G | GOV2 | CW09B |
---|---|---|---|---|---|
\(\scriptstyle BM25\)
| 0.2408 | 0.2534 | 0.2123 | 0.3008 | 0.2251 |
\(\scriptstyle BM25+SEM_{d_{LDA}\text {-}D^k_{PRF}}\)
| 0.2517
\(+\,\)4.53%* | 0.2675
\(+\,\)5.56%* | 0.2158
\(+\,\)1.65% | 0.3193
\(+\,\)6.15%* | 0.2306
\(+\,\)2.44%* |
\(\scriptstyle BM25+SEM_{d_{TFIDF}\text {-}D^k_{PRF}}\)
| 0.2477
\(+\,\)2.87% | 0.2554
\(+\,\)0.79% | 0.2187
\(+\,\)3.01% | 0.3043
\(+\,\)1.16% | 0.2311
\(+\,\)2.67%* |
\(\scriptstyle BM25+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
|
0.2820
\(+\,\)
17.11%*
|
0.2862
\(+\,\)
12.94%*
|
0.2427
\(+\,\)
14.32%*
| 0.3138
\(+\,\)4.32%* |
0.2452
\(+\,\)
8.93%*
|
\(\scriptstyle BM25+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
| 0.2727
\(+\,\)13.25%* | 0.2796
\(+\,\)10.34%* | 0.2423
\(+\,\)14.13%* |
0.3184
\(+\,\)
5.85%*
| 0.2404
\(+\,\)6.80%* |
Model | disk1&2 | disk4&5 | WT10G | GOV2 | CW09B |
---|---|---|---|---|---|
\(\scriptstyle BM25_{PRF}\)
| 0.3083 | 0.2966 | 0.2445 | 0.3430 | 0.2536 |
\(\scriptstyle BM25_{PRF}+SEM_{d_{LDA}\text {-}D^k_{PRF}}\)
| 0.3084
\(+\,\)0.03% | 0.2966
\(+\,\)0.0% | 0.2446
\(+\,\)0.04% | 0.3479
\(+\,\)1.43% | 0.2596
\(+\,\)2.37%* |
\(\scriptstyle BM25_{PRF}+SEM_{d_{TFIDF}\text {-}D^k_{PRF}}\)
| 0.3093
\(+\,\)0.32% | 0.2967
\(+\,\)0.03% | 0.2445
\(+\,\)0.0% | 0.3430
\(+\,\)0.0% | 0.2587
\(+\,\)2.01% |
\(\scriptstyle BM25_{PRF}+SEM_{d_{Para}\text {-}D^k_{PRF}}\)
|
0.3110
\(+\,\)
0.88%
|
0.2990
\(+\,\)
0.81%
|
0.2541
\(+\,\)
3.93%*
| 0.3484
\(+\,\)1.57% | 0.2635
\(+\,\)4.87%* |
\(\scriptstyle BM25_{PRF}+SEM_{d_{Sum}\text {-}D^k_{PRF}}\)
| 0.3105
\(+\,\)0.71% | 0.2985
\(+\,\)0.64% |
0.2541
\(+\,\)
3.93%*
|
0.3523
\(+\,\)
2.71%*
|
0.2677
\(+\,\)
5.42%*
|
Model | disk1&2 | disk4&5 | CW09B |
---|---|---|---|
\(Locally \text {-} trained\)
| 0.563 | 0.517 | 0.258 |
\(Our \; method\)
| 0.5779 | 0.5261 | 0.2633 |