Skip to main content

2017 | OriginalPaper | Buchkapitel

Exploiting Source-Object Networks to Resolve Object Conflicts in Linked Data

verfasst von : Wenqiang Liu, Jun Liu, Haimeng Duan, Wei Hu, Bifan Wei

Erschienen in: The Semantic Web

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Considerable effort has been exerted to increase the scale of Linked Data. However, an inevitable problem arises when dealing with data integration from multiple sources. Various sources often provide conflicting objects for a certain predicate of the same real-world entity, thereby causing the so-called object conflict problem. At present, object conflict problem has not received sufficient attention in the Linked Data community. Thus, in this paper, we firstly formalize the object conflict resolution as computing the joint distribution of variables on a heterogeneous information network called the Source-Object Network, which successfully captures three correlations from objects and Linked Data sources. Then, we introduce a novel approach based on network effects called ObResolution (object resolution), to identify a true object from multiple conflicting objects. ObResolution adopts a pairwise Markov Random Field (pMRF) to model all evidence under a unified framework. Extensive experimental results on six real-world datasets show that our method achieves higher accuracy than existing approaches and it is robust and consistent in various domains.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996) Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996)
2.
Zurück zum Zitat Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. PVLDB 2(1), 550–561 (2009). Lyon, France Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. PVLDB 2(1), 550–561 (2009). Lyon, France
3.
Zurück zum Zitat Dong, X.L., Gabrilovich, E., Murphy, K., Dang, V., Horn, W., Lugaresi, C., Sun, S., Zhang, W.: Knowledge-based trust: Estimating the trustworthiness of web sources. PVLDB 8(9), 938–949 (2015). Hawaii, USA Dong, X.L., Gabrilovich, E., Murphy, K., Dang, V., Horn, W., Lugaresi, C., Sun, S., Zhang, W.: Knowledge-based trust: Estimating the trustworthiness of web sources. PVLDB 8(9), 938–949 (2015). Hawaii, USA
4.
Zurück zum Zitat Dutta, A., Meilicke, C., Ponzetto, S.P.: A probabilistic approach for integrating heterogeneous knowledge sources. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 286–301. Springer, Cham (2014). doi:10.1007/978-3-319-07443-6_20CrossRef Dutta, A., Meilicke, C., Ponzetto, S.P.: A probabilistic approach for integrating heterogeneous knowledge sources. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 286–301. Springer, Cham (2014). doi:10.​1007/​978-3-319-07443-6_​20CrossRef
5.
Zurück zum Zitat Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)CrossRef Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)CrossRef
8.
Zurück zum Zitat Li, X., Dong, X.L., Lyons, K., Meng, W., Srivastava, D.: Truth finding on the deep web: is the problem solved? PVLDB 6(2), 97–108 (2012). Istanbul, Turkey Li, X., Dong, X.L., Lyons, K., Meng, W., Srivastava, D.: Truth finding on the deep web: is the problem solved? PVLDB 6(2), 97–108 (2012). Istanbul, Turkey
9.
Zurück zum Zitat Li, Q., Li, Y., Gao, J., Su, L., Zhao, B., Demirbas, M., Fan, W., Han, J.: A confidence-aware approach for truth discovery on long-tail data. PVLDB 8(4), 425–436 (2014). Hangzhou, China Li, Q., Li, Y., Gao, J., Su, L., Zhao, B., Demirbas, M., Fan, W., Han, J.: A confidence-aware approach for truth discovery on long-tail data. PVLDB 8(4), 425–436 (2014). Hangzhou, China
10.
Zurück zum Zitat Li, Y., Li, Q., Gao, J., Su, L., Zhao, B., Fan, W., Han, J.: On the discovery of evolving truth. In: KDD, Sydney, Australia, pp. 675–684 (2015) Li, Y., Li, Q., Gao, J., Su, L., Zhao, B., Fan, W., Han, J.: On the discovery of evolving truth. In: KDD, Sydney, Australia, pp. 675–684 (2015)
11.
Zurück zum Zitat Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: SIGMOD, Utah, USA, pp. 1187–1198 (2014) Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: SIGMOD, Utah, USA, pp. 1187–1198 (2014)
12.
Zurück zum Zitat Liu, W., Liu, J., Zhang, J., Duan, H., Wei, B.: Truthdiscover: a demonstration of resolving object conflicts on massive linked data. In: WWW, Perth, Australia (2017) Liu, W., Liu, J., Zhang, J., Duan, H., Wei, B.: Truthdiscover: a demonstration of resolving object conflicts on massive linked data. In: WWW, Perth, Australia (2017)
14.
Zurück zum Zitat Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: EDBT/ICDT Berlin, Germany, pp. 116–123 (2012) Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: EDBT/ICDT Berlin, Germany, pp. 116–123 (2012)
15.
16.
Zurück zum Zitat Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)CrossRef Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)CrossRef
17.
Zurück zum Zitat Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: KDD, Melbourne, Australia, pp. 985–994 (2015) Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: KDD, Melbourne, Australia, pp. 985–994 (2015)
18.
Zurück zum Zitat Vydiswaran, V., Zhai, C., Roth, D.: Content-driven trust propagation framework. In: KDD, San Diego, USA, pp. 974–982 (2011) Vydiswaran, V., Zhai, C., Roth, D.: Content-driven trust propagation framework. In: KDD, San Diego, USA, pp. 974–982 (2011)
19.
20.
Zurück zum Zitat Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)CrossRef Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)CrossRef
21.
Zurück zum Zitat Yin, X., Tan, W.: Semi-supervised truth discovery. In: WWW, Lyon, France, pp. 217–226 (2011) Yin, X., Tan, W.: Semi-supervised truth discovery. In: WWW, Lyon, France, pp. 217–226 (2011)
22.
Zurück zum Zitat Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S., Hitzler, P.: Quality assessment methodologies for linked open data. Semant. Web J. 7, 63–93 (2013)CrossRef Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S., Hitzler, P.: Quality assessment methodologies for linked open data. Semant. Web J. 7, 63–93 (2013)CrossRef
23.
Zurück zum Zitat Zhao, B., Rubinstein, B.I., Gemmell, J., Han, J.: A bayesian approach to discovering truth from conflicting sources for data integration. PVLDB 5(6), 550–561 (2012). Istanbul, Turkey Zhao, B., Rubinstein, B.I., Gemmell, J., Han, J.: A bayesian approach to discovering truth from conflicting sources for data integration. PVLDB 5(6), 550–561 (2012). Istanbul, Turkey
Metadaten
Titel
Exploiting Source-Object Networks to Resolve Object Conflicts in Linked Data
verfasst von
Wenqiang Liu
Jun Liu
Haimeng Duan
Wei Hu
Bifan Wei
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-58068-5_4

Neuer Inhalt