Skip to main content
Top

2017 | OriginalPaper | Chapter

SourceVote: Fusing Multi-valued Data via Inter-source Agreements

Authors : Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Mahmoud Barhamgi, Lina Yao, Anne H. H. Ngu

Published in: Conceptual Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data fusion is a fundamental research problem of identifying true values of data items of interest from conflicting multi-sourced data. Although considerable research efforts have been conducted on this topic, existing approaches generally assume every data item has exactly one true value, which fails to reflect the real world where data items with multiple true values widely exist. In this paper, we propose a novel approach, SourceVote, to estimate value veracity for multi-valued data items. SourceVote models the endorsement relations among sources by quantifying their two-sided inter-source agreements. In particular, two graphs are constructed to model inter-source relations. Then two aspects of source reliability are derived from these graphs and are used for estimating value veracity and initializing existing data fusion methods. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Here we neglect the smoothing links, i.e., no link would be there between two sources in the graphs if no common value exists between the two sources.
 
2
Note that we did not apply SourceVote to Voting, because Voting assumes all sources are equally reliable.
 
Literature
1.
go back to reference Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)CrossRef Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)CrossRef
2.
go back to reference Dong, X.L., et al.: Less is more: selecting sources wisely for integration. VLDB Endow. (PVLDB) 6(2), 37–48 (2013)CrossRef Dong, X.L., et al.: Less is more: selecting sources wisely for integration. VLDB Endow. (PVLDB) 6(2), 37–48 (2013)CrossRef
3.
go back to reference Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014), New York, USA (2014) Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2014), New York, USA (2014)
4.
go back to reference Galland, A., et al.: Corroborating information from disagreeing views. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM 2010), New York, USA (2010) Galland, A., et al.: Corroborating information from disagreeing views. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM 2010), New York, USA (2010)
5.
go back to reference Gleich, D.F., et al.: Tracking the random surfer: empirically measured teleportation parameters in pagerank. In: Proceedings of the 19th International World Wide Web Conference (WWW 2010), Raleigh, NC, USA (2010) Gleich, D.F., et al.: Tracking the random surfer: empirically measured teleportation parameters in pagerank. In: Proceedings of the 19th International World Wide Web Conference (WWW 2010), Raleigh, NC, USA (2010)
7.
go back to reference Li, X., et al.: Truth finding on the deep web: is the problem solved? VLDB Endow. (PVLDB) 6(2), 97–108 (2013)CrossRef Li, X., et al.: Truth finding on the deep web: is the problem solved? VLDB Endow. (PVLDB) 6(2), 97–108 (2013)CrossRef
8.
go back to reference Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., Han, J.: A survey on truth discovery. ACM SIGKDD Explor. Newsl. 17(2), 1–16 (2015)CrossRef Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., Han, J.: A survey on truth discovery. ACM SIGKDD Explor. Newsl. 17(2), 1–16 (2015)CrossRef
9.
go back to reference Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: Proceedings of the 23th International Conference on Computational Linguistics (COLING 2010), Stroudsburg, PA, USA (2010) Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: Proceedings of the 23th International Conference on Computational Linguistics (COLING 2010), Stroudsburg, PA, USA (2010)
10.
11.
go back to reference Wang, X., et al.: An integrated Bayesian approach for effective multi-truth discovery. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management (CIKM 2015), Melbourne, Australia (2015) Wang, X., et al.: An integrated Bayesian approach for effective multi-truth discovery. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management (CIKM 2015), Melbourne, Australia (2015)
12.
go back to reference Wang, X., et al.: Empowering truth discovery with multi-truth prediction. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), pp. 881–890 (2016) Wang, X., et al.: Empowering truth discovery with multi-truth prediction. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), pp. 881–890 (2016)
13.
go back to reference Wang, X., et al.: Truth discovery via exploiting implications from multi-source data. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), pp. 861–870 (2016) Wang, X., et al.: Truth discovery via exploiting implications from multi-source data. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), pp. 861–870 (2016)
14.
go back to reference Yin, X., et al.: Truth discovery with multiple conflicting information providers on the web. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), San Jose, California, USA (2007) Yin, X., et al.: Truth discovery with multiple conflicting information providers on the web. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), San Jose, California, USA (2007)
15.
go back to reference Zhao, B., et al.: A Bayesian approach to discovering truth from conflicting sources for data integration. The VLDB Endow. (PVLDB) 5(6), 550–561 (2012)CrossRef Zhao, B., et al.: A Bayesian approach to discovering truth from conflicting sources for data integration. The VLDB Endow. (PVLDB) 5(6), 550–561 (2012)CrossRef
Metadata
Title
SourceVote: Fusing Multi-valued Data via Inter-source Agreements
Authors
Xiu Susie Fang
Quan Z. Sheng
Xianzhi Wang
Mahmoud Barhamgi
Lina Yao
Anne H. H. Ngu
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-69904-2_13

Premium Partner