Skip to main content
Top

2018 | OriginalPaper | Chapter

Relationship Matching of Data Sources: A Graph-Based Approach

Authors : Zaiwen Feng, Wolfgang Mayer, Markus Stumptner, Georg Grossmann, Wangyu Huang

Published in: Advanced Information Systems Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Relationship matching is a key procedure during the process of transforming structural data sources, like relational data bases, spreadsheets into the common data model. The matching task refers to the automatic identification of correspondences between relationships of source columns and the relationships of the common data model. Numerous techniques have been developed for this purpose. However, the work is missing to recognize relationship types between entities in information obtained from data sources in instance level and resolve ambiguities. In this paper, we develop a method for resolving ambiguous relationship types between entity instances in structured data. The proposed method can be used as standalone matching techniques or to complement existing relationship matching techniques of data sources. The result of an evaluation on a large real-world data set demonstrated the high accuracy of our approach (>80%).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Pittke, F., Leopold, H., Mendling, J.: Automatic detection and resolution of lexical ambiguity in process models. IEEE Trans. Softw. Eng. 41(6), 526–544 (2015)CrossRef Pittke, F., Leopold, H., Mendling, J.: Automatic detection and resolution of lexical ambiguity in process models. IEEE Trans. Softw. Eng. 41(6), 526–544 (2015)CrossRef
2.
go back to reference Taheriyan, M., Knoblock, C.A., Szekely, P., Ambite, J.L.: Leveraging linked data to discover semantic relations within data sources. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46523-4_33CrossRef Taheriyan, M., Knoblock, C.A., Szekely, P., Ambite, J.L.: Leveraging linked data to discover semantic relations within data sources. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 549–565. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-46523-4_​33CrossRef
4.
go back to reference Taheriyan, M., Knoblock, C., Szekely, P., Ambite, J.L.: Learning the semantics of structured data sources. Web Sem. Sci. Serv. Agents World Wide Web 37, 152–169 (2016)CrossRef Taheriyan, M., Knoblock, C., Szekely, P., Ambite, J.L.: Learning the semantics of structured data sources. Web Sem. Sci. Serv. Agents World Wide Web 37, 152–169 (2016)CrossRef
6.
go back to reference Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 International Conference on Data Mining (ICDM 2002). IEEE (2002) Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 International Conference on Data Mining (ICDM 2002). IEEE (2002)
7.
go back to reference Liu, H., Keselj, V., Blouin, C.: Exploring a subgraph matching approach for extracting biological events from literature. Comput. Intell. 30(3), 600–635 (2014)MathSciNetCrossRef Liu, H., Keselj, V., Blouin, C.: Exploring a subgraph matching approach for extracting biological events from literature. Comput. Intell. 30(3), 600–635 (2014)MathSciNetCrossRef
8.
go back to reference Mens, T., Van Der Straeten, R., D’Hondt, M.: Detecting and resolving model inconsistencies using transformation dependency analysis. In: Nierstrasz, O., Whittle, J., Harel, D., Reggio, G. (eds.) MODELS 2006. LNCS, vol. 4199, pp. 200–214. Springer, Heidelberg (2006). https://doi.org/10.1007/11880240_15CrossRef Mens, T., Van Der Straeten, R., D’Hondt, M.: Detecting and resolving model inconsistencies using transformation dependency analysis. In: Nierstrasz, O., Whittle, J., Harel, D., Reggio, G. (eds.) MODELS 2006. LNCS, vol. 4199, pp. 200–214. Springer, Heidelberg (2006). https://​doi.​org/​10.​1007/​11880240_​15CrossRef
10.
go back to reference Dhamankar, R., Lee, Y., Doan, A., Halevy, A., Domingos, P.: iMAP: discovering complex semantic matches between database schemas. In: International Conference on Management of Data (SIGMOD), New York, NY, pp. 383–394 (2004) Dhamankar, R., Lee, Y., Doan, A., Halevy, A., Domingos, P.: iMAP: discovering complex semantic matches between database schemas. In: International Conference on Management of Data (SIGMOD), New York, NY, pp. 383–394 (2004)
11.
go back to reference Qian, L., Cafarella, M.J., Jagadish, H.V.: Sample-driven schema mapping. In: SIGMOD 2012 (2012) Qian, L., Cafarella, M.J., Jagadish, H.V.: Sample-driven schema mapping. In: SIGMOD 2012 (2012)
12.
go back to reference Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. PVLDB 3(1), 1338–1347 (2010) Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. PVLDB 3(1), 1338–1347 (2010)
13.
go back to reference Venetis, P., Halevy, A., Madhavan, J., Pa̧sca, M., Shen, W., Wu, F., Miao, G., Wu, C.: Recovering semantics of tables on the web. Proc. VLDB Endow. 4(9), 528–538 (2011)CrossRef Venetis, P., Halevy, A., Madhavan, J., Pa̧sca, M., Shen, W., Wu, F., Miao, G., Wu, C.: Recovering semantics of tables on the web. Proc. VLDB Endow. 4(9), 528–538 (2011)CrossRef
14.
go back to reference Schaible, J., Gottron, T., Scherp, A.: TermPicker: enabling the reuse of vocabulary terms by exploiting data from the linked open data cloud. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 101–117. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_7CrossRef Schaible, J., Gottron, T., Scherp, A.: TermPicker: enabling the reuse of vocabulary terms by exploiting data from the linked open data cloud. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 101–117. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-34129-3_​7CrossRef
15.
go back to reference Popa, L., Velegrakis, Y., Hernández, M., Miller, R., Fagin, R.: Translating web data. In: VLDB, pp. 598–609 (2002)CrossRef Popa, L., Velegrakis, Y., Hernández, M., Miller, R., Fagin, R.: Translating web data. In: VLDB, pp. 598–609 (2002)CrossRef
16.
go back to reference Navigli, R., Ponzetto, S.P.: Joining forces pays off: multilingual joint word sense disambiguation. In: EMNLP-CoNLL (2012), pp. 1399–1410. ACL (2012) Navigli, R., Ponzetto, S.P.: Joining forces pays off: multilingual joint word sense disambiguation. In: EMNLP-CoNLL (2012), pp. 1399–1410. ACL (2012)
17.
go back to reference Grossmann, G., Kashefi, A.K., Feng, Z., Li, W., Kwashie, S., Liu, J., Mayer, W., Stumptner, M.: Integrated law enforcement platform federated data model. Technical report, Data 2 Decision CRC (2017) Grossmann, G., Kashefi, A.K., Feng, Z., Li, W., Kwashie, S., Liu, J., Mayer, W., Stumptner, M.: Integrated law enforcement platform federated data model. Technical report, Data 2 Decision CRC (2017)
18.
go back to reference Pantel, P., Lin, D.: Discovering word senses from text. In: SIGKDD (2002), pp. 613–619. ACM (2002) Pantel, P., Lin, D.: Discovering word senses from text. In: SIGKDD (2002), pp. 613–619. ACM (2002)
20.
Metadata
Title
Relationship Matching of Data Sources: A Graph-Based Approach
Authors
Zaiwen Feng
Wolfgang Mayer
Markus Stumptner
Georg Grossmann
Wangyu Huang
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-91563-0_33

Premium Partner