Skip to main content

13.06.2021 | Special Issue Article

A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity

verfasst von: Feng Wu, Hongwei Lv, Tongrang Fan, Wenbin Zhao, Jiaqi Wang

Erschienen in: Computing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data reuse strategy is an effective method to save storage space and improve data utilization in data management. In view of the successful application of deep learning in the field of text mining, a data reuse strategy based on deep learning is proposed for high dimensional data’s pattern and instance similarity. With traditional feature analysis and deep learning model of convolutional neural network, the pattern similarity of data dimension is analyzed so as to optimize the similar dimension pairs among high dimensional data sets. Combining inner-attention mechanism, a semantic similarity model IA-LSTM is designed for instance similarity, which can build the association mapping among data entities by the calculation of the similarity of short text. Based on the pattern and instance similarity in the proposed strategy, reusable data entities are discovered, and column storage is designed to improve data reuse efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Jha NK, Mittal S (2020) modeling data reuse in deep neural networks by taking data-types into cognizance. In: IEEE transactions on computers Jha NK, Mittal S (2020) modeling data reuse in deep neural networks by taking data-types into cognizance. In: IEEE transactions on computers
2.
Zurück zum Zitat Nie Y, Tang X, Ma Y, et al. (2020) Design of CNN computing module to improve data reuse. In: Microcontrollers and embedded systems Nie Y, Tang X, Ma Y, et al. (2020) Design of CNN computing module to improve data reuse. In: Microcontrollers and embedded systems
3.
Zurück zum Zitat Belhadi H, Akli-Astouati K, Djenouri Y et al (2020) Data mining-based approach for ontology matching problem. Appl Intell 50(11):1204–1221CrossRef Belhadi H, Akli-Astouati K, Djenouri Y et al (2020) Data mining-based approach for ontology matching problem. Appl Intell 50(11):1204–1221CrossRef
4.
Zurück zum Zitat Chung TL, Xu B, Liu YB, Ouyang CP, Li SL, Luo LY (2019) Empirical study on character level neural network classifier for Chinese text. Eng Appl Artif Intell 802(1):1–6CrossRef Chung TL, Xu B, Liu YB, Ouyang CP, Li SL, Luo LY (2019) Empirical study on character level neural network classifier for Chinese text. Eng Appl Artif Intell 802(1):1–6CrossRef
5.
Zurück zum Zitat Wei L, Guo XP (2017) Data reuse strategy based on parallel processing mechanism. Appl Res Comput 34(8):2324–2328 Wei L, Guo XP (2017) Data reuse strategy based on parallel processing mechanism. Appl Res Comput 34(8):2324–2328
6.
Zurück zum Zitat Zhao WB, Fan TR, Nie YC et al (2018) Research on attribute dimension partition based on SVM classifying and MapReduce. Wirel Pers Commun 102(4):2759–2774CrossRef Zhao WB, Fan TR, Nie YC et al (2018) Research on attribute dimension partition based on SVM classifying and MapReduce. Wirel Pers Commun 102(4):2759–2774CrossRef
7.
Zurück zum Zitat Sun ZQ, Hu W, Zhang QH, Qu YZ (2018) Bootstrapping entity alignment with knowledge graph embedding. In: Twenty-seventh international joint conference on artificial intelligence, IJCAI-18, pp 4396–4402 Sun ZQ, Hu W, Zhang QH, Qu YZ (2018) Bootstrapping entity alignment with knowledge graph embedding. In: Twenty-seventh international joint conference on artificial intelligence, IJCAI-18, pp 4396–4402
8.
Zurück zum Zitat Xu K, Wang L, Yu M, et al. (2019) Cross-lingual knowledge graph alignment via graph matching neural network. In: Proceedings of the annual meeting of theassociation for computational linguistics, ACL, pp 3156–3161 Xu K, Wang L, Yu M, et al. (2019) Cross-lingual knowledge graph alignment via graph matching neural network. In: Proceedings of the annual meeting of theassociation for computational linguistics, ACL, pp 3156–3161
9.
Zurück zum Zitat Li C, Cao Y, Hou L, et al. (2019) Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model. In: Proceedings of the conference on empirical methods in natural language processing and the international joint conference on natural language processing, EMNLP-IJCNLP, pp 2723–2732 Li C, Cao Y, Hou L, et al. (2019) Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model. In: Proceedings of the conference on empirical methods in natural language processing and the international joint conference on natural language processing, EMNLP-IJCNLP, pp 2723–2732
10.
Zurück zum Zitat Paulheim H (2017) Data-driven joint debugging of the dbpedia mappings and ontology. In: European semantic web conference. Springer, Cham, pp 404–418 Paulheim H (2017) Data-driven joint debugging of the dbpedia mappings and ontology. In: European semantic web conference. Springer, Cham, pp 404–418
11.
Zurück zum Zitat Majid M, Wout H, Tan YH (2018) A comparative study of ontology matching systems via inferential statistics. IEEE Trans Knowl Data Eng 31:615–628 Majid M, Wout H, Tan YH (2018) A comparative study of ontology matching systems via inferential statistics. IEEE Trans Knowl Data Eng 31:615–628
12.
Zurück zum Zitat Xue X, Liu J (2017) A compact hybrid evolutionary algorithm for large scale instance matching in linked open data cloud. Int J Artif Intell Tools 26(4):1750013CrossRef Xue X, Liu J (2017) A compact hybrid evolutionary algorithm for large scale instance matching in linked open data cloud. Int J Artif Intell Tools 26(4):1750013CrossRef
13.
Zurück zum Zitat Ochieng P, Kyanda S (2018) A statistically-based ontology matching tool. Distrib Parallel Databases 36(1):195–217CrossRef Ochieng P, Kyanda S (2018) A statistically-based ontology matching tool. Distrib Parallel Databases 36(1):195–217CrossRef
14.
Zurück zum Zitat Sang CJ, Pierro MD (2018) Improving trading technical analysis with TensorFlow long short-term memory (LSTM) neural network. J Finance Data Sci 2(1):1–6CrossRef Sang CJ, Pierro MD (2018) Improving trading technical analysis with TensorFlow long short-term memory (LSTM) neural network. J Finance Data Sci 2(1):1–6CrossRef
15.
Zurück zum Zitat Pratim Barman P, Boruah A (2018) A RNN based approach for next word prediction in assamese phonetic transcription. Proc Comput Sci 143(2):825–834 Pratim Barman P, Boruah A (2018) A RNN based approach for next word prediction in assamese phonetic transcription. Proc Comput Sci 143(2):825–834
16.
Zurück zum Zitat Wang HY, Luo C, Wang XY (2019) Synchronization and identification of nonlinear systems by using a novel self-evolving interval type-2 fuzzy LSTM-neural network. Eng Appl Artif Intell 81(1):123–136 Wang HY, Luo C, Wang XY (2019) Synchronization and identification of nonlinear systems by using a novel self-evolving interval type-2 fuzzy LSTM-neural network. Eng Appl Artif Intell 81(1):123–136
17.
Zurück zum Zitat Wu Y, Liu X, Feng Y, et al. (2019) Relation-aware entity alignment for heterogeneous knowledge graphs. In: Proceedings of the international joint conference on artificial intelligence, IJCAI, pp 5278–5284 Wu Y, Liu X, Feng Y, et al. (2019) Relation-aware entity alignment for heterogeneous knowledge graphs. In: Proceedings of the international joint conference on artificial intelligence, IJCAI, pp 5278–5284
18.
Zurück zum Zitat Zhao WB, Fan TR, Yin ZX et al (2020) An evaluation method of scientific research team influence based on heterogeneity and node similarity of content and structure. J Ambient Intell Human Comput 11:3617–3626CrossRef Zhao WB, Fan TR, Yin ZX et al (2020) An evaluation method of scientific research team influence based on heterogeneity and node similarity of content and structure. J Ambient Intell Human Comput 11:3617–3626CrossRef
19.
Zurück zum Zitat Sun Z, Wang C, Hu W, et al. (2020) Knowledge graph alignment network with gated multi-hop neighborhood aggregation. In: Proceedings of the AAAI conference on artificial intelligence, AAAI, pp 222–229 Sun Z, Wang C, Hu W, et al. (2020) Knowledge graph alignment network with gated multi-hop neighborhood aggregation. In: Proceedings of the AAAI conference on artificial intelligence, AAAI, pp 222–229
Metadaten
Titel
A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity
verfasst von
Feng Wu
Hongwei Lv
Tongrang Fan
Wenbin Zhao
Jiaqi Wang
Publikationsdatum
13.06.2021
Verlag
Springer Vienna
Erschienen in
Computing
Print ISSN: 0010-485X
Elektronische ISSN: 1436-5057
DOI
https://doi.org/10.1007/s00607-021-00964-4