Skip to main content
Log in

Big-data: transformation from heterogeneous data to semantically-enriched simplified data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In big data, data originates from many distributed and different sources in the shape of audio, video, text and sound on the bases of real time; which makes it massive and complex for traditional systems to handle. For this, data representation is required in the form of semantically-enriched for better utilization but keeping it simplified is essential. Such a representation is possible by using Resource Description Framework (RDF) introduced by World Wide Web Consortium (W3C). Bringing and transforming data from different sources in different formats into the RDF form having rapid ratio of increase is still an issue. This requires improvements to cover transition of information among all applications with induction of simplicity to reduce complexities of prominently storing data. With the improvements induced in the shape of big data representation for transformation of data to form into Extensible Markup Language (XML) and then into RDF triple as linked in real time. It is highly needed to make transformation more data friendly. We have worked on this study on developing a process which translates data in a way without any type of information loss. This requires to manage data and metadata in such a way so they may not improve complexity and keep the strong linkage among them. Metadata is being kept generalized to keep it more useful than being dedicated to specific types of data source. Which includes a model explaining its functionality and corresponding algorithms focusing how it gets implemented. A case study is used to show transformation of relational database textual data into RDF, and at end results are being discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. R Akerkar (2013) Big data computing: CRC Press

  2. E Antezana, Kuiper M, Mironov V (2009) Biological knowledge management: the emerging role of the semantic web technologies. Brief Bioinform 10:392–407

    Article  Google Scholar 

  3. S Auer, A-C N Ngomo, P. Frischmuth, J Klimek (2013) Linked Data in Enterprise Integration, Big Data Computing, p. 169

  4. BFdF Souza, ACO Salgado, MdCMC Batista (2013) Information Quality Criteria Analysis in Query Reformulation in Dynamic Distributed Environments

  5. Bizer C, Boncz P, Brodie ML, Erling O (2012) The meaningful use of big data: four perspectives--four challenges. ACM SIGMOD Record 40:56–60

    Article  Google Scholar 

  6. Broekstra J, Klein M, Decker S, Fensel D, Van F, Horrocks I (2001) Enabling Knowledge Representation on the Web by Extending RDF Schema, WWW01, May 1–5, 2001 Hong Kong

  7. S Christodoulou, N Karacapilidis, M Tzagarakis, V Dimitrova, G de la Calle (2014) Data Intensiveness and Cognitive Complexity in Contemporary Collaboration and Decision Making Settings, Mastering Data-Intensive Collaboration Decision Making, ed: Springer, pp. 17–48

  8. A Cuzzocrea, C Diamantini, L Genga, D Potena, E Storti (2014) A composite methodology for supporting collaboration pattern discovery via semantic enrichment and multidimensional analysis, in Soft Computing Pattern Recognition (SoCPaR), 2014 6th Int Conf, pp. 459–464

  9. de Diego R, Martínez J-F, Rodríguez-Molina J, Cuerva A (2014) A semantic middleware architecture focused on data and heterogeneity management within the smart grid. Energies 7:5953–5994

    Article  Google Scholar 

  10. M Dörk (2012) Visualization for Search: Exploring Complex and Dynamic Information Spaces, Citeseer

  11. A Eberhart (2003) Ontology-based Infrastructure for Intelligent Applications, Universitätsbibliothek

  12. Frasincar F, Houben G, Vdovjak R, Barna P (2002) RAL: an Algebra for Querying RDF, Proc 3rd Int Conf Web Information Syst Eng, IEEE

  13. Frey JG, Bird CL (2013) Cheminformatics and the semantic web: adding value with linked data and enhanced provenance. Wiley Interdisciplinary Rev: Computational Mol Sci 3:465–481

    Google Scholar 

  14. D Gentner, F van Harmelen, P Hitzler, K Janowicz, K-U Kuhnberger (2012) Cognitive approaches for the semantic web

  15. H-M Haav, P KĂĽngas (2013) Semantic Data Interoperability: The Key Problem of Big Data, Big Data Computing, p. 245

  16. Herrmann-Krotz G, Kohlmetz D, Müller-Rowold B (2011) Publikationen. New Rev Hypermedia Multimedia 20:53–77

    Google Scholar 

  17. Hert M, Reif G, Gall HC (2011) A comparison of RDB-to-RDF mapping languages, In Proc 7th Int Conf Semantic Syst, pp. 25–32, ACM

  18. Hitzler P, Janowicz K (2013) Linked data, big data, and the 4th paradigm. Semantic Web 4:233–235

    Google Scholar 

  19. Hsu PL, Hsieh HS, Liang JH, Chen YS (2015) Mining various semantic relationships from unstructured user-generated web data. Web Semant Sci Serv Agents World Wide Web 31:27–38

    Article  Google Scholar 

  20. Hu C, Xu Z, Liu Y, Mei L, Chen L, Luo X (2014) Semantic link network-based model for organizing multimedia big data. Emerging Topics Comput, IEEE Trans 2(3):376–387

    Article  Google Scholar 

  21. HM Jamil (2014) Mapping abstract queries to big data web resources for on-the-fly data integration and information retrieval, in Data Engineering Workshops (ICDEW), IEEE 30th Int Conf, pp. 62–67

  22. Khalili A, Auer S (2013) User interfaces for semantic authoring of textual content: a systematic literature review. Web Semant Sci Serv Agents World Wide Web 22:1–18

    Article  Google Scholar 

  23. H Kim, K Kim (2014) Semantic levels of information hierarchy for urban street navigation, Int Conf Big Data Smart Computing (BIGCOMP), pp. 235–240

  24. Kim Y, Kim B, Lim H (2006) The Index Organizations for RDF and RDF Schema, ICACT

  25. Manola F, Miller E, McBride B (2004) RDF primer. W3C Recommendation 10:1–107

    Google Scholar 

  26. Manuja M, Garg D (2011) Semantic web mining of un-structured data: challenges and opportunities. Int J Eng (IJE) 5(3):268

    Google Scholar 

  27. Margara A, Urbani J, van Harmelen F, Bal H (2014) Streaming the web: reasoning over dynamic data. Web Semant Sci Serv Agents World Wide Web 25:24–44

    Article  Google Scholar 

  28. Martens W, Neven F, Schwentick T, Bex GJ (2006) Expressiveness and complexity of XML schema. ACM Trans Database Syst (TODS) 31(3):770–813

    Article  Google Scholar 

  29. SRH Noori (2011) A Large Scale Distributed Knowledge Organization System, University of Trento

  30. SF Pileggi, R Amor (2015) Semantic Geographic Space: From Big Data to Ecosystems of Data, in Big Data in Complex Systems, ed: Springer, pp. 351–374

  31. D Riemer, L Stojanovic, N Stojanovic (2014) SEPP: Semantics-Based Management of Fast Data Streams, in Service-Oriented Computing and Applications (SOCA), 2014 I.E. 7th International Conf, pp. 113–118

  32. OR Rocha (2014) Context-Aware Service Creation on the Semantic Web, Politecnico di Torino

  33. MA Sakka, B Defude (2012) Towards a Scalable Semantic Provenance Management System, in Transactions on Large-Scale Data-and Knowledge-Centered Systems VII, ed: Springer, pp. 96–127

  34. P Serrano-Alvarado, E Desmontils (2013) Personal linked data: a solution to manage user’s privacy on the web, in Atelier sur la Protection de la Vie Privée (APVP)

  35. S Sicari, C Cappiello, F De Pellegrini, D Miorandi, A Coen-Porisini (2014) A security-and quality-aware system architecture for Internet of Things, Information Systems Frontiers, pp. 1–13

  36. R Soussi (2012) Querying and extracting heterogeneous graphs from structured data and unstrutured content, Ecole Centrale Paris

  37. M Spaniol (2014) A Framework for Temporal Web Analytics, Université de Caen

  38. M Strohbach, H Ziekow, V Gazis, N Akiva (2015) Towards a Big Data Analytics Framework for IoT and Smart City Applications, in Modeling and Processing for Next-Generation Big-Data Technologies, ed: Springer, pp. 257–282

  39. PTT Thuy, Y-K Lee, S Lee, B-S Jeong (2007) Transforming valid XML documents into RDF via RDF schema. pp. 35–40

  40. Wu X, Zhu X, Wu G-Q, Ding W (2014) Data mining with big data. Knowledge Data Eng, IEEE Trans 26:97–107

    Article  Google Scholar 

  41. J Zhao, O Corcho, P Missier, K Belhajjame, D Newmann, D De Roure et al. (2011) eScience, Handbook of Semantic Web Technologies, pp. 701–736

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Farhan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malik, K.R., Ahmad, T., Farhan, M. et al. Big-data: transformation from heterogeneous data to semantically-enriched simplified data. Multimed Tools Appl 75, 12727–12747 (2016). https://doi.org/10.1007/s11042-015-2918-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2918-5

Keywords

Navigation