Skip to main content
Top

2023 | OriginalPaper | Chapter

A Review of Integration of Data Warehousing and WWW in the Last Decade

Authors : Priyanka Bhutani, Anju Saha, Anjana Gosain

Published in: Proceedings of Third International Conference on Computing, Communications, and Cyber-Security

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The data warehouse (DW) is a powerful technology to store and analyse huge volumes of historical data supporting business intelligence. The World Wide Web, or simply the Web, has revolutionized the way to author, share, search and access information. In the past few decades, a significant amount of research has been done in both the DW and Web domains. Interestingly, the integration of data warehousing and the World Wide Web has led to a variety of new opportunities as well as challenges for the researchers and the industry. The main motivation to conduct this systematic review of the relevant research works integrating DW and the Web in the last decade is to provide the groundwork for the research advancement in this field. A total of 27 relevant research works were identified for the research. An in-depth analysis was performed to find the problems addressed, the most relevant research categories, the tools or techniques applied and the application domains of these research works. Encouragingly, our results yielded seven categories and four sub-categories of research employing the integration of DW and Web. On the other hand, we found some open research issues, and the future research works should focus on generalized solutions for handling semantic heterogeneity, change propagation and quality analysis of identified Web sources for the DW.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chandra, P., & Gupta, M. K. (2018). Comprehensive survey on data warehousing research. International Journal of Information Technology, 10, 217–224.CrossRef Chandra, P., & Gupta, M. K. (2018). Comprehensive survey on data warehousing research. International Journal of Information Technology, 10, 217–224.CrossRef
2.
go back to reference Perez, J. M., Berlanga, R., Aramburu, M. J., & Pedersen, T. B. (2008). Integrating data warehouses with web data: A survey. IEEE Transactions on Knowledge and Data Engineering, 20, 940–955.CrossRef Perez, J. M., Berlanga, R., Aramburu, M. J., & Pedersen, T. B. (2008). Integrating data warehouses with web data: A survey. IEEE Transactions on Knowledge and Data Engineering, 20, 940–955.CrossRef
3.
go back to reference Inmon, W. H. (2005). Building the data warehouse. Wiley. Inmon, W. H. (2005). Building the data warehouse. Wiley.
4.
go back to reference Brajkovic, H., Jaksic, D., & Poscic, P. (2020). Data warehouse and data quality—An overview. In Central European Conference on Information and Intelligent Systems 2020 (pp. 17–24). Brajkovic, H., Jaksic, D., & Poscic, P. (2020). Data warehouse and data quality—An overview. In Central European Conference on Information and Intelligent Systems 2020 (pp. 17–24).
5.
go back to reference Kimball, R., & Ross, M. (2002). The data warehouse toolkit. Wiley. Kimball, R., & Ross, M. (2002). The data warehouse toolkit. Wiley.
6.
go back to reference Bhutani, P., & Saha, A. (2019). Towards an evolved information food chain of world wide web and taxonomy of semantic web mining. In S. Bhattacharyya, A. E. Hassanien, D. Gupta, A. Khanna, & I. Pan (Eds.), International Conference on Innovative Computing and Communications (pp. 443–451). Springer. Bhutani, P., & Saha, A. (2019). Towards an evolved information food chain of world wide web and taxonomy of semantic web mining. In S. Bhattacharyya, A. E. Hassanien, D. Gupta, A. Khanna, & I. Pan (Eds.), International Conference on Innovative Computing and Communications (pp. 443–451). Springer.
8.
go back to reference Zhu, Y., & Buchmann, A. (2002). Evaluating and selecting web sources as external information resources of a data warehouse. In Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002 (pp. 149–160). IEEE. Zhu, Y., & Buchmann, A. (2002). Evaluating and selecting web sources as external information resources of a data warehouse. In Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002 (pp. 149–160). IEEE.
9.
go back to reference Bhutani, P., Saha, A., & Gosain, A. (2021). Empirical validation of WebQMDW model for quality-based external web data source incorporation in a data warehouse. IJACSA, 12. Bhutani, P., Saha, A., & Gosain, A. (2021). Empirical validation of WebQMDW model for quality-based external web data source incorporation in a data warehouse. IJACSA, 12.
10.
go back to reference Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., & Linkman, S. (2009). Systematic literature reviews in software engineering—A systematic literature review. Information and Software Technology, 51, 7–15.CrossRef Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., & Linkman, S. (2009). Systematic literature reviews in software engineering—A systematic literature review. Information and Software Technology, 51, 7–15.CrossRef
16.
go back to reference Liu, X., & Luo, X. (2010). A data warehouse solution for e-Government. International Journal of Research and Reviews in Applied Sciences, 4, 101–105. Liu, X., & Luo, X. (2010). A data warehouse solution for e-Government. International Journal of Research and Reviews in Applied Sciences, 4, 101–105.
17.
go back to reference Sudhamathy, G. (2010). Mining web logs: An automated approach. In Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India (pp. 1–4). Sudhamathy, G. (2010). Mining web logs: An automated approach. In Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India (pp. 1–4).
18.
go back to reference Chen, X., Wu, Y., & Cheng, H. (2010). Quotient space granular computing for the Click-stream data warehouse in web servers. In 2010 International Conference on Computer and Communication Technologies in Agriculture Engineering (pp. 93–96). IEEE, Chengdu, China. Chen, X., Wu, Y., & Cheng, H. (2010). Quotient space granular computing for the Click-stream data warehouse in web servers. In 2010 International Conference on Computer and Communication Technologies in Agriculture Engineering (pp. 93–96). IEEE, Chengdu, China.
19.
go back to reference Moya, L. G., Kudama, S., Cabo, M. J. A., & Llavori, R. B. (2011). Integrating web feed opinions into a corporate data warehouse. In Proceedings of the 2nd International Workshop on Business intelligence and the WEB—BEWEB ’11 (pp. 20–27), Uppsala, Sweden. ACM Press. Moya, L. G., Kudama, S., Cabo, M. J. A., & Llavori, R. B. (2011). Integrating web feed opinions into a corporate data warehouse. In Proceedings of the 2nd International Workshop on Business intelligence and the WEB—BEWEB ’11 (pp. 20–27), Uppsala, Sweden. ACM Press.
20.
go back to reference Nguyen, B., Vion, A., Dudouet, F.-X., Colazzo, D., Manolescu, I., & Senellart, P. (2011). XML content warehousing: Improving sociological studies of mailing lists and web data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 112, 5–31.CrossRef Nguyen, B., Vion, A., Dudouet, F.-X., Colazzo, D., Manolescu, I., & Senellart, P. (2011). XML content warehousing: Improving sociological studies of mailing lists and web data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 112, 5–31.CrossRef
21.
go back to reference Marotta, A., González, L., & Ruggia, R. (2012). A quality aware service-oriented web warehouse platform. In Proceedings of the 2012 Joint EDBT/ICDT Workshops on EDBT-ICDT ’12 (p. 29), Berlin, Germany. ACM Press. Marotta, A., González, L., & Ruggia, R. (2012). A quality aware service-oriented web warehouse platform. In Proceedings of the 2012 Joint EDBT/ICDT Workshops on EDBT-ICDT ’12 (p. 29), Berlin, Germany. ACM Press.
22.
go back to reference Lv, H. L., Van, A. M., Cheng, V. L., & Wang, F. V. (2012). Design of cloud data warehouse and its application in smart grid. In International Conference on Automatic Control and Artificial Intelligence (ACAI 2012) (pp. 849–852), Xiamen, China. Institution of Engineering and Technology. Lv, H. L., Van, A. M., Cheng, V. L., & Wang, F. V. (2012). Design of cloud data warehouse and its application in smart grid. In International Conference on Automatic Control and Artificial Intelligence (ACAI 2012) (pp. 849–852), Xiamen, China. Institution of Engineering and Technology.
23.
go back to reference Ali, A. A., Abdelrahman, T. A., & Mohamed, W. M. (2013). Using schema matching in data transformation for warehousing web data. International Journal of Information Technologies and Knowledge, 7, 230–240. Ali, A. A., Abdelrahman, T. A., & Mohamed, W. M. (2013). Using schema matching in data transformation for warehousing web data. International Journal of Information Technologies and Knowledge, 7, 230–240.
24.
go back to reference Domingues, M. A., Soares, C., Jorge, A. M., & Rezende, S. O. (2014). A data warehouse to support web site automation. Journal of the Brazilian Computer Society, 20, 11.CrossRef Domingues, M. A., Soares, C., Jorge, A. M., & Rezende, S. O. (2014). A data warehouse to support web site automation. Journal of the Brazilian Computer Society, 20, 11.CrossRef
25.
go back to reference Mehmood, R., Shaikh, M. U., Ma, L., & Bie, R. (2014). Enhanced web warehouse model: A secure approach. In 2014 International Conference on Identification, Information and Knowledge in the Internet of Things (pp. 88–91), Beijing, China. IEEE. Mehmood, R., Shaikh, M. U., Ma, L., & Bie, R. (2014). Enhanced web warehouse model: A secure approach. In 2014 International Conference on Identification, Information and Knowledge in the Internet of Things (pp. 88–91), Beijing, China. IEEE.
26.
go back to reference Samuel, J. (2014). Feeding a data warehouse with data coming from web services. A mediation approach for the DaWeS prototype (Doctoral thesis), Université Blaise Pascal-Clermont-Ferrand II. Samuel, J. (2014). Feeding a data warehouse with data coming from web services. A mediation approach for the DaWeS prototype (Doctoral thesis), Université Blaise Pascal-Clermont-Ferrand II.
27.
go back to reference Kavitha, P., & Vydehi, M. S. (2014). Query processing of XML data warehouse using XML pattern matching techniques. International Journal of Engineering Research, 3. Kavitha, P., & Vydehi, M. S. (2014). Query processing of XML data warehouse using XML pattern matching techniques. International Journal of Engineering Research, 3.
28.
go back to reference Delgado, A., & Marotta, A. (2015). Automating the process of building flexible web warehouses with BPM systems. In 2015 Latin American Computing Conference (CLEI) (pp. 1–11), Arequipa, Peru. IEEE. Delgado, A., & Marotta, A. (2015). Automating the process of building flexible web warehouses with BPM systems. In 2015 Latin American Computing Conference (CLEI) (pp. 1–11), Arequipa, Peru. IEEE.
29.
go back to reference Jiang, Y., Shao, Z., Guo, Y., Zhang, H., & Sun, L. (2015). Building XML data warehouse with data reconstruction by knowledge graph. In 2015 IEEE Fifth International Conference on Big Data and Cloud Computing (pp. 314–320), Dalian, China. IEEE. Jiang, Y., Shao, Z., Guo, Y., Zhang, H., & Sun, L. (2015). Building XML data warehouse with data reconstruction by knowledge graph. In 2015 IEEE Fifth International Conference on Big Data and Cloud Computing (pp. 314–320), Dalian, China. IEEE.
30.
go back to reference Mehmood, R., Shaikh, M. U., Bie, R., Dawood, H., & Dawood, H. (2015). IoT-enabled web warehouse architecture: A secure approach. Personal and Ubiquitous Computing, 19, 1157–1167.CrossRef Mehmood, R., Shaikh, M. U., Bie, R., Dawood, H., & Dawood, H. (2015). IoT-enabled web warehouse architecture: A secure approach. Personal and Ubiquitous Computing, 19, 1157–1167.CrossRef
31.
go back to reference Om Sharan Sinha, H. (2016). An improvised Topsis approach to select web source as external data source for web warehousing. Indian Journal of Science and Technology, 9. Om Sharan Sinha, H. (2016). An improvised Topsis approach to select web source as external data source for web warehousing. Indian Journal of Science and Technology, 9.
32.
go back to reference Nikam, R. V., Shirwaikar, S., & Kharat, V. S. (2016). Conceptual model for a data warehouse on the web. In 2016 IEEE Bombay Section Symposium (IBSS) (pp. 1–6), Baramati, India. IEEE. Nikam, R. V., Shirwaikar, S., & Kharat, V. S. (2016). Conceptual model for a data warehouse on the web. In 2016 IEEE Bombay Section Symposium (IBSS) (pp. 1–6), Baramati, India. IEEE.
33.
go back to reference Ravat, F., & Song, J. (2016). Enabling OLAP analyses on the web of data. In 2016 Eleventh International Conference on Digital Information Management (ICDIM) (pp. 215–224), Porto, Portugal. IEEE. Ravat, F., & Song, J. (2016). Enabling OLAP analyses on the web of data. In 2016 Eleventh International Conference on Digital Information Management (ICDIM) (pp. 215–224), Porto, Portugal. IEEE.
34.
go back to reference Gupta, G., Kumar, N., & Chhabra, I. (2017). Data acquisition based web scrapping algorithm for extraction of data sets from patent portal. In International Conference on Communication, Computing and Networking (ICCCN-2017), Chandigarh, India. NITTTR. Gupta, G., Kumar, N., & Chhabra, I. (2017). Data acquisition based web scrapping algorithm for extraction of data sets from patent portal. In International Conference on Communication, Computing and Networking (ICCCN-2017), Chandigarh, India. NITTTR.
35.
go back to reference Alrefae, A., & Cao, J. (2017). Intensional XML-enabled web-based real-time decision support system. In 2017 International Conference on Computing Networking and Informatics (ICCNI) (pp. 1–10), Lagos. IEEE. Alrefae, A., & Cao, J. (2017). Intensional XML-enabled web-based real-time decision support system. In 2017 International Conference on Computing Networking and Informatics (ICCNI) (pp. 1–10), Lagos. IEEE.
36.
go back to reference Gupta, G., Kumar, N., & Chhabra, I. (2018). Optimised transformation algorithm for hadoop data loading in web ETL framework. ICST Transactions on Scalable Information Systems, 160600. Gupta, G., Kumar, N., & Chhabra, I. (2018). Optimised transformation algorithm for hadoop data loading in web ETL framework. ICST Transactions on Scalable Information Systems, 160600.
37.
go back to reference Strand, M., & Syberfeldt, A. (2019). Incorporating external data into a BI solution at a public waste management organization. International Journal of Business Intelligence Research, 10, 36–56.CrossRef Strand, M., & Syberfeldt, A. (2019). Incorporating external data into a BI solution at a public waste management organization. International Journal of Business Intelligence Research, 10, 36–56.CrossRef
38.
go back to reference Walha, A., Ghozzi, F., & Gargouri, F. (2019). From user generated content to social data warehouse: Processes, operations and data modelling. IJWET, 14, 203.CrossRef Walha, A., Ghozzi, F., & Gargouri, F. (2019). From user generated content to social data warehouse: Processes, operations and data modelling. IJWET, 14, 203.CrossRef
39.
go back to reference Agapito, G., Zucco, C., & Cannataro, M. (2020). COVID-WAREHOUSE: A data warehouse of Italian covid-19, pollution, and climate data. IJERPH, 17, 5596.CrossRef Agapito, G., Zucco, C., & Cannataro, M. (2020). COVID-WAREHOUSE: A data warehouse of Italian covid-19, pollution, and climate data. IJERPH, 17, 5596.CrossRef
40.
go back to reference Sellami, A., Nabli, A., & Gargouri, F. (2020). Graph NoSQL data warehouse creation. In Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services (pp. 34–38), Chiang Mai, Thailand. ACM. Sellami, A., Nabli, A., & Gargouri, F. (2020). Graph NoSQL data warehouse creation. In Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services (pp. 34–38), Chiang Mai, Thailand. ACM.
41.
go back to reference Bhutani, P., Saha, A., & Gosain, A. (2020). WSEMQT : A novel approach for quality-based evaluation of web data sources for a data warehouse. IET Software, 14, 806–815.CrossRef Bhutani, P., Saha, A., & Gosain, A. (2020). WSEMQT : A novel approach for quality-based evaluation of web data sources for a data warehouse. IET Software, 14, 806–815.CrossRef
Metadata
Title
A Review of Integration of Data Warehousing and WWW in the Last Decade
Authors
Priyanka Bhutani
Anju Saha
Anjana Gosain
Copyright Year
2023
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-19-1142-2_58