Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2023

01.12.2023 | Original Article

Fulmqa: a fuzzy logic-based model for social media data quality assessment

verfasst von: Oumaima Reda, Ahmed Zellou

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, with the advancement of information technology and the growing importance of social media, social media platforms such as Twitter and Facebook have become deeply entrenched in our lives and are growing rapidly around the world. On these platforms, users have the freedom to share and publish whatever they want without control, leading to faster generation and dissemination of information, as a result, rumors and false information can then be spread and seen by a larger number of people. Consequently, assessing the quality of these data proves to be of major importance. In this paper, we aim to present a new and efficient quality assessment model including the quality metrics needed for an accurate assessment. Our work presents an extension of the SMDQM (Social Media Data Quality Model) (Reda and Zellou, in: International conference on innovative research in applied science, engineering and technology, IEEE, 2022), which is used to assess the quality of data provided by social media platforms. We suggest using fuzzy logic to describe data quality metrics in order to overcome imprecision and subjectivity. Next, we perform extensive experiments to evaluate the performance of our model using a real-world implementation by performing an evaluation on two separate Twitter data sets. The findings indicate that the model can successfully evaluate all tweets with high performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasi MA, Liu H (2013) Measuring user credibility in social media. In: International conference on social computing, behavioral-cultural modeling, and prediction. Springer, Berlin, Heidelberg, pp 441–448 Abbasi MA, Liu H (2013) Measuring user credibility in social media. In: International conference on social computing, behavioral-cultural modeling, and prediction. Springer, Berlin, Heidelberg, pp 441–448
Zurück zum Zitat Ajarroud O, Zellou A, Idri A (2018) A new filtering-based query processing: improving semantic caching efficiency in mediation systems. In: Proceedings of the 12th international conference on intelligent systems: theories and applications, SITA’18, October 2018, Article no. 12, ACM international conference proceeding series. Rabat, Morocco, pp 1–6 Ajarroud O, Zellou A, Idri A (2018) A new filtering-based query processing: improving semantic caching efficiency in mediation systems. In: Proceedings of the 12th international conference on intelligent systems: theories and applications, SITA’18, October 2018, Article no. 12, ACM international conference proceeding series. Rabat, Morocco, pp 1–6
Zurück zum Zitat Alizamini FG, Pedram MM, Alishahi M, Badie K (2010) Data quality improvement using fuzzy association rules. In: 2010 International conference on electronics and information engineering, vol 1. IEEE, pp V1–468 Alizamini FG, Pedram MM, Alishahi M, Badie K (2010) Data quality improvement using fuzzy association rules. In: 2010 International conference on electronics and information engineering, vol 1. IEEE, pp V1–468
Zurück zum Zitat Alrubaian M, Al-Qurishi M, Al-Rakhami M, Hassan MM, Alamri A (2017) Reputation-based credibility analysis of twitter social network users. Concurr Comput Pract Exp 29(7):e3873CrossRef Alrubaian M, Al-Qurishi M, Al-Rakhami M, Hassan MM, Alamri A (2017) Reputation-based credibility analysis of twitter social network users. Concurr Comput Pract Exp 29(7):e3873CrossRef
Zurück zum Zitat Alrubaian M, Al-Qurishi M, Alamri A, Al-Rakhami M, Hassan MM, Fortino G (2018) Credibility in online social networks: a survey. IEEE Access 7:2828–2855CrossRef Alrubaian M, Al-Qurishi M, Alamri A, Al-Rakhami M, Hassan MM, Fortino G (2018) Credibility in online social networks: a survey. IEEE Access 7:2828–2855CrossRef
Zurück zum Zitat Ardagna D, Cappiello C, Samá W, Vitali M (2018) Context-aware data quality assessment for big data. Futur Gener. Comput. Syst. 89:548–562CrossRef Ardagna D, Cappiello C, Samá W, Vitali M (2018) Context-aware data quality assessment for big data. Futur Gener. Comput. Syst. 89:548–562CrossRef
Zurück zum Zitat Berti-Équille L (1999) Qualité des données multi-sources et recommandation multi-critère. In: Actes du congrès francophone INFormatique des ORganisations et systèmes d’INformation décisionnels (INFORSID’99), pp 185–204 Berti-Équille L (1999) Qualité des données multi-sources et recommandation multi-critère. In: Actes du congrès francophone INFormatique des ORganisations et systèmes d’INformation décisionnels (INFORSID’99), pp 185–204
Zurück zum Zitat Bird S (2006) NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL 2006 interactive presentation sessions, pp 69–72 Bird S (2006) NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL 2006 interactive presentation sessions, pp 69–72
Zurück zum Zitat Caballero I, Verbo E, Serrano M, Calero C, Piattini M (2009) Tailoring data quality models using social network preferences. In: International conference on database systems for advanced applications, Springer, Berlin, Heidelberg, pp 152–166 Caballero I, Verbo E, Serrano M, Calero C, Piattini M (2009) Tailoring data quality models using social network preferences. In: International conference on database systems for advanced applications, Springer, Berlin, Heidelberg, pp 152–166
Zurück zum Zitat Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14:2CrossRef Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14:2CrossRef
Zurück zum Zitat Crosby PB (1979) Quality is free. McGraw-Hill, New York, p 309 Crosby PB (1979) Quality is free. McGraw-Hill, New York, p 309
Zurück zum Zitat Deming WE (1982) Quality, productivity and competitive position. Massachusetts Institute of Technology Center for Advanced Engineering Study, Cambridge, MA, USA Deming WE (1982) Quality, productivity and competitive position. Massachusetts Institute of Technology Center for Advanced Engineering Study, Cambridge, MA, USA
Zurück zum Zitat Earley J (1970) An efficient context-free parsing algorithm. Commun ACM 13(2):94–102CrossRef Earley J (1970) An efficient context-free parsing algorithm. Commun ACM 13(2):94–102CrossRef
Zurück zum Zitat Ehrlinger L, Wöß W (2018) A novel data quality metric for minimality. In: International workshop on data quality and trust in big data, Springer, Cham, pp 1–15 Ehrlinger L, Wöß W (2018) A novel data quality metric for minimality. In: International workshop on data quality and trust in big data, Springer, Cham, pp 1–15
Zurück zum Zitat El Alaoui I, Gahi Y, Messoussi R (2019) Big data quality metrics for sentiment analysis approaches. In: Proceedings of the 2019 international conference on big data engineering, pp 36–43 El Alaoui I, Gahi Y, Messoussi R (2019) Big data quality metrics for sentiment analysis approaches. In: Proceedings of the 2019 international conference on big data engineering, pp 36–43
Zurück zum Zitat Elmasri R, Navathe SB (2000) Fundamentals of database systems, 3rd edn. Addison-Wesley, Reading, MA Elmasri R, Navathe SB (2000) Fundamentals of database systems, 3rd edn. Addison-Wesley, Reading, MA
Zurück zum Zitat Even A, Shankaranarayanan G (2009) Dual assessment of data quality in customer databases. J Data Inf Qual (JDIQ) 1(3):1–29CrossRef Even A, Shankaranarayanan G (2009) Dual assessment of data quality in customer databases. J Data Inf Qual (JDIQ) 1(3):1–29CrossRef
Zurück zum Zitat Fagroud FZ, Ajallouda L, Lahmar EHB, Zellou A, El Filali S (2021) A brief survey on internet of things (IoT). In: 1st International conference on digital technologies and applications. Lecture notes in networks and systems, 211 LNNS, ICDTA, pp 335–344 Fagroud FZ, Ajallouda L, Lahmar EHB, Zellou A, El Filali S (2021) A brief survey on internet of things (IoT). In: 1st International conference on digital technologies and applications. Lecture notes in networks and systems, 211 LNNS, ICDTA, pp 335–344
Zurück zum Zitat Gabr MI, Yehia MH, Doaa SE (2021) Data quality dimensions, metrics, and improvement techniques. Futur Comput Inform J 6(1):3 Gabr MI, Yehia MH, Doaa SE (2021) Data quality dimensions, metrics, and improvement techniques. Futur Comput Inform J 6(1):3
Zurück zum Zitat Gupta P, Pathak V, Goyal N, Singh J, Varshney V, Kumar S (2019) Content credibility check on twitter. Commun Comput Inf Sci 899:197–212 Gupta P, Pathak V, Goyal N, Singh J, Varshney V, Kumar S (2019) Content credibility check on twitter. Commun Comput Inf Sci 899:197–212
Zurück zum Zitat Hassenstein MJ, Vanella P (2022) Data quality-concepts and problems. Encyclopedia 2022(2):498–510CrossRef Hassenstein MJ, Vanella P (2022) Data quality-concepts and problems. Encyclopedia 2022(2):498–510CrossRef
Zurück zum Zitat Hitzler P, Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey a systematic literature review and conceptual framework. Semant Web 1:1–5CrossRef Hitzler P, Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey a systematic literature review and conceptual framework. Semant Web 1:1–5CrossRef
Zurück zum Zitat Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol. 8, pp 216–225 Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol. 8, pp 216–225
Zurück zum Zitat International Organization for Standardization ISO/IEC 25012:2008(E) (2008) Software engineering-software product quality requirementsand evaluation (SQuaRE)-data quality model. International Organization for Standardization, Geneva, Switzerland International Organization for Standardization ISO/IEC 25012:2008(E) (2008) Software engineering-software product quality requirementsand evaluation (SQuaRE)-data quality model. International Organization for Standardization, Geneva, Switzerland
Zurück zum Zitat International Organization for Standardization–ISO (1994) Quality management and quality assurance: vocabulary ISO 8402:1994 International Organization for Standardization–ISO (1994) Quality management and quality assurance: vocabulary ISO 8402:1994
Zurück zum Zitat International Standards Organization (ISO) 8402 (1994) Quality management and quality assurance International Standards Organization (ISO) 8402 (1994) Quality management and quality assurance
Zurück zum Zitat Juran JM (2003) Juran on leadership for quality. Simon and Schuster, New York Juran JM (2003) Juran on leadership for quality. Simon and Schuster, New York
Zurück zum Zitat Laranjeiro N, Soydemir SN, Bernardino J (2015) A survey on data quality: classifying poor data. In: 2015 IEEE 21st Pacific rim international symposium on dependable computing (PRDC), IEEE, pp 179–188 Laranjeiro N, Soydemir SN, Bernardino J (2015) A survey on data quality: classifying poor data. In: 2015 IEEE 21st Pacific rim international symposium on dependable computing (PRDC), IEEE, pp 179–188
Zurück zum Zitat Larsen PM (1980) Industrial application of fuzzy logic control. Int J Man Mach Stud 12:3–10CrossRef Larsen PM (1980) Industrial application of fuzzy logic control. Int J Man Mach Stud 12:3–10CrossRef
Zurück zum Zitat Lee YW, Pipino LL, Funk JD, Wang RY (2006) Journey to data quality. MIT Press, Cambridge, MA Lee YW, Pipino LL, Funk JD, Wang RY (2006) Journey to data quality. MIT Press, Cambridge, MA
Zurück zum Zitat Müller H, Naumann F, Freytag JC (2003) Data quality in genome databases. Humboldt University of Berlin, Berlin Müller H, Naumann F, Freytag JC (2003) Data quality in genome databases. Humboldt University of Berlin, Berlin
Zurück zum Zitat Nikiforova A (2020) Definition and evaluation of data quality: user-oriented data object-driven approach to data quality assessment. Balt J Modern Comput 8(3):391–432 Nikiforova A (2020) Definition and evaluation of data quality: user-oriented data object-driven approach to data quality assessment. Balt J Modern Comput 8(3):391–432
Zurück zum Zitat Olson JE (2003) Data quality: the accuracy dimension. Elsevier, Amsterdam Olson JE (2003) Data quality: the accuracy dimension. Elsevier, Amsterdam
Zurück zum Zitat Radulovic F, Mihindukulasooriya N, García-Castro R, Gómez-Pérez A (2018) A comprehensive quality model for linked data. Semant Web 9(1):3–24CrossRef Radulovic F, Mihindukulasooriya N, García-Castro R, Gómez-Pérez A (2018) A comprehensive quality model for linked data. Semant Web 9(1):3–24CrossRef
Zurück zum Zitat Reda O, Zellou A (2023) Assessing the quality of social media data: a systematic literature review. Bull Electr Eng Inform 12(2):1115–1126CrossRef Reda O, Zellou A (2023) Assessing the quality of social media data: a systematic literature review. Bull Electr Eng Inform 12(2):1115–1126CrossRef
Zurück zum Zitat Reda O, Zellou A (2022) SMDQM-social media data quality assessment model. In: 2022 2nd International conference on innovative research in applied science, engineering and technology (IRASET), IEEE, pp 1–7 Reda O, Zellou A (2022) SMDQM-social media data quality assessment model. In: 2022 2nd International conference on innovative research in applied science, engineering and technology (IRASET), IEEE, pp 1–7
Zurück zum Zitat Reuter C, Ludwig T, Ritzkatis M, Pipek V (2015, May) Social-QAS: tailorable quality assessment service for social media content. In: International symposium on end user development, Springer, Cham, pp 156–170 Reuter C, Ludwig T, Ritzkatis M, Pipek V (2015, May) Social-QAS: tailorable quality assessment service for social media content. In: International symposium on end user development, Springer, Cham, pp 156–170
Zurück zum Zitat Ross TJ (2012) Fuzzy logic with engineering applications, 3rd edn. Wiley, New York, p 585 Ross TJ (2012) Fuzzy logic with engineering applications, 3rd edn. Wiley, New York, p 585
Zurück zum Zitat Scannapieco M (2006) Data quality: concepts, methodologies and techniques. Data-centric systems and applications. Springer, Cham Scannapieco M (2006) Data quality: concepts, methodologies and techniques. Data-centric systems and applications. Springer, Cham
Zurück zum Zitat Sint R, Schaffert S, Stroka S, Ferstl R (2009) Combining unstructured, fully structured and semi-structured information in semantic wikis. In: 4th Workshop on semantic wikis-the semantic Wiki web 6 the European semantic web conference Hersonissos, Crete, Greece, June 2009, pp 73 Sint R, Schaffert S, Stroka S, Ferstl R (2009) Combining unstructured, fully structured and semi-structured information in semantic wikis. In: 4th Workshop on semantic wikis-the semantic Wiki web 6 the European semantic web conference Hersonissos, Crete, Greece, June 2009, pp 73
Zurück zum Zitat Tayi GK, Ballou DP (1998) Examining data quality. Commun ACM 41(2):54–57CrossRef Tayi GK, Ballou DP (1998) Examining data quality. Commun ACM 41(2):54–57CrossRef
Zurück zum Zitat Verma PK, Sharma V, Agarwal S (2019) Credibility investigation for tweets and its users. In: Proceedings of the 3rd international conference on computing methodologies and communication, ICCMC 2019, pp 925–928 Verma PK, Sharma V, Agarwal S (2019) Credibility investigation for tweets and its users. In: Proceedings of the 3rd international conference on computing methodologies and communication, ICCMC 2019, pp 925–928
Zurück zum Zitat Wang X, Ruan D, Kerre EE (2009) Mathematics of fuzziness-basic issues. Studies in fuzziness and soft computing, vol 245. Springer, Berlin/Heidelberg, p 220 Wang X, Ruan D, Kerre EE (2009) Mathematics of fuzziness-basic issues. Studies in fuzziness and soft computing, vol 245. Springer, Berlin/Heidelberg, p 220
Zurück zum Zitat Wayne SR (1983) Quality control circle and company wide quality control. Qual Prog 16(10):14–17 Wayne SR (1983) Quality control circle and company wide quality control. Qual Prog 16(10):14–17
Zurück zum Zitat Woodall P, Parlikad AK (2010) A hybrid approach to assessing data quality. In ICIQ Woodall P, Parlikad AK (2010) A hybrid approach to assessing data quality. In ICIQ
Zurück zum Zitat Yang J, Yu M, Qin H, Lu M, Yang C (2019) A twitter data credibility framework-hurricane Harvey as a use case. ISPRS Int J Geo Inf 8(3):111CrossRef Yang J, Yu M, Qin H, Lu M, Yang C (2019) A twitter data credibility framework-hurricane Harvey as a use case. ISPRS Int J Geo Inf 8(3):111CrossRef
Zurück zum Zitat Yousfi A, El Yazidi MH, Zellou A (2018) Assessing the performance of a new semantic similarity measure designed for schema matching for mediation systems. In: Nguyen N, Pimenidis E, Khan Z, Trawinski B (eds) Computational collective intelligence: 10th international conference on computational collective intelligence. ICCCI’18, Bristol, UK, September 5-7, Proceeding, part I, vol 11055. Springer, pp. 64–74. Print_ISBN: 978-3-319-98442-1. Online_ISBN: 978-3-319-98443-8 Yousfi A, El Yazidi MH, Zellou A (2018) Assessing the performance of a new semantic similarity measure designed for schema matching for mediation systems. In: Nguyen N, Pimenidis E, Khan Z, Trawinski B (eds) Computational collective intelligence: 10th international conference on computational collective intelligence. ICCCI’18, Bristol, UK, September 5-7, Proceeding, part I, vol 11055. Springer, pp. 64–74. Print_ISBN: 978-3-319-98442-1. Online_ISBN: 978-3-319-98443-8
Metadaten
Titel
Fulmqa: a fuzzy logic-based model for social media data quality assessment
verfasst von
Oumaima Reda
Ahmed Zellou
Publikationsdatum
01.12.2023
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2023
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-023-01148-y

Weitere Artikel der Ausgabe 1/2023

Social Network Analysis and Mining 1/2023 Zur Ausgabe

Premium Partner