nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

Enhancing Fairness and Accuracy in Machine Learning Through Similarity Networks

verfasst von : Samira Maghool, Elena Casiraghi, Paolo Ceravolo

Erschienen in: Cooperative Information Systems

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Machine Learning is a powerful tool for uncovering relationships and patterns within datasets. However, applying it to a large datasets can lead to biased outcomes and quality issues, due to confounder variables indirectly related to the outcome of interest. Achieving fairness often alters training data, like balancing imbalanced groups (privileged/unprivileged) or excluding sensitive features, impacting accuracy. To address this, we propose a solution inspired by similarity network fusion, preserving dataset structure by integrating global and local similarities. We evaluate our method, considering data set complexity, fairness, and accuracy. Experimental results show the similarity network’s effectiveness in balancing fairness and accuracy. We discuss implications and future directions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Considering Vocabulary Mappings in Query Plans for Federations of RDF Data Sources

https://rpubs.com/rhuebner/hrd_cb_v14.

https://scikit-learn.org/stable/modules/impute.html.

https://search.r-project.org/CRAN/refmans/mice/html/mice.html.

https://radimrehurek.com/gensim/models/word2vec.html.

Abdel-Megeed, S.M.: Monte Carlo study of psychometric effects of scaling levels on the pearson product moment correlation coefficient (1984)

Agarwal, A., Agarwal, H., Agarwal, N.: Fairness score and process standardization: framework for fairness certification in artificial intelligence systems. AI Ethics 3(1), 267–279 (2023). https://doi.org/10.1007/s43681-022-00147-7CrossRef

Aurelio, Y.S., De Almeida, G.M., de Castro, C.L., Braga, A.P.: Learning from imbalanced data sets with weighted cross-entropy function. Neural Process. Lett. 50, 1937–1949 (2019). https://doi.org/10.1007/s11063-018-09977-1CrossRef

Barocas, S., Hardt, M., Narayanan, A.: Fairness and Machine Learning: Limitations and Opportunities. fairmlbook.org (2019). http://www.fairmlbook.org

Bellandi, V., Damiani, E., Ghirimoldi, V., Maghool, S., Negri, F.: Validating vector-label propagation for graph embedding. In: Sellami, M., Ceravolo, P., Reijers, H.A., Gaaloul, W., Panetto, H. (eds.) CoopIS 2022. LNCS, vol. 13591, pp. 259–276. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-17834-4_15CrossRef

Casiraghi, E., et al.: A method for comparing multiple imputation techniques: a case study on the US national COVID cohort collaborative. J. Biomed. Inform. 139, 104295 (2023)CrossRef

Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806 (2017)

Cotter, A., et al.: Training well-generalizing classifiers for fairness metrics and other data-dependent constraints. In: International Conference on Machine Learning, pp. 1397–1405. PMLR (2019)

Cotter, A., et al.: Optimization with non-differentiable constraints with applications to fairness, recall, churn, and other goals. J. Mach. Learn. Res. 20(172), 1–59 (2019)MathSciNetMATH

10.

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)

11.

Fish, B., Kun, J., Lelkes, Á.D.: A confidence-based approach for balancing fairness and accuracy. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 144–152. SIAM (2016)

12.

Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E.P., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 329–338 (2019)

13.

Garcia, L.P., de Carvalho, A.C., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing 160, 108–119 (2015)CrossRef

14.

Ghazimatin, A., Kleindessner, M., Russell, C., Abedjan, Z., Golebiowski, J.: Measuring fairness of rankings under noisy sensitive information. In: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022, pp. 2263–2279. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3531146.3534641

15.

Gliozzo, J., et al.: Heterogeneous data integration methods for patient similarity networks. Briefings Bioinform. 23(4), bbac207 (2022)

16.

Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics 27(4), 857–871 (1971)CrossRef

17.

Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

18.

Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)CrossRef

19.

Japkowicz, N., Shah, M.: Performance evaluation in machine learning. In: El Naqa, I., Li, R., Murphy, M.J. (eds.) Machine Learning in Radiation Oncology, pp. 41–56. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18305-3_4CrossRef

20.

Kleinberg, J.: Inherent trade-offs in algorithmic fairness. SIGMETRICS Perform. Eval. Rev. 46(1), 40 (2018). https://doi.org/10.1145/3292040.3219634MathSciNetCrossRef

21.

Lepri, B., Oliver, N., Letouzé, E., Pentland, A., Vinck, P.: Fair, transparent, and accountable algorithmic decision-making processes: the premise, the proposed solutions, and the open challenges. Philos. Technol. 31, 611–627 (2018). https://doi.org/10.1007/s13347-017-0279-xCrossRef

22.

Liang, A., Lu, J., Mu, X.: Algorithmic design: fairness versus accuracy. In: Proceedings of the 23rd ACM Conference on Economics and Computation, pp. 58–59 (2022)

23.

Lorena, A.C., Garcia, L.P., Lehmann, J., Souto, M.C., Ho, T.K.: How complex is your classification problem? A survey on measuring classification complexity. ACM Comput. Surv. (CSUR) 52(5), 1–34 (2019)CrossRef

24.

Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

25.

Mary, J., Calauzenes, C., El Karoui, N.: Fairness-aware learning for continuous attributes and treatments. In: International Conference on Machine Learning, pp. 4382–4391. PMLR (2019)

26.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)CrossRef

27.

Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In: Conference on Fairness, Accountability and Transparency, pp. 107–118. PMLR (2018)

28.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)

29.

Morais, G., Prati, R.C.: Complex network measures for data set characterization. In: 2013 Brazilian Conference on Intelligent Systems, pp. 12–18. IEEE (2013)

30.

Naeem, S.B., Bhatti, R., Khan, A.: An exploration of how fake news is taking over social media and putting public health at risk. Health Inf. Libr. J. 38(2), 143–149 (2021)CrossRef

31.

Oneto, L., Chiappa, S.: Fairness in machine learning. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds.) Recent Trends in Learning From Data. SCI, vol. 896, pp. 155–196. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43883-8_7CrossRef

32.

Ormiston, C.K., Chiangong, J., Williams, F.: The COVID-19 pandemic and hispanic/latina/o immigrant mental health: why more needs to be done. Health Equity 7(1), 3–8 (2023)CrossRef

33.

Pessach, D., Shmueli, E.: A review on fairness in machine learning. ACM Comput. Surv. 55(3), 1–44 (2022). https://doi.org/10.1145/3494672CrossRef

34.

Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

35.

Schölkopf, B.: The kernel trick for distances. In: Advances in Neural Information Processing Systems, vol. 13 (2000)

36.

Singh, A., Joachims, T.: Fairness of exposure in rankings. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2219–2228 (2018)

37.

Smola, A.J., Kondor, R.: Kernels and regularization on graphs. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT-Kernel 2003. LNCS (LNAI), vol. 2777, pp. 144–158. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45167-9_12CrossRefMATH

38.

Sugiyama, M., Borgwardt, K.: Halting in random walk kernels. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper_files/paper/2015/file/31b3b31a1c2f8a370206f111127c0dbd-Paper.pdf

39.

Tizpaz-Niari, S., Kumar, A., Tan, G., Trivedi, A.: Fairness-aware configuration of machine learning libraries. In: Proceedings of the 44th International Conference on Software Engineering, pp. 909–920 (2022)

40.

Wang, B., et al.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11(3), 333–337 (2014)CrossRef

41.

Zhang, T., Zhu, T., Gao, K., Zhou, W., Philip, S.Y.: Balancing learning model privacy, fairness, and accuracy with early stopping criteria. IEEE Trans. Neural Netw. Learn. Syst. 34(9), 5557–5569 (2023)MathSciNetCrossRef

42.

Zhu, X.: Semi-supervised learning with graphs. Carnegie Mellon University (2005)

Titel: Enhancing Fairness and Accuracy in Machine Learning Through Similarity Networks
verfasst von: Samira Maghool
Elena Casiraghi
Paolo Ceravolo
Verlag: Springer Nature Switzerland
Buch: Cooperative Information Systems
Print ISBN: 978-3-031-46845-2

Electronic ISBN: 978-3-031-46846-9

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-46846-9_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner