Abstract
Perspectives on the role and responsibility of the data-management research community in designing, developing, using, and overseeing automated decision systems.
- Abiteboul, S. and Stoyanovich, J. Transparency, fairness, data protection, neutrality: Data management challenges in the face of new regulation. J. of Data and Information Quality 11, 3 (2019), 15:1--15:9.Google ScholarDigital Library
- Asudeh, A., Jin, Z., and Jagadish, H.V. Assessing and remedying coverage for a given dataset. In 35th IEEE International Conference on Data Engineering (April 2019), 554--565.Google ScholarCross Ref
- Baeza-Yates, R. Bias on the web. Communications of the ACM 61, 6 (2018), 54--61.Google ScholarDigital Library
- Biessmann, F., Salinas, D., Schelter, S., Schmidt, P., and Lange, D. Deep learning for missing value imputation in tables with non-numerical data. In Proceedings of the 27th ACM Intern. Conf. on Information and Knowledge Management (2018), 2017--2025.Google Scholar
- Bogen, M. and Rieke, A. Help wanted: An examination of hiring algorithms, equity, and bias. Upturn (2018).Google Scholar
- Cauwenberghs, G. and Poggio, T. Incremental and decremental support vector machine learning. NeurIPS (2001), 409--415.Google Scholar
- Chen, I., Johansson, F., and Sontag, D. Why is my classifier discriminatory? S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 3543--3554.Google Scholar
- Chouldechova, A. and Roth, A. A snapshot of the frontiers of fairness in machine learning. Communications of the ACM 63, 5 (2020), 82--89.Google ScholarDigital Library
- Crenshaw, K. Demarginalizing the intersection of race and sex: A Black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. University of Chicago Legal Forum 1 (1989), 139--167.Google Scholar
- Datta, A., Sen, S., and Zick, Y. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In IEEE Symposium on Security and Privacy (May 2016), 598--617.Google ScholarCross Ref
- Friedler, S., Scheidegger, C., and Venkatasubramanian, S. The (im)possibility of fairness: Different value systems require different mechanisms for fair decision making. Communications of the ACM 64, 4 (2021), 136--143.Google ScholarDigital Library
- Friedman, B. and Nissenbaum, H. Bias in computer systems. ACM Transactions on Information Systems 14, 3 (1996), 330--347.Google ScholarDigital Library
- Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J., Wallach, H., Daumé III, H., and Crawford, K. Datasheets for datasets. CoRR (2018), abs/1803.09010.Google Scholar
- Ginart, A., Guan, M., Valiant, G., and Zou, J. Making AI forget you: Data deletion in machine learning. In NeurIPS (2019), 3513--3526.Google Scholar
- Grafberger, S., Stoyanovich, J., and Schelter, S. Lightweight inspection of data preprocessing in native machine learning pipelines. In 11th Conf. on Innovative Data Sys. Research, Online Proceedings (January 2021), http://www.cidrdb.org.Google Scholar
- Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. A survey of methods for explaining black box models. ACM Computing Surveys 51, 5 (2019), 93:1--93:42.Google ScholarDigital Library
- Herschel, M., Diestelkämper, R., and Ben Lahmar, H. A survey on provenance: What for? What form? What from? VLDB Journal 26, 6 (2017), 881--906.Google ScholarDigital Library
- Holland, S., Hosny, A., Newman, S., Joseph, J., and Chmielinski, K. The dataset nutrition label: A framework to drive higher data quality standards. CoRR (2018), abs/1805.03677.Google Scholar
- Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., and Shahabi, C. Big data and its technical challenges. Communications of the ACM 57, 7 (2014), 86--94.Google ScholarDigital Library
- Kappelhof, J. Survey research and the quality of survey data among ethnic minorities. In Total Survey Error in Practice, Wiley (2017).Google ScholarCross Ref
- Kilbertus, N., Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., and Schölkopf, B. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems (2017), 656--666.Google ScholarDigital Library
- Kusner, M., Loftus, J., Russell, C., and Silva, R. Counterfactual fairness. I. Guyon, U. von Luxburg, S. Bengio, H.M. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, In Advances in Neural Information Processing Systems 30: (2017), 4066--4076.Google Scholar
- Lehr, D. and Ohm, P. Playing with the data: What legal scholars should learn about machine learning. UC Davis Law Review 51, 2 (2017), 653--717.Google Scholar
- Lewis, A. and Stoyanovich, J. Teaching responsible data science. Intern. J. of Artificial Intelligence in Education (2021).Google Scholar
- Mitchell, M., et al. Model cards for model reporting. In Proceedings of the Conf. on Fairness, Accountability, and Transparency 2019, 220--229.Google ScholarDigital Library
- Olteanu, A., Castillo, C., Diaz, F., and Kiciman, E. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers Big Data 2, 13 (2019).Google ScholarCross Ref
- Rabanser, S., Günnemann, S., and Lipton, Z. Failing loudly: An empirical study of methods for detecting dataset shift. H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Gannett, editors. In Advances in Neural Information Processing Systems 32 (December 2019), 1394--1406.Google Scholar
- Reeves, R. and Halikias, D. Race gaps in SAT scores highlight inequality and hinder upward mobility. Brookings (2017), https://www.brookings.edu/research/race-gaps-in-sat-scores-highlight-inequality-and-hinder-upward-mobility.Google Scholar
- Salimi, B., Rodriguez, L., Howe, B., and Suciu, D. Interventional fairness: Causal database repair for algorithmic fairness. P.A. Boncz, S. Manegold, A. Ailamaki, A. Deshpande, and T. Kraska, editors. In Proceedings of the 2019 Intern. Conf. on Management of Data, 793--810.Google Scholar
- Sarkar, S., Papon, T., Staratzis, D., and Athanassoulis, M. Lethe: A tunable delete-aware LSM engine. In Proceedings of the 2020 Intern. Conf. on Management of Data.Google Scholar
- Schelter, S. "Amnesia"--a selection of machine learning models that can forget user data very fast. Conf. on Innovative Data Systems Research, 2020.Google Scholar
- Schelter, S., Grafberger, S., and Dunning, T. HedgeCut: Maintaining randomised trees for low-latency machine unlearning. In Proceedings of the 2021 Intern. Conf. on Management of Data.Google Scholar
- Schelter, S. and Stoyanovich, J. Taming technical bias in machine learning pipelines. IEEE Data Engineering Bulletin 43, 4 (2020).Google Scholar
- Selbst, A. Disparate impact in big data policing. Georgia Law Review 52, 109 (2017).Google Scholar
- Shastri, S., Banakar, V., Wasserman, M., Kumar, A., and Chidambaram, V. Understanding and benchmarking the impact of GDPR on database systems. PVLDB (2020).Google Scholar
- Stoyanovich, J. and Howe, B. Nutritional labels for data and models. IEEE Data Engineering Bulletin 42, 3 (2019), 13--23.Google Scholar
- Stoyanovich, J., Howe, B., and Jagadish, H.V. Responsible data management. In Proceedings of the VLDB Endowment 13, 12 (2020), 3474--3488.Google ScholarDigital Library
- Yang, K., Loftus, J., and Stoyanovich, J. Causal intersectionality and fair ranking. K. Ligett and S. Gupta, editors. In 2nd Symposium on Foundations of Responsible Computing, Volume 192 of LIPICS, Schloss Dagstuhl--Leibniz Center for Informatics (June 2021), 7:1--7:20.Google Scholar
- Yang, K., Stoyanovich, J., Asudeh, A., Howe, B., Jagadish, H.V., and Miklau, G. A nutritional label for rankings. G. Das, C. Jermaine, and P. Bernstein, editors. In Proceedings of the 2018 Intern. Conf. on Management of Data, 1773--1776.Google Scholar
- Zehlike, M., Yang, K., and Stoyanovich, J. Fairness in ranking: A survey. CoRR (2021), abs/2103.14000.Google Scholar
Index Terms
- Responsible data management
Recommendations
OM Forum—A Vision of Responsible Research in Operations Management
Are we contributing positively to the society at large by research that we conduct in the field of Operations Management? Currently, the answer is probably closer to “no” than to “yes.” We often do not realize it, but there is very real cost of conducting ...
Data management and model management: a relational synthesis
ACM-SE 20: Proceedings of the 20th annual Southeast regional conferenceThe successful implementation of data base management systems has led to suggestions that similar systems, called model management systems, be developed for decision models to facilitate and control user access to models and to integrate sets of models. ...
Comments