Abstract
We frequently compute a score for each item in a data set, sometimes for its intrinsic value, but more often as a step towards classification, ranking, and so forth. The importance of computing this score fairly cannot be overstated. In this tutorial, we will develop a framework for how to think about this task, and then present techniques for responsible scoring and link these to traditional data management challenges.
- J. Mason. The secret trust scores companies use to judge us all. The Wall Street Journal, April 6, 2019.Google Scholar
- The breast cancer risk assessment tool. bcrisktool.cancer.gov, (accessed March 2020).Google Scholar
- World cup 2018 seeding: Pots, procedure & all you need to know ahead of the draw. GOAL.COM, 12/1/2017.Google Scholar
- How u.s. news calculated the 2020 best graduate schools rankings. bit.ly/39HjnGQ, 3/11/2019.Google Scholar
- P. Mozur, R. Zhong, and A. Krolik. In coronavirus fight, china gives citizens a color code, with red flags. The New York Times, March 1, 2020.Google Scholar
- A. Olteanu, C. Castillo, F. Diaz, and E. Kiciman. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2:13, 2019.Google ScholarCross Ref
- S. Barocas and A. D. Selbst. Big data's disparate impact. Calif. L. Rev., 104:671, 2016.Google Scholar
- R. Buddin. Gender gaps in high school gpa and act scores. ACT Research & Policy, 2014.Google Scholar
- C. Roth. Women job applicants punished for higher grades, study finds. WOSU Public Media, Mar 26, 2018.Google Scholar
- A. D. Selbst. Disparate impact in big data policing. Ga. L. Rev., 52:109, 2017.Google Scholar
- J. L. Santos, N. L. Cabrera, and K. J. Fosnacht. Is "race-neutral" really race-neutral?: Disparate impact towards underrepresented minorities in post-209 uc system admissions. J. High. Educ, 81(6):605--631, 2010.Google ScholarCross Ref
- M. F. Vidal and J. Menajovsky. Algorithm bias in credit scoring: What's inside the black box? CGAP blog, 2019.Google Scholar
- P. T. Kim. Data-driven discrimination at work. Wm. & Mary L. Rev., 58:857, 2016.Google Scholar
- I. Žliobaitė. Measuring discrimination in algorithmic decision making. DATA MIN KNOWL DISC, 31(4):1060--1089, 2017. Google ScholarDigital Library
- S. Barocas, M. Hardt, and A. Narayanan. Fairness and machine learning: Limitations and opportunities. fairmlbook.org, 2019.Google Scholar
- S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. Algorithmic decision making and the cost of fairness. In SIGKDD. ACM, 2017. Google ScholarDigital Library
- A. K. Menon and R. C. Williamson. The cost of fairness in binary classification. In FAT*, 2018.Google Scholar
- J. Kleinberg, S. Mullainathan, and M. Raghavan. Inherent trade-offs in the fair determination of risk scores. CoRR, abs/1609.05807, 2016.Google Scholar
- S. A. Friedler, C. Scheidegger, and S. Venkatasubramanian. On the (im) possibility of fairness. CoRR, abs/1609.07236, 2016.Google Scholar
- J. Neyman and E. S. Pearson. Contributions to the theory of testing statistical hypotheses. Statistical Research Memoirs, 1936.Google Scholar
- A. Asudeh, Z. Jin, and H. Jagadish. Assessing and remedying coverage for a given dataset. In ICDE, 2019.Google ScholarCross Ref
- R. A. Baeza-Yates. Big data or right data? In AMW, 2013.Google Scholar
- A. Narayanan. Translation tutorial: 21 fairness definitions and their politics. In FAT*, 2018.Google Scholar
- C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In ITCS, pages 214--226, 2012. Google ScholarDigital Library
- M. Kearns, S. Neel, A. Roth, and Z. S. Wu. An empirical study of rich subgroup fairness for machine learning. In FAT*, pages 100--109, 2019. Google ScholarDigital Library
- M. Kearns, S. Neel, A. Roth, and Z. S. Wu. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In ICML, pages 2564--2572, 2018.Google Scholar
- M. Drosou, H. Jagadish, E. Pitoura, and J. Stoyanovich. Diversity in big data: A review. Big data, 5(2):73--84, 2017.Google ScholarCross Ref
- A. Asudeh, H. Jagadish, G. Miklau, and J. Stoyanovich. On obtaining stable rankings. PVLDB, 12(3):237--250, 2018. Google ScholarDigital Library
- S. A. Friedler, C. Scheidegger, S. Venkatasubramanian, S. Choudhary, E. P. Hamilton, and D. Roth. A comparative study of fairness-enhancing interventions in machine learning. In FAT*, 2019. Google ScholarDigital Library
- J. Steinhardt. Robust Learning: Information Theory and Algorithms. PhD thesis, Stanford University, 2018.Google Scholar
- A. Asudeh, H. Jagadish, and J. Stoyanovich. Towards responsible data-driven decision making in score-based systems. Data Engineering, 42(3):76--87, 2019.Google Scholar
- A. Narayanan. How to recognize ai snake oil www.cs.princeton.edu/~arvindn/talks. Technical report, MIT-STS-AI-snakeoil.pdf, 2019.Google Scholar
- F. Kamiran and T. Calders. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1):1--33, 2012. Google ScholarDigital Library
- M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. In SIGKDD, 2015. Google ScholarDigital Library
- F. Calmon, D. Wei, B. Vinzamuri, K. N. Ramamurthy, and K. R. Varshney. Optimized pre-processing for discrimination prevention. In NIPS, pages 3992--4001, 2017. Google ScholarDigital Library
- B. Salimi, L. Rodriguez, B. Howe, and D. Suciu. Interventional fairness: Causal database repair for algorithmic fairness. In SIGMOD, pages 793--810, 2019. Google ScholarDigital Library
- Z. Jin, M. Xu, C. Sun, A. Asudeh, and H. Jagadish. MithraCoverage: A system for investigating population bias for intersectional fairness. SIGMOD, 2020. Google ScholarDigital Library
- Y. Lin, Y. Guan, A. Asudeh, and J. H. V. Identifying insufficient data coverage in databases with multiple relations. PVLDB, 13(11):2229--2242, 2020. Google ScholarDigital Library
- C. Sun, A. Asudeh, H. Jagadish, B. Howe, and J. Stoyanovich. MithraLabel: Flexible dataset nutritional labels for responsible data science. In CIKM, 2019. Google ScholarDigital Library
- T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Fairness-aware classifier with prejudice remover regularizer. In ECML PKDD, pages 35--50. Springer, 2012.Google Scholar
- R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. In ICML, 2013. Google ScholarDigital Library
- M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi. Fairness constraints: Mechanisms for fair classification. CoRR, abs/1507.05259, 2015.Google Scholar
- A. Asudeh, H. Jagadish, J. Stoyanovich, and G. Das. Designing fair ranking schemes. In SIGMOD, 2019. Google ScholarDigital Library
- A. Asudeh and H. Jagadish. Responsible scoring mechanisms through function sampling. CoRR, abs/:1911.10073, 2019.Google Scholar
- Y. Guan, A. Asudeh, P. Mayuram, H. Jagadish, J. Stoyanovich, G. Miklau, and G. Das. MithraRanking: A system for responsible ranking design. In SIGMOD, 2019. Google ScholarDigital Library
- A. Asudeh, H. Jagadish, Y. Wu, and C. Yu. On detecting cherry-picked trendlines. PVLDB, 13(6):939--952, 2020.Google ScholarDigital Library
Recommendations
Dividing connected chores fairly
In this paper we consider the fair division of chores (tasks that need to be performed by agents, with negative utility for them), and study the loss in social welfare due to fairness. Previous work has been done on this so-called price of fairness, ...
Fairly Allocating (Contiguous) Dynamic Indivisible Items with Few Adjustments
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent SystemsWe study the problem of dynamically allocating T indivisible items to n agents with the restriction that the allocation is fair all the time. Due to the negative results to achieve fairness when allocations are irrevocable, we allow adjustments to make ...
Generating Top-N Items Recommendation Set Using Collaborative, Content Based Filtering and Rating Variance
AbstractThe main purpose of any recommendation system is to recommend items of users’ interest. Mostly content and collaborative filtering are widely used recommendation systems. Matrix factorization technique is also used by many recommendation systems. ...
Comments