On the Within-Group Unfairness of Screening Classifiers

Nastaran Okati, Stratis Tsirtsis, Manuel Gomez Rodriguez PDF
Abstract: Screening classifiers are increasingly used to identify qualified candidates in a variety of selection processes. In this context, it has been recently shown that if a classifier is calibrated, one can identify the smallest set of candidates which contains, in expectation, a desired number of qualified candidates using a threshold decision rule. This lends support to focusing on calibration as the only requirement for screening classifiers. In this paper, we argue that screening policies that use calibrated classifiers may suffer from an understudied type of within-group unfairness---they may unfairly treat qualified members within demographic groups of interest. Further, we argue that this type of unfairness can be avoided if classifiers satisfy within-group monotonicity, a natural monotonicity property within each of the groups. Then, we introduce an efficient post-processing algorithm based on dynamic programming to minimally modify a given calibrated classifier so that its probability estimates satisfy within-group monotonicity. We validate our algorithm using US Census survey data and show that within-group monotonicity can often be achieved at a small cost in terms of prediction granularity and shortlist size.

Long-term Fair Decision Making Through Deep Generative Models

Yaowei Hu, Yongkai Wu, Lu Zhang
Abstract: This paper studies long-term fair machine learning which aims to ensure that decision models will impose fair influences on different groups of people in the long run. To define long-term fairness, we leverage the causal time series graph and use the 1-Wasserstein distance between different interventional distributions of features at a sufficiently large time step as the fairness metric. Then, we propose a three-phase deep generative framework where the decision model is trained on high-fidelity generated time series data. We formulate the optimization problem as a performative risk minimization and adopt the repeated gradient descent algorithm for learning. The empirical evaluation shows the efficacy of the proposed method using both synthetic and semi-synthetic time series datasets.

Alleviating Filter Bubbles and Polarization in News Recommendation via Dynamic Calibration

Han Zhang, Ziwei Zhu, James Caverlee PDF
Abstract: Recent work in news recommendation systems has demonstrated that recommendation algorithms can over-expose users to articles that support pre-existing opinions. Such a filter bubble problem can intensify over time if users and the recommender form a closed feedback loop, eventually resulting in severe political polarization. While empirical work has uncovered this problem in a dynamic recommendation process, how to effectively break this cycle remains elusive. Hence, in this work, we propose a Dynamic Calibration method for new recommendation, which calibrates the recommendations from perspectives of both rankings and predicted scores. Extensive experiments demonstrate the strong performance of the proposed Dynamic Calibration algorithm and also illustrate the effectiveness of the two modules in the proposed method.

Adaptation Speed of Causal Models Concerning Fairness

Yujie Lin, Chen Zhao, Minglai Shao, Xujiang Zhao, Haifeng Chen PDF
Abstract: When considering machine learning tasks aimed at bidirectional training, it is common practice to employ the source corpus as the target corpus, requiring the training of two models with opposing directions. The prompt question of which model demonstrates superior adaptability to domain shifts holds substantial significance across various disciplines. Specifically, we examine the case wherein an original distribution p undergoes transformations resulting from an unknown intervention, leading to the emergence of a modified distribution p*. Multiple factors, such as causal dependencies among variables within p, influence the rate of adaptation when aligning p with p*. Nevertheless, real-life scenarios necessitate the consideration of fairness during the training process, particularly when incorporating a sensitive variable (bias) situated between a cause and an effect variable. To investigate this scenario, we scrutinize a simplified structural causal model (SCM) featuring a cause-bias-effect structure, wherein variable A functions as a sensitive intermediary between the cause and the effect. The two models demonstrate consistent and contradictory cause-effect directions within the cause-bias-effect SCM, respectively. By subjecting variables within the SCM to unknown interventions, we can simulate various domain shifts to facilitate analysis. Consequently, we compare the adaptation speeds of the two models across four shift scenarios while also establishing the connection between their adaptation speeds across all interventions.

Fair Multiclass Classification for a Black-Box Classifier

Grigori Jasnovidov, Elizaveta Tarasova PDF
Abstract: Algorithmic fairness is a widely studied area in machine learning field. The tasks of fair regression and fair binary classification are quite well explored up to the current moment. However, just few works consider a problem of fair multi-class classification despite its potential usefulness in areas like credit scoring, school and university admission, criminal jurisdiction, etc. Indeed, in all these issues, the predicted label may take more than two values. The credit liability may be estimated as 'low','medium' and 'high'; the risk of recidivism may also have several values; the future performance of a student can be evaluated as a non-binary variable. In this paper, we present a post-processing type algorithm that increases fairness in multi-class classification problems. The core of our approach is a linear programming problem that allows our algorithm to relabel some predictions of the initial classifier in order to improve fairness with a small possible loss in accuracy. We evaluate performance of our algorithm on synthetic and real datasets. As the results show, depending on the dataset, our algorithm increases fairness without statistically significant loss in accuracy.

Are Your Reviewers Being Treated Equally? Discovering Subgroup Structures to Improve Fairness in Spam Detection

Jiaxin Liu, Yuefei Lyu, Xi Zhang, Sihong Xie PDF
Abstract: User-generated product reviews are essential for online platforms like Amazon and Yelp. However, the presence of fake reviews misleads customers. GNN is the state-of-the-art method that detects suspicious reviewers by exploiting the topologies of the graph connecting reviewers, reviews, and products. Nevertheless, the discrepancy in the detection accuracy over different groups of reviewers degrades reviewer engagement and customer trust in the review websites. Unlike the previous belief that the difference between the groups causes unfairness, we study the subgroup structures within the groups that can also cause discrepancies in treating different groups. This paper addresses the challenges of defining, approximating, and utilizing a new subgroup structure for fair spam detection. We first identify subgroup structures in the review graph that lead to discrepant accuracy in the groups. The complex dependencies over the review graph create difficulties in teasing out subgroups hidden within larger groups. We design a model that can be trained to jointly infer the hidden subgroup memberships and exploits the membership for calibrating the detection accuracy across groups. Comprehensive comparisons against baselines on three large Yelp review datasets demonstrate that the subgroup membership can be identified and exploited for group fairness.

Neural-Informed Decision Trees: A Fair, Interpretable and Expressive Tree Model

Georgia Perakis, Asterios Tsiourvas PDF
Abstract: We study the problem of creating a highly expressive, interpretable, and simultaneously fair machine learning model. We propose neural-informed decision trees (NIDTs), a fair model that combines the predictive power of neural networks with the inherent interpretability of decision trees. NIDTs perform axis-aligned splits on the features of the dataset to create an interpretable decision path, and at each leaf, use a linear predictor that uses both the features as well as the embeddings coming from a task-specific neural network to capture non-linearities in the data. To generate NIDTs we propose a decomposition training scheme. The proposed training method enables the direct integration of fairness constraints by solving a constrained convex optimization problem at each leaf, resulting in a certified fair model. We evaluate NIDTs on 15 publicly available datasets, where we show that NIDTs outperform multiple interpretable tree-based models, as well as the neural network that informs them. We also show the interpretable aspects of the method by extracting a drug-dosage prescription policy using a real-world dataset. Finally, we demonstrate the fairness of NIDTs on a real-world dataset by directly incorporating fairness constraints into the model, resulting in a certified fair model that eliminates gender bias in prediction.

Fair Learning to Rank with Distribution-free Risk Control

Ruocheng Guo, Jean-Francois Ton, Yang Liu
Abstract: Learning to Rank (LTR) methods are vital in online economies, affecting users and item providers. Fairness in LTR models is crucial to allocate exposure proportionally to item relevance. The deterministic ranking model can lead to unfair exposure distribution when items with the same relevance receive slightly different scores. Stochastic LTR models, incorporating the Plackett-Luce (PL) model, address fairness issues but have limitations in computational cost and performance guarantees. To overcome these limitations, we propose FairLTR-RC, a novel post-hoc model-agnostic method. FairLTR-RC leverages a pretrained scoring function to create a stochastic LTR model, eliminating the need for expensive training. Furthermore, FairLTR-RC provides finite-sample guarantees on a user-specified utility using distribution-free risk control framework. By additionally incorporating the Thresholded PL (TPL) model, we are able to achieve an effective trade-off between utility and fairness. Experimental results on several benchmark datasets demonstrate that FairLTR-RC significantly improves fairness in widely-used deterministic LTR models while guaranteeing a specified level of utility.

Learning Fair and Domain Generalization Representation

Dong Li, Chen Zhao, Minglai Shao, Xujiang Zhao
Abstract: Currently, there are many papers that focus on generalization to out-of-distribution (OOD) data with multi-source domains as well as learning fair representation, but little work combines them together at the same time. To this end, we propose a new approach, Learning Fair and Domain Generalization Representation (LFDGR), aimed to learning simultaneously a representation that can be invariant to domain information and sensitive attributes. We first learn the semantic information and domain information by a semantic encoder and a content encoder, respectively, to obtain a domain-invariant representation by disentanglement of the domain information. Then we use the GMM with a statistical party constraint to learn a prototype in the late space to reach both group fairness and individual fairness. Finally we train an exclusive classifier to verify the performance of the learned representation on fairness and domain generalization. Experiments on tabular datasets and image datasets verify that our method exhibits excellent performance on both fairness and domain generalization.

Fair Few-shot Learning with Auxiliary Sets

Song Wang, Jing Ma, Lu Cheng, Jundong Li PDF
Abstract: Recently, there has been a growing interest in developing machine learning (ML) models that can promote fairness, i.e., eliminating biased predictions towards certain populations (e.g., individuals from a specific demographic group). Most existing works learn such models based on well-designed fairness constraints in optimization. Nevertheless, in many practical ML tasks, only very few labeled data samples can be collected, which can lead to inferior fairness performance. This is because existing fairness constraints are designed to restrict the prediction disparity among different sensitive groups, but with few samples, it becomes difficult to accurately measure the disparity, thus rendering ineffective fairness optimization. In this paper, we define the fairness-aware learning task with limited training samples as the fair few-shot learning problem. To deal with this problem, we devise a novel framework that accumulates fairness-aware knowledge across different meta-training tasks and then generalizes the learned knowledge to meta-test tasks via fairness adaptation. To compensate for insufficient training samples, we propose an essential strategy to select and leverage an auxiliary set for each meta-test task. These auxiliary sets contain several labeled training samples that can enhance fairness adaptation in meta-test tasks, thereby allowing for the transfer of learned useful fairness-oriented knowledge to meta-test tasks. Furthermore, we conduct extensive experiments on three real-world datasets to validate the superiority of our framework against the state-of-the-art baselines.