From Feature Engineering Bookcamp by Sinan Ozdemir This article series covers ● Recognizing and mitigating bias in our data and model ● Quantifying fairness through various metrics ● Applying feature engineering techniques to remove bias from our model without sacrificing model performance

Take 35% off Feature Engineering Bookcamp by entering fccozdemir into the discount code box at checkout at manning.com.
Building a Biasaware Model
For information on the dataset, the mechanics and importance of model bias and fairness, and building the basic model, check out part 1 and part 2.
Let’s begin to construct a more biasaware model using two feature engineering techniques. We will begin by applying a familiar transformation to construct a new lessbiased column and then move on to our feature extraction method of the book. Our goal is to minimize the bias of our model without sacrificing a great deal of model performance.
Feature Construction – Using the YeoJohnson transformer to treat the disparate impact
We’re going to do something similar to the boxcox transformation to transform some of our features in order to make them appear more normal. To set up why we have to investigate the reasons for which our model is underpredicting recidivism for nonAfricanAmerican people. One approach would be to remove race entirely from our dataset and expect the ML model to remove all bias. This rarely is the answer unfortunately.
Unprivileged and privileged groups of people experience different opportunities and this likely present itself in the data through correlated features. The most likely cause for our model’s bias is that at least one of our features is highly correlated with race and our model is able to reconstruct someone’s racial identity through this feature. To find this feature, let’s start by finding the correlation coefficient of our numerical features and being AfricanAmerican.
compas_df.corrwith(compas_df['race'] == 'AfricanAmerican').sort_values() age 0.179095 juv_count 0.111835 priors_count 0.202897
Both age
and priors_count
are highly correlated with our boolean label of simply being AfricanAmerican so let’s take a closer look at each of those. Let’s start by looking at age. We can plot a histogram and print out some basic statistics and we will see that across our four racial categories, age seems to be relatively similar with a similar mean, standard deviation, and median. This signals to us that even though age is negatively correlated to being AfricanAmerican, this relationship is not a huge contributing factor to our model’s bias.
# Age is not very skewed compas_df.groupby('race')['age'].plot( figsize=(20,5), kind='hist', xlabel='Age', title='Histogram of Age' ) compas_df.groupby('race')['age'].describe()
Figure 1. Distribution of Age by the group. The table on top implies that age distribution is not drastically different across groups thereby implying less impact to disparate treatment and impact. It is worth noting that the average and the median age of AfricanAmericans is about 1015% younger than those identified in the other categories which is probably why we are seeing a strong correlation between our age column and our AfricanAmerican identifying column
Let’s turn our attention to priors_count and do the same printout. When we do, we will see some stark contrasts from age.
# Priors is extremely skewed by looking at the differences in mean/median/std across the racial categories compas_df.groupby('race')['priors_count'].plot( figsize=(20,5), kind='hist', xlabel='Count of Priors', title='Histogram of Priors' ) compas_df.groupby('race')['priors_count'].describe()
Figure 2. At first glance, it may seem like the pattern of priors for all races follow a similar pattern. Distributions of priors count show a similar right skew between racial groups. However, for reasons out of many people’s control, the median and mean prior count for AfricanAmericans are nearly twice that of the other groups.
Two things of note:
 AfricanAmerican priors are hugely right skewed as evidenced by the mean being over twice the median
 AfricanAmerican priors are nearly twice as high as the other racial groups combined due to a long history of systemic criminal justice issues
The fact that priors_count is so correlated to race and it is skewed differently for the different racial categories is a huge problem, mainly because the ML model can likely pick up on this fact and bias itself against certain races simply by looking at the priors_count column. To remedy this, we will create a custom transformer that will modify a column in place by applying the yeojohnson transformation to each racial category’s subset of values. This will help to remove the disparate impact that this column would have on our group fairness.
As pseudocode, it would look like
For each group label: Get the subset of priors_count values for that group Apply the yeojohnson transformation to the subset Modify the column in place for that group label with the new values
By applying the transformation on each subset of values rather than applying to the column as a whole, we are forcing each group’s set of values to be normal with a mean of 0 and a standard deviation of 1, making it harder for the model to reconstruct a particular group label from a given priors_count value.
Let’s construct a custom scikitlearn transformer to perform this operation.
Listing 1. Disparate Treatment Mitigation through YeoJohnson
from sklearn.preprocessing import PowerTransformer # A from sklearn.base import BaseEstimator, TransformerMixin # A class NormalizeColumnByLabel(BaseEstimator, TransformerMixin): def __init__(self, col, label): self.col = col self.label = label self.transformers = {} def fit(self, X, y=None): # B for group in X[self.label].unique(): self.transformers[group] = PowerTransformer( method='yeojohnson', standardize=True ) self.transformers[group].fit( X.loc[X[self.label]==group][self.col].values.reshape(1, 1) ) return self def transform(self, X, y=None): # C C = X.copy() for group in X[self.label].unique(): C.loc[X[self.label]==group, self.col] = self.transformers[group].transform( X.loc[X[self.label]==group][self.col].values.reshape(1, 1) ) return C
# A Imports
# B fit a PowerTransformer for each group label
# C when transforming a new D\dataframe, we use the transform method of our already fit transformers and modify the dataframe in place
With our new transformer, let’s apply it to our training data to see our prior counts have been modified so that each group label has a mean priors count of 0 and a standard deviation of 1.
n = NormalizeColumnByLabel(col='priors_count', label='race') X_train_normalized = n.fit_transform(X_train, y_train) X_train_normalized.groupby('race')['priors_count'].hist(figsize=(20,5)) X_train_normalized.groupby('race')['priors_count'].describe()
Figure 3. After applying the yeojohnson transformation on each subgroups subset of prior counts, the distributions begin to look much less skewed and different from one another. This will make it difficult for the ML model to reconstruct race from this feature
Listing 2. Our First biasaware model
clf_tree_aware = Pipeline(steps=[ ('normalize_priors', NormalizeColumnByLabel(col='priors_count', label='race')), # A ('preprocessor', preprocessor), ('classifier', classifier) ]) clf_tree_aware.fit(X_train, y_train) aware_y_preds = clf_tree_aware.predict(X_test) exp_tree_aware = dx.Explainer(clf_tree_aware, X_test, y_test, label='Random Forest DIR', verbose=False) # B mf_tree_aware = exp_tree_aware.model_fairness(protected=race_test, privileged = "Caucasian") # performance is virtually unchanged overall pd.concat([exp.model_performance().result for exp in [exp_tree, exp_tree_aware]]) # We can see a small drop in parity loss mf_tree.plot(objects=[mf_tree_aware], type='stacked') # C
# A Add in our new transformer before our preprocessor to fix the priors_count before doing anything else
# B Check out our model performance
# C Investigate change in parity loss
Our new biasaware model with Disparate Impact Removal is working quite well! We can actually see a small boost in model performance and small decrease in cumulative parity loss.
Figure 4. The top bar represents the sum of our bias metrics for our biasaware model which is seeing a minor boost in model performance (noted in the metric table) in all metrics except recall where it is unchanged. The bottom bar shows the original biasunaware stacked plot that we saw earlier. Overall our new biasunaware model is performing better in some ML metrics and is showing a decrease in bias based on our parity loss bar chart. We are on the right track!
Feature Extraction – Learning fair representation implementation using AIF360
Up until now, we haven’t done anything to address our model’s unawareness of sensitive features. Rather than remove race completely, We are going to use AI Fairness 360 (aif360) which is an opensource toolkit developed by IBM to help data scientists get access to preprocessing, in processing, and postprocessing bias mitigation techniques to apply our first feature extraction technique called learning fair representation (LFR). The idea of LFR is to map our data X.
For our use case, we are going to attempt to map our categorical variables (4 / 6 of them representing race) into a new “fairer” vector space that preserves statistical parity and retains as much information as possible from our original X.
Aif360 can be a bit tricky to use as they force you to use their own version of a dataframe called the BinaryLabelDataset
. Below is a custom scikitlearn transformer that will:
 Take in X, a DataFrame of binary values which are created from our categorical preprocessor
 Convert the dataframe into a BinaryLabelDataset
 Fit the LFR module from the aif360 package
 Transform any new dataset using the now fit LFR to map it onto our new fair representation
Listing 3. Custom LFR Transformer
from aif360.algorithms.preprocessing.lfr import LFR from aif360.datasets import BinaryLabelDataset class LFRCustom(BaseEstimator, TransformerMixin): def __init__(self, col, protected_col, unprivileged_groups, privileged_groups): self.col = col self.protected_col = protected_col self.TR = None self.unprivileged_groups = unprivileged_groups self.privileged_groups = privileged_groups def fit(self, X, y=None): d = pd.DataFrame(X, columns=self.col) d['response'] = list(y) binary_df = BinaryLabelDataset( # A df=d, protected_attribute_names=self.protected_col, label_names=['response'] ) # Input reconstruction quality  Ax # Output prediction error  Ay # Fairness constraint  Az self.TR = LFR(unprivileged_groups=self.unprivileged_groups, privileged_groups=self.privileged_groups, seed=0, k=2, Ax=0.5, Ay=0.2, Az=0.2, # B verbose=1 ) self.TR.fit(binary_df, maxiter=5000, maxfun=5000) return self def transform(self, X, y=None): d = pd.DataFrame(X, columns=self.col) if y: d['response'] = list(y) else: d['response'] = False binary_df = BinaryLabelDataset( df=d, protected_attribute_names=self.protected_col, label_names=['response'] ) return self.TR.transform(binary_df).convert_to_dataframe()[0].drop(['response'], axis=1) # B
# A Conversion to and from the aif360 BinaryLabelDataset object
# B These parameters can be found on the aif360 website and were discovered through offline grid searching
In order to use our new transformer, we will need to modify our pipeline slightly and make use of the FeatureUnion object.
Listing 4. Model with Disparate Impact Removal and LFR
categorical_preprocessor = ColumnTransformer(transformers=[ ('cat', categorical_transformer, categorical_features) ]) # A # Right now the aif360 package can only support one privileged and one unprivileged group privileged_groups = [{'Caucasian': 1}] # B unprivileged_groups = [{'Caucasian': 0}] # B lfr = LFRCustom( col=['AfricanAmerican', 'Caucasian', 'Hispanic', 'Other', 'Male', 'M'], protected_col=sorted(X_train['race'].unique()) , privileged_groups=privileged_groups, unprivileged_groups=unprivileged_groups ) categorical_pipeline = Pipeline([ ('transform', categorical_preprocessor), ('LFR', lfr), ]) numerical_features = ["age", "priors_count"] numerical_transformer = Pipeline(steps=[ ('scale', StandardScaler()) ]) numerical_preprocessor = ColumnTransformer(transformers=[ ('num', numerical_transformer, numerical_features) ]) # A preprocessor = FeatureUnion([ # C ('numerical_preprocessor', numerical_preprocessor), ('categorical_pipeline', categorical_pipeline) ]) clf_tree_more_aware = Pipeline(steps=[ # D ('normalize_priors', NormalizeColumnByLabel(col='priors_count', label='race')), ('preprocessor', preprocessor), ('classifier', classifier) ]) clf_tree_more_aware.fit(X_train, y_train) more_aware_y_preds = clf_tree_more_aware.predict(X_test)
#A Isolate the numerical and categorical preprocessor so that we can fit the LFR to the categorical data separately
#B Tell aif360 that rows with a Caucasian label of 1 are privileged and rows with a Caucasian label of 0 are unprivileged
#C use FeatureUnion to combine our categorical data and our numerical data
#D Our new pipeline will remove disparate impact/treatment via yeojohnson and will apply LFR to our categorical data to address model unawareness
That was a lot of code to simply apply an LFR module to our dataframe. Truly the only reason it was so much was the need to transform our pandas Dataframe into aif360’s custom data object and back. Now that we have fit our model, let’s take a final look at our model’s fairness.
exp_tree_more_aware = dx.Explainer(clf_tree_more_aware, X_test, y_test, label='Random Forest DIR + LFR', verbose=False) mf_tree_more_aware = exp_tree_more_aware.model_fairness(protected=race_test, privileged="Caucasian") pd.concat([exp.model_performance().result for exp in [exp_tree, exp_tree_aware, exp_tree_more_aware]])
We can see that our final model with disparate impact removal and LFR applied has arguably better model performance than our original baseline model.
Figure 5. Our final biasaware model has improved accuracy, f1, and precision and is seeing only a minor drop in recall and AUC. This is wonderful because it shows that by reducing bias, we have gotten our ML model to perform better in more “classical” metrics like accuracy at the same time. Winwin!
We also want to check in on our cumulative parity loss to make sure we are heading in the right direction.
mf_tree.plot(objects=[mf_tree_aware, mf_tree_more_aware], type='stacked')
When we check our plot, we can see that our fairness metrics are decreasing as well! This is allaround great news. Our model is not suffering performancewise from our baseline model and our model is also acting much more fairly.
Figure 6. Our Final biasaware model that has Disparate Impact Removal and LFR is the fairest model yet. Again keep in mind that smaller means less bias which is generally better for us. We are definitely making some right moves here to see such a drop in bias and an increase in model performance after doing some pretty simple transformations to our data!
Let’s take a look at our dalex model fairness check one last time. Recall for our unaware model, we have 7 numbers outside of our range of (0.8, 1.25) and we had bias detected in 4 / 5 metrics.
mf_tree_more_aware.fairness_check() # 4 / 15 numbers out of the range of (08, 1.25) Bias detected in 3 metrics: TPR, FPR, STP Conclusion: your model is not fair because 2 or more criteria exceeded acceptable limits set by epsilon. Ratios of metrics, based on 'Caucasian'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25) TPR ACC PPV FPR STP AfricanAmerican 1.626829 1.058268 1.198953 1.538095 1.712329 Hispanic 1.075610 1.102362 0.965096 0.828571 0.893836 Other 0.914634 0.996850 0.806283 1.100000 0.962329
We now only have 3 metrics out of our range as opposed to the 7 previously, and bias is now only being detected in 3 metrics rather than 4. All in all, our work has seemed to improve our model performance slightly and reduced our cumulative parity loss at the same time.
We’ve done a lot of work on this data, but would we be comfortable submitting this model as is to be considered an accurate and fair recidivism predictor? Absolutely not! Our work in this article series barely scratches the surface of bias and fairness awareness and only focused on a few preprocessing techniques. We did not even begin to discuss the other forms of bias mitigation available to us in depth.
Summary
 Model fairness is as important if sometimes more important than model performance alone
 There are multiple ways of defining fairness in our model, each with pros and cons
 Pure unawareness of a model is usually not enough given correlating factors in our data
 Statistical Parity and Equalized Odds are two common definitions of fairness but can sometimes contradict one another
 We can mitigate bias before, during, and after training a model
 Disparate Impact Removal and Learning of Fair Representation helped our model become more fair and also led to a small bump in model performance
 Preprocessing alone is not enough to mitigate bias. We would also have to work on inprocessing and postprocessing methods to further minimize our bias
That’s all for this article series.
If you want to learn more about the book, check it out on Manning’s liveBook platform here.