From Feature Engineering Bookcamp by Sinan Ozdemir

This article series covers

●      Recognizing and mitigating bias in our data and model

●      Quantifying fairness through various metrics

●      Applying feature engineering techniques to remove bias from our model without sacrificing model performance

Take 35% off Feature Engineering Bookcamp by entering fccozdemir into the discount code box at checkout at

Building a Bias-aware Model

For information on the dataset, the mechanics and importance of model bias and fairness, and building the basic model, check out part 1 and part 2.

Let’s begin to construct a more bias-aware model using two feature engineering techniques. We will begin by applying a familiar transformation to construct a new less-biased column and then move on to our feature extraction method of the book. Our goal is to minimize the bias of our model without sacrificing a great deal of model performance.

Feature Construction – Using the Yeo-Johnson transformer to treat the disparate impact

We’re going to do something similar to the box-cox transformation to transform some of our features in order to make them appear more normal. To set up why we have to investigate the reasons for which our model is under-predicting recidivism for non-African-American people. One approach would be to remove race entirely from our dataset and expect the ML model to remove all bias. This rarely is the answer unfortunately.

Unprivileged and privileged groups of people experience different opportunities and this likely present itself in the data through correlated features. The most likely cause for our model’s bias is that at least one of our features is highly correlated with race and our model is able to reconstruct someone’s racial identity through this feature. To find this feature, let’s start by finding the correlation coefficient of our numerical features and being African-American.

 compas_df.corrwith(compas_df['race'] == 'African-American').sort_values()
 age              -0.179095
 juv_count         0.111835
 priors_count      0.202897

Both age and priors_count are highly correlated with our boolean label of simply being African-American so let’s take a closer look at each of those. Let’s start by looking at age. We can plot a histogram and print out some basic statistics and we will see that across our four racial categories, age seems to be relatively similar with a similar mean, standard deviation, and median. This signals to us that even though age is negatively correlated to being African-American, this relationship is not a huge contributing factor to our model’s bias.

 # Age is not very skewed
     kind='hist', xlabel='Age', title='Histogram of Age'

Figure 1. Distribution of Age by the group. The table on top implies that age distribution is not drastically different across groups thereby implying less impact to disparate treatment and impact. It is worth noting that the average and the median age of African-Americans  is about 10-15% younger than those identified in the other categories which is probably why we are seeing a strong correlation between our age column and our African-American identifying column

Let’s turn our attention to priors_count and do the same printout. When we do, we will see some stark contrasts from age.

 # Priors is extremely skewed by looking at the differences in mean/median/std across the racial categories
     kind='hist', xlabel='Count of Priors', title='Histogram of Priors'

Figure 2. At first glance, it may seem like the pattern of priors for all races follow a similar pattern. Distributions of priors count show a similar right skew between racial groups. However, for reasons out of many people’s control, the median and mean prior count for African-Americans are nearly twice that of the other groups.

Two things of note:

  1. African-American priors are hugely right skewed as evidenced by the mean being over twice the median
  2. African-American priors are nearly twice as high as the other racial groups combined due to a long history of systemic criminal justice issues

The fact that priors_count is so correlated to race and it is skewed differently for the different racial categories is a huge problem, mainly because the ML model can likely pick up on this fact and bias itself against certain races simply by looking at the priors_count column. To remedy this, we will create a custom transformer that will modify a column in place by applying the yeo-johnson transformation to each racial category’s subset of values. This will help to remove the disparate impact that this column would have on our group fairness.

As pseudocode, it would look like

 For each group label:
         Get the subset of priors_count values for that group
         Apply the yeo-johnson transformation to the subset
         Modify the column in place for that group label with the new values

By applying the transformation on each subset of values rather than applying to the column as a whole, we are forcing each group’s set of values to be normal with a mean of 0 and a standard deviation of 1, making it harder for the model to reconstruct a particular group label from a given priors_count value.

Let’s construct a custom scikit-learn transformer to perform this operation.

Listing 1. Disparate Treatment Mitigation through Yeo-Johnson

 from sklearn.preprocessing import PowerTransformer  # A
 from sklearn.base import BaseEstimator, TransformerMixin  # A
 class NormalizeColumnByLabel(BaseEstimator, TransformerMixin):
     def __init__(self, col, label):
         self.col = col
         self.label = label
         self.transformers = {}
     def fit(self, X, y=None):  # B
         for group in X[self.label].unique():
             self.transformers[group] = PowerTransformer(
                 method='yeo-johnson', standardize=True
                 X.loc[X[self.label]==group][self.col].values.reshape(-1, 1)
         return self
     def transform(self, X, y=None):  # C
         C = X.copy()
         for group in X[self.label].unique():
             C.loc[X[self.label]==group, self.col] = self.transformers[group].transform(
                 X.loc[X[self.label]==group][self.col].values.reshape(-1, 1)
         return C

# A Imports

# B fit a PowerTransformer for each group label

# C when transforming a new D\dataframe, we use the transform method of our already fit transformers and modify the dataframe in place

With our new transformer, let’s apply it to our training data to see our prior counts have been modified so that each group label has a mean priors count of 0 and a standard deviation of 1.

 n = NormalizeColumnByLabel(col='priors_count', label='race')
 X_train_normalized = n.fit_transform(X_train, y_train)

Figure 3. After applying the yeo-johnson transformation on each sub-groups subset of prior counts, the distributions begin to look much less skewed and different from one another. This will make it difficult for the ML model to reconstruct race from this feature

Listing 2. Our First bias-aware model

 clf_tree_aware = Pipeline(steps=[
     ('normalize_priors', NormalizeColumnByLabel(col='priors_count', label='race')),  # A
     ('preprocessor', preprocessor),
     ('classifier', classifier)
 ]), y_train)
 aware_y_preds = clf_tree_aware.predict(X_test)
 exp_tree_aware = dx.Explainer(clf_tree_aware, X_test, y_test, label='Random Forest DIR', verbose=False)  # B
 mf_tree_aware = exp_tree_aware.model_fairness(protected=race_test, privileged = "Caucasian")
 # performance is virtually unchanged overall
 pd.concat([exp.model_performance().result for exp in [exp_tree, exp_tree_aware]])
 # We can see a small drop in parity loss
 mf_tree.plot(objects=[mf_tree_aware], type='stacked')  # C

# A  Add in our new transformer before our preprocessor to fix the priors_count before doing anything else

# B Check out our model performance

# C Investigate change in parity loss

Our new bias-aware model with Disparate Impact Removal is working quite well! We can actually see a small boost in model performance and small decrease in cumulative parity loss.

Figure 4. The top bar represents the sum of our bias metrics for our bias-aware model which is seeing a minor boost in model performance (noted in the metric table) in all metrics except recall where it is unchanged. The bottom bar shows the original bias-unaware stacked plot that we saw earlier. Overall our new bias-unaware model is performing better in some ML metrics and is showing a decrease in bias based on our parity loss bar chart. We are on the right track!

Feature Extraction – Learning fair representation implementation using AIF360

Up until now, we haven’t done anything to address our model’s unawareness of sensitive features. Rather than remove race completely, We are going to use AI Fairness 360 (aif360) which is an open-source toolkit developed by IBM to help data scientists get access to preprocessing, in processing, and postprocessing bias mitigation techniques to apply our first feature extraction technique called learning fair representation (LFR). The idea of LFR is to map our data X.

For our use case, we are going to attempt to map our categorical variables (4 / 6 of them representing race) into a new “fairer” vector space that preserves statistical parity and retains as much information as possible from our original X.

Aif360 can be a bit tricky to use as they force you to use their own version of a dataframe called the BinaryLabelDataset. Below is a custom scikit-learn transformer that will:

  1. Take in X, a DataFrame of binary values which are created from our categorical preprocessor
  2. Convert the dataframe into a BinaryLabelDataset
  3. Fit the LFR module from the aif360 package
  4. Transform any new dataset using the now fit LFR to map it onto our new fair representation

Listing 3. Custom LFR Transformer

 from aif360.algorithms.preprocessing.lfr import LFR
 from aif360.datasets import BinaryLabelDataset
 class LFRCustom(BaseEstimator, TransformerMixin):
     def __init__(self, col, protected_col, unprivileged_groups, privileged_groups):
         self.col = col
         self.protected_col = protected_col
         self.TR = None
         self.unprivileged_groups = unprivileged_groups
         self.privileged_groups = privileged_groups
     def fit(self, X, y=None):
         d = pd.DataFrame(X, columns=self.col)
         d['response'] = list(y)
         binary_df = BinaryLabelDataset(  # A
         # Input reconstruction quality - Ax
         # Output prediction error - Ay
         # Fairness constraint - Az
         self.TR = LFR(unprivileged_groups=self.unprivileged_groups,
                  privileged_groups=self.privileged_groups, seed=0,
                  k=2, Ax=0.5, Ay=0.2, Az=0.2,  # B
                 ), maxiter=5000, maxfun=5000)
         return self
     def transform(self, X, y=None):
         d = pd.DataFrame(X, columns=self.col)
         if y:
             d['response'] = list(y)
             d['response'] = False
         binary_df = BinaryLabelDataset(
         return self.TR.transform(binary_df).convert_to_dataframe()[0].drop(['response'], axis=1)  # B

# A Conversion to and from the aif360 BinaryLabelDataset object

# B These parameters can be found on the aif360 website and were discovered through offline grid searching

In order to use our new transformer, we will need to modify our pipeline slightly and make use of the FeatureUnion object.

Listing 4. Model with Disparate Impact Removal and LFR

 categorical_preprocessor = ColumnTransformer(transformers=[
     ('cat', categorical_transformer, categorical_features)
 ])  # A
 # Right now the aif360 package can only support one privileged and one unprivileged group
 privileged_groups = [{'Caucasian': 1}]  # B
 unprivileged_groups = [{'Caucasian': 0}]  # B
 lfr = LFRCustom(
     col=['African-American', 'Caucasian', 'Hispanic', 'Other', 'Male', 'M'],
     protected_col=sorted(X_train['race'].unique()) ,
 categorical_pipeline = Pipeline([
     ('transform', categorical_preprocessor),
     ('LFR', lfr),
 numerical_features = ["age", "priors_count"]
 numerical_transformer = Pipeline(steps=[
     ('scale', StandardScaler())
 numerical_preprocessor = ColumnTransformer(transformers=[
         ('num', numerical_transformer, numerical_features)
 ])  # A
 preprocessor = FeatureUnion([  # C
     ('numerical_preprocessor', numerical_preprocessor),
     ('categorical_pipeline', categorical_pipeline)
 clf_tree_more_aware = Pipeline(steps=[  # D
     ('normalize_priors', NormalizeColumnByLabel(col='priors_count', label='race')),
     ('preprocessor', preprocessor),
     ('classifier', classifier)
 ]), y_train)
 more_aware_y_preds = clf_tree_more_aware.predict(X_test)

#A Isolate the numerical and categorical preprocessor so that we can fit the LFR to the categorical data separately

#B Tell aif360 that rows with a Caucasian label of 1 are privileged and rows with a Caucasian label of 0 are unprivileged

#C use FeatureUnion to combine our categorical data and our numerical data

#D Our new pipeline will remove disparate impact/treatment via yeo-johnson and will apply LFR to our categorical data to address model unawareness

That was a lot of code to simply apply an LFR module to our dataframe. Truly the only reason it was so much was the need to transform our pandas Dataframe into aif360’s custom data object and back. Now that we have fit our model, let’s take a final look at our model’s fairness.

 exp_tree_more_aware = dx.Explainer(clf_tree_more_aware, X_test, y_test, label='Random Forest DIR + LFR', verbose=False)
 mf_tree_more_aware = exp_tree_more_aware.model_fairness(protected=race_test, privileged="Caucasian")
 pd.concat([exp.model_performance().result for exp in [exp_tree, exp_tree_aware, exp_tree_more_aware]])

We can see that our final model with disparate impact removal and LFR applied has arguably better model performance than our original baseline model.

Figure 5. Our final bias-aware model has improved accuracy, f1, and precision and is seeing only a minor drop in recall and AUC. This is wonderful because it shows that by reducing bias, we have gotten our ML model to perform better in more “classical” metrics like accuracy at the same time. Win-win!

We also want to check in on our cumulative parity loss to make sure we are heading in the right direction.

 mf_tree.plot(objects=[mf_tree_aware, mf_tree_more_aware], type='stacked')

When we check our plot, we can see that our fairness metrics are decreasing as well! This is all-around great news. Our model is not suffering performance-wise from our baseline model and our model is also acting much more fairly.

Figure 6. Our Final bias-aware model that has Disparate Impact Removal and LFR is the fairest model yet. Again keep in mind that smaller means less bias which is generally better for us. We are definitely making some right moves here to see such a drop in bias and an increase in model performance after doing some pretty simple transformations to our data!

Let’s take a look at our dalex model fairness check one last time. Recall for our unaware model, we have 7 numbers outside of our range of (0.8, 1.25) and we had bias detected in 4 / 5 metrics.

 mf_tree_more_aware.fairness_check()  # 4 / 15 numbers out of the range of (08, 1.25)
 Bias detected in 3 metrics: TPR, FPR, STP
 Conclusion: your model is not fair because 2 or more criteria exceeded acceptable limits set by epsilon.
 Ratios of metrics, based on 'Caucasian'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
                        TPR       ACC       PPV       FPR       STP
 African-American  1.626829  1.058268  1.198953  1.538095  1.712329
 Hispanic          1.075610  1.102362  0.965096  0.828571  0.893836
 Other             0.914634  0.996850  0.806283  1.100000  0.962329

We now only have 3 metrics out of our range as opposed to the 7 previously, and bias is now only being detected in 3 metrics rather than 4. All in all, our work has seemed to improve our model performance slightly and reduced our cumulative parity loss at the same time.

We’ve done a lot of work on this data, but would we be comfortable submitting this model as is to be considered an accurate and fair recidivism predictor? Absolutely not! Our work in this article series barely scratches the surface of bias and fairness awareness and only focused on a few preprocessing techniques. We did not even begin to discuss the other forms of bias mitigation available to us in depth.


  • Model fairness is as important if sometimes more important than model performance alone
  • There are multiple ways of defining fairness in our model, each with pros and cons
    • Pure unawareness of a model is usually not enough given correlating factors in our data
    • Statistical Parity and Equalized Odds are two common definitions of fairness but can sometimes contradict one another
  • We can mitigate bias before, during, and after training a model
  • Disparate Impact Removal and Learning of Fair Representation helped our model become more fair and also led to a small bump in model performance
  • Preprocessing alone is not enough to mitigate bias. We would also have to work on in-processing and post-processing methods to further minimize our bias

That’s all for this article series.

If you want to learn more about the book, check it out on Manning’s liveBook platform here.