Algorithmic Bias: Where is the Problem?

What is algorithmic bias? That is, how can we actually define it in a meaningful, constructive way that can help us to ultimately create a more equitable society to live in? To begin thinking about this question more deeply, we must consider a few different ideas of algorithmic bias, some clarifications, and then what benchmark we are comparing algorithms to.

When I first read the following ProPublica article about predictive policing (found here) over a year ago, I was caught off guard. I was convinced that there was some sort of problem, but had to work through what exactly the problem was and what that meant with regards to algorithms and the sort of responsibility for software engineers who come up with these algorithms. However, after reading some clarifications of the ProPublica article and some statistical studies showing that the data itself was biased (based on the data that ProPublica published -> they responsibly published all of the data that they used). Now, it’s also important here to define what I mean by bias. In this colloquial sense, I simply mean that the data shows a disparate impact against a group of people based on ethically non-essential characteristics, like race. I also believe that this is a common use of the term when speaking about bias within this context.

Following the ProPublica article, a common reaction is to be up-in-arms against the dangers of such a technology as predictive policing -> will this increase the disparity? Keep it the same such that we can’t improve it? While these fears are justified and legitimate fears to have, it is important to first acknowledge that there is a real problem that this article unearths, but then, not to jump to a conclusion about what is to blame, namely the algorithms. We should not draw conclusions about what to blame simply because of a lack of understanding or a lack of information. It is a major danger and error to jump to a conclusion based off of a lack of information, namely to blame algorithms for all bias simply because we don’t understand what the algorithm is doing. In this case, it turns out that the algorithm itself was okay, but the data was skewed because of bias in the world that already exists. The important point here is that the algorithm itself was constructed in such a manner that it did its job exactly, with no “bias” or mistakes. It just so happened that the data that the algorithm used to make predictions, in this case about which areas were more likely to have crime, was skewed based on an inherent disparity that exists in the world.

If we are comparing algorithms to a benchmark of perfection, then they will fall short. By nature of uncertainty, there will always be false positives and false negatives, although this can be limited by a very good algorithm. However, there are always false positives and false negatives when humans make important decisions too. As an example, consider the study that examined the increase in harsher rulings following the loss of a home football team…). So what sort of benchmark should we compare to? If an algorithm consistently performs better and more equitably than a human, then should we use that algorithm? It seems that the rational answer should be a clear yes, but when we really think of putting life or death decisions into the hands of a machine, many would likely say that we should not. Then, we should consider what the real difference between the two cases are and why one might be not be comfortable with choosing the algorithm. Is it a lack of transparency? (i.e. algorithmic transparency and education about the algorithm might help). Or is it a lack of an intangible humanity? Moving forward, it will be exceedingly important for us as a society to think about the different ways we can define algorithmic bias in a constructive way and consider which sort of situations the use of algorithms might be okay and why.

Leave a Comment

Log in