Wrongful (but not wrong) Prediction
There was an episode of Welcome Back, Kotter that aired in 1976. Juan Epstein, a character in Kotter’s infamous classroom of remedial students called the Sweathogs, decides he wants to become a veterinarian. His teacher, Gabe Kotter, supports his dream and encourages him to speak to a career counselor at the school. Enter technology. The counselor runs the stats and informs Epstein he must aim for manual labor based on his grades and scores. Of course, in sit-com fashion, a heroic Epstein oversees a pregnant hamster’s complicated birthing process, impressing the counselor and enlightening her to the deficits of her automated process of prediction.
That was 1976. A check on algorithmic bias allowed a fictional student to pursue a dream. But where are the checks on algorithmic bias now? Organizations use algorithms to predict which students will encounter a prison sentence, who will be a more dangerous parent, who will need public services, which areas have more crime, or who, among the homeless, is most in need of housing. In some ways, the algorithm can be harmless and lead to preparation. But predictive algorithms are not expressing a truth; they are making predictions by performing statistics based on a dataset. Some of them are better than fortune tellers, yet far from perfect (See Eubanks, p. 145 referring to the Allegheny Family Screening Tool which measured 76 percent in predictive accuracy. Eubanks reminds us that a model with zero predictive ability could be like a coin toss: likely to be correct about 50 percent of the time.) Models and the algorithms in them also include many judgments ― for example, people’s ideas about the qualities that would make someone deserve housing more than someone else (Eubanks, Chapter 3). Even at their most accurate, they often answer the wrong question or predict wrongfully.
When I say predict wrongfully, I mean that a prediction based on accurate data is used to perpetuate a wrong. Accuracy does not imply a causal connection between metrics and outcomes and group data does not always accurately predict individual behavior. The problems the Juan Epstein character faced continue to plague students. As a hypothetical, let’s assume data indicates that people with an inability to pay bills, who have been incarcerated, who shop at the dollar store, and are single parents are less likely to perform well at a certain job, even if they are qualified. An algorithm filters them out. Yet through reading personal stories about people who have persevered, become educated and learned skills, and even started their own companies, we see that resilience and perseverance might be indicators of success. Positive personal stories are excluded from the algorithm or show up as so rare as to not feed into the prediction, while negative experiences, mostly experiences associated with poverty, are included. Using large datasets to filter large groups out is easy and has quite a bit of algorithm roadkill.
Algorithms affect job seekers and employers; algorithms predict who is most likely to become pregnant and quit a job; job seekers are steered away from jobs based on demographics rather than skills. While each predictive model is not necessarily inaccurate, there is an open, important issue about how government agencies and businesses use algorithms. We need a stronger public voice in algorithms used in policing, criminal justice, education, health care, employment, the distribution of public services, and surveillance. More data often leads to more accuracy in predicting group behavior but does not necessarily tell more about an individual (O’Neil, p. 170).
In the example of who might quit upon childbirth, it may be that the algorithm reflected characteristics of people who did not have access to childcare, but the predictive model could prevent someone who has access to childcare from being offered the job, and it could allow discrimination against women and any people for whom securing childcare is challenging.
To validate the model is another tricky process. If people are turned away from jobs, they are not part of new data gathered that answers the question: who is good at this job? In the pregnancy prediction, the people who keep the jobs may be delaying pregnancy, may be financially secure with many childcare options, or may live in a state and county with affordable services. As the feedback loop continues, the ones who were offered and took the jobs and have had a chance to prove themselves will be in the dataset, and their demographics or qualities will be seen as likely to stay at work. It will look like a good predictive algorithm if more women push back pregnancy, are burdened by debt, or are not interested in having children or in taking time off for maternity leave if they do. This type of model may (for better or worse depending on outlook) make an employer prefer to hire someone with student loans to repay as that is a known reason that people delay pregnancy. The model’s constantly updated dataset will not adjust unless people deemed a risk by it are hired and stay – but they will not be hired and given the opportunity to stay because the algorithm has weeded them out.
Algorithms that are accurate can engage in begging the question, a logical fallacy where the premise is also the conclusion (for example, “everyone uses TikTok because it is the most used app”). The word because is incorrect. The premise is the same as the conclusion. In algorithm use, this looks something like “a lot of people like you do this because so many people doing this are people like you.” That leads the algorithm to make a prediction, adding in a presumable “because”, where it should not, using the past to predict the future. If a confounding variable or the true root cause is overlooked, policy will go astray, punishing and mischaracterizing the poor. Because, after all, if you share characteristics with a group deemed more likely to steal, rob, loiter, be disorderly, or use drugs, you are indeed being stereotyped and aligned with people who may be nothing like you rather than people a lot like you.
We glorify tech (often for good reason!). But we trust algorithms more than we trust our instincts (Mansharamani; Eubanks). We love STEM here in the United States. Technology and the sciences are crucial to society, comfort, access to information, wellness, medical care, transportation, the military, and the environment, etc. Yet the use of tech in areas like criminology, policing, access to health care, education and college admissions, and job markets comes with a duty to eliminate bias and to end feedback loops that leave the already disenfranchised behind.
Cathy O’Neil, Weapons of Math Destruction, 2016, Broadway Books: New York.
Virginia Eubanks, Automating Inequality, 2018. Picador: New York.
Vikram Mansharamani, Think for Yourself: Restoring Common Sense in the Age of Experts and Artificial Intelligence, 2020, Harvard Business Review Press: Cambridge.