Why algorithmic auditing can’t fully cope with AI bias in hiring


AI is often touted as an antidote to bias and discrimination in hiring. But there is growing recognition that AI itself can be biased, putting companies that use algorithms to drive hiring decisions at legal risk.

The challenge now for executives and HR managers is figuring out how to spot and eradicate racial bias, sexism and other forms of discrimination in AI — a complex technology few laypeople can begin to understand.

Algorithmic auditing, a process for verifying that decision-making algorithms produce the expected outcomes without violating legal or ethical parameters, is emerging as the most likely fix. But algorithmic auditing is so new, lacking in standards and subject to vendor influence that it has come under attack from academics and policy advocates. Vendors are already squabbling over whose audits are the most credible.

That raises a big question: How can companies trust AI if they can’t trust the process for auditing it?

Why is algorithmic auditing important?

Before delving into answers, it’s important to remember the ends that algorithmic auditing is meant to serve when directed at automated hiring tools.

Many companies strive to include people of color and other minorities in their workforces, a goal that grew more urgent with the rise of diversity programs and the Black Lives Matter movement. Ultimately, they don’t have much choice: Discrimination has been illegal for years in most jurisdictions around the world.

In the U.S., the 1964 Civil Rights Act required that employment policies not have a disparate impact and adversely affect members of a protected class (see Figure 1 for terminology). Disparate impact (also commonly called adverse impact) was later quantified in the Uniform Guidelines on Employee Selection Procedures, part of a 1991 revision of the Civil Rights Act. The guidelines include a four-fifths rule that requires the selection rate of any race, sex or ethnic group to be no less than four-fifths (80%) that of the group with the highest rate. 

Key terms in AI bias in hiring automation tools
Figure 1. These rules and concepts are typically programmed into AI hiring software sold in the U.S.

Companies have been operating under these rules, which are administered by the federal Equal Employment Opportunity Commission (EEOC), for half a century overall. But it’s only in the past couple of decades that they’ve had recruitment and talent management software to help with employment decisions — such as hiring, promotion and career development — that expose them to disparate impact challenges. With AI now automating more of the data analysis and decision-making, organizations are anxious to know whether the technology is helping or hurting.

But investigating the algorithms used in hiring is hard, according to Alex Engler, a fellow at The Brookings Institution, a Washington, D.C., think tank. Engler is a data scientist who has sharply criticized AI-based hiring tools and algorithmic auditing.

“You’re not working necessarily with [vendors] that always want to tell you exactly what’s going on, partly because they know it opens them up to criticism and because it’s proprietary and sort of a competitive disadvantage,” Engler said.

How AI can introduce bias into hiring

It’s more than a little ironic that the controversy over hiring algorithms is swirling around vendors that hype AI as key to reducing hiring bias on the theory that machines are less biased than people.

But bias can creep into the AI in recruitment software if the data used to train machine learning algorithms is heavy with historically favored groups, such as white males. Analyzing performance reviews to predict the success of candidates can then disadvantage minorities who are underrepresented in the data.

Another AI technology, natural language processing (NLP), can stumble on heavily accented English. Facial analysis tools have been shown to misread the expressions of darker-skinned people, filtering out qualified candidates and ultimately producing a less diverse talent pool. The use of AI in automated interviewing to detect emotions and character traits has also come under fire for potential bias against minorities.

Few vendors have had to thread the needle on ethical use of AI in hiring more than HireVue Inc., a pioneer in using NLP and facial analysis in video interviews.

Kevin Parker, CEO, HireVue Inc.Kevin Parker

HireVue dropped facial analysis from new assessments in early 2020 after widespread criticism by Engler and others that the AI could discriminate against minorities. More recently, the software’s ability to analyze vocal tone, which Engler said holds similar risks, has come under scrutiny after customers expressed “nonspecific concerns,” according to HireVue CEO Kevin Parker. He said the company has decided to stop using that feature because it no longer has predictive value. It won’t be in new AI models and will be removed from older ones that come up for review.

HireVue will instead rely on NLP, which has improved so much that it can produce reliable transcripts and not be thrown by heavy accents, said Lindsey Zuloaga, HireVue’s chief data scientist. “It’s the words that matter — what you say, not how you say it.”

Steps already taken to mitigate bias

Three vendors that were interviewed — HireVue, Modern Hire and Pymetrics — all say they regularly test their AI models for bias.

“We do a lot of testing and issue a pretty voluminous technical report to the customer, looking at different constituencies and the outcomes, and the work we’ve done if we discovered bias in the process,” Parker said.

Zuloaga described the bias mitigation process in more detail. “We have a lot of inputs that go into the model,” she said. “When we use video features, or facial action units [facial movements that count toward the score] or tonal things — any of those, just like language, will be punished by the model if they cause adverse impact.” Models are reviewed at least annually, she added.

Pymetrics Inc. takes a different approach to hiring, applying cognitive science to games designed to determine a candidate’s soft skills, though it also offers structured video interviews.

Pymetrics essentially uses reporting to monitor for bias, according to CEO Frida Polli.

“The best way to check if an algorithm has bias is to look at what the output is,” Polli said. “Think of it like emissions testing. You don’t actually need to know all of the nuances of a car engine … to know whether it’s above or below the allowed pollution level.”

Before releasing any algorithm it builds for an employer, Pymetrics shares a report to confirm that the algorithm is performing above the four-fifths rule — “so, basically free of bias,” Polli said. “We show them ahead of time that we are falling above that threshold, that the algorithm is performing above that before we ever use it.” Then, after the algorithm has “been in use for some period of time, we test it continually to ensure that it hasn’t fallen below that guidance,” she added. 

Algorithmic auditing steps
Figure 2. Algorithmic auditors typically follow these steps when analyzing AI hiring algorithms for bias and discrimination.

How algorithmic auditing can mitigate AI bias

Auditing algorithms could be the most practical way to verify that an AI tool avoids bias. It also lets vendors reassure customers without exposing their intellectual property (IP). But in its current form, algorithmic auditing is extremely labor intensive.

Jake Appel, chief strategist at O’Neil Risk Consulting & Algorithmic Auditing (ORCAA), explained the process. His company conducted an audit for HireVue that was released in January.

First, ORCAA identifies the use case for the technology. Next, it identifies stakeholders and asks them how an algorithm could fail them.

“We then do the work of translating their concerns into statistical tests or analytic questions that could be asked of the algorithm,” Appel said. If problems are detected, ORCAA suggests fixes, such as changing data sets or removing an algorithm.

Vendors, academics and algorithmic auditors disagree on whether it’s necessary to see the program code to determine bias.

Mike Hudy, chief science officer at Modern Hire, a HireVue competitor in AI-based video interview and recruitment software, said transparency must be balanced against a vendor’s need to protect its IP — but claimed some vendors use that as an excuse not to share code. He said Modern Hire, which itself does…


Read More:Why algorithmic auditing can’t fully cope with AI bias in hiring