I then went on to work for multiple firms that placed a premium on candidates from Ivy League/Top Tier (Stanford/Duke etc) candidates.
This taught me that:
- Their are pros and cons to any selection criteria.
- There are smart people everywhere. One of the smartest people I ever worked for spent several years in prison for drug dealing. He was on par with many of the Managing Directors I've worked for
- There was a study where they asked big bank recruiters which school consistently produced people who were excellent employees 2-3 years out from hiring and the answer was Penn State (not my alma mater)
- There used to be "manager's choice" hires where managers had 1 slot in a training program where they could select whoever they wanted. Sometimes that was terrible. Sometimes that person was top of their training program.
- Smart people are just as capable as creating problems as less intelligent people. Smart people, in some ways, are better at creating problems. Especially if the incentives reward them for creating those problems.
alexpotato
I think this partially buries the lede:
"As a single hiring vendor comes to dominate screening for an industry, it may be more likely that candidates are shut out."
If we move to using just a small number of AI models to help do things like hiring, we will amplify biases and possibly completely lock out portions of the population. We need to be very careful when using AI systems to evaluate people in general -- not because they might be biased (which they might be), but because even a small bias, if used by virtually everyone, can be damning.
kenjackson
Did I miss the part of the article where they break down how they determined race? Is the algorithm blind to race? It looks like they specifically looked at 83k people applying to ~100 companies which notably were Fortune 500 companies. Could there simply be candidate discrepancies here? Hard for me to follow the full methodology but it doesn't necessarily seem either malicious or that well structured. Don't you need to have a control group of applicants who are similar on paper? To allege DISCRIMINATION is quite bold.
Definitely open to opposing or critical views
wand3r
Misleading title the paper [0] does not mention any CV screening that might suggest racial or gender bias. It is purely about assessment tool. No AI or LLMs.
I'm not saying AI is not biased, but this study does not prove that.
> Fig. 1. The pymetrics process.
> Stage 1: Applicants apply to positions.
> Stage 2: Applicants are directed to the pymetrics platform to play assessment games.
> Stage 3: pymetrics algorithms use applicant gameplay features to recommend 58.2% of applicants per position on average.
> Stage 4: Employers decide which applicants to interview or hire, typically rejecting applicants that were not recommended by pymetrics.
Oras
> We find that people who submit multiple applications to positions screened by the same algorithmic hiring vendor are more likely to be rejected from every position to which they apply than would be true if the companies made decisions statistically independently from one another. Ten percent of applicants who submit four applications are rejected from all the places to which they apply.
> Our research also found that this pattern does not appear to be the case in other circumstances. We analyzed data from the largest prior study of hiring decisions, which sent 83,000 applications to 108 Fortune 500 firms during the same time period as our study and did not focus on whether AI was used to make decisions. We found that the rate at which applicants were rejected from every firm they applied to in this data was no higher than what you’d expect if each company decided independently of the others.
It sounds like this study was using real-world applicants, and the other study they're comparing against was using synthetic applicants.
Consider the chance of being accepted as being composed of signal+bias+noise. Noise is random. Signal is a per-applicant value, and what's meant to be measured. Bias is a per-group value, and an artifact of the measuring process.
If acceptance/rejection is independent between positions applied for (as in the synthetic applicant study), that suggests that it's random or composed entirely of noise; ie there is no signal; ie the applicants are all equally qualified.
If acceptance/rejection is correlated, that means there is some nonzero amount of (signal+bias). But real-world applicants are not all identical, so there should be some amount of signal. So you can't just assume zero signal in order to infer that there must be bias.
tbrownaw
Anyone who’s done hiring wouldn’t be shocked by this:
We find applicants are more likely to be rejected from every position they apply to than would be predicted by the baseline of each position making statistically independent decisions.
Obviously a rejected resume is more likely to be rejected by every other employer and an accepted resume is more likely to be accepted by every other employer. Like online dating, most employers are looking for some baseline indicators that you are going to be successful and stable.
daft_pink
The European Union passed The Artificial Intelligence Act, which classifies:
High-risk – AI applications that are expected to pose significant threats to health, safety, or the fundamental rights of persons. Notably, AI systems used in health, education, recruitment, critical infrastructure management, law enforcement or justice. They are subject to quality, transparency, human oversight and safety obligations
"Cards held by African-American sellers sold for approximately 20% ($0.90) less than cards held by Caucasian sellers, and the race effect was more pronounced in sales of minority player cards."
ortusdux
> To measure adverse impact, we apply the EEOC’s “four-fifths rule,” which flags a position when one group is recommended at less than 80% of the rate of the most-recommended group
That seems like a nonsensical way to measure racial discrimination. What could justify it?
dash2
Some job application websites I've seen actually have a yes or no option to consent to AI review that they claim is to simply assist HR and not actually screen you. I always select no. There is no way that selecting yes would ever be in my interest. I'm sorry, I'm going to force a real human to look at my stuff if I still can.
comments (10)
I then went on to work for multiple firms that placed a premium on candidates from Ivy League/Top Tier (Stanford/Duke etc) candidates.
This taught me that:
- Their are pros and cons to any selection criteria.
- There are smart people everywhere. One of the smartest people I ever worked for spent several years in prison for drug dealing. He was on par with many of the Managing Directors I've worked for
- There was a study where they asked big bank recruiters which school consistently produced people who were excellent employees 2-3 years out from hiring and the answer was Penn State (not my alma mater)
- There used to be "manager's choice" hires where managers had 1 slot in a training program where they could select whoever they wanted. Sometimes that was terrible. Sometimes that person was top of their training program.
- Smart people are just as capable as creating problems as less intelligent people. Smart people, in some ways, are better at creating problems. Especially if the incentives reward them for creating those problems.
alexpotato
If we move to using just a small number of AI models to help do things like hiring, we will amplify biases and possibly completely lock out portions of the population. We need to be very careful when using AI systems to evaluate people in general -- not because they might be biased (which they might be), but because even a small bias, if used by virtually everyone, can be damning.
kenjackson
Definitely open to opposing or critical views
wand3r
I'm not saying AI is not biased, but this study does not prove that.
[0] https://arxiv.org/pdf/2605.27371
From the paper:
> Fig. 1. The pymetrics process. > Stage 1: Applicants apply to positions. > Stage 2: Applicants are directed to the pymetrics platform to play assessment games. > Stage 3: pymetrics algorithms use applicant gameplay features to recommend 58.2% of applicants per position on average. > Stage 4: Employers decide which applicants to interview or hire, typically rejecting applicants that were not recommended by pymetrics.
Oras
> Our research also found that this pattern does not appear to be the case in other circumstances. We analyzed data from the largest prior study of hiring decisions, which sent 83,000 applications to 108 Fortune 500 firms during the same time period as our study and did not focus on whether AI was used to make decisions. We found that the rate at which applicants were rejected from every firm they applied to in this data was no higher than what you’d expect if each company decided independently of the others.
It sounds like this study was using real-world applicants, and the other study they're comparing against was using synthetic applicants.
Consider the chance of being accepted as being composed of signal+bias+noise. Noise is random. Signal is a per-applicant value, and what's meant to be measured. Bias is a per-group value, and an artifact of the measuring process.
If acceptance/rejection is independent between positions applied for (as in the synthetic applicant study), that suggests that it's random or composed entirely of noise; ie there is no signal; ie the applicants are all equally qualified.
If acceptance/rejection is correlated, that means there is some nonzero amount of (signal+bias). But real-world applicants are not all identical, so there should be some amount of signal. So you can't just assume zero signal in order to infer that there must be bias.
tbrownaw
We find applicants are more likely to be rejected from every position they apply to than would be predicted by the baseline of each position making statistically independent decisions.
Obviously a rejected resume is more likely to be rejected by every other employer and an accepted resume is more likely to be accepted by every other employer. Like online dating, most employers are looking for some baseline indicators that you are going to be successful and stable.
daft_pink
High-risk – AI applications that are expected to pose significant threats to health, safety, or the fundamental rights of persons. Notably, AI systems used in health, education, recruitment, critical infrastructure management, law enforcement or justice. They are subject to quality, transparency, human oversight and safety obligations
That's a pretty common sense legislation to me.
alain94040
"Cards held by African-American sellers sold for approximately 20% ($0.90) less than cards held by Caucasian sellers, and the race effect was more pronounced in sales of minority player cards."
ortusdux
That seems like a nonsensical way to measure racial discrimination. What could justify it?
dash2
asdff