Big Data and Discrimination
The NYT has an interesting report on the new wave of startups using “big data” to assess individual creditworthiness. However, the story is rather too cute by half:
By law, lenders cannot discriminate against loan applicants on the basis of race, religion, national origin, sex, marital status, age or the receipt of public assistance. Big-data lending, though, relies on software algorithms largely working on their own and learning as they go.
The danger is that with so much data and so much complexity, an automated system is in control. The software could end up discriminating against certain racial or ethnic groups without being programmed to do so.
The data scientists focus on finding reliable correlations in the data rather than trying to determine why, for instance, proper capitalization may be a hint of creditworthiness.
The problem with these “reliable correlations” isn’t that they’re wrong – it’s that they are virtually guaranteed to correlate with race, religion, national origin, sex, marital status, age, and the receipt of public assistance. While evaluation criteria like whether people capitalize names on forms are seemingly neutral, there is a reason that big credit bureaus don’t use them. The predictable consequence would be systematically higher costs of borrowing for disadvantaged groups, and the end result would look a lot like discrimination based on suspect categories. As a society, we have generally decided that access to credit should be more equitable, at the cost of being somewhat harder to forecast.
The startups featured here are trying to make an end-run around these anti-discrimination laws. Which is fine, I suppose. They are small outfits and so can embrace tactics that Equifax etc. would shun as unacceptably risky. My guess is that the DOJ will mostly ignore them as long as they remain confined to mostly niche markets, but would intervene in a big way if they become standard credit scorers. However, I think reporters should be clear on the issue and stakes involved. Discriminatory access to capital isn’t some unforeseen side-effect of better credit-scoring based on “big data”; it’s the entire point.