Technology

59317 readers

4683 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

On Being an Outlier (www.goethe.de)

submitted 7 months ago by JoBo@feddit.uk to c/technology@lemmy.world

22 comments fedilink hide all child comments

Proponents of AI and other optimists are often ready to acknowledge the numerous problems, threats, dangers, and downright murders enabled by these systems to date. But they also dismiss critique and assuage skepticism with the promise that these casualties are themselves outliers — exceptions, flukes — or, if not, they are imminently fixable with the right methodological tweaks.

Common practices of technology development can produce this kind of naivete. Alberto Toscano calls this a “Culture of Abstraction.” He argues that logical abstraction, core to computer science and other scientific analysis, influences how we perceive real-world phenomena. This abstraction away from the particular and toward idealized representations produces and sustains apolitical conceits in science and technology. We are led to believe that if we can just “de-bias” the data and build in logical controls for “non-discrimination,” the techno-utopia will arrive, and the returns will come pouring in. The argument here is that these adverse consequences are unintended. The assumption is that the intention of algorithmic inference systems is always good — beneficial, benevolent, innovative, progressive.

Stafford Beer gave us an effective analytical tool to evaluate a system without getting sidetracked arguments about intent rather than its real impact. This tool is called POSIWID and it stands for “The Purpose of a System Is What It Does.” This analytical frame provides “a better starting point for understanding a system than a focus on designers’ or users’ intention or expectations.”

you are viewing a single comment's thread
view the rest of the comments

[–] ozymandias117@lemmy.world 1 points 7 months ago (1 children)

after you’ve controlled for all the things that cause the gender pay gap

Isn’t that a continuation of “why the outlier was culled”?

More emphasis on how the data set is selected (while hard) is very useful

[–] JoBo@feddit.uk 2 points 7 months ago (1 children)

Isn’t that a continuation of “why the outlier was culled”?

Not sure I follow, but I think the answer is "no".

If you control for all the causes of a difference, the difference will disappear. Which is fine if you're looking for causal factors which are not already known to be causal factors, but no good at all if you're trying to establish whether or not a difference exists.

It's really quite difficult to ask a coherent question with real-world data from the messy, complicated reality of human beings.

A simple example:

Women are more likely to die from complications after a coronary artery bypass.

But if you include body surface area (a measure of body size) in your model, the difference between men and women disappears.

And if you go the whole hog and measure vein size, the importance of body size disappears too.

And, while we can never do an RCT to prove it, it makes perfect sense that smaller veins would increase the risk for a surgery which involves operating on blood vessels.

None of that means women do not, in fact, have a higher risk of dying after coronary artery bypass surgery. Collect all the data which has ever existed and women will still be more likely to die from the surgery. We have explained the phenomenon and found what is very likely to be the direct cause of higher mortality. Being a woman just makes you more likely to have that risk factor.

It is rare that the answer is as neat and simple as this. It is very easy to ask a different question from the one you thought you were asking (or pretend to be answering one question when you answered another).

You can't just throw masses of data into a pot and expect sensible answers to come out. This is the key difference between statisticians and data scientists. And, not to throw shade on data scientists, they often end up explaining to the world that oestrogen makes people more likely to die from complications of coronary artery bypass surgery.

[–] ozymandias117@lemmy.world 1 points 7 months ago (1 children)

Maybe it’s a crude interpretation, but over controlling for all the the cause of a change, and removing outliers in your data that is training these AI models seem like similar issues when trying to actually understand the data

[–] JoBo@feddit.uk -1 points 7 months ago (1 children)

The data cannot be understood. These models are too large for that.

Apple says it doesn't understand why its credit card gives lower credit limits to women that men even if they have the same (or better) credit scores, because they don't use sex as a datapoint. But it's freaking obvious why, if you have a basic grasp of the social sciences and humanities. Women were not given the legal right to their own bank accounts until the 1970s. After that, banks could be forced to grant them bank accounts but not to extend the same amount of credit. Women earn and spend in ways that are different, on average, to men. So the algorithm does not need to be told that the applicant is a woman, it just identifies them as the sort of person who earns and spends like the class of people with historically lower credit limits.

Apple's 'sexist' credit card investigated by US regulator

Garbage in, garbage out. Society has been garbage for marginalised groups since forever and there's no way to take that out of the data. Especially not big data. You can try but you just end up playing whackamole with new sources of bias, many of which cannot be measured well, if at all.

[–] ozymandias117@lemmy.world 2 points 7 months ago (1 children)

You are pointing out specific biases that we already know about. The article you posted seems to posit using the data to find the unknown biases we have as well

[–] JoBo@feddit.uk 0 points 7 months ago

It's asking why don't we use it for that purpose, not suggesting that there is anything easy about doing so. I don't know how you think science works, but it's not like that.