TechTakes

1916 readers

95 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

dgerard@awful.systems

Nature: Al generates covertly racist decisions about people based on their dialect (www.nature.com)

submitted 9 months ago by antifuchs@awful.systems to c/techtakes@awful.systems

6 comments fedilink hide all child comments

Got the pointer to this from Allison Parrish who says it better than I could:

it's a very compelling paper, with a super clever methodology, and (i'm paraphrasing/extrapolating) shows that "alignment" strategies like RLHF only work to ensure that it never seems like a white person is saying something overtly racist, rather than addressing the actual prejudice baked into the model.

you are viewing a single comment's thread
view the rest of the comments

[–] L0rdMathias@sh.itjust.works 3 points 9 months ago

Interesting results, interesting insights, neat takeaways and argumentation.

It's unfortunate they only tested models that were trained on SAE and they didn't have a control group of language models in other dialects. Seems like a huge oversight.

I wonder how this would play out with a model that has been trained on AAE, another non-SAE dialect, or even one trained in English but optimized for a non-english language.