LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 6 days ago

LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users

bunnossin [she/her, it/its]@hexbear.net · 6 days ago

I knew AIs were racist in the sense that they’re biased towards generating white people, but what the fuck?

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 6 days ago

These AIs have American values firmly embedded in them it turns out.

JoeByeThen [he/him, they/them]@hexbear.net · edit-2 6 days ago

I mean, have these tests been run on deepseek? The base of deepseek was originally shortcutted by using chatgpt conversations, wasn’t it? A lot of this would seem inherent to the culture of the western internet just being mass processed into a corpus.

Also, overall but can’t say I’m surprised.

Edit: I should note I haven’t had time to read the study yet, I’m responding purely to your commentary.

SuperZutsuki [they/them]@hexbear.net · 6 days ago

What really annoys me about deepseek (the web version) is that it will just refuse to give you any information on things like the June 4 Incident and the Uyghur “Genocide”. It’s really disappointing that a Chinese AI model is programmed to never give info about these things rather than dispel the Western propaganda. I guarantee thousands of libs have asked it about the Incident and saw the refusal to give information as evidence of a see see pee cover-up and became further entrenched in their belief that thousands were killed at Tiananmen Square.

Carl [he/him]@hexbear.net · 6 days ago

Zhupu Ai (which uses GLM) is the same way, get too close to a topic that’s controversial in China and it closes the link. I was talking to it about cults when I noticed this for the first time, whenever its web search encountered any results about the Falun Gong it would trigger the safety stop and I’d have to try again - the same thing happens if you ask about Communist history, if its web search puts Western propaganda about Mao and co into the context it shuts down. In that case I think it was China’s “Martyrs and Heros Law” prompting the stop, since if anything defamatory enters the models context it might reproduce that information, so the safety stop triggers immediately.

I think it’s because the Chinese government has been clear that LLM providers are responsible for their LLM’s outputs, so Chinese-based companies make their public models tread very carefully around sensitive topics.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 6 days ago

The difference is that DeepSeek has open weights and anybody can download and run the model themselves. And you can tune the model any way you like, so even if DeepSeek had some baked in biases, anybody can publish a new version without them. That’s why developing this stuff in the open is so important.

JoeByeThen [he/him, they/them]@hexbear.net · 6 days ago

Sure, but can you fine tune the culture out of it without the whole base (I forgot the proper word, sorry) collapsing? The training data isn’t open, right? Like while I totally agree with you about the importance of openness, this shit is coming from the training data and our shit culture it was derived from.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 6 days ago

Yes, you absolutely can. That’s precisely what LoRAs are for. You can completely change the way the model responds by adding a layer on top. All the core knowledge stays the same. I’ve actually done this myself. I rented some time on runpod to train a LoRA on Lovecraft that I applied to a base Qwen model.

JoeByeThen [he/him, they/them]@hexbear.net · 6 days ago

OVERFITTING! The word I’m thinking of is overfitting. Lol and yeah I swear I know what a Lora is, but I don’t think you have a chance in hell of using a Lora to consistently remove cultural discrimination from a model. I very much think that’s wishful thinking. You’d be playing whackamole and then you’re still hoping that you dont introduce some ‘stop talking about gremlins’ type version of some asshole that doesn’t believe racism exists because America had a black president. Lol.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 6 days ago

I think at some point you can be fairly sure that the model performs well enough. And the simplest thing it can do is literally just act as a translator layer on top of the model. So, if you give a query, it’ll reformulate it in a way the model is known to respond well to. You can do a random sample test to see that you’re generally getting the results you expect too.

At the end of the day, models shouldn’t be treated like oracles in the first place, it’s a useful tool for helping point you in the right direction, or work through a problem. But it should always be the human making a decision in the end, and doing their own due diligence to verify the information.

soybeanis [they/them]@hexbear.net · 6 days ago

it “chose”

carpoftruth [any, any]@hexbear.net · 6 days ago

Tool developed by digesting English language works best when users give instructions in English. That tracks to me. Do the Chinese models do better in Chinese or are they effective across a broader array of languages?

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 6 days ago

It’s a lot worse than that, it’s not that the model doesn’t understand the question. It chooses the answer based on the persona it interacts with. It’s not a capability limitation.