- cross-posted to:
- hackernews@lemmy.bestiver.se
- cross-posted to:
- hackernews@lemmy.bestiver.se
A Norwegian man said he was horrified to discover that ChatGPT outputs had falsely accused him of murdering his own children.
According to a complaint filed Thursday by European Union digital rights advocates Noyb, Arve Hjalmar Holmen decided to see what information ChatGPT might provide if a user searched his name. He was shocked when ChatGPT responded with outputs falsely claiming that he was sentenced to 21 years in prison as “a convicted criminal who murdered two of his children and attempted to murder his third son,” a Noyb press release said.
It’s AI. There’s nothing to delete but the erroneous response. There is no database of facts to edit. It doesn’t know fact from fiction, and the response is also very much skewed by the context of the query. I could easily get it to say the same about nearly any random name just by asking it about a bunch of family murders and then asking about a name it doesn’t recognize. It is more likely to assume that person is in the same category as the others and if the one or more of the names have any association (real or fictional) with murder.
I don’t care why. That is still libel and it is illegal for good reason. if you can’t stop this for all cases then you ai is and should be illegal.
None of the moneybags will listen, unfortunately. But I’m with you. The rollout of AI was extremely irresponsible. Just to make it profitable as quickly as possible.
Seems to me libel would require AI to have credibility, which it does not.
It’s a tool. Like most useful tools it can do harmful things. We know almost nothing about the provenance of this output. It could have been poisoned either accidentally or deliberately.
But above all, the problem is ignorant people believing the output of AI is truth. It’s pretty good at some things, but the more esoteric the knowledge, the less reliable it is. It’s best to treat AI as a storyteller. Yeah there are a lot of facts in there but when they don’t serve the story they can be embellished. I don’t see the harm in just acknowledging that and moving on.
Im not a lawyer but the most conclusive missing piece of what we commonly understand to be libel is the information has to be published.
I thought about that.
The definition of publish could get a little murky here. Actually the best defense here is that, so far as we know, this was not disclosed to a third party by ChatGPT (that’s pretty flimsy, though, because it likely has no idea who it is talking to.)
I acknowledge there is some level of nuance here, which is why I come back to no one should have any expectation that AI will be factual. The disclaimers are everywhere. There is really no excuse for anyone to treat the output as gospel.
Meanwhile, AI vendors:
“AI will soon be the only way we access information and make decisions!”
Except it’s not libel. It’s a one time string of text generated exclusively for him. Literally no one would have known what it said if the guy didn’t get the exact thing he wants “deleted” published online for everyone to see. Now it’ll be linked to his name forever, but the llm didn’t do that.
Libel requires the claims to be published or broadcasted, so it isn’t. A predictive text algorithm strung some random words together, and the guy got offended.
It’s like suing because your phone keyboard autosuggested “is a murderer” as the next words after you wrote your name. Btw, I tried it a few times for lulz and managed to get it to write out “bluGill and the kids are going to get it on”, so I guess you can sue Google now?I read it as they aren’t using libel as cause for their complaint but failure to comply with GDPR
deleted by creator
I have this gun machine that shoots in all directions randomly. I can’t predict it, so I can’t stop it from shooting you. So sorry. It’s uncontrollable.
Yeah but I can just ignore the bullets because they are nerf. And I have my own nerf guns as well.
I mean at some point any analogy fails, but AI is nothing like a gun.
They may seem like nerf when they first come out of the AI, but they turn into real bullets once they start filling people’s heads with convincing enough lies and falsehoods, and those people start wielding their own weapons against minorities, democracy, and the government. If the election of Trump 2.0 has not convinced you of the immense danger of disinformation and misinformation, I have literally no idea how anything could ever possibly get through to you.
That doesn’t really change anything. The internet is full of AI slop and just people outright lying. Nothing is reliable any more outside of the word of an actual expert.
This has been happening since before Trump. Hell Trump 45 was before the wave of truly capable AI.
AI doesn’t change this at all except people ought to know they are getting info from a bullshit source if they are getting it from AI themselves.
Even nerf bullets can hurt you if they’re shot at you in sufficient quantities.
Or speed. Some of the homebrew mods are ridiculous.
AI is a thing people choose to host and are responsible for the outcomes of its use. The internal working and limitations of the machine do not make the owners less responsible.
Okay, so I agree with none of that, but you’re saying as long as we host our own AI or rent our own processing from the cloud we’re in the clear? I want to make sure that’s your fundamental argument because that leaves all open models in the clear and frankly I could be down with that. I like AI but I’m not a huge fan of AI companies.
So insurance companies use AI to screen claims.
It denies a claim for life saving intervention - person dies. Who is responsible for that? Historically it would be the insurance company - and worker. Would it be them or the AI company?
Psych screening tools were using it to pre screen calls.
Ai tells the person to kill themselves - who is at fault if they do it. Psych screener would lose their job and their license. What and who is impacted if AI does it.
QA check on a car or product is passed by AI but should have failed.
Thousands die before the recall. Who is at fault for it? The Company leveraging AI. Or the AI itself?
Company using AI for that shit is responsible. There is no responsible way to remove a human from there process. These aren’t reasonable uses of AI no matter how bad companies want to save money by not hiring.
I’m not sure you get my point.
If I’m proving a service, and that service is creating and publishing disparaging information about you, you should have recourse against me. I don’t get off the hook just because of the way I’ve set up the technology.
Right. Well if your service is a well-known bullshiter I wouldn’t give a fuck. That being said, I’d be happy to agree that AI should all be open source and self-hosted. I run local AI myself, but the quality isn’t there. I’d have to rent time on a big boy machine if the big players went away. That would be a little inconvenient because I’d want to have a whole bunch of requests queued up to use maximum power over minimum time and that’s not really how anyone uses AI.
Maybe I could share that rental with other AI enthusiasts… hmmm.
Maybe people need to learn that AI hallucinates
you misspelled “is fucking wrong all the goddamn time”
It would be more accurate to say that rather than knowing anything at all they have a model of the statistical relationship between a series of tokens and subsequent tokens which words are apt to follow other words and because the training set contains many true things the words produced in response to queries often contain true statements and almost always contain statements that LOOK like true statements.
Since it has no inherent model of the world to draw on and only such statistical relationships you should check anything important
you say more accurate but all I see is a very roundabout way of saying fucking wrong all the goddamn time
So then what’s the use of the program if it uses a bunch of energy to just make shit up?
sometimes you need a machine that makes things up according to a given specification.
Because it makes up things that are 99% correct and in some areas the 99% + verification and expansion can be superior time wise to the 100% manual route
What models are youseeing where things are 99% correct? Google’s search chat bot can’t even keep Windows vs Mac hotkey commands straight.
And when it hallucinates harmful things, protections need to be put onto the output.
Ok so explain particularly what this means
If you have a service, and that service is generating things that harm people, you should have to stop it.
If creating text is like shooting bullets, we should require a license for text editors.
You can pry Vim from my cold, dead hands!
Can’t exit it on your own?
The severity of the impact should not dictate whether a person is accountable for a thing they own, or not.
So, licenses for everything?
Anyway, we hold the person accountable who does (or rarely does not) do something, not the owner of a thing. Which is why a libel accusation makes 0 sense here.
The fact you chose to make your data storage unreadable, doesn’t relieve you of the responsibilities inherent to storing the data.
Throwing away my car key won’t protect me from paying parking tickets i accrue while being physically unable to move my car.
It’s not unreadable, it doesn’t exist.
The responses are just statistically what sounds vaugly what you want to hear.
They can erase the chat responses, but that won’t stop it from generating it again.
Generative AI doesn’t start with facts and work from there. It’s just statistically what you want to hear.
It’s not unreadable, it doesn’t exist.
Then what do you mean trained AI models are?
The ai model is trained on data and encodes unknown parts of that data in its weights.
This is data storage. Unmanageable, almost unknowable data storage, but still data storage.
If it didn’t store data it couldn’t learn from its training.
Your still placing more intent and facts into those processes than actually exist.
You cant even get it to count how many letter p are in the word apple. At least not last time I tried.
That storage your talking about isn’t facts. It’s how sentences are structured and what they “mean”.
As for the output “meaning” it’s still just guessing what you want to hear. No facts involved.
Your still placing more intent and facts into those processes than actually exist.
No? When they train AI’s on data they lose control of that data. If the data is sensitive, they aren’t being responsible.
GPT models are as you say dumb statistical models, I agree. But in its weights are encoded ghost images of its training data. The model being dumb is not sufficient to make the data storing itself defensible in my opinion.
Sure, but are you suggesting they somehow encoded, falsely, that they were a murder?
Because it’s very unlikely.
It fabricated this from no where. So there’s nothing to delete. Because it’s just a response to a prompt.
Well, here we are. We skipped using this tech for only search Automation and leapfrogged to directly making shit up (once again).
To me it’s clear that these tools are primarily useful as bullshit generators, and I expect them to hallucinate and be inaccurate. But the companies trying to capitalize on the “AI” bubble are saying that these tools can be useful and accurate. I imagine OpenAI is going to have to invoke the Fox News defense in this case, and claim that “no reasonable person would take this seriously”.
Don’t use hallucinate to describe what it is doing, that is humanizing it and making the tech seem more advanced than it is. It is randomly mashing words together without understanding the meaning of any of them
The technical term was created to promote the misunderstanding that LLMs “think”. The “experts” want people to think LLMs are far more advanced than they actually are. You can add as many tokens to your context as you want - every model is still, fundamentally, a text generator. Humanizing it more than that is naive or deceptive, depending on how much money you have riding on the bubble.
You didn’t read the article I linked. The term came into use before LLMs were a thing, it was originally used in relation to image processing.
Thank you!
deleted by creator
Leapfrogged? It never left. LLMs were made to make shit up.
It’s all hallucinations.
Some (many) just happen to be very close to factual.
It’s sad to see that the marketing of these tools has been so effective that few realize how they work and what they do.
It really is sad. I often hear, “I even asked ChatGPT and it said…” as if that means their response is valid. I’ve heard people say it who I thought would know better, too.
The number of times I’ve heard that by people expecting it to win them arguments is incredibly discouraging.
hallucinations
It’s called libel.
Surely you jest because it’s so clearly not if you understand how LLMs work (at the core it’s a statistic model - and therefore all approximation to a varying degree).
But great can come out of this case if it gets far enough.
Imagine the ilk of OpenAI, Google, Anthropic, XAI, etc. being forced to admit that an LLM can’t actually do anything but generate approximations of language. That these models (again LLMs in particular) produce approximations of language that are so good they’re often indistinguishable from the versions our brains approximate.
But at the core they cannot produce facts because the way they are made includes artificially injected randomness layered on-top of mathematically encoded values that merely get expressed as tiny pieces of language (tokens) - ones that happen to be close to each other in a massively multidimensional vector space.
TLDR - they’d be forced to admit the emperor has no clothes and that’s a win for everyone (except maybe this one guy).
Also it’s worth noting I use LLMs for work almost daily and have studied them quite a bit. I’m not a hater on the tech. Only the capitalists trying to force it down everyone’s throat in such a way that we blindly adopt it for everything.
this is confusing. did you think I meant you’re engaging in libel against llms or something? that’s the only way I can make sense of your reply.
Really?
I read your reply as saying the output is (can be) libellous - which it cannot be because it is not based on a dataset which resolves to anything absolute.
Maybe we’re just missing each other - struggling to parse each others’ output. ;)
well I must be mixing something because all I’m getting is that you’re saying it’s full of shit as a defense against libel.
Is it really him that it’s saying did this? I mean, I could look up my dad’s name and all I get are articles about a serial killer who just happened to have the same name; and that’s not generated by AI. Names aren’t usually unique identifiers.
There’s a list of names of people who have sued OpenAI, they often cause ChatGPT to shut down.
We should keep those names handy just incase cyber dogs are ever chasing us.
Certain names, including “David Mayer,” “Brian Hood,” “Jonathan Turley,” “Jonathan Zittrain,” “David Faber,” and “Guido Scorza,” cause ChatGPT to produce an error message and terminate the chat session, likely due to a hard-coded filter or privacy concerns.
Well now it will say that Arve Hjalmar Holmen is a twit who doesn’t understand how ChatGPT works and what to expect from it
Arve Hjalmar Holmen