Who is responsible then? Cuz the devs basically gotta let the AI go to town on many websites and documents for any sort of training set.
So you mean to say, you can’t blame the developers, because they just made a tool (one that scrapes data from everywhere possible), can’t blame the tool (don’t mind that AI is scraping all your data), and can’t blame the end users, because some dirty minded people search or post inappropriate things…?
First, you need to figure out exactly what it is that the “blame” is for.
If the problem is the abuse of children, well, none of that actually happened in this case so there’s no blame to begin with.
If the problem is possession of CSAM, then that’s on the guy who generated them since they didn’t exist at any point before then. The trainers wouldn’t have needed to have any of that in the training set so if you want to blame them you’re going to need to do a completely separate investigation into that, the ability of the AI to generate images like that doesn’t prove anything.
If the problem is the creation of CSAM, then again, it’s the guy who generated them.
If it’s the provision of general-purpose art tools that were later used to create CSAM, then sure, the AI trainers are in trouble. As are the camera makers and the pencil makers, as I mentioned sarcastically in my first comment.
AI only knows what has gone through it’s training data, both from the developers and the end users.
Yes, and as I’ve said repeatedly, it’s able to synthesize novel images from the things it has learned.
If you train an AI with pictures of green cars and pictures of red apples, it’ll be able to figure out how to generate images of red cars and green apples for you.
It’s possible to legally photograph young people. Completely ordinary legal photographs of young people exist, from which an AI can learn the concept of what a young person looks like.
Do a Google Image search for “child” or “teenager” or other such innocent terms, you’ll find plenty of such.
I think you’re underestimating just how well AI is able to learn basic concepts from images. A lot of people imagine these AIs as being some sort of collage machine that pastes together little chunks of existing images, but that’s not what’s going on under the hood of modern generative art AIs. They learn the underlying concepts and characteristics of what things are, and are able to remix them conceptually.
Is an image of a child inappropriate? Fully clothed, nothing going on.
Is the image of an adult engaging in sexual activity inappropriate?
Based on those two concepts, it can generate inappropriate child sexual imagery.
You may have done OCR work a while ago, but that is not the same type of machine learning that goes into typical generative AI systems in the modern world. It very much seems as though you are profoundly misunderstanding how this technology operates if you think it can’t generate a novel combination of previously trained concepts without a prior example.
You’re not the brightest spoon in the drawer are you?
“Naked” and “child” are two concepts it can learn and combine without needing to be taught “naked child”.
It does not need to see an example of every type of thing it can generate.
It can combine pornographic concepts learned in isolation to disparate unrelated concepts.
It does not need to have been trained on child porn to generate child porn.
I haven’t and won’t attempt any testing on this, for obvious reasons. But at the same time, if any AI system out there can manage to generate images of pre-puberty private parts, then the training data must have included inappropriate material to be able to distinguish the differences.
Who is responsible then? Cuz the devs basically gotta let the AI go to town on many websites and documents for any sort of training set.
So you mean to say, you can’t blame the developers, because they just made a tool (one that scrapes data from everywhere possible), can’t blame the tool (don’t mind that AI is scraping all your data), and can’t blame the end users, because some dirty minded people search or post inappropriate things…?
So where’s the blame go?
First, you need to figure out exactly what it is that the “blame” is for.
If the problem is the abuse of children, well, none of that actually happened in this case so there’s no blame to begin with.
If the problem is possession of CSAM, then that’s on the guy who generated them since they didn’t exist at any point before then. The trainers wouldn’t have needed to have any of that in the training set so if you want to blame them you’re going to need to do a completely separate investigation into that, the ability of the AI to generate images like that doesn’t prove anything.
If the problem is the creation of CSAM, then again, it’s the guy who generated them.
If it’s the provision of general-purpose art tools that were later used to create CSAM, then sure, the AI trainers are in trouble. As are the camera makers and the pencil makers, as I mentioned sarcastically in my first comment.
You obviously don’t understand squat about AI.
AI only knows what has gone through it’s training data, both from the developers and the end users.
Hell, back in 2003 I wrote an adaptive AI for optical character recognition (OCR). I designed it for English, but also with a crude ability to learn.
I could have taught that thing hieroglyphics if I wanted to. But AI will never generate things that it’s never seen before.
Funny that AI has an easier time rendering inappropriate material than it does human hands…
Ha.
Yes, and as I’ve said repeatedly, it’s able to synthesize novel images from the things it has learned.
If you train an AI with pictures of green cars and pictures of red apples, it’ll be able to figure out how to generate images of red cars and green apples for you.
Exactly. And if you ask it for the opposite of an older MILF, then how does it know what younger ladies look like?
It’s possible to legally photograph young people. Completely ordinary legal photographs of young people exist, from which an AI can learn the concept of what a young person looks like.
The only example I can think of with what you said is just a couple brief innocent scenes from The Blue Lagoon.
Short of that, I don’t know (nor care for any references to) any other legal public images or video of anything as such.
I dunno, I’m just bumfuzzled how AI, whether public or private, could have sufficient information to generate such things these days.
Do a Google Image search for “child” or “teenager” or other such innocent terms, you’ll find plenty of such.
I think you’re underestimating just how well AI is able to learn basic concepts from images. A lot of people imagine these AIs as being some sort of collage machine that pastes together little chunks of existing images, but that’s not what’s going on under the hood of modern generative art AIs. They learn the underlying concepts and characteristics of what things are, and are able to remix them conceptually.
And conceptually, if I had never seen my cousin in the nude, I’d never know what young people look naked.
No that’s not a concept, that’s a fact. AI has seen inappropriate things, and it doesn’t fully know the difference.
You can’t blame the AI itself, but you can and should blame any and all users that have knowingly fed it bad data.
Is an image of a child inappropriate? Fully clothed, nothing going on.
Is the image of an adult engaging in sexual activity inappropriate?
Based on those two concepts, it can generate inappropriate child sexual imagery.
You may have done OCR work a while ago, but that is not the same type of machine learning that goes into typical generative AI systems in the modern world. It very much seems as though you are profoundly misunderstanding how this technology operates if you think it can’t generate a novel combination of previously trained concepts without a prior example.
I’m referring to the inappropriate photography and videos out there. Please learn to read.
You’re not the brightest spoon in the drawer are you?
“Naked” and “child” are two concepts it can learn and combine without needing to be taught “naked child”.
It does not need to see an example of every type of thing it can generate.
It can combine pornographic concepts learned in isolation to disparate unrelated concepts.
It does not need to have been trained on child porn to generate child porn.
I haven’t and won’t attempt any testing on this, for obvious reasons. But at the same time, if any AI system out there can manage to generate images of pre-puberty private parts, then the training data must have included inappropriate material to be able to distinguish the differences.