• FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      7
      arrow-down
      1
      ·
      6 months ago

      3,226 suspected images out of 5.8 billion. About 0.00006%. And probably mislabeled to boot, or it would have been caught earlier. I doubt it had any significant impact on the model’s capabilities.

    • wandermind@sopuli.xyz
      link
      fedilink
      arrow-up
      1
      ·
      6 months ago

      I know. So to confirm, you’re saying that you’re okay with AI generated CSAM as long as the training data for the model didn’t include any CSAM?

      • xmunk@sh.itjust.works
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        6 months ago

        No, I’m not - I still have ethical objections and I don’t believe CSAM could be generated without some CSAM in the training set. I think it’s generally problematic to sexually fantasize about underage persons though I know that’s an extremely unpopular opinion here.

        • wandermind@sopuli.xyz
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          6 months ago

          So why are you posting all over this thread about how CSAM was included in the training set if that is in your opinion ultimately irrelevant with regards to the topic of the post and discussion, the morality of using AI to generate CSAM?

          • xmunk@sh.itjust.works
            link
            fedilink
            arrow-up
            1
            ·
            6 months ago

            Because all over this thread are claims that AI CSAM doesn’t need actual CSAM to generate. We currently don’t have AI CSAM that is taint free and it’s unlikely we ever will due to how generative AI works.

            • wandermind@sopuli.xyz
              link
              fedilink
              arrow-up
              1
              ·
              6 months ago

              So at best we don’t know whether or not AI CSAM without CSAM training data is possible. “This AI used CSAM training data” is not an answer to that question. It is even less of an answer to the question “Should AI generated CSAM be illegal?” Just like “elephants get killed for their ivory” is not an answer to “should pianos be illegal?”

              If your argument is that yes, all AI CSAM should be illegal whether or not the training used real CSAM, then argue that point. Whether or not any specific AI used CSAM to train is an irrelevant non sequitur. A lot of what you’re doing now is replying to “pencils should not be illegal just because some people write bad stuff” with the equivalent of “this one guy did some bad stuff before writing it down”. That is completely unrelated to the argument being made.