OpenAI and Anthropic are ignoring an established rule that prevents bots scraping online content

IndustryStandard@lemmy.world · 9 months ago

OpenAI and Anthropic are ignoring an established rule that prevents bots scraping online content

lemmyvore@feddit.nl · 9 months ago

I’ve yet to understand how the hell they get away with “I don’t know how it works”. Either figure out how it works or stop using it, shithead. It’s software not magic beans.

There’s lots of complicated fields out there, none of them get a pass for “I don’t know how my drugs work” or “I don’t know how my rockets work”. That’s absolutely ridiculous.

Balder@lemmy.world · 9 months ago

It’s just how machine learning has been since ever.

We only know the model’s behavior by testing, hence we only know more or less the behavior in relation to the amount of testing that was done. But the model internals has always been a black box of numbers that individually mean nothing and if tracked which neurons fire here and there it’ll appear just random, because it probably is.

Remember the machine learning models aren’t carefully designed, they’re just brute-force trained for a long time and have the numbers adjusted again and again whenever the results look closer or further away from the desired output.

OpenAI and Anthropic are ignoring an established rule that prevents bots scraping online content

OpenAI and Anthropic are ignoring an established rule that prevents bots scraping online content

archive.ph