Cynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 1 year agoHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizexternal-linkmessage-square46linkfedilinkarrow-up139arrow-down18
arrow-up131arrow-down1external-linkHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizCynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 1 year agomessage-square46linkfedilink
minus-squareDa Bald Eagul@feddit.nllinkfedilinkarrow-up8·1 year agoThat is what they meant, yes. The title promises a block, completely preventing crawlers from accessing the site. That is not what is delivered.
minus-squareJackbyDev@programming.devlinkfedilinkEnglisharrow-up2arrow-down1·1 year agoIs it a lie or a simplification for beginners?
minus-squarethanks_shakey_snake@lemmy.calinkfedilinkarrow-up8·1 year agoLie. Or at best, dangerously wrong. Like saying “Crosswalks make cars incapable of harming pedestrians who stay within them.”
minus-squareJackbyDev@programming.devlinkfedilinkEnglisharrow-up0arrow-down2·1 year agoIt’s better than saying something like “there’s no point in robots.txt because bots can disobey is” though.
minus-squarethanks_shakey_snake@lemmy.calinkfedilinkarrow-up2·1 year agoMaybe? But it’s not like that’s the only alternative thing to say, lol
minus-squareReversalHatchery@beehaw.orglinkfedilinkEnglisharrow-up2arrow-down1·edit-21 year agoIs it, though? I mean, robots.txt is the Do Not Track of the opposite side of the connection.
minus-squareEager Eagle@lemmy.worldlinkfedilinkEnglisharrow-up1·1 year agothe word disallow is right there
That is what they meant, yes. The title promises a block, completely preventing crawlers from accessing the site. That is not what is delivered.
Is it a lie or a simplification for beginners?
Lie. Or at best, dangerously wrong. Like saying “Crosswalks make cars incapable of harming pedestrians who stay within them.”
It’s better than saying something like “there’s no point in robots.txt because bots can disobey is” though.
Maybe? But it’s not like that’s the only alternative thing to say, lol
Is it, though?
I mean, robots.txt is the Do Not Track of the opposite side of the connection.
the word disallow is right there