• RelativeArea1@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    13 days ago

    this is some fucking stupid situation, we somewhat got a faster internet and these bots messing each other are hogging the bandwidth.

    • dual_sport_dork 🐧🗡️@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      13 days ago

      Especially since the solution I cooked up for my site works just fine and took a lot less work. This is simply to identify the incoming requests from these damn bots – which is not difficult, since they ignore all directives and sanity and try to slam your site with like 200+ requests per second, that makes 'em easy to spot – and simply IP ban them. This is considerably simpler, and doesn’t require an entire nuclear plant powered AI to combat the opposition’s nuclear plant powered AI.

      In fact, anybody who doesn’t exhibit a sane crawl rate gets blocked from my site automatically. For a while, most of them were coming from Russian IP address zones for some reason. These days Amazon is the worst offender, I guess their Rufus AI or whatever the fuck it is tries to pester other retail sites to “learn” about products rather than sticking to its own domain.

      Fuck 'em. Route those motherfuckers right to /dev/null.

      • Buelldozer@lemmy.today
        link
        fedilink
        English
        arrow-up
        1
        ·
        13 days ago

        and try to slam your site with like 200+ requests per second

        Your solution would do nothing to stop the crawlers that are operating 10ish rps. There’s ones out there operating at a mere 2rps but when multiple companies are doing it at the same time 24x7x365 it adds up.

        Some incredibly talented people have been battling this since last year and your solution has been tried multiple times. It’s not effective in all instances and can require a LOT of manual intervention and SysAdmin time.

        https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/

        • confusedbytheBasics@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          13 days ago

          Yep. After you ban all the easy to spot ones you’re still left with far too many hard to ID bots. At least if your site is popular and large.