• 5opn0o30
    arrow-up
    21
    arrow-down
    5
    ·
    2 months ago
    link
    fedilink

    Wow. A lot of cynicism here. The AI bots are (currently) honoring robots.txt so this is an easy way to say go away. Honeypot urls can be a second line of defense as well as blocking published IP ranges. They’re no different than other bots that have existed for years.

    • digdilemEnglish
      arrow-up
      9
      arrow-down
      0
      ·
      2 months ago
      edit-2
      2 months ago
      link
      fedilink

      In my experience, the AI bots are absolutely not honoring robots.txt - and there are literally hundreds of unique ones. Everyone and their dog has unleashed AI/LLM harvesters over the past year without much thought to the impact to low bandwidth sites.

      Many of them aren’t even identifying themselves as AI bots, but faking human user-agents.