Poisoned AI went rogue during training and couldn’t be taught to behave again in ‘legitimately scary’ study::AI researchers found that widely used safety training techniques failed to remove malicious behavior from large language models — and one technique even backfired, teaching the AI to recognize its triggers and better hide its bad behavior from the researchers.

  • JustMy2cEnglish
    arrow-up
    4
    arrow-down
    5
    ·
    9 months ago
    link
    fedilink

    I know we don’t like them here but the word reddit is not banned (yet)

    • Daxtron2English
      arrow-up
      16
      arrow-down
      0
      ·
      9 months ago
      link
      fedilink

      What? What does my comment have anything to do with Reddit?

      • JustMy2cEnglish
        arrow-up
        1
        arrow-down
        7
        ·
        9 months ago
        link
        fedilink

        So you’re saying that “Inflammatory data” isn’t a reference to reddit? :D

        • kent_ehEnglish
          arrow-up
          2
          arrow-down
          0
          ·
          9 months ago
          link
          fedilink

          I’d say using Twitter and Facebook would be worse than reddit. Or, and I shudder to think about it, truth social

          • JustMy2cEnglish
            arrow-up
            1
            arrow-down
            1
            ·
            9 months ago
            link
            fedilink

            Reddit is used more for Ai models as those

        • Daxtron2English
          arrow-up
          3
          arrow-down
          1
          ·
          9 months ago
          link
          fedilink

          Not inherently, I’m sure that’s part of it but it’s really everywhere. Even here on Lemmy I’ve run into nasty folk

          • JustMy2cEnglish
            arrow-up
            1
            arrow-down
            2
            ·
            9 months ago
            link
            fedilink

            True but it’s reddit that’s served as a base for most models

            • Daxtron2English
              arrow-up
              1
              arrow-down
              0
              ·
              9 months ago
              link
              fedilink

              Not just reddit, LAION is a huge dataset

              • JustMy2cEnglish
                arrow-up
                1
                arrow-down
                1
                ·
                9 months ago
                link
                fedilink

                Obviously but reddit is in the goldilocks zone where you get coherent intelligent stuff and humor and facts.

                But it’s still toxic for an Ai.

                • Daxtron2English
                  arrow-up
                  2
                  arrow-down
                  0
                  ·
                  9 months ago
                  link
                  fedilink

                  Saying it served as the base for most models is just objectively incorrect though

                  • JustMy2cEnglish
                    arrow-up
                    1
                    arrow-down
                    0
                    ·
                    9 months ago
                    link
                    fedilink

                    Correcto but maybe it DOES apply to most asked questions, if you know where I’m going with that

        • ChocratesEnglish
          arrow-up
          1
          arrow-down
          1
          ·
          9 months ago
          link
          fedilink

          No, LLM is the AI, OP is saying if you train it with hate it’s gonna spit out hate

          • JustMy2cEnglish
            arrow-up
            1
            arrow-down
            2
            ·
            9 months ago
            link
            fedilink

            And I’m saying that reddit data is sublime for Ai. And specifically that it’s invested with toxicity