• CombatWombat1212English
    arrow-up
    40
    arrow-down
    0
    ·
    11 hours ago
    link
    fedilink

    So do I every time I ask it a slightly complicated programming question

    • Saik0English
      arrow-up
      11
      arrow-down
      0
      ·
      8 hours ago
      link
      fedilink

      And sometimes even really simple ones.

      • werefreeatlastEnglish
        arrow-up
        6
        arrow-down
        0
        ·
        7 hours ago
        link
        fedilink

        How many w’s in “Howard likes strawberries” It would be awesome to know!

        • Saik0English
          arrow-up
          4
          arrow-down
          0
          ·
          7 hours ago
          edit-2
          7 hours ago
          link
          fedilink

          So I keep seeing people reference this And I found it curious of a concept that LLMs have problems with this. So I asked them Several of them

          Outside of this image Codestral ( my default ) got it actually correct and didn’t talk itself out of being correct But that’s no fun so I asked 5 others, at once.

          https://lemmy.saik0.com/pictrs/image/141fdba1-59d3-4dda-9f3f-3e5b447c8e12.png

          What’s sad is that Dolphin Mixtral is a 26.44GB model
          Gemma 2 is the 5.44GB variant
          Gemma 2B is the 1.63GB variant
          LLaVa Llama3 is the 5.55 GB variant
          Mistral is the 4.11GB Variant

          So I asked Codestral again because why not! And this time it talked itself out of being correct

          https://lemmy.saik0.com/pictrs/image/bb8c91ec-f113-4725-b703-e4076e6fbfa3.png

          Edit: fixed newline formatting.

          • Regrettable_incidentEnglish
            arrow-up
            1
            arrow-down
            0
            ·
            3 hours ago
            link
            fedilink

            Interesting. . . I’d say Gemma 2B wasn’t actually wrong - it just didn’t answer the question you asked! I wonder if they have this problem with other letters - like maybe it’s something to do with how we say w as double-you . . . But maybe not, because they seem to be underestimating rather and overestimating. But yeah, I guess the fuckers just can’t count. You’d think a question using the phrase ‘How many . . .’ would be a giveaway that they might need to count something rather than rely on knowledge base.

          • werefreeatlastEnglish
            arrow-up
            1
            arrow-down
            0
            ·
            4 hours ago
            link
            fedilink

            LOL 😆😅! I totally made it up! And it worked! So maybe it’s not just R’s that it has trouble counting. It’s any letter at all.

        • ExcrubulentEnglish
          arrow-up
          2
          arrow-down
          0
          ·
          7 hours ago
          link
          fedilink

          I’d be happy to help! There are 3 "w"s in the string “Howard likes strawberries”.