The research from Purdue University, first spotted by news outlet Futurism, was presented earlier this month at the Computer-Human Interaction Conference in Hawaii and looked at 517 programming questions on Stack Overflow that were then fed to ChatGPT.

“Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose, the new study explained. “Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style.

Disturbingly, programmers in the study didn’t always catch the mistakes being produced by the AI chatbot.

“However, they also overlooked the misinformation in the ChatGPT answers 39% of the time, according to the study. “This implies the need to counter misinformation in ChatGPT answers to programming questions and raise awareness of the risks associated with seemingly correct answers.

  • zbyte64English
    arrow-up
    5
    arrow-down
    0
    ·
    5 months ago
    link
    fedilink

    Compilers are deterministic and you can reason about how they came to their results, and because of that they are useful.

    • FaceDeer
      arrow-up
      2
      arrow-down
      1
      ·
      5 months ago
      link
      fedilink

      No, they’re useful because they produce useful machine code.

      • zbyte64English
        arrow-up
        2
        arrow-down
        0
        ·
        5 months ago
        link
        fedilink

        That’s a distinction without a difference. The code is useful because we can reason how it was made and we can then make deterministic changes. Try using a compiler that gives you a qualitatively different result each time it runs even though the inputs are the same.

        • FaceDeer
          arrow-up
          1
          arrow-down
          1
          ·
          5 months ago
          link
          fedilink

          It’s useful because it does the stuff we want it to do.

          You’re focusing on a very high level philosophical meaning of “usefulness. I’m focusing on what actually does what I need it to do.

          • zbyte64English
            arrow-up
            2
            arrow-down
            0
            ·
            5 months ago
            link
            fedilink

            I’m providing explicit examples of compilers doing “the stuff we want it to do”. LLMs do what the want 50% of the time and it still needs modifications afterwards. Imagine having to correct a compiler output and calling that compiler “useful”.

            • FaceDeer
              arrow-up
              1
              arrow-down
              1
              ·
              5 months ago
              link
              fedilink

              So if something isn’t perfect it’s not “useful?

              I use LLMs when programming. Despite their imperfection they save me an enormous amount of time. I can confidently confirm that LLMs are useful from personal direct experience.