But it’s not cheap.

  • MartineskiEnglish
    arrow-up
    3
    arrow-down
    0
    ·
    1 month ago
    link
    fedilink

    I’m curious how it will do on the private benchmark that ai explained made. I think it was called simple bench?

  • happybadger [he/him]English
    arrow-up
    2
    arrow-down
    0
    ·
    1 month ago
    link
    fedilink

    “The model is definitely better at solving the AP math test than I am, and I was a math minor in college, OpenAI’s chief research officer, Bob McGrew, tells me. He says OpenAI also tested o1 against a qualifying exam for the International Mathematics Olympiad, and while GPT-4o only correctly solved only 13 percent of problems, o1 scored 83 percent.

    That’s still unreliable enough that I wouldn’t trust it to actually do anything. If it scoured its database for a trigonometry textbook and cited a solution for a problem which was as correct as any web calculator, cool. That’d be as useful as google was in 2010. 83% is the kind of score I get on advanced mathematics tests when I have no idea what I’m doing but half-remember the basic steps to get an answer.

    • NoiseColorEnglish
      arrow-up
      4
      arrow-down
      0
      ·
      1 month ago
      link
      fedilink

      You get an 83% if you don’t know what your are doing?

      I wish my scores were as high in those situations

  • vrighterEnglish
    arrow-up
    3
    arrow-down
    1
    ·
    1 month ago
    link
    fedilink

    no it doesn’t have reasoning abilities. It just replicates you trying to coax it into giving you something decent, hides the process from you, and then charges you for it.