• Honytawk@lemmy.zip
    link
    fedilink
    arrow-up
    12
    arrow-down
    5
    ·
    edit-2
    1 年前

    If it was only guessing, it would never be able to create a single functioning program. Which it has, numerously.

    This isn’t some infinite monkeys on typewriter stuff.

    It writes and can check itself if it is correct.

    I’ve seen ChatGPT write an entire Website in Wordpress, including setting up a MySQL database for users, by a user stating their wishes vocally in a microphone and then not touching the computer once.

    How is that guessing?

    • Norgur@kbin.social
      link
      fedilink
      arrow-up
      23
      arrow-down
      2
      ·
      1 年前

      No, it does not “check itself”. You mixed up “completely random guesses” and stochastically calculated guesses… ChatGPT has.an obscenely large corpus of training data that was further refined by a blatant disregard for copyright and tons and tons of exploited workers in low wage countries, right?

      So imagine the topic “setup Wordpress”. ChatGPT has just about every article indexed that’s on the internet about this. Word for word. So it’s able to assign a number to each word and calculate the probability of each word following every other word it scanned. Since WordPress follows a very clear pattern as to how it’s set up, those probabilities will be very clear cut.
      The details the user entered can be stitched in because ChatGPT can very easily detect variables given the huge amount of data. Imagine a CREATE USER MySQL command. ChatGPTs sources will be almost identical up until it comes to the username which suddenly leads to a drop on certainty regarding the next Word. So there’s your variable. Now stitch in the word the user typed after the word “User” and bobs your uncle.

      ChatGPT can “write programs” because programming (just as human language) follows clear patterns that become pretty distinct if the amount of data you analyze becomes large enough.

      ChatGPT does not check anything it spurts out. It just generates a word and calculates which word is most likely to follow that one.

      It only knows which sources of it’s training data it should xluse because those were sorted and categorized by humans slaving away in Africa and Asia, doing all the categories by hand.