• 0 Posts
  • 16 Comments
Joined 1 year ago
cake
Cake day: July 4th, 2023

help-circle







  • I think I understand how it works.

    Remember that LLMs are glorified auto-complete. They just spit out the most likely word that follows the previous words (literally just like how your phone keyboard suggestions work, just with a lot more computation).

    They have a limit to how far back they can remember. For ChatGPT 3.5 I believe it’s 24,000 tokens.

    So it tries to follow instruction and spits out “poem poem poem” until all the data is just the word “poem”, then it doesn’t have enough memory to remember its instructions.

    “Poem poem poem” is useless data so it doesn’t have anything to go off of, so it just outputs words that go together.

    LLMs don’t record data in the same way a computer file is stored, but absent other information may “remember” that the most likely word to follow the previous word is something that it has seen before, i.e. its training data. It is somewhat surprising that it is not just junk. It seems to be real text (such as bible verses).

    If I am correct then I’m surprised OpenAI didn’t fix if. I would think they could make it so in the event the LLM is running out of memory it would keep the input and simply abort operation, or at least drop the beginning of its output.







  • Hosting video content must be insanely expensive so I have some amount of sympathy for YouTube. However, ads earn such a paltry amount that I can hardly fathom how anyone can put up with it.

    The total amount of ad revenue between all sites is only like $12 per user per year. So IMO it is a net benefit to block all ads (save your sanity!) and simply pay the creators of content you like.

    Even just one $5 Patreon sub is worth more than all ads I’m likely to see so I block everything I can and IMO my conscience is clear