May 16

The Runtime Mechanics of Large Language Models

3 Comments

QB

Deep.

You appear to be describing the application of a LLM like the iterative predictions from a Markov chain…?

There is some superficial similarity here. A Markov chain predicts the next state based only on the current state. An LLM predicts the next token based on the entire context window, allowing very long-range correlations and structures to influence generation.

So yes, superficially it resembles a Markov chain, but only superficially.

Reply

Share

Audrius Berzanskis

Executable Language