Endless Dimensions of Tokens

At first glance, tokens look like simple pieces of text, but in modern AI systems, they are the entry point into something far stranger: an effectively endless geometric space of meaning, where language becomes coordinates in extremely high-dimensional structure.

Sitting in a conference room being given a simplistic overview of an utterly complex and astounding alteration to reality (Artificial Intelligence / LLMs / Neural Networks) it gave me time to wonder in my mono tropic hyper focus sort of way, about the fundamental makeup of a new kind of spacetime. The concept itself is not new to me, but it’s ever changing evolution is; especially as we move towards quantum matrix multiplication density. (Quantum computing will likely make the token fabric much denser by embedding more information into fewer dimensions — the Word2ket effect) So while everyone was typing verbatim notes of the presentation taking place, I was typing complex corrections of everything being said. It led me to ruminate for several hours on the images that the rivers, or vectors of tokens inspire in my mind.

Like translucent layers of matter on warped oragami planes folding in and expanding through one another beyond comprehension. Like galaxies. Like clouds of starlings in the sky. No different than the mesh supporting the theory of relativity: In both systems, an entity's value is defined by its position in a multi-dimensional space relative to everything else.

You see… Tokens cannot be simply calculated by word count. Tokens are not words. They aren’t even, really, letters. They are numbers and points extending eternal vectors on countless planes.

Take a word: river

It is not processed as a “word” but as a token, mapped to an integer.

Example: "river" » 9821 // A sentence becomes an array of integers: "the river flows" » [12, 9821, 77]

Here, language is flattened into a numeric sequence.

Then: each token is then mapped into a vector - or a list of numbers: river » [0.21, -1.8, 0.44, ..., ]

Now: each token lives in a space with hundreds or thousands of dimensions. It becomes a geometry of meaning.

But the model does not see all tokens at once. It only sees a fixed window. (Context Window) :

Context window ie size = 6 tokens

Full sequence:

[ A B C D E F G H ]

Active window:

[ C D E F G H ]

NOW: Every layer above is ultimately encoded in binary: 9821 » 10011001001101 « Base2 Each position is a power of 2

T O K E N S :

A galaxy of words, where stars are concepts, constellations are meanings, and distance reflects semantic similarity in high dimensional space. 

Starling Murmuration


FYI On average AI models use tokens as the fundamental unit for processing language. Generally, 1 token equals about 4 characters or 0.75 words in English

Next
Next

AI Ethics & Iran