Q, K, V Projections
How one vector is transformed into query, key, and value versions used by attention.
Attention Is All You Need · 2026 A1
Further reading
How text becomes ordered rows of numbers before attention begins.
LLMs do not work directly on text. They work on tokens , which are pieces of text in a fixed order, and on embeddings , which are vectors of numbers attached to those tokens.
Text
Tokens in order
Embedding rows
(0.41, -0.12, 0.87, -0.33)
(0.05, 0.62, -0.14, 0.48)
(-0.36, 0.91, 0.11, -0.07)
i = 0, token 1 becomes i = 1, and so on. Attention later uses those positions when it decides what each position may look back at.For this assignment, you can treat tokens almost like words:
In this assignment, you first practise the representation idea with a simplified embedding step.
Stage 1 builds one-hot vectors for the tokens as a simple stand-in for embeddings. That is the representation exercise students do directly. After that, the later attention stages use richer embedding rows as the prompt matrix that gets projected into $\mathbf{Q}$, $\mathbf{K}$, and $\mathbf{V}$.
The prompt matrix can be read as:
n rowsd numbers in each rowThat is why the rest of the assignment is mostly about array loops and matrix-style computations rather than text processing.
Models do not literally operate on words. A token might be:
For this assignment, you can safely pretend tokens are words because the important fact is just that token positions have an order and the attention rules depend on that order.