One-Hot Encoding

How a single active slot can represent a category or label.

Also called: one hot encoding, one-hot vector, indicator vectorFoundations

One-hot encoding is a way to represent one choice from a small set using a vector of zeros with exactly one 1.

cat

1 0 0 0

dog

0 1 0 0

bird

0 0 1 0

fish

0 0 0 1

One-hot encoding gives each category its own slot. The vectors are sparse on purpose: the category is encoded by where the `1` appears, not by the size of the number.

Example mapping

The position of the 1 is the identity. Everything else stays 0.

Category	slot 0	slot 1	slot 2	slot 3
cat	1	0	0	0
dog	0	1	0	0
bird	0	0	1	0
fish	0	0	0	1

One-hot encoding gives each category its own slot. The vector is sparse on purpose: the category is encoded by where the 1 appears, not by the size of the number.

For example, if you had three categories, you could write them as:

cat -> (1, 0, 0)
dog -> (0, 1, 0)
bird -> (0, 0, 1)

The important idea is that the position of the 1 carries the identity. The vector is not trying to be smooth or semantic. It is just a direct numeric label.

In a model, one-hot vectors are often used when the system needs a discrete choice in numeric form. They are a simple bridge between categories and vectors, and they are easy to read because only one slot is active.

One-hot vector versus embedding

A one-hot vector is a very sparse representation . It says “this item is category k” and leaves the rest of the slots empty.

An embedding is different: it usually has many non-zero values and can encode richer relationships between items. In that sense, a one-hot vector is the simplest possible kind of vector representation.

Why the toy generator mentions entity one-hot slots

The toy input generator uses a small entity one-hot region in its structured mode so repeated noun-like mentions can share a simple identity slot. That is a debugging aid, not something you have to implement in the assignment itself.

If you want the broader “tokens become vectors” picture, read Tokens and Embeddings.

One-Hot Encoding

One category, one active slot

Related explainers

Tokens and Embeddings