COMP10002 Foundations of Algorithms

Toy Demo

A small end-to-end way to connect the assignment’s numeric inputs and outputs back to token positions in real text.

Generate a toy input

From the assignment bundle root:

python3 toy_llm_generate_input.py --mode structured --n 6 --g 2 --d 6 \
  --text "the cat sat on the mat and purred softly" \
  --show-tokens > toy-input.txt

The --mode structured option intentionally makes the matrices and embeddings easier to reason about. It is meant for learning and debugging, not for grading.

The structured generator also uses a small one-hot encoding region for entity identity, so repeated noun-like mentions can share a simple slot.

Then run your program:

./a1 < toy-input.txt > toy-output.txt

Explain the output

Use the helper script to turn the numeric output back into a token-oriented explanation of the Stage 4 weights:

python3 toy_llm_explain_output.py \
  --input toy-input.txt \
  --text "the cat sat on the mat and purred softly" \
  --output toy-output.txt

If the explanation does not perfectly match your grammar intuition, that is fine. Attention is computed from the vectors and matrices, not from English rules directly. The goal of the toy demo is to make the output feel less abstract.