COMP10002 Foundations of Algorithms
Annotated Scaffold
A guided reading of the provided `a1.c`, with direct links from the main spec and tooltip-style explanations for the names, arrays, and helper functions you keep seeing.
Overview
This page is meant to be open beside a1.c.
The main assignment page tells you what to implement. This page is about how to read the scaffold you were given without getting lost in the arrays, the stage order, or the helper names.
Why this page exists
The scaffold does some important work for you already:
- it allocates the main arrays
- it calls the stages in order
- it prints the stage headers and formatted rows
- it provides the printing helpers you should normally leave alone
That means your job is mostly to fill in the TODO functions correctly and keep the existing output contract intact.
File map
Top of file
Comment block, includes, and constants
The header identifies the file, the complexity-analysis block is where students write their Big-O notes, and the preprocessor section defines the fixed array bounds such as MAX_TOKENS , MAX_D , and MAX_GEN .
Interface
Function prototypes
This section tells you the shape of every function before you reach the implementations. When you are coding a stage from the main page, the prototype is the fastest way to remind yourself which arrays are inputs and which arrays you are expected to fill.
Execution flow
main() is the roadmap
The `main` function is worth reading carefully once, even if you do not edit it much. It shows the exact order in which the scaffold expects the stages to happen, which arrays are reused later, and where the printing already happens.
Your code
TODO functions
The core work lives in the TODO block: `read_input`, `create_embeddings`, `compute_projection`, the combined prompt score/weight function, the prompt output function, and the generation-with-cache function. Those are the places you should spend most of your implementation effort.
Leave these alone
Provided helpers
The printing helpers already implement the expected formatting, including clamping near-zero values. This is one of the easiest places to accidentally break the autograder if you start “cleaning things up”.
Walking Through main()
The easiest way to understand the scaffold is to read main() as a pipeline:
/* Some details of the main function such as printing and variable declarations
are ommitted here for brevity, but are included in the skeleton code */
int main(void) {
/* allocate all the main arrays */
read_input(&n, &d, &g, &text_len, embedding_table, mask, prompt, gen, wq, wk, wv);
/* Stage 1: sort and print the one-hot table */
create_embeddings(embedding_table, text_len);
/* build Q, K, V */
compute_projection(n, d, prompt, wq, q_matrix);
compute_projection(n, d, prompt, wk, k_matrix);
compute_projection(n, d, prompt, wv, v_matrix);
/* scores -> weights -> outputs */
compute_attention_scores_or_weights_prompt(
n, d, mask, q_matrix, k_matrix, scores, NO_SOFTMAX);
compute_attention_scores_or_weights_prompt(
n, d, mask, q_matrix, k_matrix, weights, APPLY_SOFTMAX);
compute_attention_output_prompt(n, d, weights, v_matrix, out_prompt);
/* generation with cache */
compute_generation_with_cache(..., t, ..., output);
}
What to notice first
The scaffold already decides the stage order for you. If your output is wrong, one of the earlier filled arrays is usually the real cause.
Why there are separate arrays
The scaffold does not overwrite the prompt matrix in place. It keeps prompt, then separate q_matrix, k_matrix, v_matrix, then separate scores, weights, and out_prompt so each stage can be inspected independently.
What printing means for you
Notice where printf already happens. That is why the spec keeps warning you not to add extra output and not to change the existing formatting helpers.
read_input(...)
Stage 0
Load the whole snapshot into memory
This function is the bridge between the input file and every later computation. If this stage is wrong, nothing downstream is trustworthy.
void read_input(int *n, int *d, int *g, int *text_len,
char embedding_table[MAX_TEXT_SIZE][MAX_TOKEN_LENGTH + 1],
int mask[MAX_TOKENS],
double prompt[MAX_TOKENS][MAX_D],
double gen[MAX_GEN][MAX_D],
double wq[MAX_D][MAX_D], double wk[MAX_D][MAX_D],
double wv[MAX_D][MAX_D]);
What this function should do
Use scanf to read the values in the exact order the spec gives: sizes, Stage 1 token text, prompt mask, prompt rows, generated rows, then Wq, Wk, and Wv. This is a storage stage, not a computation stage.
Why the scalar values are pointers
n, d, g, and text_len are passed by address so read_input can write them back into the variables owned by main().
Common mistake
Mixing up loop bounds. The prompt loops run over n rows, gen runs over g, and each matrix Wq/Wk/Wv runs over d × d entries.
create_embeddings(...)
Stage 1
Turn the token list into a sorted one-hot table
This is the warm-up representation stage. It is separate from the later numeric prompt embeddings, and its whole job is to sort the unique token strings and print the one-hot rows.
int compare_tokens(const void *a, const void *b);
void create_embeddings(char embedding_table[MAX_TEXT_SIZE][MAX_TOKEN_LENGTH + 1],
int text_len);
What this function should do
Sort the token strings, keep each unique token once, and print one one-hot vector per token in sorted order.
What the vector length is
The Stage 1 vector length is the number of unique tokens you ended up with after sorting and deduplicating, not the later prompt embedding dimension d.
Common mistake
Mixing up “first seen” with the final printed order. First seen helps define the unique set; the printed order is the sorted order.
compute_projection(...)
Stage 2
Apply the same matrix-multiply pattern three times
This is the reusable projection helper. The scaffold calls it three times so the same loop structure builds Q, K, and V.
void compute_projection(int count, int d, double src[MAX_TOKENS][MAX_D],
double w[MAX_D][MAX_D],
double dest[MAX_TOKENS][MAX_D]);
Loop shape
One loop over source rows, one loop over output columns, and an inner accumulation loop over the shared dimension d.
Mental model
Each output component is “one source row dotted with one column of the weight matrix”. The projection explainer shows that visually.
Common mistake
Forgetting to reset the accumulator to 0.0 for each destination component before the inner loop starts.
compute_attention_scores_or_weights_prompt(...)
Stages 3 and 4
Use one function twice: first for scores, then for weights
The scaffold combines raw score computation and weight computation using stable softmax into one function. The first call fills the square prompt score matrix; the second call repeats the same masking logic and turns each row into attention weights. The only reason we compute the scores on their own seperately, is as a check that your program output is correct.
void compute_attention_scores_or_weights_prompt(
int n, int d, const int mask[MAX_TOKENS],
double q[MAX_TOKENS][MAX_D], double k[MAX_TOKENS][MAX_D],
double scores_or_weights[MAX_TOKENS][MAX_TOKENS],
int apply_softmax);
What the first call does
When apply_softmax == NO_SOFTMAX, compute the scaled dot products, apply both masks, and store -INFINITY in the masked positions. If mask[i] == 0, the whole row is masked.
What the second call does
When apply_softmax == APPLY_SOFTMAX, use the same allowed positions but turn each row into a stable softmax over the unmasked columns. If a row has no valid columns, the whole output row should be zeros.
Why the output is square
Each prompt row is compared with every prompt row position, so the output is n × n even though q and k are n × d.
compute_attention_output_prompt(...)
Stage 5
Use the weights to blend value rows into output rows
This is the weighted-sum stage. The previous stage decided how much each earlier row matters; this stage uses those coefficients to build the new output vector.
void compute_attention_output_prompt(
int n, int d,
double weights[MAX_TOKENS][MAX_TOKENS],
double v[MAX_TOKENS][MAX_D], double out[MAX_TOKENS][MAX_D]);
Why the dimensions fit
A weight row has length n, and each column of v also has n entries. That is why you can think of each output component as “one weight row dotted with one column of v”.
What changes conceptually
This is the point where a prompt row stops being “just itself” and becomes a mixture of the value rows from the positions it attended to.
Common mistake
Looping over the wrong dimension in the inner sum. The inner sum here runs over prompt positions j, not over embedding components.
compute_generation_with_cache(...)
Stage 6
Do the same attention logic again, but reuse history
This function is the most crowded because it has to compute the new generated token's projections, append to the caches, score against the cache, produce the output, and return the dot-product count.
long compute_generation_with_cache(
int n, int d, int g, int t, const int mask[MAX_TOKENS],
double gen[MAX_GEN][MAX_D], double wq[MAX_D][MAX_D],
double wk[MAX_D][MAX_D], double wv[MAX_D][MAX_D],
double k_cache[MAX_TOKENS + MAX_GEN][MAX_D],
double v_cache[MAX_TOKENS + MAX_GEN][MAX_D],
double output[MAX_D]);
What makes this different
Prompt attention compares prompt rows to other prompt rows. Generation compares one fresh generated query against the already-built cache of older keys and values.
Why the cache arrays are bigger
The cache has to hold both the original prompt rows and the later generated rows, so the row capacity is MAX_TOKENS + MAX_GEN.
What the return value means
The function returns a long because the stage also has to report how many dot products were computed for that generated token.
Provided Helpers
These are the functions you should read once, understand, and usually leave alone:
static double clamp_near_zero(double value) { ... }
void print_vector(int d, const double row[]) { ... }
void print_attention_matrix(int n,
const double matrix[MAX_TOKENS][MAX_TOKENS]) { ... }
void print_embedding_matrix(int n, int d,
const double matrix[MAX_TOKENS][MAX_D]) { ... }
int compare_tokens(const void *a, const void *b) { ... }
Do not casually rewrite the printing helpers
Why they matter:
clamp_near_zero(...)rounds tiny floating-point numbers to0.0print_vector(...)print a length-dvector rounded to 3 decimal places seperated by spaces.print_embedding_matrix(...)print an n by d embedding matrix with each row on a new line.print_attention_matrix(...)print an n by n attention matrix with each row on a new line.compare_tokens(...)is an optional helper function to use for sorting an array of strings with qsort
These helpers are part of the expected output behaviour, so changing them is risky unless the spec explicitly tells you to.