Phase 1 · Foundations of intelligence

The transformation

What changes between entering the bench and leaving it.

The reader arrives at the bench with whatever picture of AI public discourse has installed: a single mysterious thing, sometimes magical, sometimes dismissed. Phase 1 replaces that picture with a working noun.

An intelligence system is a thing with inputs, internal state, outputs, and a learning signal. It runs as optimisation against an objective. Its capability is shaped by what signal it was trained on, what representation it built, and what constraints it lives under. None of those four facts requires mathematics yet. They require the right vocabulary and the right mental model.

By the end of Phase 1, the reader can describe any AI system as: this input → this representation → this objective → this output, trained with this signal under these constraints. The maths, the hardware, the architecture, and the training stack all get layered onto that skeleton in the phases that follow.

Phase 1 in one line

Phase 1 teaches what the machine fundamentally is. Optimisation against an objective, shaped by signal and constraints, surfacing as representation. Mechanism first.

The systems loop

The shape that recurs through every later phase.

The diagram below is the seed diagram of the course. Phase 4's transformer block is the same loop instantiated with attention and feed-forward layers. Phase 5's training loop is the same loop with feedback to the parameters. The progressive diagram evolution starts here.

Fig 1 · The Phase 1 systems loop. Forward arrows (amber) carry input through representation, internal state, and output. The learning signal (green, dashed) closes the loop by feeding back from target or reward into the internal state. Constraints (red, dotted) bound every block. This shape recurs at higher fidelity in the transformer block (L43), the training loop (L49), and the deployment stack (L58–L67).

The 10 stations

The bench, left to right.

Each station is a physical object on the bench, anchored to one concept. The route is the spine of Phase 1. Walking the bench is the consolidation step that turns the lessons into structural memory.

      L1The reading lamp · what is an intelligence systemA working definition. Inputs, internal state, outputs, learning signal.
      L2The blank page · pattern, prediction, compressionPrediction and compression are the same problem under the hood.
      L3The folded map · generalisationMemorisation vs abstraction. Training error vs test error.
      L4The dust in sunlight · emergence (first pass)When more becomes different. Mechanistic, not mystical. Revisited at L75.
      L5The toolbox · learning paradigmsSupervised, unsupervised, self-supervised, reinforcement.
      L6The maze on the wall · sequential decision makingState, action, reward, policy. Why RL is harder than supervised learning.
      L7The mirror · representationHow the system sees itself seeing the world. Where the first core law lands.
      L8The spool of solder · tokensQuantising messy input into discrete units. Byte pair encoding.
      L9The compass · embeddings (intuition)Meaning as direction in a high-dimensional space. Geometry teaser.
      L10The dust cover · current AI perimeterStable capability, brittle trick, confidently wrong output that looks like capability.
      S1Synthesis · the whole bench, end to endCompress the systems vocabulary into one mental model. Bridge to Phase 2.
      C1Calibration · mechanism check4–6 open-ended questions. Gate to Phase 2.
    

Phase 1 themes

What Phase 1 reinforces and what it refuses.

Four themes thread the bench. Each one cuts against a specific tendency in how AI is talked about elsewhere.

Theme · 1

Mechanism over mysticism

"The model just understands" is not an explanation. Phase 1 names objective, representation, signal, and constraint instead. Where a behaviour is currently unexplained, the lesson says so and references current interpretability work.

Theme · 2

Optimisation over magic

Capability is a function of what was optimised against, not a property the system "wants" to have. L5 makes the paradigms explicit; L6 makes the temporal credit assignment problem of RL explicit; the whole phase resists language that hides the optimisation.

Theme · 3

Representation over anthropomorphism

The system operates on its representation of the world, not on the world. Choice of representation often matters more than the algorithm running on top. L7 lands this as a core law (representation shapes computation) that recurs through P3, P4, and P6.

Theme · 4

Constraints over hype

The capability perimeter (L10) is the operational consequence of constraints: data, compute, signal, deployment. The honest perimeter is what separates engineering from press release.

The capability perimeter (L10)

Three honest categories.

By the end of Phase 1, the reader can sort claimed AI capabilities into the three categories below, with the mechanism that puts each one there. This sorting is the operational habit Phase 1 installs.

stable capability

The system is reliably useful at this. The mechanism is well-understood; the failure modes are bounded. Example: text classification on in-distribution data.

brittle trick

The system can do this on benchmarks. On real inputs slightly outside the training distribution, it falls over. The brittleness traces back to a specific representation or signal limit.

confidently wrong

The system produces output that looks like a capability but isn't. The output is fluent; the underlying claim is false. The mechanism is overconfidence in low-evidence regions of the input space.

Core laws established in Phase 1

What lands here · what recurs later

Representation shapes computation. Established at L7. Recurs through Phase 3 (the chip is designed for matmul on representations), Phase 4 (architectures encode different representation choices), Phase 6 (embeddings and retrieval live or die on representation quality).
Optimisation shapes capability. Threaded across L5 and L6, and revisited every time Phase 5 runs an objective on a model. The system gets what its objective rewards.
Constraints shape systems. The compute-spectrum lens lands lightly at L1 (intelligence systems exist across tiers) and at L10 (the perimeter is shaped by what hardware you have). This law turns into the whole of Phase 3.
Geometry enables generalisation. Teased at L9 with embeddings as direction. The full geometric treatment lives in Phase 2 (L11–L12–L13) and is then used as scaffolding through the rest of the course.

Bridge to Phase 2

From vocabulary to apparatus.

Phase 1 leaves the reader with a working mental model of intelligence-as-optimisation. The model is durable but unquantified. You can talk about representation; you can't yet say what a representation looks like as a 768-dimensional vector. You can name optimisation; you can't yet describe what a gradient is or which direction to step.

Phase 2 is the apparatus. The whiteboard wall sketches the maths the rest of the course depends on: vectors, matrices, gradients, probability, entropy, parallelism, and scaling intuition. None of it is heavier than it has to be. Each piece earns its place because it shows up later.

The S1 synthesis runs the bench in one breath and names the bridge explicitly. The C1 calibration gates the move. If C1 doesn't stick, you walk the bench again before crossing the workshop to the wall.

← Lesson 0 · doorway Syllabus Begin L1 · the reading lamp →