# The memory palace (v3)

A workshop you walk through in your head. 7 rooms plus a staircase, 79 numbered concept stations. Each station is a physical object, anchored to a concept. The route is the spine of the course.

You walk the route from start to current lesson every time you review. That's how the order sticks.

v3 adds **the workshop doorway** as Station 0 (Lesson 0 orientation), plus a closing **walk-and-view** for each room as the phase synthesis lesson. Calibration assessments aren't palace stations; they're operational checks done at the central workbench in the middle of the workshop.

---

## The route

0. **The workshop doorway** (Lesson 0, orientation). You stand at the entrance before you go in. A pinned schematic of the workshop hangs beside you, showing all 7 rooms. You read it. After each phase you come back in your head and tick off the room.
1. **The bench** (phase 1, foundations of intelligence). You sit. You think before you build. 10 stations along the bench, left to right. Closing walk: S1, all 10 stations in one breath.
2. **The whiteboard wall** (phase 2, mathematical intuition). You stand. You sketch the maths you'll need. 11 stations across the wall. Closing view: S2, the whole wall read as one picture.
3. **The server bay** (phase 3, hardware). Through a heavy door, into the noisy room with the racks. 12 stations around the cold aisle. Closing walk: S3, the cold-aisle loop, clockwise.
4. **The drafting table** (phase 4, neural architectures). Back into the main workshop. The big flat table where architectures get drawn. 14 stations across the table. Closing view: S4, all 14 sketches laid out in order.
5. **The foundry** (phase 5, training and scaling). Down a corridor, into the room where things get made. 10 stations along the production line. Closing walk: S5, the production line end to end.
6. **The lab bench** (phase 6, engineering and deployment). Beside the foundry, the wiring rig where you assemble systems. 10 stations along the bench. Closing view: S6, every connection traced.
7. **The stairs** (phase 7A, research literacy). A metal staircase up to the roof, with a small reading area as you climb. 3 stations on the way up. Closing view: S7A, look down the staircase and see the 3 landings.
8. **The roof** (phase 7B, frontier intelligence). Up the stairs, onto the flat roof. You look out. 9 stations across the parapet. Closing view: S7B, the full horizon survey.

**The central workbench** sits in the middle of the workshop where the corridors meet. After each phase, you sit at the central workbench and take the calibration (C1-C7B). It's not a palace station; it's an operational test bench you return to between rooms.

---

## Why workshop, still

You navigate workshops without thinking. The route is muscle memory. Anchoring abstract AI concepts to a place you already have spatial memory for is the cheat code that makes 79 lessons stick.

If you have a real workshop you know, overlay this onto it. If not, the one below is fine.

---

## Phase 0: the workshop doorway (orientation)

Before any room. 1 station, plus the schematic on the wall.

0. **The doorway, with the workshop schematic pinned beside it.** "Entering the stack." You stand at the entrance, read the schematic, see the route before you walk it. Worldview: AI is mechanistic; you are being taught to think like a systems engineer; representation, optimisation, geometry, hardware, and constraints recur through every phase.

## Phase 1: the bench (foundations of intelligence)

Left to right along the bench. 10 stations.

1. **The reading lamp.** "What is an intelligence system." The lamp turns on, you sit down, you start thinking.
2. **The blank page.** "Pattern, prediction, compression." A page waiting to be filled with the shortest description of what's true.
3. **The folded map.** "Generalisation, memorisation, and abstraction." A compressed representation of terrain that survives travelling through unfamiliar places.
4. **The dust suspended in sunlight above the bench.** "Emergence (first pass)." Hidden structure becoming visible under the right illumination; gradual accumulation crossing a visibility threshold; complex behaviour built from many tiny interactions. (Revisited at station 75 on the parapet.)
5. **The toolbox.** "Learning paradigms." 4 drawers: supervised, unsupervised, self-supervised, reinforcement.
6. **The maze on the wall.** "Sequential decision making and reward." A small printed gridworld pinned above the bench. You can see the goal cell and the start cell. State, action, reward, policy.
7. **The mirror.** "Representation." How the system sees itself seeing the world.
8. **The spool of solder.** "Tokens." Unrolls in chunks, not infinitely thin.
9. **The compass.** "Embeddings (intuition)." Direction matters more than position.
10. **The dust cover.** "What current AI can and can't do." Some things are stably under the cover; some things only look like they are.

**S1. The whole bench, end to end.** Walk all 10 stations in one breath. Compress what they share. Core laws callback. Bridge: cross the workshop to the wall.

## Phase 2: the whiteboard wall (mathematical & computational intuition)

Standing in front of the wall, reading left to right. 11 stations.

11. **The arrow on the board.** "Vectors." Direction and magnitude, sketched.
12. **The pinned map.** "Distance, similarity, and semantic geometry." Dots clustered on a grid, threads connecting neighbours, angle arcs between arrows. Where meaning becomes geometry.
13. **The layered grids.** "Dimensions, feature spaces, and representation capacity." Transparent overlays stacked, each adding a new axis of distinction. The wall gains depth.
14. **The grid.** "Matrices and linear transforms." Bend the grid, that's the matrix doing work.
15. **The stack of transparent sheets.** "Projections, subspaces, and information selection." Light shines through a thick stack pinned to the wall; each sheet filters out some detail, so only the marks that matter reach the surface. Where the layered grids (station 13) added axes of distinction, this stack takes them away. Projection = selective preservation: keep what matters for the task, discard the rest.
16. **The dice rack.** "Probability, uncertainty, and belief." Each die shows a different possible future; no single die is certain; read together they form a distribution. Probability = uncertainty made measurable.
17. **The funnel and bins.** "Distributions, sampling, and possible futures." A clear funnel pours coloured balls into a row of bins; the pattern they pile into is the distribution, drawing one ball is a sample, and a valve sets how widely they scatter (temperature). Distribution = landscape of possibilities; sampling = drawing from it.
18. **The fog machine.** "Entropy, uncertainty, and surprise." Fog fills a network of paths: thick fog means many routes stay plausible (high entropy), cleared fog means one obvious route (low entropy). Entropy = how much uncertainty remains; fog = uncertainty; visibility = predictability.
19. **The slope and valley.** "Gradients and optimisation landscapes." A contour-line landscape with a glowing ball high on a slope; at each step an arrow points down the steepest descent and the ball rolls until it settles in a valley. Gradient = slope; optimisation = following the slope downhill; training = repeated downhill movement.
20. **The conveyor belt.** "Parallelism." A belt of identical boxes: one worker handles them one at a time (serial), while a warehouse of thousands of workers each grab a box as it passes (parallel). One worker = serial; the warehouse = parallel; modern AI works because the warehouse exists.
21. **The lever and machine.** "Compute scaling intuition." A giant lever drives a machine that keeps producing better outputs; each pull takes more effort than the last and each improvement is smaller. Lever = compute invested; output quality = capability; rising effort = diminishing returns. More compute helps, but each gain costs more than the previous one.

**S2. The whole wall, as one picture.** Step back. The maths is a single toolkit. Bridge: through the heavy door into the server bay.

## Phase 3: the server bay (hardware & systems)

Through the heavy door. Around the cold aisle, clockwise. 12 stations.

22. **The old CPU on the shelf.** "The general-purpose CPU." Serial, deep pipeline, fast at one thing.
23. **The GPU board on the rack.** "Why GPUs exist." Throughput over latency.
24. **The warp/lane diagram pinned to the rack.** "The GPU execution model." SIMT, lanes, warps.
25. **The VRAM stick on the bench.** "VRAM." Capacity and bandwidth, the memory wall.
26. **The tensor core schematic.** "Tensor cores and matmul engines." Silicon shaped for the operation that dominates everything.
27. **The memory hierarchy ladder.** "Memory hierarchies." Each rung is roughly 10× slower than the one above.
28. **The roofline poster.** "The roofline model." Compute-bound on one side, memory-bound on the other.
29. **The TPU pod model.** "TPUs and other accelerators." Systolic arrays, designed-for-one-workload silicon.
30. **The NPU in the phone.** "NPUs and edge inference." Different constraints, different shape of chip.
31. **The NVLink ribbon.** "Interconnects." The bandwidth between chips, not just inside them.
32. **The quantisation scale.** "Quantisation." FP32 down to int4, what you give up at each step.
33. **The cluster diagram on the back wall.** "Distributed compute patterns." Data, tensor, pipeline, expert.

**S3. The cold-aisle walk, clockwise.** All 12 stations in one loop. Memory hierarchy as the spine; the roofline as the constraint surface; matmul-shaped silicon as the response. Bridge: back into the main workshop, to the drafting table.

## Phase 4: the drafting table (neural architectures)

Around the big flat table. 14 stations.

34. **The neuron sketch.** "The perceptron." Linear separability, the XOR wall.
35. **The stack of layers.** "MLPs and universal approximation." Stacking nonlinearities is, in principle, enough.
36. **The flowing chain.** "Backpropagation." The chain rule, applied at scale.
37. **The image filter.** "Convolutional nets." Locality and weight sharing.
38. **The film reel.** "Recurrent nets." State through time, frame by frame.
39. **The locked memory box.** "Vanishing gradients and LSTMs." Gates that protect the signal.
40. **The spotlight.** "Attention." Soft lookup over the past, learned weighting.
41. **The grid of spotlights.** "Multi-head attention." Many lenses at once.
42. **The ruler.** "Positional encoding." Order has to be told, not inferred.
43. **The block stack.** "The transformer block." The unit cell that gets repeated.
44. **The denoising sequence.** "Diffusion." Noise as a training signal.
45. **The committee.** "Mixture of experts." Many small specialists, route to a few.
46. **The pair of senses.** "Multimodal architectures." Vision and language wired together.
47. **The label maker.** "Discriminative architectures." Not everything generates. Classifiers, rankers, two-tower retrievers; the other half of the field.

**S4. All 14 sketches laid out in order.** The architecture chain end to end, each step responding to the previous step's limit and the era's hardware. The transformer is the survivor; diffusion, MoE, multimodal, discriminative are its companions. Bridge: down the corridor to the foundry.

## Phase 5: the foundry (training & scaling)

Walking the production line. 10 stations.

48. **The ingot stack.** "Datasets and corpora." Raw material before any processing.
49. **The casting oven.** "Pretraining." Hot, long, expensive.
50. **The triangle scale model.** "The compute-data-parameters triangle." 3 axes, scale them together.
51. **The slope graph on the wall.** "Scaling laws." The curve that predicts the next model.
52. **The polishing bench.** "Supervised fine-tuning." First post-training step.
53. **The judge stand.** "RLHF." A reward model decides, the policy adjusts. (Second pass on RL from station 6.)
54. **The simpler scale.** "DPO and lighter alternatives." Less machinery, similar outcome.
55. **The synthesis vat.** "Synthetic data and self-training." Models teaching models.
56. **The shrinker.** "Distillation." Big in, small out, most of the capability preserved.
57. **The distributed forge.** "Distributed training systems." A single logical model, sharded over thousands of chips.

**S5. The production line, end to end.** From raw ingot to a serving-ready model. Pretraining cost, post-training stack, scaling laws as the curve, distillation as the move that brings a model down a tier. Bridge: next door to the lab bench.

## Phase 6: the lab bench (engineering & deployment)

The wiring rig beside the foundry. 10 stations.

58. **The keyboard.** "The API call." What crosses the wire.
59. **The vector compass II.** "Embeddings in practice." Indexing, similarity, where they break.
60. **The retrieval rig.** "RAG end-to-end." Chunk, embed, retrieve, rerank, generate.
61. **The ranking dial.** "Classical ML in production." Recommenders, two-tower ranking, gradient-boosted trees on tabular data. Most production ML isn't generative.
62. **The robot arm.** "Tool use and function calling." The model reaches out.
63. **The agent shelf.** "Agents." Small loops that occasionally work.
64. **The socket strip.** "MCP and protocols." The protocol layer for tool use.
65. **The multimeter.** "Evaluation." Did it actually work, measured properly.
66. **The local inference rig.** "Inference engines." KV cache, batching, speculative decoding.
67. **The fume hood.** "Private and on-prem AI." Filtered, controlled, off the public internet.

**S6. The wiring rig, complete.** Every connection traced. Hybrid systems are the production reality; the LLM is one component; classical ML still runs most of the world. Bridge: up the metal staircase.

## Phase 7A: the stairs (research literacy)

A metal staircase to the roof, with reading material as you climb. 3 stations.

68. **The reading desk.** "Reading an AI paper." A clipped-together paper on the first landing. You read the load-bearing claim.
69. **The benchmark scale.** "Benchmarks and how they lie." A scale at the second landing with a clearly rigged calibration. You learn to spot the rigging.
70. **The mirror under the magnifier.** "Reading scaling graphs critically." A log-log plot at the top of the stairs, magnified so the axis tricks become obvious.

**S7A. Looking down the staircase.** Three landings together: load-bearing claim, benchmark honesty, graph honesty. You are now equipped to read primary sources. Bridge: step out onto the roof.

## Phase 7B: the roof (frontier intelligence)

Up the last step. Onto the flat roof. 9 stations along the parapet, looking out.

71. **The chess piece.** "Reasoning models." RL on chains of thought. (Third pass on RL from stations 6 and 53.)
72. **The globe.** "World models." Predicting the world's next state, not just text.
73. **The walking robot.** "Embodiment and robotics foundation models." Sensors and motors meet foundation models.
74. **The reflection pool.** "Self-modeling and introspection." The model representing its own state.
75. **The dust on the parapet.** "Emergence revisited (mechanism, not magic)." Deliberate callback to station 4. The same theme at a higher altitude, with all the architecture context now in place.
76. **The candle on the parapet.** "Theories of consciousness." Old questions, new tools, honest uncertainty.
77. **The chain and the open book.** "Alignment and interpretability." Scalable oversight, mech interp, faithful CoT.
78. **The forking road sign.** "AGI hypotheses and timelines." The honest spread of views.
79. **The lighthouse.** "Future hardware." Photonics, neuromorphic, beyond-CMOS. The next physical substrate.

**S7B. The full horizon survey.** Reasoning, world models, embodiment, self-modelling, emergence-revisited, consciousness, alignment, AGI views, future hardware. All 5 core laws show up here at once. The course ends with the reader equipped to follow primary sources for the next 1-2 years.

---

## How to actually use this

After each lesson, close the page. Stand up. Walk the route in your head from station 1 to the current station. Name each station's concept out loud.

When you can walk all 79 without a hitch, the course has stuck.

If a station goes fuzzy, that's your re-study signal. Open that lesson.

The hardware room (phase 3) gets walked more often than the others. Every architecture (phase 4) and training (phase 5) lesson has an interleaved retrieval question that pulls you back into the server bay. So you walk that route twice for every architecture lesson. By the end of phase 5 you should be able to recite the memory hierarchy ladder cold.

The maze on the wall (station 6, RL) is another high-traffic node. Stations 53 (RLHF), 71 (Reasoning models), 72 (World models), and 73 (Robotics) all interleave back to it.

The dust pattern repeats too. Station 4 (emergence first pass) is referenced from stations 43 (transformer block), 45 (MoE), 51 (scaling laws), 71 (reasoning), 72 (world models), and finally reframed at station 75 (emergence revisited). When you walk the route, you should feel the theme threading through the rooms.