AIPOWERHOUSE
One day. One real ticket from our own backlog. You run the whole lifecycle — SPECIFY → GENERATE → COMPREHEND — with an agent doing the typing.
RULE OF THE DAYShort talks, long labs. If I speak for more than fifteen minutes, something has gone wrong.
Timeboxes are hard; done-when beats done-everything. Whatever state your lab is in when time runs out, that's what we debrief.
Every piece of work is a sandwich. You are the bread on both ends; the AI is the filling — the only loop you follow today.
- Specify — you say what you want and what the rules are. A clear spec means less guessing.
- Generate — the agent does the work. It writes code, tests, and edits.
- Comprehend — you read it back, ask questions, and decide. This is where quality is set.
- Skip Comprehend and you did not save time. You just pushed the work to later.
Metaphor: Dan Shipper, Every's podcast “AI & I”, with Kieran Klaassen. The Specify→Generate→Comprehend framing is this engagement's own adaptation.
You write less and less code over time. The time does not vanish — it moves to the two human ends.
The front of the sandwich. Vague in, vague out — judgment goes in here, while changing your mind is still cheap. Unclear requirements are this company's single biggest source of delay; this block is the antidote.
Your tool for the day: a session, your repo, and control over what it may do.
- Start it in the repo root. Now it can see your real code.
- Permissions: it asks before acting, until you choose to let it run.
- Talk to it in plain language. Point at real files. No magic words.
- Esc interrupts at any time. You are always the one in charge.
Your ticket becomes a plan so clear someone else could build it.
Stretch: two competing approaches from the agent; one sentence on why you chose yours.
It read, it ran, it looked at the result, and it went again. That loop is the difference between an agent and a chatbot.
- Each step is a tool call. The result decides the next step.
- It acts — it doesn't just answer. That's what makes it wave 2.
- The loop stops when the goal is met, or when it needs you.
Long session, many files — and the answers went vague. The context window is nearly full. This is normal; now you know its name.
- The agent's short-term memory has a hard limit. Old details fall off or blur.
- The fix: compact the session, start fresh, or write the state to a file first.
- Long runs are managed, not endured. This is half of what supervision means.
Many agents can work at the same time. But only one person reviews. That is the slow part.
- Agents spread out. Many tasks generate at the same time.
- Human review is the one slow step. It sets how fast the factory really goes.
Today was the whole sandwich, fast. The next sessions slow it down — each one takes a station you ran today and builds it properly, with homework on real tickets in between.
Everyone drove an agent through Specify → Generate → Comprehend on our own code — and the lessons you wrote into CLAUDE.md are still here tomorrow. That loop, repeated, is the factory.
The harder parts — many agents at once, autonomous routines, real review at volume — come in the next sessions. We are not learning a tool. We are building a factory the team owns. Today was day one.
Six blocks. This is the only one where I talk for more than fifteen minutes. Everything after it, you build — on a real ticket from our own backlog.
If one of these is false for you, raise a hand now — the rescue corner fixes it while we frame the day.
First to call one out loud claims it; we tally at the demo circle. Noticing these moments is the actual skill this track teaches.
You wouldn't take a new colleague's word for everything on day one. Same rule here. First contact is read-only: find out what it knows, where it bluffs, and how it works.
Calibrate: what does it actually know about our codebase — and where does it bluff?
Stretch: make it draw the subsystem as a diagram — and grade the diagram too.
No tricks. Give the agent what a new teammate would need on their first day.
- Say the goal and the rules — not every step.
- Point to the real files, and to examples of how we already do it.
- Say what “done” looks like, in checkable terms.
- Let it ask. A good agent interviews you before it builds.
Don't let it build yet. Make it ask questions until you both mean the same thing.
- The agent interviews you: edge cases, assumptions, what must not change.
- Disagreements surface now, in planning — not later, in a pull request.
- Only then does it produce a plan you can hand over with confidence.
A feeling is not a finish line. Done is a list of checks that pass or fail.
- 3–5 concrete checks per ticket: given this, when that, then this.
- The checks become tests. The tests give the agent a target.
- “Looks right” is not done. A passing check is done.
Every vague spot your pair marked in Lab 2 lives here. The seam is the most expensive place to be vague — and it is where our delays begin.
The skill of this block is knowing when to step in — and when to sit on your hands.
Your hands leave the keyboard. Your attention doesn't.
- Checks first: the done-when list from your plan becomes failing tests, then code.
- Intervene when it asks, when it drifts off-plan, or when a spotter moment fires.
- Don't grab the wheel at the first wobble — watch it try to recover first.
- Read the tool calls as they happen. Narrate to your pair what it's doing and why.
Hand over the plan from Lab 2. Your job changes: supervise.
Stretch: kick off a second, smaller task in a separate session — and feel what it does to your supervision of the first.
You met all four this morning without their names. Now they have names — we go deeper on each in later sessions.
The back of the sandwich — and the part that decides whether AI makes us faster or just busier. Code you can't explain isn't done.
Write down what you expect the diff to contain — before you open it.
- Expectation first: your model of the change must exist before the agent's.
- Every surprise is either something you misunderstood or something it overdid. Classify each one.
- Ask your pair one question about their diff. If they can't answer, that's comprehension debt — found while it's cheap.
- AI review augments human review. It never replaces it.
Evidence first. Then read the diff like a stranger wrote it.
Stretch: ask the review agent for security findings — injection, authorization, input validation.
The gap between code that exists and code anyone understands. It compounds quietly — and it is the failure mode this whole track is built to prevent.
- Throughput without comprehension makes a team fast and fragile at the same time.
- The unit of review is the unit of understanding. Keep changes small.
- You just practiced the antidote: expectation-first review, and one honest question.
Everything you explained twice today is a lesson the system should never need again. Write it down, prove it works, and the factory gets smarter. Then we demo, measure, and close.
One file the agent reads every session. Today's friction becomes tomorrow's head start.
- Rules, traps, where things live — written once, read every session.
- It's the team's shared memory, kept in version control.
- A team that writes lessons down owns a factory that improves weekly. A team that doesn't starts from zero every morning.
Make tomorrow's agent smarter than today's. One rule, written and proven.
Stretch: turn a workflow you repeated today — like the grill prompt — into a reusable skill stub.
A comprehension check, not a victory lap.
“It went perfectly” earns follow-up questions. Caught failures earn applause.
If we only count speed, we end up approving work nobody read. So we count both — agreed now, before story points become the only number.
Baseline taken today. Same questions at week 8 — results shown, not claimed.