AI Development Workflow — 5-Part Series
  1. AI-Assisted Development: A Loop, Not a Chat
  2. /plan-issue: Collaborative Planning with AI
  3. /work-issue: Autonomous Implementation
  4. /qa-run: AI-Driven QA That Closes the Loop
  5. Specialist Agents: Looking at Every Page with Different Eyes

/work-issue: Autonomous Implementation

This is Part 3 of my series on AI-assisted development. Part 1 covers the loop, Part 2 covers planning.

/plan-issue is collaborative — I sit with the AI and make decisions. /work-issue is the opposite. It’s autonomous. The AI picks up a planned issue from Linear, implements it, tests it, commits it, and marks it Done. I’m not in the room.

/work-issue works fine interactively, but it gets way more useful in headless mode with claude -p. You can put it on a loop and leave it running overnight. My day now looks like: plan issues in the afternoon, start the batch script in the evening, review commits and run QA in the morning. My laptop is basically always doing something.

This only works because planning happened first. The spec in the Linear issue tells the AI exactly what to build, which files to touch, what tests to write, and what’s out of scope. Without that, autonomous implementation is unreliable. I tried — it produced code I’d have to rewrite anyway. That was probably the most important lesson in this whole process: autonomous doesn’t mean unsupervised, it means well-prepared.

What I learned building the protocol

/work-issue follows a rigid protocol defined in .claude/commands/work-issue.md. I didn’t start with 13 steps — I started with “just implement this issue” and kept adding structure every time something went wrong. Each step exists because I learned its absence the hard way.

1. Pick an issue

/work-issue                          # Auto-pick oldest Todo issue
/work-issue --id=PROJ-42             # Specific issue
/work-issue --git=pr                 # Auto-pick, create a PR
/work-issue --id=PROJ-42 --git=pr   # Specific issue, create a PR

Auto-pick grabs the oldest issue in Todo — which is the queue that /plan-issue populates. This means issues get processed in the order they were planned.

There’s also parent issue detection. If the picked issue has subtasks, the AI works on the first Todo subtask instead of the parent. When all subtasks are done, the parent auto-completes.

2. Claim it

The issue moves to In Progress immediately. This is critical for batch mode — it prevents the next run from picking the same issue.

3. Git setup

Three strategies:

  • main (default) — work directly on main. Good for small, well-planned changes.
  • branch — create issue/proj-42-add-notifications feature branch. Good for larger changes you want to review before merging.
  • pr — create branch + push + open a GitHub PR. Good for team workflows.

4. Plan (internally)

The AI reads the issue spec from Linear, fetches comments and related issues, reads the architecture docs from docs/ (same ones used during planning — platform overview, tech stack, data model), and explores the relevant code. It builds an internal plan — no plan file is written, it just organizes its approach.

If the issue is too vague (bad spec, missing details), the AI adds a Linear comment explaining what’s unclear and stops. This is an important guardrail — it’s better to stop than to guess wrong.

5-6. Implement and test

The AI writes code following existing patterns. This is where the spec pays off:

  • “Files likely affected” tells it where to look
  • “Technical approach” tells it what to do
  • “Acceptance criteria” tells it when to stop
  • “Out of scope” tells it what NOT to do

It writes tests for every implementation — backend Go tests, frontend Vitest tests. This isn’t optional; the protocol requires it.

7-8. Verify

make test          # All tests
make test-frontend # Frontend specifically
make typecheck     # TypeScript checks
make check         # fmt + vet + lint + tests

If tests fail, the AI fixes and retries. If lint fails, it fixes and retries. Up to 3 attempts before noting the issue and moving on.

9. Self-review

The AI runs git diff and checks its own work for:

  • Leftover debug statements
  • Hardcoded values
  • Missing error handling
  • SQL injection or security issues
  • Naming inconsistencies

This catches a surprising amount of things. AI is good at following checklists. For example, on PROJ-38 (the login redirect fix), the self-review caught a hardcoded API endpoint that should have been environment-based. The AI fixed it before committing — changed /api/v1/auth to process.env.API_BASE_URL + '/auth'. Small catch, but the kind of thing that would've broken in staging.

10-11. Commit and push

Only files relevant to the issue get staged — never git add -A. The commit message follows a consistent format:

PROJ-42: Add review deadline notifications

- Add notification_preferences table and migration
- Implement daily cron job for deadline checking
- Add NotificationBadge component to Calendar page

Co-Authored-By: Claude Opus 4.6 <[email protected]>

12. Update Linear

The AI comments on the issue with a summary of what was done, which files changed, the commit hash, and test results. Then it marks the issue Done.

If this was a subtask, it checks if all siblings are done. If so, it auto-completes the parent.

13. Notify

curl -d "brot: Completed PROJ-42 — Add review deadline notifications" "$NTFY_URL"

This sends a push notification via ntfy to my phone. When I’m running batch mode and doing something else entirely, I get a ping for each completed issue. It’s the “ding” that lets me know work is happening in the background.

The real unlock: headless batch mode

The batch orchestrator is a shell script that processes multiple issues one after the other. Each issue gets its own headless claude -p call — fresh context, no leftover state from the previous issue:

./scripts/work-linear-issues.sh -n 50 --stop-on-error

I usually kick this off in the evening with 40-50 planned issues in Linear. Takes a few hours to go through them — each issue runs for 2-5 minutes depending on how big it is. By morning, they're done.

What the script does:

  1. Queries Linear for the next Todo issue
  2. Spawns claude -p with the work-issue prompt in a fresh context
  3. Monitors progress with a live spinner that shows elapsed time and files being changed
  4. Handles failures — retries on transient API errors, stops or continues based on flags
  5. Prints a summary when done
╭─────────────────────────────────────────────╮
│  Linear Issue Processor                      │
│                                             │
│  Issues: 5    Model: auto     Git: main     │
│  Stop on error: false                       │
╰─────────────────────────────────────────────╯

  ● PROJ-38 (High) Fix login redirect loop [opus]
  ✓ PROJ-38: Fix login redirect loop (3m 42s)
    Files (3):
      internal/api/auth.go
      web/src/hooks/useAuth.ts
      web/src/components/LoginForm.test.tsx
    Commit: a1b2c3d PROJ-38: Fix login redirect loop

  ● PROJ-39 (Medium) Add empty state to library [sonnet]
  ✓ PROJ-39: Add empty state to library (1m 15s)
    ...

╭─────────────────────────────────────────────╮
│  Summary                                     │
├─────────────────────────────────────────────┤
│  ✓ PROJ-38  Fix login redirect loop   3m42s │
│  ✓ PROJ-39  Add empty state to lib    1m15s │
│  ✓ PROJ-40  Update review badges      2m03s │
│  ✗ PROJ-41  Refactor calendar view    5m00s │
│  ✓ PROJ-42  Add notifications         4m22s │
├─────────────────────────────────────────────┤
│  Completed: 4    Failed: 1    Total: 5      │
│  Elapsed: 16m 22s                           │
╰─────────────────────────────────────────────╯

Fresh context per issue

Each issue gets its own claude -p call — completely fresh, no memory of the previous issue. This is on purpose:

  • No leftover confusion from the previous issue
  • Each one starts clean with just the spec and the codebase

The downside is it has to re-read the docs and explore the code every time. The upside is it doesn’t get confused.

Failure handling

When an issue fails:

  • The AI leaves it In Progress with a comment explaining what went wrong
  • The working tree is cleaned up (git checkout .)
  • The script either stops (--stop-on-error) or continues to the next issue
  • A detailed log is saved to logs/work-issues/

Failed issues don’t disappear — they stay visible in Linear for me to investigate and re-plan.

Notifications: staying in the loop without being in the room

Both /plan-issue and /work-issue send push notifications via ntfy when they complete. My global Claude Code config (~/.claude/CLAUDE.md) defines the pattern:

curl -d "brot: Completed PROJ-42 — Add review deadline notifications" "$NTFY_URL"

In practice, when I kick off a batch in the evening, I wake up to a trail of notifications:

brot: Completed PROJ-38 — Fix login redirect loop
brot: Completed PROJ-39 — Add empty state to library
brot: Completed PROJ-40 — Update review status badges
brot: Completed PROJ-42 — Add review notifications

(PROJ-41 failed silently — no notification — which is itself a signal to go check the logs.)

This matters more than I expected. When the batch runs overnight, waking up to 30-40 notifications means stuff got done while I slept. No notification for an issue means something went wrong — time to check the logs and re-plan. Simple, but it makes the whole overnight thing feel way less like a black box.

Guardrails I learned to add the hard way

Every guardrail in this list exists because I didn’t have it once and something went wrong:

  • One issue per run. Never attempt multiple issues in the same context.
  • Never git push --force or git reset --hard. No destructive git operations.
  • Never modify .env files. Secrets stay untouched.
  • Never delete existing tests. Only add or modify.
  • If something feels wrong, stop. Add a comment and leave the issue In Progress.
  • Only stage files you changed. Never git add -A (which would catch unrelated formatting changes from linting).

These guardrails exist because autonomous means unattended — and in my case, literally overnight while I’m asleep. The cost of the AI making a bad decision at 2am is higher than the cost of it being conservative. I’d rather wake up to a failed issue (that I can re-plan) than to a force-pushed branch or a deleted test file.

Where I’ve seen this succeed and fail

Works well for:

  • Well-planned issues with clear acceptance criteria
  • Issues that follow existing patterns (new handler, new component, new test)
  • Bug fixes with reproduction steps in the spec
  • Small-to-medium scope (1-5 files)

Doesn’t work well for:

  • Vague issues (“make the UI better”)
  • Architecture changes that require judgment calls
  • Issues that depend on external services the AI can’t access
  • Large refactors that touch 20+ files

Bottom line: /work-issue is only as good as the spec it gets. Bad spec, bad code. That’s why I bother with /plan-issue.