Skip to content

Testing

This chapter explains how the test suite works, how to run it, and how to extend it.

Running the suite

All test commands are Makefile targets (see make help for the full list):

Target What it does
make test go test ./... — the full suite, no frills
make test-race Same with -race for data-race detection
make cover Same with -cover for a per-package summary
make gate fmt-checkbuildvettest — the green gate CI runs

The green gate is the canonical pre-merge bar. A PR is mergeable only when make gate passes cleanly.

VS Code extension (ide/vscode/)

The extension has its own toolchain (npm/tsc/esbuild/mocha) and is kept out of make gate so the Go gate stays fast and doesn't require Node. It has a dedicated CI job instead.

Target What it does
make ext-build npm ci + npm run bundle — esbuild bundle to dist/extension.js
make ext-test npm ci + npm test — plain-mocha unit suite (src/test/suite/**)

Both targets print a notice and skip (exit 0) if npm is not on $PATH, mirroring how make lint and make reuse treat optional tools.

npm test runs pretest (tsc -p ./) then mocha over out/test/suite/**/*.test.js — the fixture-driven, pure-Node unit tests (per .mocharc.json). It does not invoke @vscode/test-electron: the in-host smoke suite under src/test/vscode-suite/ runs separately via npm run test:electron (needs a VS Code download and a display), and is intentionally not part of make ext-test or CI.

Package overview

Package Approach
internal/engine End-to-end integration via fake bd + fake claude scripts
internal/anthro Unit tests against fake HTTP backends; one opt-in live test
internal/sched Pure-logic unit tests for wave scheduling
internal/ledger, beads, dispatch, … Unit tests per package

Engine end-to-end tests

internal/engine/engine_test.go drives the full engine loop without a network or real tools.

Fake script fixtures

Two shell-script constants are written to t.TempDir() as executable binaries at test startup:

fakeBDScript — a minimal bd stub. On the first ready call it returns ready.json (one open task); on subsequent calls it returns []. Mutations (update, close, comment) succeed silently and are appended to bd.log so tests can assert what the engine asked bd to do.

fakeClaudeScript — a well-behaved implementer stub. It commits one file (agent-work.txt) to the worktree, writes SUMMARY.md to $KORYPH_SUMMARY_PATH, and emits a JSON cost line (total_cost_usd: 0.42) so the ledger cost-recording path is exercised.

Environment overrides

The engine reads two env vars at startup that redirect which binaries it calls:

Variable Purpose
KORYPH_BD_BIN Path to the bd binary (default: bd on $PATH)
KORYPH_CLAUDE_BIN Path to the claude binary (default: claude on $PATH)

newFixture sets both via t.Setenv to point at the fake scripts, so the tests never touch the real tools. Two additional overrides tighten timing: KORYPH_BACKOFF_SEC=0 eliminates requeue delays; KORYPH_NO_NPX=1 prevents any Node fallback.

The fixture

newFixture(t, fixOpts{}) wires up a complete mock world:

  • a temp git repo seeded with a koryph.project.json and an implementer.md agent persona
  • a registry entry pointing at the temp repo
  • a KORYPH_HOME isolated from ~/.koryph
  • the fake bd and claude binaries, with a ready.json that holds one open task (tb1)

fixOpts lets individual tests vary the setup: expectedIdentity, migrationStatus, workSource, and mergePolicy.

Key tests

Test What it exercises
TestRunOnceMergesAndDrains Full happy-path: frontier → dispatch → auto-merge → ledger → drain
TestRunAccountMismatchFailsClosed Identity check fires before any state is touched
TestRunMergePendingWithoutAutoMerge without --auto-merge leaves branch + worktree intact, posts bd comment
TestRunRefusesUnvalidatedProject Engine rejects a project that hasn't been validated
TestRunRefusesMarkdownWorkSource Legacy markdown work-source is blocked until migrated

Anthro engine-import guardrail

internal/anthro/anthro_test.go contains TestEngineNeverImportsAnthro, which parses every .go file under internal/engine with the Go AST and fails the build if any of them imports internal/anthro.

The rule: the engine loop must never make per-token API calls implicitly. All Anthropic traffic is routed through the dispatch and review layers explicitly; the engine itself stays cost-neutral.

Live API tests

internal/anthro/batch_live_test.go exercises the full BatchSubmit → BatchWait path against the real Anthropic Message Batches API. It is skipped by default; to run it, set KORYPH_BATCH_API_KEY explicitly (the ambient ANTHROPIC_API_KEY is intentionally refused):

KORYPH_BATCH_API_KEY=sk-ant-… \
  go test -v ./internal/anthro/ -run TestBatchLive -timeout 15m

Extending the suite

Add a new engine scenario: call newFixture with the appropriate fixOpts, then call engine.Run and assert on the returned Outcome, the ledger, and f.bdLog(t).

Test a new bd behaviour: extend fakeBDScript to handle the new subcommand, or add a new case to the switch inside the script constant, and update the ready.json fixture as needed.

Add a unit test to a sub-package: write a standard _test.go file in the package directory. No external deps or network access are needed for any package except internal/anthro (live test only).