Boatman Ecosystem documentation is live!
Harness Module
Pipeline Runner

Pipeline Runner

The runner package provides a composable pipeline orchestrator for AI agent harnesses. It coordinates the execute-test-review-refactor loop using role-based interfaces.

Roles

The runner is configured with role implementations. Two are required, two are optional:

Required Roles

// Developer generates and refactors code
type Developer interface {
    Execute(ctx context.Context, plan string) error
    Refactor(ctx context.Context, issues []review.Issue, guidance string) error
}
 
// Reviewer evaluates code changes
type Reviewer interface {
    Review(ctx context.Context, diff string, context string) (*review.ReviewResult, error)
}

Optional Roles

// Planner creates an implementation strategy before execution
type Planner interface {
    Plan(ctx context.Context, task string) (string, error)
}
 
// Tester runs the project's test suite
type Tester interface {
    RunTests(ctx context.Context) (*testrunner.TestResult, error)
}

Pipeline Flow

1. Plan (optional)     → Planner.Plan() produces strategy
2. Execute             → Developer.Execute() generates code
3. Review Loop (1..N):
   a. Test (optional)  → Tester.RunTests()
   b. Review           → Reviewer.Review()
   c. If passed: break
   d. Refactor          → Developer.Refactor()
4. Finalize            → Return Result with metrics

The loop runs up to MaxIterations times (default: 3). Each iteration saves a checkpoint, tracks cost, and deduplicates issues.


Configuration

type Config struct {
    // Required
    Developer Developer
    Reviewer  Reviewer
 
    // Optional
    Planner   Planner
    Tester    Tester
 
    // Pipeline settings
    MaxIterations int           // Max review-refactor cycles (default: 3)
    TaskID        string        // Identifier for checkpoints
    WorktreePath  string        // Git worktree path
 
    // Primitive integrations
    Checkpoint *checkpoint.Manager
    Cost       *cost.Tracker
    Issues     *issuetracker.IssueTracker
}

Usage

cfg := runner.Config{
    Developer:     myDeveloper,
    Reviewer:      myReviewer,
    Planner:       myPlanner,       // optional
    Tester:        myTester,        // optional
    MaxIterations: 3,
    TaskID:        "ENG-123",
    WorktreePath:  "/tmp/worktree",
    Checkpoint:    checkpointMgr,
    Cost:          costTracker,
    Issues:        issueTracker,
}
 
result, err := runner.Run(ctx, cfg)
if err != nil {
    log.Fatal(err)
}
 
fmt.Printf("Status: %s\n", result.Status)
fmt.Printf("Iterations: %d\n", result.Iterations)
fmt.Printf("Cost: $%.4f\n", result.Cost.Total().TotalCost)

Result

type Result struct {
    Status     string              // "success", "failed", "error"
    Iterations int                 // Number of review cycles completed
    Steps      []StepRecord        // History of each step
    Cost       *cost.Tracker       // Aggregated token usage
    Issues     *issuetracker.Stats // Final issue statistics
}
 
type StepRecord struct {
    Name     string
    Duration time.Duration
    Error    error
    Output   string
}

Integration with Primitives

The runner integrates with harness primitives at each step:

StepPrimitives Used
PlanMemory (best prompts for task type)
ExecuteContextPin (lock related files), FileSummary (reduce token usage)
TestTestRunner (detect framework, run tests)
ReviewReview (canonical issues), Cost (track tokens)
RefactorIssueTracker (deduplicate), Memory (common issues)
VerifyDiffVerify (check fixes), IssueTracker (track addressed)
Each stepCheckpoint (save progress), Cost (aggregate)

Step-level primitives (contextpin, memory, filesummary) are used inside your role implementations, not by the runner directly. The runner manages checkpoint, cost, and issue tracking automatically.