Background

Good tooling requires understanding the tools. Before applying AI agents to performance engineering tasks — profiling pipeline automation, cross-run analysis, report generation — it's worth building a clear mental model of how parallel agent workflows actually work.

This post documents a practice repository built for exactly that purpose.

The Repository

codeberg.org/srinathv/zed-parallel-agents

The repo demonstrates two distinct kinds of parallelism, applied to the same class of problem:

Approach 1 — Zed Threads (Interactive)

Zed's Agent Panel supports multiple simultaneous Claude sessions, each with its own model, context, and rule set. Three threads running concurrently — one Architect on Opus, one Developer on Sonnet, one Reviewer on Haiku — is functionally equivalent to three agents working in parallel, with the human acting as the orchestrator.

The repo includes .rules files and a Rules Library configuration that give each thread a distinct persona. The thread-rules directory documents what each agent is expected to produce and not produce — tight role boundaries are what make multi-agent systems predictable.

Approach 2 — Claude API Agents (Automated)

At the API level, an agent is a messages.create() call with a strong system prompt. Parallel agents are multiple async calls dispatched via asyncio.gather() using the AsyncAnthropic client.

The pipeline is a two-phase DAG:

Phase 1 (sequential):
  Architect → design specification

Phase 2 (parallel — asyncio.gather):
  Developer ──┐
              ├── both hit the API simultaneously
  Reviewer  ──┘

The Architect runs first because the Developer and Reviewer both depend on its output. Once that dependency is satisfied, the remaining agents run concurrently — total Phase 2 time is max(developer_time, reviewer_time), not their sum.

Project 3 — PyTorch ML Pipeline

A third project applies the same pattern to a machine learning pipeline: a Data Engineer agent and Model Architect agent run in parallel (their work is independent), then an ML Engineer agent synthesises both into a complete PyTorch training script. A hand-written baseline train.py is included for comparison.

Why This Matters

The same DAG structure — parallel where independent, sequential where dependent — applies directly to performance engineering workflows:

  • Simultaneously profiling multiple frameworks (independent) before a comparative analysis (dependent)
  • Running kernel-level and system-level profiling in parallel, then synthesising results
  • Generating reports for multiple benchmarks concurrently, then producing an executive summary

The practice repo makes the pattern concrete before those applications come up.

What's Next

The follow-up post will be a live walkthrough: running agents.py, examining what each agent produces, comparing the agent-generated PyTorch code against the hand-written baseline, and discussing where the pattern breaks down and how to fix it.

The code is all there now — the walkthrough comes after I've run it against a real workload.