CampaignForge AI - The Journey
Chapter 9: Three Flows, Not One
Date: 2026-05-09 Vertical: B2B SaaS | Budget: $500/month
Where We Left Off
Chapter 8 gave the Performance Analyst a diagnostic brain. It can now tell you why a campaign is underperforming, not just report the numbers. The GATE-7 rework loop means poor performance triggers a structured decision rather than just continuing to spend.
At the end of Chapter 8, the pipeline had this shape: when performance data was strong enough, Agent 06 returned TRIGGER_CONTENT. That recommendation flowed into the content publishing path — content_draft → GATE-6 → content_publish — as an automatic next step in the campaign run.
That design made sense when I built it. By the time I ran the full system end-to-end in Chapter 8, it had stopped making sense.
The Problem With One Graph
The coupling worked like this: a campaign run on Meta Ads could end up blocked on a LinkedIn API call.
Not because the LinkedIn post was required for the campaign to succeed. Not because the performance data depended on it. But because they lived in the same state machine. If content_publish failed — bad credentials, rate limit, platform outage — the campaign pipeline was in a FAILED state. The audit log recorded a pipeline failure. The operator had to diagnose a social media API problem to understand what had happened to their ad campaign.
Those are not the same problem. They should not produce the same error.
The deeper issue: TRIGGER_CONTENT was automatic. The performance analyst decided the campaign had performed well enough, and the next thing that happened — after GATE-6 — was a LinkedIn post. The operator approved the gate, but the trigger itself was algorithmic. I had said from the start that I always want to manually decide when a milestone is worth a chapter. The pipeline I had built did not respect that.
The Question
I asked Claude to analyze whether the system should split into 2 or 3 separate flows. I had an intuition about what the answer would be, but I wanted it reasoned out, not assumed.
The framing I gave: there seem to be three distinct things happening in this codebase.
- 1. Building CampaignForge itself — the research, architecture decisions, implementation work. That is what Tim and Claude are doing together. Not automated.
- 2. Publishing the journey — documenting what was built, turning it into LinkedIn posts, X threads, website chapters. That is what
--publish-chapterdoes. Operator-triggered. - 3. Running a campaign for a client — intake brief through monitoring pause. That is the LangGraph state machine.
The question was whether 1 and 2 belonged in the same graph, and whether 3 was correctly isolated.
CODEX's analysis came back with the same conclusion: three separate flows, and the key reason was failure isolation. A LinkedIn API failure should not affect a running campaign. A paused campaign should not block a build chapter from being published. The operator should manually decide when a milestone is worth documenting.
The third insight was the one that settled it: the Journey Content flow is not triggered by campaign performance. It is triggered by the operator deciding that something worth documenting has happened. Those are different triggers with different authorship. The algorithm should not be making that call.
What Changed in the Campaign Graph
GATE-6 and the content publishing path came out of the campaign state machine entirely.
content_draft_node, gate_6, and content_publish_node were removed from the graph. The GATE-6 entry was removed from every gate table, every routing function, every UI card. regenerate_content_draft_for_platforms() — the function that let operators swap LinkedIn for X from the gate card — was removed from ui_support.py and the Streamlit UI.
route_performance() in gates.py was updated: TRIGGER_CONTENT now maps to "continue", which routes to monitoring_pause. When Agent 06 returns TRIGGER_CONTENT, the campaign finishes its run and pauses at monitoring. The recommendation is preserved in the agent output. The operator can read it, decide whether the result warrants a chapter, and invoke --publish-chapter separately if so.
The campaign graph now has one terminal state: monitoring_pause. The path to it is always through the performance analyst. The content layer is not involved.
performance_analyst
│
├── CONTINUE / TRIGGER_CONTENT → monitoring_pause → END
├── REWORK → gate_7 → creative or strategist
└── CRITICAL → error_halt → END
TRIGGER_CONTENT does not get a special route. It lands in the same place as a well-performing campaign that is simply continuing to run. The operator decides what happens next.
The Dead Code in the Attic
While making these changes, I went looking for everything that referenced GATE-6, content_draft, and content_publish across the codebase. I found what I expected.
I also found something I had been walking past for eight chapters.
Sitting in the repository were the full AWS Step Functions artifacts from Chapter 1: Lambda handlers for every agent, an API Gateway handler for brief intake, an approval handler, an escalation checker, a notifier, a layers directory with a custom LLM client, the full ASL state machine definition. Seventeen files. The complete Chapter 1 implementation.
None of it had been touched since the pivot to LangGraph in Chapter 2. None of it was imported by anything. The test files that covered it were failing with ModuleNotFoundError: No module named 'boto3' because the dependency had been removed long ago. The code had been running next to the LangGraph implementation for six chapters, doing nothing, adding confusion.
The original reason it was kept: I said ADR-001 is preserved as Chapter 1 in git history. That is true — it is tagged at v0.1-aws-stepfunctions-architecture. The document is worth keeping. The code was not providing value. It was actively making the codebase harder to navigate for both me and Claude.
So it came out.
What Got Removed
step_functions/campaign_pipeline.asl.json # 269 lines — the ASL state machine
src/agents/01_product_person/handler.py # Lambda handler
src/agents/06_performance_analyst/handler.py # Lambda handler
src/agents/06_performance_analyst/activate_monitoring.py
src/agents/11_content_publisher/handler.py
src/api/brief_intake/handler.py # API Gateway Lambda
src/api/pipeline_status/handler.py
src/api/audit_reader/handler.py
src/approval/handler/handler.py # Approval gate Lambda
src/approval/escalation_checker/handler.py
src/approval/notifier/handler.py
src/layers/llm_client/python/... # Lambda LLM layer
src/common/pipeline_state.py # DynamoDB state wrapper
tests/test_agent_01.py # Failing boto3 tests
tests/test_approval_handler.py
tests/test_brief_intake.py
Net: 2,857 lines deleted. 256 tests passing after the removal, same as before.
The git history keeps the full Chapter 1 implementation. The working directory keeps only what runs.
Updating the Documents
ADR-001 was never marked superseded. It still said "PENDING APPROVAL — Gate 2" at the top. That is the wrong status for an architecture that was designed, implemented, and then explicitly replaced. A SUPERSEDED banner now sits at the top: the AWS code was designed but never provisioned, no credentials were entered, and the decision to pivot is documented in ADR-002.
ADR-002 got a significant update. The status changed from PENDING APPROVAL to APPROVED. The Mermaid diagram now reflects the actual graph: seven gates, monitoring_pause as the terminal node, no content_draft or GATE-6, the GATE-7 rework loop wired in. The gate summary table was expanded from five gates to seven. The project layout was updated to match the actual file tree.
The PRD had a non-negotiable that read: "AWS only. No GCP, no Azure." That constraint was written on May 4th, before Chapter 2 existed. CampaignForge has been running local-first for seven chapters. The non-negotiable was updated:
> Local-first; cloud path must be clear. Phase 1 runs on a single machine: LangGraph StateGraph, SQLite checkpointing, no cloud accounts required. Phase 2 migrates the runtime substrate to AWS without changing agent contracts or gate logic.
The technical path to Phase 2 has not changed: SqliteSaver → AsyncPostgresSaver is one import change in graph.py. The non-negotiable now describes the actual constraint — the migration path must be clear — rather than a cloud preference that was bypassed in Chapter 2 for good reasons.
The agents/04-aws-deployer.md file was renamed to agents/04-deployer.md and rewritten. The old version told the Deployer to execute Terraform, deploy Lambda functions, return AWS resource ARNs, and stay within the $76/month MVP budget cap. None of that is relevant to a local deployment. The new version describes what the Deployer actually does: produce a local deployment artifact, run health checks against the LangGraph process, verify that the audit log is writing, confirm that spend_guard.py is blocking over-cap calls, and return inspectable local paths the Cost Analyst can trust.
The Honest State After Chapter 9
What changed:
- GATE-6 and the content publishing path removed from the Campaign Graph
TRIGGER_CONTENTrecommendation routes tomonitoring_pause(same asCONTINUE)regenerate_content_draft_for_platforms()removed fromui_support.pyand the UI- GATE-6 card removed from the Streamlit UI gate sequence
- All AWS Lambda/Step Functions/DynamoDB dead code deleted (17 files, 2,857 lines)
- ADR-001 marked SUPERSEDED
- ADR-002 updated: status APPROVED, Mermaid diagram current, gate table expanded to 7 gates
- PRD non-negotiable #6 updated from "AWS only" to "local-first, cloud path clear"
agents/04-aws-deployer.md→agents/04-deployer.mdwith local-first scope- 256 tests passing
What did not change:
- Agent 11 is fully operational — it runs as the Journey Content flow via
--publish-chapter TRIGGER_CONTENTis still returned by Agent 06 when ROAS qualifies — the signal is preserved in state, the operator reads it and decides- The three existing chapter publishing paths (LinkedIn, X thread, website HTML) are unaffected
- No live Meta Ads calls
- No live social publishing from within the campaign graph
What Comes Next
The three-flow architecture is the right shape for this system. The campaign graph runs campaigns. The journey content flow publishes chapters. They are independent. They can fail independently. The operator controls what happens at the boundary.
What the campaign graph is still missing is real performance data. The feedback loop exists. The diagnostic brain exists. The rework loop exists. Every piece of the system that processes real Meta Ads results is ready. The missing piece is the API connection that feeds real impression, click, and conversion data into Agent 06 instead of manually loaded fixtures.
That is the next unlock: a campaign run against real Meta Ads data that produces real performance numbers, triggers a real diagnosis, and either routes to monitoring pause or routes to GATE-7 with a recommendation the diagnosis actually earned.
The pipeline is ready for it.