[REDACTED] Episode 4: We Stopped Using Claude Code Mid-Build. Here's What We Built Instead
How we're using the Anthropic SDK, Vercel AI Gateway, and Claude Design to automate CRM cleanup, lead gen, and landing page creation — with just two people running a $1M business.
We need your [Redacted] AI experiences for upcoming episodes! Who's the most underrated AI builder you know? Someone running real systems inside a real business? Send us a message at contact@tweenerfund.com because we want to get them on the show!
Episode 4
Redacted is the show that doesn't clean things up before hitting record. Episode 4 is a double build session: Taylor Cotner walks through the multi-agent HubSpot cleanup pipeline he's been iterating on for weeks, now running on the Anthropic SDK with Claude Code out of the loop, and David Shaner demos how he used Claude Design and Claude Code to rebuild Offline's partner landing page from scratch. Most of the episode is screen sharing, so pull it up on YouTube.
What We Cover
No Claude Code in the loop: Taylor stopped using Claude Code as an agent orchestrator in his HubSpot pipeline, not as his coding tool (he’s still building the app with Claude Code), but as a decision-maker in the middle of a workflow. Removing it gave him full control over inputs and outputs at every step.
Custom eval system built from scratch: Taylor built an eval page that looks like an Excel grid, models as columns, test cases as rows , to measure Haiku, Sonnet, GPT-5, and GPT-5 Mini against real messy HubSpot data. Each cell shows pass, fail, and cost.
GPT-5 Mini at 10–20× less cost: For the lead qualifier agent, Sonnet evals cost $1.00 per run. GPT-5 Mini costs $0.05. “I can live with that. 10X less the cost.” For the core cleanup evals: $1.50 for Sonnet versus $0.14 on GPT-5 Mini.
$20 for 133 million tokens overnight: Using the Vercel AI Gateway — which lets you swap any model without changing your code, Taylor ran 200 HubSpot restaurant cleanups in a single night for $20 total.
Self-grading pipeline: The pipeline grades its own output after every cleanup run. If a job comes back below an A, it automatically spawns a new run with Sonnet, no human catch required. A B grade on 101 Craft Kitchen auto-escalated and came back with an A.
Real mess-ups make the best evals: Almost every eval case came from a real HubSpot error. The system once tried to create a “Kim company” to link a group of unrelated restaurants, so Taylor added an eval to teach it that being linked by an owner contact is not the same as being linked by company structure.
The conveyor belt metaphor: David’s landing page pipeline starts with live sales transcripts from Steve (Offline’s seller) and is designed to end with a generated, voice-of-customer partner landing page. “In an ideal world, I’ve got a black box in the middle.”
Claude Design → Claude Code handoff: Claude Design’s share feature generates a markdown handoff document with a file map, token contract, and panel build notes. When Claude Code picks up the project, it reads this file first, bridging design intent to implementation.
One person, 7–8 hats replaced: David processed customer reviews, tightened company positioning, built wireframes, designed mobile experiences, wrote code, and is about to ship a pull request, all without a designer, copywriter, or front-end developer.
GPT-5 “overthinks”: Their working theory is that GPT-5 (not Mini) gets weird things wrong because it goes too abstract. The temperature/Myers-Briggs analogy, literal versus creative thinking, might explain why Mini outperforms the full model on structured cleanup tasks.
The iceberg: Once the cleanup and landing page are done, the plan is to surface above the water: automated emails, Instagram DMs, and a fully AI-run lead generation function operating on top of the clean data.
How to watch:
It’s best viewed on YouTube to fully see the examples (make sure to subscribe!)
But also available on all audio podcast players through Tweener Talks!
PLUS we have a new spot for show notes and files discussed in the episode. Check cit out: https://github.com/instanttaylor/redacted-podcast
What’s Next?
New episodes drop twice a month/every other Wednesday. If you want to be on the show as a guest and show your [REDACTED] builds, email us here.





