For three years, the tech industry was obsessed with prompt engineering.
We acted as if typing the exact right sequence of words into ChatGPT was a dark art. "Act as an expert marketer. Take a deep breath. Think step by step." People wrote entire courses about this. LinkedIn was full of "Top 10 Prompts That Will Change Your Business" posts.
Prompt engineering is already obsolete. Not dying — dead. The new frontier is agentic workflows, and if you're still optimizing your prompts, you're tuning a horse's saddle while your competitors buy motorcycles.
#What is an Agentic Workflow?
In a standard LLM interaction, the human is the orchestrator. You prompt the AI, it generates a draft, you read it, you tell it what to fix, it fixes it. You're the manager; the AI is the intern.
In an agentic workflow, the AI manages itself.
You give the system a high-level goal: "Research the top 10 competitors in the CRM space, write a comparative analysis report, and format it as a markdown file."
The agent then:
- Plans — It writes its own step-by-step task list.
- Executes — It uses a web search tool to find competitors. It reads their websites.
- Drafts — It writes the comparative report.
- Reflects — This is the part that matters. It reads its own output, notices it forgot pricing data for two companies, and goes back to search specifically for that.
- Finalizes — It delivers the completed markdown file.
That last step — reflection — is what separates an agent from an autocomplete tool. The ability to evaluate your own work, identify gaps, and self-correct without being told to is what makes agentic systems genuinely useful for business tasks that used to require a human.
#The 4 Capabilities That Make a Workflow Agentic
Not every AI integration qualifies. To be genuinely agentic, a system needs four things working together.
#Reflection
The ability for an LLM to critique its own output and decide whether it's good enough. This sounds simple. It isn't. You have to prompt for it explicitly — most LLMs will happily return mediocre output if you don't ask them to evaluate it first. The practical implementation: after each major step, the agent runs a self-check prompt ("Does this output accomplish the goal? What's missing?") before proceeding.
#Tool Use
Reasoning alone isn't enough. Agents need access to the world — APIs, databases, web search, code execution, file systems. A reasoning engine without tools is a very smart thing that can't do anything. Claude and GPT-4o both support function calling, which lets you define exactly what tools an agent can use and when.
#Planning
The ability to receive a high-level goal and decompose it into a concrete task sequence. ReAct (Reason + Act) is the dominant architecture here — the model explicitly reasons about what to do before doing it, then reflects on what happened, then reasons about the next step. It's slower than raw generation. It's also dramatically more reliable for complex tasks.
#Multi-Agent Collaboration
This is where it gets interesting. A single agent working alone hits cognitive limits. Multi-agent systems assign different roles to different model instances: a Researcher agent handles data gathering, a Writer agent handles synthesis, a QA agent tries to break the output and flag problems. They pass work between each other rather than piling everything into one context window.
When I built the first version of an automated competitive intelligence system using this architecture, the output quality jumped significantly compared to a single-agent approach — not because any individual model improved, but because the error-catching layer caught mistakes before they propagated downstream.
#What This Means for Founders Building Products
If you're building a SaaS product with a "chat interface" where users have to constantly prompt your tool to extract value, you're building for the past.
The next generation of B2B software won't be software you use. It will be software you hire. It runs asynchronously, executes multi-step workflows, and pings you only when it needs a strategic decision.
Practically, this means a few things:
The value shift from interface to outcome. Users increasingly don't care what the UI looks like if the outcome is automated. A clunky agent that books 20 qualified sales calls per week beats a beautiful chat interface that still requires manual follow-up every time.
"Configured once, runs forever" as a product category. The best AI products in 2026 are ones you set up in an afternoon and then don't touch for weeks. That's a fundamentally different product design challenge than building something users interact with daily.
Reliability matters more than capability. An agent that completes 95% of tasks correctly is far more valuable than one that completes 99% of tasks brilliantly but fails catastrophically on the other 1%. Founders building agentic products need to spend as much time on error handling and escalation design as on the core functionality.
#The Current Practical Stack
For founders who want to build agentic workflows today — not in theory, but actually running in production — here's what's working:
Orchestration: Make.com for complex business logic with visual debugging. n8n if you want self-hosted and don't mind more setup. LangGraph if you're technical and need fine-grained control over agent state.
Reasoning: Claude 3.5 Sonnet for tasks requiring long-context comprehension and nuanced instruction-following. GPT-4o for structured output and strict JSON formatting. Don't use the same model for everything — match model to task.
Web access: Firecrawl for clean markdown output from web pages, ready for LLM consumption. Exa for semantic search over recent web content.
Memory: Supabase pgvector for persistent memory across runs. Without persistent memory, every agent run starts from zero — fine for one-shot tasks, broken for anything that needs to learn from previous interactions.
#The Mistake Most Founders Make
They build agents that are too ambitious.
A single agent trying to own an entire complex workflow end-to-end — research, write, review, format, send — will fail in unpredictable ways. The right approach is to start with one step of an existing human workflow. Automate just the research phase. Then the drafting. Then add the quality check. Build in stages, with a human reviewing outputs at each stage until you trust the reliability.
The teams seeing real ROI from agentic AI in 2026 aren't the ones who built the most sophisticated systems. They're the ones who identified the most painful, repetitive, well-defined task in their operation and automated that one thing completely — then moved to the next.
Stop teaching your team how to prompt. Start teaching them how to build agents. And start with something small enough to actually finish this week.
Where should we send it?
You'll also get the weekly briefing.