Technical Deep-Dives

The Hybrid Brain Architecture: How Runstack Separates Thinking from Doing

Why we split AI reasoning from deterministic execution — and why it matters for your operations.

Runstack Labs· Engineering· 4 min read· January 25, 2026

The problem with pure AI agents

Large language models are remarkable at understanding intent, generating responses, and reasoning through complex scenarios. But they have a fundamental limitation for operational use: they are probabilistic.

Ask an LLM to process a refund, and it might generate the right API call. Or it might hallucinate a parameter. Or it might decide, based on its training data, that the refund amount should be different from what your policy specifies.

For a creative writing tool, probabilistic behavior is a feature. For a system processing customer refunds, it's a liability.

The problem with pure rule-based systems

On the other end of the spectrum, traditional automation — rule-based workflows, decision trees, if-then logic — is perfectly deterministic. Every action is predictable, auditable, and safe.

But rule-based systems are rigid. They can't handle nuance. They can't understand context. They can't adapt when a customer's request doesn't fit neatly into a predefined category.

"I ordered the blue one but got a slightly different shade and I'm not sure if I want to keep it" doesn't map cleanly to any decision tree.

The Hybrid Brain: combining both

Runstack's architecture solves this by separating the system into three layers:

Layer 1: Reasoning (AI Intelligence)

The bottom layer is powered by Claude. It handles:

  • Intent classification: Understanding what the customer wants
  • Context retrieval: Pulling relevant order data, customer history, and policy information
  • Response generation: Crafting a reply in your brand voice
  • Sentiment analysis: Detecting frustration, urgency, or confusion

This layer is probabilistic by design. It needs to be flexible, contextual, and adaptive. That's what AI does well.

Layer 2: Execution (Deterministic Actions)

The middle layer is powered by n8n workflows. Every action the agent takes — issuing a refund, updating a ticket status, sending a response, looking up shipping data — executes through a predefined, guardrailed workflow.

  • Every workflow has explicit input validation
  • Every action is logged with a full audit trail
  • Dollar thresholds trigger human approval
  • API calls use scoped credentials with minimum necessary permissions

This layer is deterministic by design. No hallucinated API calls. No invented refund amounts. No unauthorized actions.

Layer 3: Human Control Plane

The top layer is yours. You configure:

  • Approval thresholds: Refunds over $50 require human sign-off (configurable)
  • Escalation rules: Ticket types that always route to humans
  • Guardrail settings: What the agent can and cannot do
  • Policy boundaries: Return windows, discount limits, communication guidelines

Why the separation matters

The key insight is that understanding a customer's request requires flexibility, but acting on it requires safety.

When Atlas receives a ticket saying "I need to return something but it's been 35 days and your policy says 30," the Reasoning Layer understands the nuance. It recognizes this is a borderline case, reads the customer's purchase history, and considers the lifetime value.

But the Execution Layer doesn't process a 35-day return because your policy says 30 days. Instead, it escalates to your human team with context: "Customer requesting return outside window. High LTV customer (12 orders). Recommendation: approve exception."

The AI thinks. The guardrails enforce. You decide.

Technical specifications

For the technically curious:

  • Reasoning Layer: Claude API via Anthropic, running in Runstack's infrastructure
  • Execution Layer: Self-hosted n8n with custom node types for e-commerce operations
  • Control Plane: Configuration API with role-based access control
  • Audit Trail: Every decision and action logged with timestamps, inputs, outputs, and confidence scores
  • Data Isolation: Per-customer data boundaries with no cross-contamination between tenants

What this means for your evaluation

If you're evaluating Runstack as a technical stakeholder, here's what matters:

  1. Your data never trains our models. Customer data is used for real-time context retrieval only.
  2. Every action is auditable. You can trace any decision from customer input to final action.
  3. Guardrails are configurable, not optional. You set the rules. The system enforces them.
  4. Human-in-the-loop is built in, not bolted on. Sensitive actions require human approval by default.

Ready to reduce your support costs?

Calculate your hidden labor tax in 60 seconds.