The problem with pure AI agents
Large language models are remarkable at understanding intent, generating responses, and reasoning through complex scenarios. But they have a fundamental limitation for operational use: they are probabilistic.
Ask an LLM to process a refund, and it might generate the right API call. Or it might hallucinate a parameter. Or it might decide, based on its training data, that the refund amount should be different from what your policy specifies.
For a creative writing tool, probabilistic behavior is a feature. For a system processing customer refunds, it's a liability.
The problem with pure rule-based systems
On the other end of the spectrum, traditional automation — rule-based workflows, decision trees, if-then logic — is perfectly deterministic. Every action is predictable, auditable, and safe.
But rule-based systems are rigid. They can't handle nuance. They can't understand context. They can't adapt when a customer's request doesn't fit neatly into a predefined category.
"I ordered the blue one but got a slightly different shade and I'm not sure if I want to keep it" doesn't map cleanly to any decision tree.
The Hybrid Brain: combining both
Runstack's architecture solves this by separating the system into three layers:
Layer 1: Reasoning (AI Intelligence)
The bottom layer is powered by Claude. It handles:
- Intent classification: Understanding what the customer wants
- Context retrieval: Pulling relevant order data, customer history, and policy information
- Response generation: Crafting a reply in your brand voice
- Sentiment analysis: Detecting frustration, urgency, or confusion
This layer is probabilistic by design. It needs to be flexible, contextual, and adaptive. That's what AI does well.
Layer 2: Execution (Deterministic Actions)
The middle layer is powered by n8n workflows. Every action the agent takes — issuing a refund, updating a ticket status, sending a response, looking up shipping data — executes through a predefined, guardrailed workflow.
- Every workflow has explicit input validation
- Every action is logged with a full audit trail
- Dollar thresholds trigger human approval
- API calls use scoped credentials with minimum necessary permissions
This layer is deterministic by design. No hallucinated API calls. No invented refund amounts. No unauthorized actions.
Layer 3: Human Control Plane
The top layer is yours. You configure:
- Approval thresholds: Refunds over $50 require human sign-off (configurable)
- Escalation rules: Ticket types that always route to humans
- Guardrail settings: What the agent can and cannot do
- Policy boundaries: Return windows, discount limits, communication guidelines
Why the separation matters
The key insight is that understanding a customer's request requires flexibility, but acting on it requires safety.
When Atlas receives a ticket saying "I need to return something but it's been 35 days and your policy says 30," the Reasoning Layer understands the nuance. It recognizes this is a borderline case, reads the customer's purchase history, and considers the lifetime value.
But the Execution Layer doesn't process a 35-day return because your policy says 30 days. Instead, it escalates to your human team with context: "Customer requesting return outside window. High LTV customer (12 orders). Recommendation: approve exception."
The AI thinks. The guardrails enforce. You decide.
Technical specifications
For the technically curious:
- Reasoning Layer: Claude API via Anthropic, running in Runstack's infrastructure
- Execution Layer: Self-hosted n8n with custom node types for e-commerce operations
- Control Plane: Configuration API with role-based access control
- Audit Trail: Every decision and action logged with timestamps, inputs, outputs, and confidence scores
- Data Isolation: Per-customer data boundaries with no cross-contamination between tenants
What this means for your evaluation
If you're evaluating Runstack as a technical stakeholder, here's what matters:
- Your data never trains our models. Customer data is used for real-time context retrieval only.
- Every action is auditable. You can trace any decision from customer input to final action.
- Guardrails are configurable, not optional. You set the rules. The system enforces them.
- Human-in-the-loop is built in, not bolted on. Sensitive actions require human approval by default.
Ready to reduce your support costs?
Calculate your hidden labor tax in 60 seconds.