Most “AI agents” are advanced automation.
That’s where the money is.
Operator view: a map of automation levels, where ROI actually lives, and two working examples you can click through. No vendor pitches, no hype.
The Levels
Adapted from Cobus Greyling’s 5 Levels, simplified for operators. The marked rows are where production systems live reliably right now.
| Lvl | Name | What it actually is | Example | Risk |
|---|---|---|---|---|
| 0 | Manual | A human does every step | Spreadsheet passed between teams | Slow, error-prone |
| 1 | Scripted Automation | Cron jobs, scripts, deterministic chains. No LLM. | Weekly poll of HR system, scripted provisioning | Low — breaks loudly |
| 2 | Smart Forms | Deterministic chain + LLM for one step (parse, draft, classify). Existing rules still route. | Seller describes case in chat → AI fills routing fields → rules route | Low — LLM translates, doesn’t decide |
| 3 | Action-Taking Agent | LLM plans across multiple steps with intermediate feedback. Takes actions in narrow, known domains. Human approves irreversible steps. | Onboarding pipeline: detect new hire → draft → provision → flag exceptions | Medium — needs guardrails, audit logs, human approval gates |
| 4 | Autonomous Operator | LLM makes every decision dynamically | “Just handle it” agent | High — this is where prod incidents come from |
| 5 | Digital Employee | Fully autonomous, persists state, learns continuously | Not reliably achieved | Don’t try |
Where ROI Lives
The pattern that works
Level 2 to low Level 3. Deterministic workflow + an LLM plugged into the one step that actually needs it — parsing messy input, drafting personalized copy, classifying ambiguous cases. Fortune 500 production agents run about five steps, call one model, and hand to a human for approval. That’s the shape of every system delivering measurable ROI right now.
What MIT found in 2025
95% of enterprise AI deployments delivered zero measurable bottom-line impact. The failure isn’t the model — it’s the approach. Teams build Level 4 when Level 2 would have worked and been reliable.
Example 1 — Onboarding GUI (Level 1, scripted)
Before: a spreadsheet of new hires passed between HR, IT, and enablement. After: an operator uploads the same sheet, the tool maps titles to tools, drafts emails, generates provisioning queries. Click to send. No LLM required.
onboarding-gui.app
Level 1 · ScriptedParsed New Hires
Output
How it's done: sheet parsed locally, title-to-tool mapping comes from a YAML config the ops team owns. Email drafts open in Outlook. Provisioning commands are shell + SOQL the operator runs themselves. Zero AI in this loop — and that's the point.
- ✓Cron polls HR — detects hires & leavers
- ✓Maps job title → tool stack
- ✓Sends templated welcome emails
- ✓Runs provisioning + license cleanup
- ✓Records assigned course + checklist
Databasewho got hired · their tools · their course · their checklist
- ✓Reads the GUI's database
- ✓Answers new hire questions in chat
- ✓Guides them through their course + checklist
- ✓Can't answer? Finds the tool/topic owner
- ◆Messages that expert and routes the answer back
The GUI does the deterministic work and leaves behind a clean record of every hire. The agent treats that record as its source of truth — then does what the GUI can't: hold a conversation, walk someone through their course, and reach out to the right expert when it hits a question it can't answer. The GUI runs the process; the agent helps the human.
The Agent — Levels 1 → 4
The GUI fires the trigger and keeps the database current. The agent reads from it and builds on top — answering, guiding, and reaching out to humans and other agents. Click through each level to see what gets added.
onboarding-agent.demo
Level 1 · Read-onlyHow it's done
Reads the GUI's database to know who this person is — their assigned course, their checklist, their tool stack. RAG over onboarding docs answers the rest.
When it doesn't know, it dead-ends at "ask someone." That's the ceiling of Level 1 — it talks, it never acts.