Episode 2·March 10, 2026

Price Your AI Services: The 3-Tier Playbook Nomads Use to Hit $10K MRR

Spotify Apple Podcasts RSS Feed Open Companion Resource

Intro

This episode is for AI freelancers and agency owners at $0-10K MRR who need a defensible, async-friendly pricing model that clients understand and will renew. You'll get a complete repackaging framework, real margin math, and the tools to price your work based on outcomes, not compute costs.

In This Episode

Santi and Kira tackle the biggest mistake nomad AI builders make: pricing their services like subscriptions instead of business infrastructure. Through a detailed case study, they show how to transform a basic $199/month lead follow-up agent into a $2,500 setup plus $750-$1,500 monthly retainer by adding custom personality training, real-time reporting, and human handoff SOPs. They break down the actual math—mid-tier agents cost $25-$50/month in tokens but require 8-12 hours of monthly maintenance for monitoring, evals, and incident response. The episode covers industry pricing multipliers (healthcare and finance pay double), SLA premiums (2-hour response adds 50%), and compliance levers like data residency requirements. They introduce the "Lisbon Test" for truly location-independent services and provide a pricing calculator to find your real break-even points and operational capacity limits.

Key Takeaways

Token costs have dropped 600x since 2020, making compute 5% of your delivery cost—you're selling reliability and outcomes, not processing power
The three-tier structure (Starter $500-750, Growth $1,200-2,000, Enterprise $3,000+) works because clients need to see where they fit, not guess at value
Your retainer funds monitoring, evals, version management, and incident response—maintenance often exceeds initial build costs over 12-18 months

Timestamps

Companion Resource

guide

The 3‑Tier Pricing Playbook for AI Services (Nomad‑Proof, Setup + Retainer)

Turn your AI automations into a setup + retainer offer with clear tiers, SLAs, and ops that pass the Lisbon Test. Built for nomad founders who want reliable margins and a straight path to $10K MRR without selling hours.

Digital Agency Network — AI Agency Pricing Guide 2026
digitalagencynetwork.com
- - AI automation ‘build’ projects commonly priced at $2,500–$15,000+ with monitoring/optimization retainers from $500–$5,000+/month in 2026.
GSA Advantage — Y Point Solutions LLC Price List (47QTCA24D00CM)
gsaadvantage.gov
- - GSA MAS (SIN 54151S) lists Artificial Intelligence Engineer rates at $125.72–$196.10/hour for 2025–2026 across Level 1–4 roles (Y Point Solutions LLC).
GSA Advantage — Xfinion (GS‑35F‑339CA) SIN 54151S pricing table 2025–2030
gsaadvantage.gov
- - Typical IT services under GSA MAS show $100–$260+/hour bands for roles relevant to LLM/automation delivery (e.g., Software Engineer $130/hr; DevSecOps Engineer $241/hr for 2025–2026).
OpenAI — ‘Introducing GPT‑5.4 mini and nano’
openai.com
- - OpenAI introduced GPT‑5.4 nano at $0.20 per 1M input tokens and $1.25 per 1M output tokens (API).
OpenAI — ‘GPT‑4o mini: advancing cost‑efficient intelligence’
openai.com
- - GPT‑4o mini (2024) priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens — >60% cheaper than GPT‑3.5 Turbo at the time.
ArXiv — ‘Tiered Super‑Moore’s Law: Price Evolution… LLM Inference Services’ (2026)
arxiv.org
- - A 2026 economic analysis finds an ~600‑fold decline in LLM token prices from 2020–2026, with a structural break in May 2024 as competition accelerated price cuts.
OpenAI API pricing pages (localized)
openai.com
- - OpenAI API documentation indicates a 10% uplift for data‑residency endpoints on certain GPT‑5.x models.
TechRadar Pro coverage of OVH CEO forecast
techradar.com
- - Cloud provider OVH’s CEO warned of 5–10% price rises for certain cloud products by April–September 2026 due to AI‑driven hardware demand (DDR4/DDR5 price spikes).
Reddit r/AiAutomations — ‘What running five AI agents actually costs’
reddit.com
- - Live founder report: single ‘mid‑tier’ agents often consume 5–10M tokens/month, yielding ~$25–$50/month raw inference cost at $2.50–$5 per 1M tokens.
Promethean Research — 2025 Digital Agency Industry Report
prometheanresearch.com
- - Promethean Research (2025) finds pure value‑based pricing used by ~2% of digital agencies; most blend T&M, fixed‑bid, and retainers.
NeurIPS — ‘Hidden Technical Debt in Machine Learning Systems’
proceedings.neurips.cc
- - Maintenance/operations for AI systems are not trivial: Sculley et al. (2015) identify ‘hidden technical debt’ where ongoing system complexity often dominates initial model code.
MyFlowMind; Sockly.ai pricing pages
sockly.ai
- - Named agency retainers at $2,497/month (MyFlowMind) and ‘setup + monthly’ models at $2,500 setup + $1,500/mo (Sockly) are publicly listed in 2026.
MyFlowMind (automation/AI agency) pricing page
myflowmind.com
- - MyFlowMind
- - Named, current agency offering with an explicit high‑tier retainer that fits the episode’s Growth/Enterprise tier conversation.
LowCode Agency blog (n8n)
lowcode.agency
- - LowCode Agency (n8n development firm)
- - Public guidance that retainers typically run $500–$3,000/mo for ongoing automation support — a concrete range for the Starter/Growth tiers.
Sockly.ai pricing
sockly.ai
- - Sockly.ai (AI receptionist/agent provider)
- - Clear ‘setup + monthly’ pricing structure useful for illustrating repackaging from low‑MRR to setup+retainer.
CallReply pricing page
callreply.com
- - CallReply (AI receptionist)
- - Example of a premium monthly plan with explicit setup fee — supports top‑tier packaging and SLA discussions.
Reddit r/n8nbusinessautomation
reddit.com
- - Founder thread: ‘Charging $2,500 for this’
- - Live practitioner signal for repackaging common ‘lead follow‑up’ agents into a setup + sub‑$1k/mo retainer.
Reddit r/agency
reddit.com
- - Founder thread sanity‑checking $1.5k/mo agent pricing
- - Shows migration path from low‑priced ‘$149–$199/mo’ bots to $1.5k+/mo value‑based agent packages tied to reclaimed hours/ROI.
Reddit r/AiAutomations
reddit.com
- - ‘What running five AI agents actually costs’
- - Concrete raw‑cost datapoints to ground margin math and cost‑sensitivity (tokens often $25–$50/mo per mid‑tier agent).

Santi: If you're charging one hundred ninety-nine dollars a month for something your client would happily pay twenty-five hundred for, that's on you.

Kira: Twenty-five hundred. For the same deliverable.

Santi: Same deliverable. Same agent. Same outcome. The difference? One is priced like a Zapier subscription. The other is priced like what it actually is — a business system that runs twenty-four seven without them thinking about it.

Kira: Okay but here's what you're not considering — most nomads starting out, they don't have the confidence to ask for twenty-five hundred. They're sitting in a hostel in Canggu thinking "who am I to charge enterprise rates?"

Santi: No no no — not enterprise rates. We're talking about, what, like twelve hundred MRR per client at thirty percent margins after you factor in the token costs, the QA hours, the incident response. That's not enterprise. That's just... not underpricing yourself into the Bali Trap.

Kira: The Bali Trap.

Santi: Where your pricing only works if you're living on three dollars a meal. Look — token costs have dropped six hundred times since 2020. Six hundred. GPT-5.4 nano is twenty cents per million input tokens. The compute is basically free at this point. So why are agencies still pricing like they're selling compute?

Kira: Because they don't know how to price the other stuff. The monitoring. The evals. The "my model just started hallucinating client names at two AM" response plan.

Santi: Here's what actually happened to me last month. I had this real estate guy — been running his lead follow-up bot for one ninety-nine a month. One ninety-nine. He calls me up, says a competitor just quoted him twenty-five hundred setup plus seven fifty monthly for the exact same thing. And he's asking me why mine is so cheap.

Kira: He's asking why you're too cheap.

Santi: He's literally concerned that I'm too cheap. That maybe my thing doesn't do what theirs does. And I'm sitting there realizing — I trained him to see this as a subscription, not as critical infrastructure.

Kira: If you're running a book of five clients or more and you're still pricing your AI services like SaaS subscriptions, you are leaving thousands on the table every month. That's what today is about.

Santi: We're going to show you exactly how to repackage what you're already building — same agents, same workflows — into a three-tier structure that clients actually understand and will pay real money for.

Kira: The problem starts with how we think about what we're selling. Most nomads building AI services think they're selling automation. They're not. They're selling outcomes that happen to be delivered through automation.

Santi: Right, right — and this is where the hourly billing completely falls apart. Because if I bill you a hundred fifty an hour to build an agent, and it takes me ten hours, that's fifteen hundred bucks. But if that agent saves you twenty hours a week for the next year—

Kira: You've massively underpriced.

Santi: Massively. And here's the data on this — GSA rate cards have AI engineers at one twenty-six to one ninety-six an hour in 2026. Government rates. So when you're charging less than a hundred an hour effective rate, you're signaling that either your scope is limited, your SLAs are weak, or you're not meeting enterprise standards.

Kira: Which might be fine if you're just starting out. But it's not sustainable past Visa Run Revenue.

Santi: Three thousand a month.

Kira: Three thousand a month minimum to sustain nomad life with visa runs, coworking spaces, decent internet redundancy. So let's talk about how to get there. And more importantly, how to get past it.

Santi: The repackaging. This is where it gets interesting. So you've got these agencies charging one forty-nine, one ninety-nine a month for lead follow-up bots. Reddit's full of these discussions — "is one ninety-nine too much for an AI agent?" Meanwhile, Sockly is charging twenty-five hundred setup plus fifteen hundred monthly for basically the same thing.

Kira: Same underlying tech.

Santi: Same GPT-4 or Claude calls. Same basic workflow. But completely different packaging. So here's what we did — we took a basic lead follow-up agent and rebuilt the entire offer structure around it.

Kira: Walk me through the before and after.

Santi: Before: one ninety-nine a month, unlimited messages, basic SMS integration, weekly reports. That's it. After: twenty-five hundred dollar setup fee, seven fifty to fifteen hundred monthly retainer depending on volume, and here's the key — we added three things that cost us almost nothing but completely changed the value perception.

Kira: Which three?

Santi: First, the agent itself with custom personality training and response templates. Second, real-time reporting dashboard with conversion tracking. Third — and this is critical — the human handoff SOP with escalation triggers.

Kira: The SOP changes everything.

Santi: Because now it's not just an agent. It's a system. And systems are worth more than tools. Let me show you the math on this. Mid-tier agent, five to ten million tokens a month based on what people are actually reporting. At current prices — we're talking twenty cents per million for GPT-5.4 nano — that's twenty-five to fifty dollars in raw compute.

Kira: Fifty dollars.

Santi: Fifty dollars maximum. Add three to six hours of QA per month at, let's say, forty an hour internal cost. Another one to two hours for incident response budget. You're looking at total delivery cost of maybe three hundred dollars for a thousand-dollar monthly retainer. That's seventy percent margins.

Kira: But here's what actually happens when you're on the ground — clients don't care about your margins. They care about their outcomes. So the question isn't "what does it cost you?" The question is "what result are they buying?"

Santi: Time to first touch under five minutes. Booked calls per week. Lead qualification accuracy. Those are the metrics that matter.

Kira: And this is where the three-tier structure comes in. Because not every client needs the same level of service, but they all need to see where they fit.

Santi: Starter tier — five hundred to seven fifty monthly. Basic agent, email only, next-business-day support, shared Slack channel. This is for people testing the waters.

Kira: Growth tier — where most clients should land — twelve hundred to two thousand monthly. Multi-channel agent, SMS and email, four-hour response SLA, dedicated Slack channel, custom reporting dashboard.

Santi: Enterprise tier — three thousand plus. Multiple agents, API access, two-hour response SLA, white-label options, on-premise deployment if needed. And here's where it gets interesting — data residency requirements.

Kira: The compliance premium.

Santi: OpenAI charges ten percent more for data residency endpoints. Ten percent. But you can charge thirty to fifty percent more to the client because you're taking on the compliance burden. Healthcare, finance, government contractors — they'll pay it without blinking.

Kira: Imagine you're a medical practice. HIPAA compliance isn't optional. So when someone says "we can run this in a HIPAA-compliant environment with BAA agreements and audit logs," that's not a nice-to-have.

Santi: It's table stakes. And it completely changes the pricing conversation. Same with on-premise deployment. Yes, it's more complex. Yes, the cloud costs might actually go up — OVH is predicting five to ten percent increases in 2026 because of hardware demand. But enterprise clients will pay three to five X for on-prem.

Kira: Because they're not buying compute. They're buying control.

Santi: Let's talk about the Lisbon Test.

Kira: Your framework for everything.

Santi: Can you run this from a Lisbon café with sketchy wifi? That's the test. Because if your entire business breaks when the internet drops for twenty minutes, you don't have a location-independent business. You have a location-dependent business that happens to be remote.

Kira: So what passes the test?

Santi: Async operations. Monitoring that alerts you before clients notice problems. Cache strategies so your agents keep running even if the primary model is down. Fallback models — if GPT-5 is having issues, can you automatically switch to Claude? And most importantly, a human runbook that a VA could follow if everything breaks while you're on a flight.

Kira: That runbook is what you're really selling in the retainer.

Santi: Exactly. The retainer isn't paying for compute. It's paying for the monitoring, the evals, the version management, the incident response. Softment's LLMOps retainer — they explicitly list weekly QA, regression testing, incident support. That's what a real retainer includes.

Kira: And this is the important part — maintenance isn't trivial. There's research from 2015, still cited everywhere, showing that the ongoing operational complexity of ML systems often exceeds the initial build cost over twelve to eighteen months.

Santi: Model drift is real. Prompt degradation is real. API changes — we talked about this last episode — they will break your stuff if you're not watching.

Kira: So when a client says "why should I pay fifteen hundred a month when tokens only cost fifty dollars?" you have a real answer.

Santi: Here's what actually happened — I started tracking every hour I spent on client work. Not just building. Everything. The random Slack message at 10 PM asking why responses seem slower. The monthly audit to check if prompts are still performing. The emergency patch when OpenAI changes their function calling format. It adds up to eight to twelve hours per client per month.

Kira: Even for a "stable" system.

Santi: Especially for a stable system. Because stable means someone is actively maintaining it.

Kira: Let's address the elephant in the room. The objection every client has: "We can build this in-house."

Santi: Sure. They can also build their own CRM.

Kira: But they don't. So what's the real objection?

Santi: They think it's just ChatGPT with a wrapper. They see these Twitter threads — "I built an AI agent in 30 minutes" — and think that's the whole thing. They don't see the error handling, the retry logic, the token optimization, the context window management, the hallucination detection, the—

Kira: Okay but you can't just list features at them. They need to feel the difference.

Santi: This is why we do pilots. Thirty-day pilot, capped usage, clear success metrics, and — this is key — a rollback clause. If it doesn't hit the metrics, they can walk away, no questions asked.

Kira: But here's what actually happens — you make the pilot so good they can't imagine going back.

Santi: And you include a cost transparency appendix. Show them the token logs. Show them the cache hit rates. Show them exactly where their money is going. Because once they see that compute is five percent of the cost and operations is ninety-five percent, the conversation changes.

Kira: From "why is this so expensive" to "what else can we automate."

Santi: The upsell writes itself. Speaking of which — let's talk about pricing levers. Because not all clients are created equal.

Kira: Industry is the biggest one.

Santi: Healthcare and finance pay double. Minimum. Not because they have more money — though they do — but because the stakes are higher. A missed lead for a real estate agent is annoying. A missed patient inquiry could be a liability.

Kira: SLA windows change everything too.

Santi: Twenty-four hour response? Standard rate. Four hour response? Twenty percent premium. Two hour response? Fifty percent premium. And here's the thing — most issues don't actually need two hour response. But enterprise clients need to know that if something critical breaks, you're on it.

Kira: It's insurance.

Santi: It's exactly insurance. They're paying for the option, not the execution. Other levers — legacy system integration adds thirty to fifty percent. Multi-language support adds twenty percent. Custom model training adds fifty to a hundred percent.

Kira: White-labeling.

Santi: White-labeling is beautiful because it costs you nothing and clients will pay twenty to thirty percent more for it. They want to present this as their innovation, not yours. Let them.

Kira: But here's what you're not considering — some of these levers actually make your life easier. On-premise deployment sounds complex, but once it's running, you're not paying for compute anymore.

Santi: And you can charge for every update.

Kira: Every. Single. Update.

Santi: Let's get specific about the calculator. Because you can't just guess at these numbers.

Kira: The Google Sheet we're sharing has five inputs. Target gross margin — usually sixty to seventy percent. Model costs per thousand tokens — this changes monthly, so keep it updated. Human QA hours per client per month. SLA multiplier — how much premium for faster response. And industry multiplier — healthcare is 2x, finance is 2.2x, general business is 1x.

Santi: The calculator then spits out three things. Your break-even monthly price per client. Your target monthly price to hit margin goals. And your maximum viable clients given your current operational capacity.

Kira: That last one surprises people.

Santi: Because at fifteen hundred per client with seventy percent margins, you can handle maybe eight to ten clients solo before operations overwhelm you. But most people don't realize they're approaching that cliff until they hit it.

Kira: And then they're turning down three thousand dollar contracts because they literally can't deliver.

Santi: Which brings us back to the beginning. Pricing isn't just about what the market will bear. It's about what you can actually deliver while living the life you want. The Lisbon Test isn't just for your tech stack — it's for your entire business model.

Kira: Can you run this from anywhere?

Santi: Can you run this async?

Kira: Can you run this without burning out?

Santi: Because if you're charging one ninety-nine a month and need twenty clients to hit ten thousand MRR, you're not building a location-independent business. You're building a prison with a view.

Kira: A prison with a view.

Santi: Look — the market data is clear. Digital Agency Network shows AI automation builds going for twenty-five hundred to fifteen thousand. Retainers from five hundred to five thousand monthly. These aren't made-up numbers. These are what agencies are actually charging in 2026.

Kira: And more importantly, what clients are actually paying.

Santi: MyFlowMind — twenty-four ninety-seven monthly for their Scale tier. LowCode Agency — five hundred to three thousand for ongoing automation support. These are public prices. On their websites. Right now.

Kira: So the question isn't whether the market will bear these prices.

Santi: The question is whether you believe your work is worth it.

Kira: And if you don't, that's a different problem entirely.

Kira: So here's what we covered. You're not selling automation — you're selling outcomes. The setup plus retainer model works because it matches how clients think about infrastructure, not subscriptions. And the difference between one ninety-nine and fifteen hundred monthly isn't the tech — it's the packaging, the guarantees, and the operational excellence.

Santi: The real unlock is understanding that token costs are now so low they're almost irrelevant. What you're charging for is reliability, compliance, and the ability to run this from anywhere without it breaking.

Kira: Which brings us to your homework.

Santi: Download the pricing calculator from the show notes. Plug in your actual numbers — not what you hope they are, what they actually are. Token costs, QA hours, incident response time. See what your real margins look like.

Kira: Then take one client — just one — and repackage their current service using the three-tier structure we outlined. Don't change the tech. Don't add features. Just reframe what you're already delivering as a system, not a tool.

Santi: And here's the thing — if you can't confidently explain why your service is worth fifteen hundred a month instead of one ninety-nine, you have a positioning problem, not a pricing problem.

Kira: The templates are all in the show notes. The calculator, the three-tier packaging framework, the email sequence for presenting the new structure to existing clients.

Santi: Plus that Lisbon Test checklist. Because if your business can't survive sketchy wifi and async operations—

Kira: You're not actually location-independent.

Santi: You're just remote.

Kira: Next week we're diving into something completely different — how to build and sell AI-powered digital products while visa-hopping. Not services, not consulting — actual products you can sell while you sleep.

Santi: Including the one product category that's printing money right now but nobody's talking about because it sounds too boring to be true.

Kira: Compliance templates.

Santi: Compliance templates with AI-powered customization. Selling for two to five thousand per package. But that's next week.

Kira: I'm Kira.

Santi: I'm Santi.

Kira: Stop underpricing your work.

Santi: Build something that passes the Lisbon Test.

AI pricingdigital nomad businessAI automation agencyvalue-based pricinglocation-independent businessAI consultant ratesretainer pricingSaaS pricingbusiness systemsnomad revenue