Fermata for AI Billing
When you started working on your AI startup, you probably didn’t sweat working on the billing components. You’ll just use Stripe right? However, working on an AI startup, you have one thing that most other SaaS companies don’t have to worry about: COGS. (That’s Cost of Goods Sold)
It’s a fancy business term that means there is a cost associated with the service you provide your users. That’s usage on expensive cloud GPUs, or other AI billing APIs that charge you for every request. That means you’re not really billing your customers on a per-seat basis, right? You need to bill them based on how much they use. That sounds simple enough. Just tally up how many API requests they consumed over the course of the month and then have Stripe run their credit card.
But what happens when the transaction gets declined?
AI startups need prepaid billing
Since you have a significant infrastructure cost, you can’t just assume good faith for all of your customers. You need to accept payment upfront. Just like a prepaid cellphone.
Fermata makes this process radically simple and two ways to setup prepaid services
- Setup prepaid spend accounts that auto-reload
- Setup prepaid token plans
We call both of these Treasury Accounts. We keep track of their ledgers, subtracting usage against their balance.
For prepaid spend accounts, users can deposit money into your account and as they use your platform you can bill against their account. This means that you can bill for small, atomic purchases that normally would be too expensive to run a credit card for. (Think fractions of pennies.) As they spend from their account you can setup automatic reloading options. For instance whenever their balance drops below $10.00 we automatically add $50.00 to their account. This prevents users from getting their workflows interrupted by a billing issue.
With prepaid token plans, you can issue a set number of tokens each month. Let’s say you are an AI image generator for interior designers. You can setup three different plans, 1,000 images, 3,000 images, and 10,000 images a month. As your users generate images, they spend from this token bank. At the end of the month your customers are billed and then their token balances are refreshed.
Fermata denotes a pause.
Fermata handles each usage tracking event in real-time. This means each time that your users hit your API they are billed for their usage in that exact moment. There isn’t a batching process or delay. This means that the instant they hit zero tokens (or zero dollars), we’ll tell you (via Webhook or push API) and you can prevent them from submitting new requests.
Now pausing users at zero balance sounds like a bad customer experience, but in fact, it’s a perfect moment to over an upsell solution. You can enable users to purchase additional credits at a discount (outside of their monthly plan). Or you can offer the ability to upgrade to a bigger plan.
Fermata is flexible, allowing you to setup multiple event trackers, accounts for both tokens and dollars, and setup renewing plans that fit your business perfectly.
Plan changes are complex
Speaking of plan setup. When users want to upgrade, downgrade, or change their monthly plan this can put you in a tough position. How do you handle refunds or credits? What if they want to change their plan mid-month?
This is a lot of math and billing logic to incorporate into your app. You should focus on building the best AI products, not working through weird billing edge cases.
Fermata gives you the tools you need to help make plan changes possible, fair, and easy. We’ll handle the math and billing.
You can’t go into postpaid blind either
For your top customers, you might need to offer Net-30 invoicing. And while that might be a more straightforward billing process, it’s still rather risky and leaves you exposed to oversuse and nonpayment.
For instance, let’s say you have a major client that usually spends about $20,000 a month on your API. And while they have gotten invoiced for their spend in January, and February, they haven’t paid you yet. Now, it’s March and this customer starts ramping up their usage to 3x their normal rate. On most teams, there is a disconnect from the finance and engineering teams. Engineers might know that this customer is ramping up usage, but the finance team is the ones that know that they haven’t paid in months.
Fermata makes it easy for your teams to collaborate and solve these postpaid risk problems. For each postpaid client, your finance team can set a credit limit that automatically alerts (and even pauses) usage if it’s exceeded.
With live visibility connecting outstanding bills with the customer’s total usage you can catch the risk early and intervene with your customer success teams.
Fermata solves your core AI billing problems
With prepaid balances you can reduce the risk of nonpayment from your long tail of self-service customers. You can offload the complex plan logic to Fermata and focus on building a better product. And you can collaborate across finance and product engineering teams in a single billing platform.
.jpg)

