AI News HubLIVE
In-site rewrite6 min read

AI Workflows in Production Without Burning Tokens

This article discusses how to bring AI (LLM) into production while controlling token costs to ensure a positive cost-benefit ratio. Using an expense approval case study, it demonstrates how combining AI with deterministic rules can optimize workflows, drastically reducing token consumption while maintaining flexibility and consistency.

SourceHacker News AIAuthor: jusonchan81

Back to Blog

Bringing AI Workflows into Production without Burning Tokens

Adopting AI (or the abilities of an LLM) into production is a core metric or goal for most engineers today. In this article we look at the best way to bring AI into production while keeping the token costs under control such that the cost vs benefit equation lands in the benefits bucket and adds value to the business.

Authored by

Gulam Mohiuddeen

Software Engineer

10 min read

June 22, 2026

Let’s make it Agentic!

The push in the market is to use agentic flows. Agentic is when you let a model decide how to process a request or a flow and expect its abilities to parse and understand context to result in the best possible outcome for the use case. The idea is that as models mature and become more “intelligent” the outcomes become more high quality and beats a human coded fixed algorithm.

With this in mind, oftentimes you’d see a use case pushed into production which relies on model calls completely.

For example, an agent may execute a use case by parsing input, validating data, classifying the request, checking policy, routing it to the right person, and drafting a response, all by calling a model.

It's often quite fast to build this, with the many agentic frameworks out there today and the demo is usually great and impressive to the management.

Launching this in production in a high volume use case will bring a shock though - when the bill arrives from the model providers. Token costs are increasing rather than decreasing as models evolve. Now is that use case adding sufficient value to the cost of running it?

What about questions such as consistency, latency, security and governance? And an even bigger question, do you really know why a decision was made?

I think there is a trend shift happening now in the market. The shift is to double check token spend to value creation. This shift is primarily among those who have already shipped a reasonable amount of use cases leveraging AI. The camp that is still working to deploy some use cases is not bothered by the spend yet as it hasn’t really hit them yet. But it's almost a certainty that once your budget starts to get eaten up, the question will come.

Going to Production with AI

The instinct to use AI for everything was not wrong. But does it always make sense? Teams and people are starting to ask which of the steps actually needs “intelligence” and which ones just need some rules or logic? This leads to answers for not just token spend, but the latency and consistency as well. Consistency means you know the reasons why the system is doing something.

But if we are not using AI, are we not losing out? Isn’t that what everyone says now? Get onboard or be left behind to be eaten by more modern competing companies.

The solution is to maximize the use of AI, but in a way that it yields maximum value and not just blindly at everything. I think an example is overdue for explaining this.

Expense approvals - This is quite common and every company needs it and typically it's done by a couple methods:

A human manually reviews and approves each expense based on some published policy

Some rules in a HR system that can automatically approve some expenses and route others to approvers

Let’s say the finance team of a company wants to be more dynamic, rapidly respond to changing trends and create a system that can benefit the company to maximize employee productivity - by letting them manage expenses that are not bound by rules set in stone!

An engineering team asked to build this could simply do this - ask the finance team to write the policy in a Google Doc or something - which can then be published to the internal portal as the official policy and then say build an AI agent that reads this policy and approves every request based on the policy. Now the finance team can update the policy every now and then, and without any developer in the loop, the policy can reflect on each expense approval request - Et Voila! Cool right?

Steps:

User initiate a chat with an expense agent

Exchange greetings (of course we humans always do that, agent or not!)

Upload a receipt, explain the expense

Agent parses the receipt, validates the amounts and dates

Evaluates the entire expense policy against the request

Decides on the request, informs the user

If approved, make a request to the HR system to note the required expense reimbursement

This is a pretty cool agentic flow if you were to build it and I think the finance team and the entire company is probably going to be thrilled to use it. First, policies will start being practically applied (assuming the AI is intelligent) and the finance team has the flexibility to change it every Monday if they want to. Win win - And the CTO can present to the board on how they leveraged the intelligence of the models available today to add value to the business.

The big savings here is the manual approval times that the finance team would have to spend on without something like this. Or even more is the developer time required to keep changing policies as they change and lead time for doing so while expenses may not be processed as per the latest update.

So is this value worth the new token bills that may now start coming from the model providers?

Optimizing the AI

What could we do differently here? The models are worthy of use for sure. It's proven beyond doubt that it can be very effective in a lot of scenarios. A change up for the example above could be this:

On every policy update, steps:

Read the latest policy

Create a set of rules for basic cases extracted from the policy

Create test scenarios for the human to verify

Send the test scenarios and ruleset for finance team to approve

Deploy rules into production

On every expense request, steps:

Present a data entry form for user

If user chooses to enter via an unstructured form, then run the model to convert it into structured

Run the rules deployed

if its matching, approve or deny as per rules

If none matches

Run the model to decide approval

Or route to human for the lower volume

Inform the user and invoke the HR system for reimbursement

Still using that model! But only where it matters and you can pretty much cut down the token costs by 80-90%. The more people who use structured input, there is not even a need for the model to parse the inputs. We still end up the model’s intelligence but in a more consistent way since it created static, deterministic rules. The large volume of requests will be now automatically approved by the updated rulesets for each policy update by the finance team. Same flexibility as the earlier flow, but with a lot lower token spend. And we get to use AI to do the one thing it's designed to do great - code things out! A developer in the loop can also review the code and ensure its consistent with the engineering standards just like any other code review. Win win win across the board.

Judgment vs. logic

It’s quite easy to decide when you need a model and when you need a simple logical interpretation. In every workflow step decide on this:

Does this step require understanding context, generating language, or making a nuanced decision, or does it just need to follow a rule?

It is a useful question because it removes some of the glamour from the architecture. Most steps, when you ask it honestly, are less mysterious than they first looked.

Steps that need judgmentSteps that need logic

Classifying ambiguous or unstructured inputRouting based on a known field value

Summarizing a document or conversationValidating a number against a threshold

Generating a policy, workflow, or configRunning a decision table at scale

Explaining a rejection in plain EnglishParsing a structured form submission

Helping a non-technical user set up a ruleExecuting that rule 50,000 times a day

About our platform - Unmeshed

So since you are reading this - I’d love to share a bit about our platform - Unmeshed. Unmeshed helps teams build AI-powered workflows that combine model calls, deterministic rules, API integrations, human approvals, and observability in one place.

In Unmeshed, engineering teams can:

Model API calls, rules, decision tables, human approvals, and AI steps in one workflow

See which steps call models and why

Attribute cost to specific workflows and outcomes

Put budgets, scopes, and tool allow-lists around agentic execution

Keep humans in the loop for high-risk or ambiguous decisions

Move repeatable decisions from model calls into deterministic logic over time

The decision table is a rules engine and you can create complex business rules using AI or manually. This is where you could use models to create your rules, have humans validate them and then run them for effectively next to nothing compared to individual model calls for every use case execution.

If your team is adding AI into business processes and starting to ask where the cost, latency, and outcomes are coming from, Unmeshed gives you one place to design, run, observe, and optimize the workflow.

Build AI workflows that don't burn your budget

Unmeshed lets you mix model calls, deterministic rules, decision tables, and human approvals in one workflow - so you use AI where it adds value and logic everywhere else.

Try UnmeshedTalk to us

Bring your AI workflow and we can help you cut token costs without cutting capability.

Tags:

UnmeshedAIWorkflow AutomationProduction AIToken ManagementCost Effective AIAI Integration

Recent Blogs

What Is Token Efficiency? A Practical Guide for AI Teams

Token efficiency is the ratio of useful output to total tokens consumed. This guide breaks down where AI teams lose the most money and how to get ahead of it.

June 24, 2026

13 min read

Read more

Smart Drip Email Funnel with Unmeshed | SaaS Trial Guide

Someone starts a SaaS trial. Before each email is sent, the workflow checks whether they already upgraded and exits if they have.

May 29, 2026

7 min read

Read more

How Unmeshed Reduces Deployment Nightmares

Learn how Unmeshed eliminates deployment risks by managing backend logic as modular workflows. Fix bugs instantly without redeployment and reduce downtime.

May 26, 2026

4 min read

Read more

AI Workflows in Production Without Burning Tokens | AI News Hub