2026-06-04 21:41 UTCIn-site rewrite4 min readUpdated: 2026-06-30 13:03 UTC

Show HN: Intencion – Product analytics that improves your AI agents continuously

Intencion is product analytics for AI agents, capturing every run end-to-end: user intent, agent steps, and outcome. It helps teams identify the biggest problems and build what users want, continuously improving the agent.

SourceHacker News AIAuthor: sakuraiben

Intencion · Product analytics for AI agents

Product analytics for the agents you ship.

Product analytics for AI agents. Every run is captured end to end: what the user wanted, the steps your agent took, and how it ended. Fix the biggest problem, build what users keep asking for, and your agent gets a little better every week.

Start freeSee the live demo

Built for teams running a customer-facing agent in production, support and operations first, where a failed run is a real ticket or lost revenue.

One line to install. About 1 ms to capture a call, flushed in the background, so your model response returns before anything is sent. Zero added latency on the request path. Prefer a walkthrough? Book a demo.

What users want from your agentlast 7 days · 4,210 runs

IntentRunsResolved

Refund a charge

1,204

96%

Track my order

880

92%

Change my plan

410

61%

top failurecan't verify identity · 43%verify_user → lookup_account → escalate

Cancel subscription

260

79%

Book a callback new

118

build next

1 intent to fix1 to build next

Not observability

Not another tracing tool.

Observability tells you the model ran. Intencion tells you whether the user got what they came for, and what to build so more of them do.

The product layer

What users wanted, and whether they got it.

Spans, tokens, and latency tell you what the system did. They are not the question that sets your roadmap. Intencion maps the intent behind every run to the path your agent took and how it ended, then ranks what to fix and what to build next.

Coexists

Works alongside your stack.

It sits next to your tracing (Langfuse, LangSmith, Datadog) and your product analytics (Amplitude, PostHog). Keep your stack and add one line. We are not a span format and do not ask you to re-emit traces. An OpenTelemetry bridge for teams that already emit spans is on the roadmap.

Not this

Per-call metrics

Tokens, latency, and traces, one span at a time.

But this

Per-intent resolution

Every user goal, grouped, with how many runs actually resolved.

And this

What to build next

The asks your agent cannot handle yet, ranked by how often they come up.

The taxonomy compounds. Every run sharpens your intent clusters, so the longer it runs the better it labels your traffic. Your intent map is built from your own runs, which is hard to copy.

Every run makes the next one better.

Add the SDK once. After that you can see what users want, where your agent succeeds, and where it falls short. Each week you fix the biggest problem and ship what people keep asking for, so the agent keeps getting better. That is the loop.

Users say what they want

intent captured

Your agent takes steps

tool calls traced

It works, or it doesn't

outcome detected

You see what to fix and build

ranked by impact

You ship it, the loop repeats

improvements compound

Outcomes

You define success, not a model.

No model guesses whether your agent worked. A run that returns is a success. A run that throws is a failure, and the error message becomes the reason. Need nuance? Set it yourself. It is deterministic and the same on every replay, so there is no classifier to babysit and no accuracy rate to chase.

How outcomes work, in the docs →

run.ts

// default: returns → success, throws → failure

run.ok(); // force success

run.fail("card declined"); // failure + reason

run.abandon(); // user gave up

It compounds

Small wins, stacked.

Illustrative example. "Change my plan" resolves 61 percent, and "can't verify identity" is 43 percent of its failures, all on the path verify_user → lookup_account → escalate. Fix that one step and the intent climbs toward the 90s. Do that every week, top problem and top request, and it adds up: high 70s into the 90s in a month.

Resolved runs, week by weekexample

Week 1fixed identity verification

Week 2fixed a refund edge case

Week 3shipped order tracking

Week 4shipped callback booking

One line

Works with your stack, no rewrites.

Patch your model client once and every call is captured: model, tokens, latency, outcome. It patches at the class level, so calls your framework makes internally (LangChain, the OpenAI Agents SDK, LlamaIndex) are captured too. When your agent calls tools, wrap the run and record each step. No decorators sprinkled everywhere. Published for TypeScript and Python.

Read the docs →

agent.ts

import { Intencion } from "@intencion/sdk";

import Anthropic from "@anthropic-ai/sdk";

const ix = new Intencion({ apiKey });

const anthropic = ix.instrumentAnthropic(new Anthropic());

// every call captured: model, tokens, latency, outcome.

// instrumentOpenAI(new OpenAI()) works the same way.

tools.ts

await ix.run({ intent: "refund_request", input, user }, async (run) => {

const order = await lookupOrder(id);

run.step({ name: "lookup_order", tool: "orders-db", status: "success" });

const refund = await issueRefund(order);

run.step({ name: "issue_refund", tool: "payments", status: "success" });

return refund; // returns → success, throws → failure

});

~1 ms

to capture a call

0 ms

on the request path

5 s

background flush

1,000

bounded queue

What you get

Each row is a real part of the product, built to answer one specific question about your agent.

Intents

Every goal your users bring to the agent, grouped and counted, with how many runs resolved. One row per intent, not twenty log lines. You declare the intent, or we infer it: match to your existing clusters first, then a small model names anything new and reuses your labels. Editing the taxonomy from the dashboard is on the roadmap.

Run trace

Open any run to see the goal, every tool call it made, the path it took, and how it ended, all on one timeline.

Failure modes

Failed and abandoned runs grouped by cause and ranked by how often they happen, so you fix the biggest one first.

Emerging

The things users keep asking for that your agent can't handle yet. Your roadmap, straight from real usage.

Private by default

Emails, names, and card numbers are stripped before anything is stored, so nothing sensitive lands in our database.

Make your agent better, week after week.

Start free in a minute, no call required. Or click through the live demo first.

Start freeExplore the live demo

See it on your own agent.

Grab 20 minutes. We'll walk through Intencion on a real agent's runs and answer anything. Pick a time below.