SolutionWright Universal

June 30, 2026

Cost Per Task, Not Per Token: How To Price An AI Workflow Honestly

Why per-token pricing hides the real unit economics of an AI engagement, and the cost-per-task framing SolutionWright puts on every quote.

Ask any vendor what their AI workflow costs and most will answer in tokens, seats, or "estimated monthly usage." None of those numbers tell you what you actually want to know: what does it cost to get one real task done, end to end, with the receipts attached?

We price differently. Every SolutionWright quote names the task, names the unit cost, and names what the unit cost includes. If we can't say that, we don't quote yet.

Why per-token pricing hides the bill

Per-token pricing is the natural unit for a model provider — they sell inference. It is the wrong unit for a buyer of an outcome. A token bill rolls up four very different costs into one undifferentiated meter:

  1. The model call itself.
  2. The retries, re-prompts, and tool-loop overhead.
  3. The orchestration, retrieval, and post-processing around the model.
  4. The human review when the model is wrong.

When a vendor quotes "about $0.40 per request" they have almost always priced (1) and silently absorbed (2), (3), and (4) into a retainer, a seat license, or your own staff time (Class C — observed across procurement reviews we have run with clients). The token number looks cheap because three of the four real costs have been moved off the line item.

You see the consequence at month three: invoices are "within forecast" but nothing has shipped, because the cost of getting a task to done was never on the page.

What "one task" actually has to include

A task, for pricing purposes, is the smallest unit of work the buyer would pay for if it were done by a person. Not "a model call." Not "a draft." A finished, accepted output.

To quote that honestly we have to put numbers on:

  • the median and the p90 of how many model calls it takes (because the long tail is the cost),
  • the orchestration and retrieval cost per attempt,
  • the expected human-review minutes per task,
  • the rework rate, and
  • the failure mode — what happens to cost when the workflow can't complete and a human has to take over.

A single number — dollars per accepted task — falls out of that. It is usually larger than the token line and smaller than the retainer. More importantly, it is the number you can actually compare against the value of the task.

Small kits can beat big bills

The reason this matters now, and not five years ago, is that the cost of "getting one task done" has decoupled from the cost of "running a giant model." Themesis reports a striking instance of that decoupling: AIX's SeedIQ system solving ARC-AGI 3 problems on an M1 MacBook Pro in roughly nine seconds at around twenty watts, compared against foundation-model attempts that consumed thousands of GPU-hours (Themesis, 2026) (Class E — external report, not our measurement). The lesson for buyers is not "use SeedIQ"; the lesson is that "compute spent" and "task completed" are different axes and a vendor who quotes you on the first is hiding the second.

If a small, well-shaped pipeline can finish a task for cents, the right question to ask any vendor is not "what is your token price" but "what is your cost per accepted task on a workload that looks like mine, and can I see the trace?"

What we put on a SolutionWright quote

Every engagement we propose names, on one page:

  • The task. Stated as a finished output with an acceptance test.
  • The cost per accepted task. Dollars, not tokens.
  • What's inside that number. Model, orchestration, retrieval, review, rework.
  • The failure cost. What it costs when the task can't be completed automatically.
  • The first instrumented batch. A small run where the above numbers get replaced by measured numbers, on your data, before you commit to a larger spend (Class C — this is how we structure week-one work on every contract).

If a number on that page turns out to be wrong, the ledger shows it, and the contract treats the corrected number as the price going forward. No retroactive markups, no hidden meter.

What to ask your current vendor this week

You don't have to switch anyone to put this discipline on. Three questions are usually enough:

  1. What is the smallest unit of work I am paying you to deliver, and what is the cost of one of them?
  2. Does that number include retries, orchestration, retrieval, and human review?
  3. Show me a trace of one completed unit and the line items that add up to the number.

A vendor who can answer all three in writing is worth keeping. A vendor who can answer two of the three is worth a hard conversation. A vendor who can answer none of them is the reason you are reading this post.


Related:

EvidenceECTagspricingcost-per-taskunit-economicstransparencyai-procurement

Next steps

Bring this into a working session.

The workshop is where these notes turn into receipts on real work. The science page is where the underlying hypothesis is laid out in full, with the falsifier attached.