Per-token pricing rewards efficient prompting and short outputs, which matches experienced teams but punishes newcomers whose verbose system prompts silently inflate every bill. Per-request pricing is predictable — a CFO can forecast from call volume alone — but over-charges for micro-queries and under-charges for long generations, so the provider absorbs variance.
In emerging markets the predictability wins more than it loses: teams paying in F CFA with no credit card typically prefer a flat bill they can reconcile against Wave statements over a cheaper-on-paper token meter they can't audit. A useful middle path is tiered flat pricing with token-metered overages, disclosed upfront — it keeps the predictability that wins budgets while preserving the efficiency incentive for teams who grow into it.