Bill your AI customers per-token.

Multi-provider LLM gateway with built-in usage metering. Track costs per customer, enforce quotas, route requests—one SDK.

You're on the list!

We'll reach out to when your spot opens up.

Have an invite code? Sign in →

TypeScript Rust Go

import { ModelRelay, parseSecretKey } from "@modelrelay/sdk";

const mr = new ModelRelay({ key: parseSecretKey("mr_sk_...") });

const text = await mr.responses.textForCustomer(
  "cust_abc123",
  "You are a helpful assistant.",
  "Summarize our Q4 results",
);
// Usage billed to cust_abc123

use modelrelay::{ApiKey, Client};

let client = Client::with_key(ApiKey::parse("mr_sk_...")?)
    .build()?;

let text = client.responses()
    .text_for_customer(
        "cust_abc123",
        "You are a helpful assistant.",
        "Summarize our Q4 results",
    )
    .await?;
// Usage billed to cust_abc123

key, _ := sdk.ParseAPIKeyAuth("mr_sk_...")
client, _ := sdk.NewClientWithKey(key)

text, _ := client.Responses.TextForCustomer(ctx,
  "cust_abc123",
  "You are a helpful assistant.",
  "Summarize our Q4 results",
)
// Usage billed to cust_abc123

Every request is attributed to a customer

Your users make AI requests. ModelRelay tracks costs per customer. You keep the margin.

Your Customers

User A

Pro tier

User B

Free tier

User C

Enterprise

MODELRELAY

Cost Tracking

Per customer

Provider Routing

Anthropic, OpenAI, xAI

response: { cost_cents: 12 }

You Bill Your Customers

User A

$4.50

this month

User B

Free

47 actions left

User C

$892

this month

Bill per customer

Every request is attributed to a customer. Export usage for invoicing or sync to Stripe.

Set your margins

Define free/pro/enterprise tiers with different models and rate limits per tier.

Multi-provider fallbacks

Route to Anthropic, OpenAI, or xAI. Automatic failover when a provider is down.

Cut off abusers

Set spend limits per customer. Requests fail before they cost you money.

See every cent

Every response includes token counts and cost. No waiting for monthly invoices.

Multi-agent orchestration

Define parallel pipelines with workflow.v0. Fan-out, join, tool calling.

See how providers compare → Live benchmarks updated every 60s