OpenRouter is one of those tools you can describe in a sentence — one OpenAI-compatible API in front of 400+ models from 60+ providers — and still spend weeks figuring out whether you want it. The pitch is real; the catch is knowing when it's the right answer versus when you should pay providers directly, self-host, or run a heavier gateway like LiteLLM.
This guide is for engineering teams who've already worked out that the AI coding landscape now has three pricing camps and want a hands-on read of the portable camp's most popular front door. We'll cover what OpenRouter does, how to set it up, how to wire it into Cline / Aider / Continue / OpenCode, what the billing model gives you, and where the line is between OpenRouter is enough
and graduate to LiteLLM or run models locally.
What OpenRouter actually is
Strip away the marketing copy and OpenRouter is three things stapled together:
- A unified API. One endpoint (
https://openrouter.ai/api/v1/chat/completions), OpenAI-compatible request/response shape. Any participating provider's model is selectable by a singlemodelstring. Existing OpenAI SDK code becomes drop-in compatible by swapping thebase_url. - A routing layer. When the provider behind your chosen model has an outage or hits a rate limit, OpenRouter falls back to another provider serving the same model. A
:nitrovariant suffix routes to the fastest available provider for a given model. - A credit-based billing relationship. You fund one credit balance (card, crypto, or bank transfer) and pay at the underlying provider's published rate. OpenRouter does not mark up provider pricing — what you see in the model catalog is what you pay — but adds a 5.5% platform fee on pay-as-you-go usage.
Two more things matter for shipping teams: custom data policies scope which providers can see your prompts, and prompts/completions are not logged by default (you can opt in for a 1% discount).
OpenRouter publishes a live rankings page tracking weekly usage across the platform — top models, market share by creator, tool-call frequency. That's more useful than any leaderboard we could paste into this article — it'll be out of date before you finish reading. Bookmark the page; don't memorize the order.
Setting it up
Generate a key in the OpenRouter dashboard. A cURL call against any model then looks like this:
curl https://openrouter.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d '{
"model": "anthropic/claude-sonnet-4.5",
"messages": [{"role": "user", "content": "Refactor this function..."}]
}'
Two optional headers worth setting day one: HTTP-Referer and X-OpenRouter-Title. Those identify your app in the OpenRouter rankings and help separate our coding tool
from our chatbot
on the activity page.
Because the API is OpenAI-compatible, any existing OpenAI SDK call works with two changes — base URL and key:
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=os.environ["OPENROUTER_API_KEY"],
)
resp = client.chat.completions.create(
model="anthropic/claude-sonnet-4.5",
messages=[{"role": "user", "content": "..."}],
)
That portability is the load-bearing feature. Existing code, existing SDK — you're swapping the destination, not rewriting your application.
Wiring it into your AI coding tool
Most teams adopt OpenRouter not to call it from app code, but to point their AI coding agent at it so every developer's tool talks to one billing account. The mechanics differ slightly per tool.
Cline has a first-class OpenRouter provider. In Cline's settings panel, pick OpenRouter, paste your key, and select a model — Cline pulls the live model list from the OpenRouter catalog and explicitly supports OpenRouter's 200+ models
alongside Claude, GPT, Gemini, and the major cloud endpoints.
Aider picks up OpenRouter via the standard OPENROUTER_API_KEY env var and an openrouter/<provider>/<model> model string:
export OPENROUTER_API_KEY=sk-or-...
aider --model openrouter/anthropic/claude-sonnet-4.5
The openrouter/ prefix tells Aider's underlying LiteLLM layer which provider to route through. Aider's git-native posture pairs well with OpenRouter — swap models mid-session and the diff history still records what each agent contributed.
Continue.dev configures OpenRouter as a provider entry in config.yaml:
models:
- name: Claude Sonnet via OpenRouter
provider: openrouter
model: anthropic/claude-sonnet-4.5
apiKey: ${env:OPENROUTER_API_KEY}
roles:
- chat
- edit
The roles array slots one OpenRouter-backed model into chat and a different one into autocomplete — autocomplete fires constantly and benefits from a cheap, fast model; chat fires occasionally and benefits from a smarter, pricier one. OpenRouter makes that split a one-line config change.
OpenCode integrates with OpenRouter through its provider system in opencode.json — set OPENROUTER_API_KEY, pick an OpenRouter-routed model, every session uses it.
Claude Code, Copilot, and Cursor are deliberately not on this list — they're tightly coupled to their provider relationships and don't accept arbitrary OpenAI-compatible base URLs on standard tiers. The point of switching to a router is moving from subscription seat economics to usage economics, which means leaving those tools behind on the routing path. For the single-vendor first-party take instead, we covered the first-party AI coding CLIs separately.
→ Once your coding tool is wired up, the question is what happens to the code that comes back. Deploy AI-generated changes from your terminal with the DeployHQ CLI — same agent on the laptop, same pipeline in production.
How OpenRouter pricing works
OpenRouter is pay-as-you-go, no subscription, pass-through pricing. You buy credits — manual top-up or auto top-up — and each request costs the provider's published rate, deducted from your balance as you go. The pricing page is explicit: Pricing shown in the model catalog is what you pay which is exactly what you will see on provider's websites.
There's no inflated middleman rate.
On top of that pass-through cost, a 5.5% platform fee applies on pay-as-you-go usage. Free-tier accounts (25+ models, 4 providers, capped at 50 requests/day) skip the fee.
A few details worth knowing:
- Free models exist. A
openrouter/freerouter auto-picks an available free model. Rate limits are tight (5 requests/day with no credits, 200/day with any credits) — fine for evaluation, not for production. - Credits expire after a year if unused. Refunds available within 24 hours of purchase.
- Enterprise pricing is separate — volume discounts, annual commitments, and invoicing outside the standard credit system.
We're deliberately not quoting per-model dollar amounts here because pricing on the underlying providers moves often. The OpenRouter model catalog shows current per-million-input and per-million-output prices alongside each model — that's the only number worth trusting.
Team setup: provisioning, billing, monitoring
The single credit balance is the easy part. The part that earns OpenRouter its keep is giving each developer a key you can monitor, limit, and revoke without re-issuing your master credentials.
OpenRouter exposes a Provisioning API Keys feature for exactly this. You generate a provisioning key in the dashboard (it can only manage other keys, not call completions), then programmatically create per-developer or per-project keys via the /api/v1/keys endpoints. Each generated key can carry a spend limit with optional daily, weekly, or monthly reset (resets at midnight UTC; weeks run Monday through Sunday) and a descriptive label so the activity page tells you which developer or service it belongs to.
The OpenRouter docs call out three patterns: SaaS applications (one API key per customer), key rotation (programmatic regeneration for compliance), and usage monitoring (auto-disable when a key exceeds its limit). For a coding team, the use case is the latter two — every developer gets their own key with a sensible weekly cap, replacing everyone shares the master key
with auditable, capped access.
BYOK fits in here too. OpenRouter supports Bring Your Own Key for 60+ inference providers — plug your direct Anthropic, OpenAI, Azure, AWS Bedrock, or Google Vertex credentials into OpenRouter and it routes through your provider account instead of charging your OpenRouter credits. The fee is 5% of the equivalent OpenRouter rate, and the first 1 million BYOK requests per month are free. That makes BYOK useful for teams that already have committed-use discounts with a provider and just want OpenRouter's tooling layer on top.
→ Once you've centralized AI billing, you'll want matching centralization on the output side — start a DeployHQ project to ship every AI-generated change through one auditable pipeline.
OpenRouter vs the alternatives
Three honest comparisons.
vs LiteLLM. LiteLLM is the self-hosted equivalent — a proxy you run that gives you a similar OpenAI-compatible front door, plus virtual keys, spend tracking, guardrails, and an admin dashboard. The trade is operational: LiteLLM is more control (prompts never leave your infrastructure except to the providers you've configured) at the cost of running a container, a database, and monitoring. Platform teams with regulatory requirements pick LiteLLM. Smaller teams that want to skip the ops tax pick OpenRouter.
vs paying providers directly. Direct relationships have two failure modes: separate bills (one card per provider, fragmented spend visibility) and lock-in. OpenRouter solves both with one bill and one API. The trade is trusting OpenRouter as the middleman — fine for most teams, a non-starter for regulated environments unless you're using BYOK to keep underlying providers as your data processors.
vs running models locally. If you have GPU capacity sitting around and don't want to pay per token, the local-models alternative path is the move. You give up frontier model quality on hard problems and you take on operational overhead, but the per-request cost drops to zero. For deeper architecture and VRAM math, see our self-hosting AI models guide. The realistic team pattern is hybrid: route routine completions through OpenRouter, run experimental work locally, escalate the hardest problems to Claude or Codex directly.
A short decision table:
| Use OpenRouter when | Use LiteLLM when | Use local models when |
|---|---|---|
| You want one bill and minimal ops | You need self-hosted control | You want zero per-request cost |
| Your developers want model flexibility | You have compliance constraints | You have GPU capacity sitting idle |
| You're OK with a hosted middleman | You're a platform team with ops capacity | You're willing to accept a quality ceiling |
The deployment handoff
Routing is upstream. It governs how prompts get to models and how bills get split. None of it matters once the AI has produced code — at that point, the only question is whether the change ships safely.
Three things matter on the output side. First, a containerized build pipeline that runs your full test suite on every commit — it doesn't matter whether the code came from a developer, Claude via OpenRouter, or DeepSeek via OpenRouter, the build either goes green or it doesn't. Second, one-click rollback to the last good deploy for when something slips through — the difference between a five-minute fix and a multi-hour incident. Third, automated AI code review on PRs before the build pipeline sees the change — AI-generated code from different OpenRouter-routed models drifts in different ways, and dedicated reviewers catch the patterns each model reproduces most often.
OpenRouter solves the input-side problem of every developer has their own Anthropic invoice.
DeployHQ closes the loop on the output side. The two together are what a routing strategy looks like in production.
FAQ
Is OpenRouter cheaper than going direct? On per-token cost alone, it's the same — pass-through pricing means you pay the provider's published rate, plus a 5.5% platform fee on pay-as-you-go (or 5% on BYOK after the 1M-request free tier). What you save is operational overhead: one bill, one balance, one set of keys instead of N.
Does it slow down requests?
There's a small added hop, generally negligible for streaming completions. OpenRouter optimizes for this with distributed infrastructure and offers a :nitro variant routing to the fastest provider for a given model. For a normal coding task you won't notice; for very long contexts you might.
Can I use OpenRouter with Claude Code or Copilot directly? No on standard tiers — both are tightly coupled to their own provider relationships and don't accept arbitrary OpenAI-compatible base URLs. The point of switching to a router is leaving those subscription tools behind for the routing path. Cline, Aider, Continue, and OpenCode all accept OpenRouter as a provider.
What about data privacy? Prompts and completions are not logged by default; you can opt in for a 1% discount. Custom data policies scope which providers see your prompts. For stricter requirements, BYOK keeps your direct provider relationship intact and just uses OpenRouter as the routing layer.
Does OpenRouter support BYOK? Yes — for 60+ inference providers including OpenAI, Azure (AI Foundry and OpenAI), AWS Bedrock, and Google Vertex. The fee is 5% of the equivalent OpenRouter rate, and the first 1 million BYOK requests per month are free. Keys are encrypted and you can set priority/fallback behavior per provider.
The bottom line
OpenRouter is the easiest router to start with. One key, one bill, OpenAI-compatible API, drop-in compatibility with every modern AI coding tool, provisioning keys for team controls, BYOK if you outgrow the simple credit model. For most shipping teams the question isn't should I use OpenRouter?
— it's should I use OpenRouter or graduate to LiteLLM?
Start with OpenRouter; reach for LiteLLM when you need self-hosted control.
What it doesn't solve is what happens after the model writes the code. That's the matching half — the build pipeline, the rollback, the agent-to-production handoff — and it's where shipping teams get bitten when they treat routing solved
as AI workflow solved.
Questions about wiring OpenRouter into a DeployHQ-backed shipping pipeline? Email support@deployhq.com or reach out on X / Twitter.