Skill AI & LLM →

Model Selection Router

Lock every AI call to the most capable Opus model (no downgrade) and route cost savings through prompt caching, batch APIs and context engineering instead of cutting quality.

An enforcement skill that locks every AI call in your system to the most capable Opus model, with no downgrade to cheaper or faster models permitted. Instead of cutting quality to cut cost, it routes savings through prompt caching, batch APIs, and context engineering, so output quality stays consistent across every agent, script, and report.

$15 one-time
Add to a kit →

Prices include 20% VAT. · Forged on real agency work · one-time, no lock-in

  • Type Skill
  • Category AI & LLM
  • Delivery Email · instant
  • License One-time
Run preview
forgehouse, model-selection-router

Inside the run · no black box

See the actual work before you buy it.

Downgrading to a cheaper model rarely saves money once revision rounds are priced in. This skill audits the whole fleet against that math and enforces a single top-model standard.

  1. Scans every agent frontmatter for the model field: missing or downgraded entries get flagged, and the alias form is enforced so new model releases upgrade the whole fleet automatically with zero code changes.
  2. Scans the skill library for explicit lower-tier model declarations and scripts for hardcoded model ID strings, replacing frozen versions with the alias plus an environment override so no file pins itself to a stale model.
  3. Intercepts downgrade proposals at the argument level: when a 'cheaper model is enough for this task' or 'complex work on the big model, fast iterations on the small one' hierarchy appears, it gets rejected with the total-cost-of-iteration math, because revision rounds, re-prompting time and reputation risk make the downgrade more expensive than it looks.
  4. Routes the cost pressure to the legitimate levers instead: prompt caching for up to 90 percent input discount on repeated static prefixes, the asynchronous batch API at 50 percent for non-realtime workloads, and context engineering that cuts irrelevant chunks out of the token budget.
  5. Re-checks consistency across the whole pipeline, because internal automation output feeds customer-facing reports: a downgrade at the start of a chain becomes a quality loss at the end of it, so the standard applies to every stage including tool-call reasoning.
  6. Produces a structured audit brief: total agents and how many carry the correct model field, skills with explicit downgrades to fix, scripts with hardcoded IDs to fix, the correction actions taken and a final verified count.
Use cases · what happens when you plug it in

One power source. 6 lines out.

model-selection-router · core

core active · 6 lines

  1. Auditing agent frontmatter to confirm every dispatch uses the Opus alias

    ✓ auditing agent frontmatter
  2. Reviewing new skill files to block hidden cheaper-model declarations

    ✓ reviewing new skill files
  3. Catching scripts that hardcode a frozen model ID instead of the alias

    ✓ catching scripts that ha…
  4. Rejecting cost-cutting proposals to switch to a lighter model

    ✓ rejecting cost-cutting p…
  5. Setting up prompt caching to recover cost without dropping quality

    ✓ setting up prompt caching
  6. Configuring batch processing for non-realtime report and audit workloads

    ✓ configuring batch proces…
Benefits · what you walk away with

Yours to keep.

Drag time forward. Watch what stays.

Forever

That's what owning means.

The rented stack

ai writing tool: subscription

expired · access lost

analytics suite: subscription

expired · access lost

design platform: subscription

expired · access lost

(nothing left)

Your forge

  1. Eliminates the hidden revision loops that downgraded outputs cause downstream

    license: perpetual
  2. Keeps every client-facing deliverable at one predictable quality bar

    license: perpetual
  3. Cuts spend through caching and batching rather than quality compromise

    license: perpetual
  4. Future-proofs every agent so new model releases upgrade automatically via the alias

    license: perpetual

subscriptions expire · deeds don't

What's included · the full manifest

Everything in the box.

Pick a piece up. Watch it work.

Total-cost-of-iteration reasoning that counts revision rounds, not just per-call price

part 01 of 06 · in the box

6 parts · one working system · ships instantly by email

From the field · a real case

This wasn’t written at a desk.

The problem

The fix

The result

Who it's for

This wasn't forged for everyone.

  • Not for you if you'd rather rent a tool than own one.
  • Not for you if you want someone else to run your stack.
  • Not for you if you're happy guessing.
Still here? Good.

Teams running multi-agent or automated pipelines who want consistent top-tier output and disciplined cost control without trading quality for a cheaper model.

then this was forged for you.

Works with

Universal by design: these run in any AI. Delivered in the open Agent Skills + MCP format (native in Claude); ChatGPT, Gemini, Cursor and Copilot adapt the same files their own way.

  • Claude Native format
  • ChatGPT Adapts via open standards
  • Gemini Adapts via open standards
  • Cursor Adapts via open standards
  • Copilot Adapts via open standards
Questions · still in the air

Catch what's on your mind.

the air is clear. nothing between you and the forge.
catch a spark: the forge will answer

  1. Does it audit scripts and cron jobs too, or just interactive agents?

    All three: it audits agent frontmatter, reviews skill files for hidden cheaper-model declarations, and catches scripts that hardcode a frozen model ID instead of the alias. The audit output template covers agents, skills, and scripts in one pass.

  2. If downgrading is banned, how does the bill stay under control?

    Savings come from mechanics, not quality cuts: prompt caching recovers up to 90 percent on repeated static context, the batch API runs asynchronous workloads at half cost, and context-budget engineering trims irrelevant tokens before each call. The total-cost reasoning also counts the revision rounds that downgraded output causes.

  3. Can I make an exception for one low-stakes task?

    No. The ten-point anti-pattern catalog exists to reject exactly that argument, because exceptions are where quality drift starts. If a task is genuinely cheap, batching and caching make it cheap at full quality.

  4. How is it delivered?

    By email right after purchase: ready to run, downloaded instantly, no setup wait.

  5. One-time or subscription?

    A one-time purchase; no subscription or hidden fees. VAT (20%) is included.

  6. Can I get a refund?

    As a digital product, it can’t be refunded once downloaded. That’s why we show exactly what’s inside and who it’s for, right here.