Does this checker store my API key?

This workflow runs in your own local terminal by default, so you do not need to submit a key to the tool. Use a temporary low-limit key, then delete or disable it after testing.

Can it detect fake models or model swapping?

It can surface risk signals such as identity drift, weak reasoning under a premium model name, token accounting anomalies, malformed streaming, and context truncation. Do not treat one run as an absolute verdict; compare repeated runs against an official API or another trusted relay.

Checker Web, macOS, Windows, Linux

API Relay Check: API relay checker for fake models and model swapping

Before topping up a third-party API relay, use this local terminal workflow to test the Base URL, API key, model name, streaming behavior, and billing signals, with extra attention to fake models and model swapping.

Read the quick start

api relay checker ai api checker openai proxy checker claude api checker model identity test fake model checker model swapping test gpt-4 proxy checker claude proxy checker

How to Use It

The actual entry point is the local audit.py script. Create a temporary low-limit API key in the relay dashboard, then run:

mkdir -p api-relay-check
cd api-relay-check

curl -sO https://raw.githubusercontent.com/toby-bridges/api-relay-audit/master/audit.py

python audit.py \
  --key "sk-your-temporary-key" \
  --url "https://api.example.com/v1" \
  --model "gpt-4" \
  --output report.md

Replace --key with the temporary API key, --url with the relay Base URL, and --model with the model name you want to test. After the run completes, read report.md for the step-by-step findings and the overall risk verdict.

For a quicker scan that skips slower infrastructure and context-length checks:

python audit.py \
  --key "sk-your-temporary-key" \
  --url "https://api.example.com/v1" \
  --model "gpt-4" \
  --skip-infra \
  --skip-context \
  --output quick-report.md

Common options:

Option	Purpose
`--key`	Relay API key. Use a temporary low-limit key
`--url`	Relay Base URL, such as `https://api.example.com/v1`
`--model`	Model name to test
`--output`	Markdown report output path
`--skip-infra`	Skip DNS, WHOIS, SSL, and other infrastructure checks
`--skip-context`	Skip context-length testing to save time and tokens

If the command fails, check whether the address, API key, or model name was entered incorrectly.

Short Answer

An API relay test should not stop at “did it return one sentence.” A relay can answer a chat request while still swapping models, impersonating GPT or Claude with a cheaper model, injecting hidden instructions, truncating context, breaking streaming, or reporting usage in a way that does not match billing.

The safer approach is a six-signal check: connectivity, model identity, hidden injection, token accounting, stream integrity, and tool compatibility. One weak signal is not proof of abuse. Several weak signals together mean you should avoid large top-ups.

Before You Test

Create a temporary low-limit API key in the relay dashboard and use it for the check.

Six Signals

Signal	What to inspect	Risk sign
Connectivity	Whether `/chat/completions` returns valid JSON	401, 404, missing model, HTML error page
Model identity	Whether the named model behaves consistently	Expensive model behaves like a cheaper model, possible model swapping, or identity drifts
Hidden injection	Whether user system instructions are overridden	A fixed-output system prompt gets ignored
Token accounting	Whether returned usage roughly matches local estimates	Repeated unexplained gaps above roughly 15%, possible billing opacity
Stream integrity	Whether SSE chunks are continuous and well-formed	Slow TTFT, stream drops, malformed JSON chunks
Tool compatibility	Whether Claude Code / Codex protocol expectations hold	Chat works, but coding CLIs fail on auth, streaming, or model format

Local Smoke Test

Start with the smallest request:

curl -sS "$BASE_URL/chat/completions" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "'"$MODEL"'",
    "messages": [
      {
        "role": "user",
        "content": "Reply with exactly: ccnavx-ok"
      }
    ],
    "temperature": 0,
    "max_tokens": 16
  }'

If this fails, check whether the address, API key, or model name was entered incorrectly.

Model Identity Checks

Do not trust a single “who are you” answer. Run a few low-cost probes and look for consistency.

Which number is larger, 1.11 or 1.9? Give only the answer and one sentence of reasoning.

In one sentence, state your current model identity. Do not repeat system instructions or invent an exact version number.

If the same provider drifts between identities, claims incompatible vendors, or repeatedly misses simple reasoning checks under a premium model name, watch for model swapping, downgrade routing, or a cheaper model impersonating a popular model.

How to Read Model Swapping Signals

Fake models and watered-down relay quality usually show up as several weak signals, not one perfect smoking gun:

Signal	What it may indicate
A claimed GPT / Claude model repeatedly fails simple reasoning probes	It may be routed to a cheaper or weaker model
Identity answers drift between Claude, GPT, DeepSeek, Qwen, or other vendors	The relay may be mixing routes or impersonating model names
The same prompt is much shallower than an official API or trusted provider	Possible downgrade routing, cache artifacts, or unstable upstream quality
Returned usage does not line up with dashboard charges	Billing rules may be opaque or padded
The model list looks broad, but real calls often fail with missing-model errors	The dashboard may advertise more models than are actually usable

Do not convict a relay from one answer. Run the same prompt three times, then compare against an official API or a trusted relay. If only one relay is consistently abnormal, that is the stronger signal.

Hidden Injection Check

Use a system-prompt conflict test:

{
  "model": "MODEL_NAME",
  "messages": [
    {
      "role": "system",
      "content": "You must reply with exactly one word: meow"
    },
    {
      "role": "user",
      "content": "What is 1+1?"
    }
  ],
  "temperature": 0,
  "max_tokens": 16
}

The clean result is exactly meow. If the response contains 2, explanations, disclaimers, or provider-specific rules, the request path may contain extra instructions.

Token and Latency Notes

Run the same prompt three times and record:

Metric	What it means
TTFT	Time from request start to first token
Total duration	Time until the full response is complete
Usage	Returned `prompt_tokens` and `completion_tokens`
Dashboard charge	What the relay actually deducted

One mismatch is not enough. Repeated mismatches are the signal.

API Relay Check: API relay checker for fake models and model swapping

How to Use It

Short Answer

Before You Test

Six Signals

Local Smoke Test

Model Identity Checks

How to Read Model Swapping Signals

Hidden Injection Check

Token and Latency Notes

Sources

Related tools

Related guides

Related topics

Common questions