WSO2
 

Best LiteLLM Alternatives in 2026: Secure AI Gateways

  • Matt Tanner
  • Senior Director, Product Marketing - API Platform, WSO2

A supply chain compromise and fraudulent compliance certifications hit LiteLLM in the same week. Here's what engineering teams should evaluate next.

Why teams are reevaluating LiteLLM in 2026

Two things went wrong for LiteLLM in the last week of March 2026. Together, they've turned LiteLLM security into a front-page topic and shaken confidence in one of the most widely adopted AI gateway libraries in the Python ecosystem.

On March 24, a threat actor linked to the group known as TeamPCP published compromised versions of the LiteLLm Python package (versions 1.82.7 and 1.82.8) to PyPI. The malicious packages included a .pth file that executed automatically on every Python process startup, deploying a multi-stage payload: credential harvesting, reconnaissance, and a persistent backdoor for remote code execution. According to InfoQ, PyPI quarantined the packages after approximately 40 minutes, but LiteLLM sees roughly 3 million downloads per day. Forty minutes was enough.

Two days later, TechCrunch reported that LiteLLM had used Delve, the Y Combinator-backed compliance startup, for its SOC 2 Type 2 and ISO 27001 certifications. The problem: Delve has been accused of systematically fabricating compliance reports for hundreds of clients. According to a whistleblower analysis, Delve produced over 500 structurally identical audit reports across hundreds of companies, with only client names and logos changed. The auditing firms involved had minimal U.S. presence.

In short: the LiteLLM supply chain attack compromised the package itself, and the SOC 2 certification that should have increased confidence that relevant controls were in place was, according to TechCrunch's reporting, based on fabricated compliance evidence from Delve. Delve has denied the allegations, but the certifications are now under a cloud. And the supply chain was already breached.

LiteLLM's CTO has since announced a switch to Vanta for re-certification with an independent auditor. Good move. But it doesn't unwind the months that teams operated under the assumption that LiteLLM's security posture had been independently verified. If you're one of those teams, you're probably already looking at what comes next.

What to look for in an AI gateway after a trust breach

The temptation is to swap one proxy for another and move on. Don't. The LiteLLM situation exposed two distinct failure modes, and your replacement criteria should address both.

Supply chain integrity. The compromise didn't originate from LiteLLM's own code. It came through trivy-action, the GitHub Action wrapper for the Trivy security scanner in their CI/CD pipeline. Attackers rewrote the action's Git tags to point to a malicious release, which exfiltrated LiteLLM's PyPI publishing credentials from the GitHub Actions runner. Your AI gateway's direct code can be spotless, but if a transitive dependency in the build pipeline gets poisoned, none of that matters. Look for gateways with reproducible builds, signed releases, and an open source codebase you can audit yourself. Self-hosted deployment options also reduce your exposure: you manage infrastructure by controlling what gets installed and when. This is the OWASP Top 10 for LLM Applications problem applied to production AI infrastructure rather than prompts. Your attack surface extends well beyond the application layer.

Real compliance verification. Delve's fraud worked because enterprises treated the SOC 2 badge as a checkbox rather than verifying the auditor behind it. When evaluating what an AI gateway does for your organization and which one to adopt, dig one layer deeper on compliance claims. Who performed the audit? Are they an established firm with a verifiable track record? Does the compliance scope actually cover the product you're using, or just a parent company's generic cloud infrastructure? WSO2, for example, holds SOC 2 Type 2 attestation scoped specifically to its public cloud offerings (API Platform Cloud, Choreo, Devant, Asgardeo) covering Security, Confidentiality, and Availability trust service criteria, plus ISO 27001:2022 certification for its Digital Operations function. That specificity matters. You can verify exactly which products are covered and by whom.

Deployment flexibility. After a supply chain attack that came through PyPI, the question of where your gateway runs matters more than it used to. A managed-only SaaS gateway means you're trusting the vendor's entire deployment pipeline on faith. A gateway that offers self-managed deployment alongside SaaS gives you the option to control your own infrastructure, audit dependencies before deployment, and decouple your upgrade cycle from the vendor's release cadence. Self-hosted isn't always the right call, but having the choice matters for enterprise risk management.

Beyond those three, the standard enterprise AI gateway evaluation criteria still apply: multi-model routing across multiple LLM providers, guardrails and data protection, observability, and cost controls. But in April 2026, the security and compliance requirements come first.

WSO2 AI Gateway: Enterprise AI traffic control built on Envoy

WSO2 AI Gateway takes a different architectural approach than LiteLLM. LiteLLM is a Python library that acts as a unified API proxy for routing requests to multiple providers. WSO2's gateway is a custom Go-based engine built on top of Envoy Proxy, the cloud-native proxy, extended for AI workloads. The foundation is open source under the Apache 2.0 license and written primarily in Go (88.9% of the codebase), so the code is auditable and the supply chain is anchored to a well-understood proxy infrastructure rather than a PyPI package.

That architectural choice matters right now. An Envoy-based gateway follows established Kubernetes-native deployment patterns with a dependency tree that security teams already know how to audit. It's not a Python package that auto-executes .pth files on import. For teams searching for LiteLLM alternatives that are open source, the Apache 2.0 licensed codebase is a real differentiator. You can fork it, audit it, and build from source if your security posture demands it.

LLM governance and routing

For outbound LLM traffic (your AI applications calling models), WSO2 handles multi-provider routing with load balancing and automatic failover across major LLM providers including OpenAI, Anthropic, Google Vertex, Azure AI, AWS Bedrock, and Mistral. Table stakes for any LiteLLM replacement. Where it goes further is consumption management: token-based rate limiting calibrated to how LLMs actually charge (per-token, not per-request), prompt management with semantic caching that serves cached responses to semantically similar queries rather than requiring exact text matches, and PII masking that scrubs sensitive data before prompts leave your network.

The rate limiting distinction deserves attention. Most API gateways rate-limit on request count because that's how traditional APIs work. LLM costs scale with token usage, not request volume. A single request with a 10,000-token context window costs 100x more than a simple classification prompt. WSO2's gateway tracks and throttles at the token level, so your finance team can set department-level budgets that actually correspond to spend, not just call volume.

MCP governance

The less obvious but increasingly important capability is MCP (Model Context Protocol) governance. As AI agents proliferate, they need governed access to external tools and internal APIs. WSO2's gateway can convert REST APIs into MCP-compatible servers without custom wrapper code, proxy external MCP servers through centralized policy enforcement, and provide a discoverable catalog, enabling teams of agent developers for agent developers through its MCP Hub. Tool-level and server-level audit logs give you end-to-end visibility into which agents are accessing which tools, and how often.

Several other gateways in this roundup have started adding MCP support (Portkey and Kong both offer MCP capabilities now), but WSO2's approach is distinctive in that it converts existing REST APIs into MCP-compatible servers without custom wrapper code, rather than only proxying MCP traffic that already exists. If your AI applications and agents interact with internal systems (and if you're reading this, they probably do or will soon), the ability to govern both LLM traffic and MCP tool access through a unified control plane with shared audit trails matters.

Deployment and compliance

Deployment options span SaaS (WSO2's managed service), hybrid (data plane in your network, control plane managed by WSO2), and fully self-hosted. For production teams in regulated environments whose primary concern after the LiteLLM incident is supply chain control, the self-hosted option gives you complete control: you're running infrastructure you've pulled, audited, and deployed yourself. Not trusting a PyPI package to behave.

On compliance, WSO2 holds SOC 2 Type 2 attestation for its public cloud offerings and ISO 27001:2022 certification for Digital Operations, with enterprise controls covering data retention and access. Real certifications from established auditors. Not Delve-generated reports with identical boilerplate text.

Where it's heavier than LiteLLM

WSO2 AI Gateway is part of a broader API platform that includes the standard WSO2 API gateway, API control plane, API developer portal, and analytics and monetization capabilities. That may seem like a lot of infrastructure, especially if you're coming from a pip install litellm world, it can look like overkill.

Here's the thing, though: you don't have to install and use all of it. WSO2's platform supports unbundled adoption, meaning you can deploy just the AI Gateway as a standalone component without the rest of the platform. The Envoy AI Gateway is open source and independently deployable. Start routing requests immediately with just LLM and MCP governance. If your API management team wants unified governance across AI and traditional APIs, bring in the broader platform. If you're building autonomous agents, the Agent Platform is there when you need it.

This is the real differentiator against the rest of this list. Most LiteLLM alternatives fall into one of two buckets: lightweight proxies with minimal operational overhead that stay lightweight (OpenRouter, Helicone), or full enterprise platforms that require full enterprise commitment from day one. WSO2's unbundled approach means you get a LiteLLM-weight starting point with a clear growth path into enterprise governance features, without re-platforming when your needs change.

Other LiteLLM alternatives worth evaluating

WSO2 fits a specific profile: enterprise teams that need governed, self-hostable AI infrastructure with verified compliance credentials. But the landscape is broader, and different teams have different constraints. Here are the other gateways appearing in best LiteLLM alternatives 2026 lists right now.

Portkey bundles gateway routing, observability, guardrails, and cost tracking into a unified platform. It supports 200+ AI models through an OpenAI-compatible single API with real-time logging and trace analysis. Portkey's open source gateway can be self-hosted, and the platform has added MCP support for connecting to remote MCP servers. It appears in virtually every LiteLLM alternatives comparison, which signals broad model access and market awareness but also means it's the default recommendation. Default doesn't mean best fit.

Kong AI Gateway extends Kong's existing API gateway with AI-specific plugins for rate limiting, prompt management, and intelligent routing. Kong now offers native support for MCP traffic governance through dedicated AI MCP Proxy and OAuth2 plugins. Kong has deep enterprise penetration and a long track record in API management, the kind of track record that matters when you've just been burned by a startup's compliance theater. The AI capabilities are plugin-based rather than purpose-built, but teams already running Kong have a natural adoption path with minimal setup and no new infrastructure.

Cloudflare AI Gateway runs at the edge, adding caching, rate limiting, and usage analytics to LLM API calls. Minimal setup if you're already in Cloudflare's ecosystem, and the edge deployment model means lower latency for geographically distributed teams. The AI-specific governance features (guardrails, PII masking, fine-grained compliance controls) are thinner than purpose-built AI gateways. Cloudflare's overall compliance posture is strong (they hold their own SOC 2), but the AI Gateway product is relatively young.

OpenRouter provides a unified interface across multiple LLM providers with an OpenAI-compatible API. It's the quickest path to broad model access without building your own abstraction layer, and for teams that used LiteLLM primarily as a LiteLLM proxy alternative for provider normalization, it's the most direct functional replacement. It's a gateway layer for routing, not a governance platform. If you need guardrails, compliance, or observability, you'll need to layer those on separately.

Helicone started as an LLM observability platform (request logging, latency tracking, cost analysis, prompt debugging) but has expanded into an OpenAI-compatible gateway with smart routing across 100+ AI models, automatic failover, and unified billing. It's open source and offers self-hosted deployment, which checks an important box post-LiteLLM. Helicone still excels at observability, but it's growing into a broader middleware layer. Enterprise features like compliance and governance remain thinner than enterprise-focused alternatives.

Comparison at a glance

Gateway Self-hosted Open source MCP support Extensibility Verified compliance Primary strength
WSO2 AI Gateway Yes Yes (Apache 2.0) Yes (proxy + hub/portal + REST-to-MCP) Custom Go-based AI guardrails SOC 2 Type 2, ISO 27001 Full AI traffic governance
Portkey Yes (OSS gateway) Yes (gateway) Yes (remote servers) Config-driven (Node.js) Not public Production observability + routing
Kong AI Gateway Yes Partial (OSS core) Yes (proxy + OAuth plugins) Lua plugins Kong Inc. SOC 2 Enterprise API management
Cloudflare AI Gateway No No No Workers (JS/TS) Cloudflare SOC 2 Edge performance + caching
OpenRouter No No No None (routing service) Not public Quick multi-provider access
Helicone Yes Yes No Config-driven (TS) Not public Observability + smart routing

Two patterns stand out. First, MCP support is spreading fast: WSO2, Portkey, and Kong all offer it now, though the depth varies (WSO2 converts REST APIs to MCP and can also proxy existing MCP ingress and egress traffic; Portkey and Kong proxy existing MCP traffic only). Second, the gateways built by established infrastructure companies (WSO2, Kong, Cloudflare) have verifiable compliance credentials. Many of the newer alternatives haven't published theirs. After the Delve revelation, that distinction carries more weight than it did a month ago.

How to migrate away from LiteLLM

If you're running LiteLLM at production scale today, switching gateways is a real engineering project. Not a one-line change. But most production teams discover they're using 20% of LiteLLM's features, and that narrows the replacement scope significantly.

Step 1: Audit your current usage. Before evaluating replacements, document exactly what LiteLLM does in your stack. Which AI models do you route requests to? What routing logic is active: simple failover, conditional routing, latency-based selection? What rate limits, retries, or caching are configured? Which environment variables and provider API keys does your application reference? Build a spreadsheet. This audit typically takes a senior engineer half a day and saves weeks of guesswork during migration.

Step 2: Pin and verify your current LiteLLM version. If you haven't already, pin to a known-safe version (anything before 1.82.7 or after the patched release), verify the package hash against LiteLLM's published checksums, and audit your dependency lock files. Sonatype's analysis provides the specific indicators of compromise to check against. If you find the malicious .pth file in any environment, treat that environment as fully compromised. Rotate all credentials, tokens, and API keys that were accessible.

Step 3: Map features to your replacement. Most AI gateways support OpenAI-compatible API formats, which means the application-side changes are often limited to updating the base URL and swapping provider keys in environment variables. WSO2 AI Gateway's proxy API approach, for instance, lets you route through the gateway by pointing your existing OpenAI SDK calls at the gateway endpoint. No SDK swap required. The complexity comes from recreating routing logic, rate limits, and guardrails in the new platform's configuration model. If you were using LiteLLM's Python SDK for model fallbacks, you'll need to map those to your new gateway's routing configuration, which varies by vendor.

Step 4: Run parallel traffic. Don't cut over all at once. Route a percentage of production traffic through the new gateway while keeping the old path active (on a verified-safe LiteLLM version). Compare latency, production reliability, and cost. Pay particular attention to semantic caching behavior, which can differ significantly between implementations. What LiteLLM cached may not be cached identically by your replacement. Give this at least a week with production traffic patterns before committing.

Step 5: Decommission and clean up. Once you've validated the new gateway, remove litellm from your dependencies entirely. Rotate any API keys or tokens that were present in the environment during the compromise window, even if you believe you were running a safe version. Update your CI/CD pipelines to pull from the new gateway's dependency chain. Document the migration for your team: what changed, where the new configuration lives, and how to troubleshoot routing issues.

The entire migration could take as little as 1-3 weeks for a team running LiteLLM with standard routing and automatic failover. Teams with complex custom routing logic or extensive LiteLLM SDK integration should budget 3-5 weeks or more.

Lessons from the LiteLLM incident for AI infrastructure teams

The March 2026 LiteLLM situation wasn't a single failure. It was two failures that compounded: a supply chain compromise that injected malicious code, and a compliance certification that should have increased confidence that relevant controls were in place, but was allegedly based on fabricated evidence.

That compounding is the real takeaway. Security controls exist in layers, and when one layer is fake, the whole stack degrades. A SOC 2 report is supposed to verify that a vendor has enterprise controls around access controls like role-based access control, data retention, change management, and incident response. Those are the exact controls that would have caught or mitigated a CI/CD pipeline compromise. When the report is generated by a platform that produced structurally identical reports across hundreds of clients, those controls were never actually verified.

For teams rethinking AI gateway security and choosing their next platform, a few principles fall out of this:

Verify the compliance chain, not just the badge. Ask who audited, look up the firm, confirm the scope covers the product you're using. WSO2 publishes its compliance details on a dedicated security page with scope specifics. That's the standard to hold vendors to. If a vendor can't tell you who their auditor is and what the audit scope covers, that's a red flag.

Prefer auditable infrastructure. Open source gateways built on established foundations, like WSO2's Envoy-based architecture (available on GitHub), let your security team inspect the code, the build process, and the dependency tree. Closed-source proxies require you to trust the vendor's supply chain on faith. After March 2026, "trust us" is not a security architecture.

Consider self-hosted deployment seriously (especially if using these earlier stage tools). If you control the deployment, you control what version runs, when it updates, and what dependencies it pulls. The LiteLLM compromise affected teams that pulled the latest version from PyPI automatically. Teams running pinned, self-hosted infrastructure had a buffer. Not every team should self-host everything. But having the option for security-critical infrastructure reduces your blast radius.

Audit your transitive dependencies. The LiteLLM attack came through a compromised GitHub Action (trivy-action), not through LiteLLM's own code. Run pip audit, npm audit, or your language's equivalent regularly, and review your CI/CD pipeline's dependency chain with the same scrutiny you apply to your application code. Supply chain attacks exploit the gap between "code we wrote" and "code we run."

Treat AI gateways as critical infrastructure, not dev tools. LiteLLM started as a convenience library for normalizing LLM provider APIs. Many teams adopted it that way: a pip install in a requirements file, not a reviewed infrastructure decision. But an AI gateway handles API key management, sees your prompts, and sits in the request path between your AI applications and your LLM providers. It deserves the same procurement scrutiny as your database, your auth provider, or your cloud platform.

The AI gateway market is maturing fast. The March 2026 incidents are accelerating that maturation by forcing production teams to ask harder questions about the infrastructure they're trusting with their LLM traffic, API keys, and customer data. That's uncomfortable in the short term. Net positive for the ecosystem.

Ready to evaluate WSO2 AI Gateway for your team? Sign up for the free SaaS tier or explore the open source Envoy AI Gateway on GitHub.