AI Gateway

Monitor, control, and optimize your AI applications
Running AI work loads

Cloudflare AI Gateway provides centralized visibility and control for your AI applications. Connect your apps with a single line of code to monitor usage, costs, and errors. Reduce risks and expenses through caching, rate limiting, request retries, and model fallbacks. Ensure reliability, scalability, and productivity with minimal effort.

Running AI work loads
Control costs for all your AI apps

Connect your AI apps to AI Gateway for a unified dashboard and control costs with usage stats, rate limiting, and caching.

Cloudflare Web Analytics - APPROVED WEB ICONS
Easy analytics & troubleshooting

Gain visibility into prompts, AI API requests, errors, token usage, costs, and more. Logs are available for auditing and troubleshooting.

Support for the most popular AI providers

Unify the top AI providers including Hugging Face, OpenAI, Anthropic and Workers AI, for comprehensive visibility into your AI applications.


Make AI applications observable, reliable, and scalable

API Gateway how it works

By shifting features such as rate limiting, caching, and error handling to the proxy layer, organizations can apply unified configurations across AI apps and inference service providers. AI Gateway sits between your application and the AI provider to give you multivendor AI observability and control.

API Gateway how it works

What our customers are saying

"Without AI Gateway, it’s difficult to see which applications are driving the majority of the costs with the OpenAI API … We can choose to limit the number of requests used by certain tools to control costs."


Top AI Gateway use cases

Cloudflare AI Gateway helps you monitor, control, and optimize your AI applications


Real-time insights and reliability with logs, metrics, rate limiting, caching, and monitoring.

Padlock icon

Effortlessly connect the most popular providers- Workers AI, Hugging Face, OpenAI, Anthropic, and more with just one line of code.

Lightning bolt icon

Optimize costs and reduce latency with custom caching. Control scaling and prevent excessive activity with rate limiting.

Helping organizations worldwide monitor, control, and scale their AI solutions

Get AI Gateway for your enterprise