Cloudflare AI
Live from the Edge

Real inference, running on Cloudflare Workers AI right now. Type a prompt, pick a model, and see the response stream in from the nearest GPU.

⚡ Workers AI Edge Location: detecting... Streaming SSE

Input

Ready

Response

Latency:
Response will appear here...

Texts to Embed

Ready

Enter 2–10 text snippets. We'll generate embeddings and show how similar they are to each other.

Similarity Matrix

Latency:
Similarity scores will appear here...

What Just Happened

Your prompt was sent to a Cloudflare Worker running on the nearest edge node. The Worker called Workers AI, which routed the request to a GPU in one of 200+ cities. The response streamed back via Server-Sent Events. No server to manage, no GPU to provision. Cost: fractions of a cent.

Zero Cold Start

Workers AI models are always warm. No container spin-up, no Lambda cold starts. First token arrives in milliseconds.

Global by Default

Your request was handled by the nearest edge location. Users in Johannesburg hit Johannesburg GPUs. Users in London hit London.

Pay per Neuron

$0.011 per 1,000 neurons. No minimums, no reservations. 10,000 free neurons/day on the free tier.

Next Demo: LiteParse → ← All Demos