Real inference, running on Cloudflare Workers AI right now. Type a prompt, pick a model, and see the response stream in from the nearest GPU.
Enter 2–10 text snippets. We'll generate embeddings and show how similar they are to each other.
Workers AI models are always warm. No container spin-up, no Lambda cold starts. First token arrives in milliseconds.
Your request was handled by the nearest edge location. Users in Johannesburg hit Johannesburg GPUs. Users in London hit London.
$0.011 per 1,000 neurons. No minimums, no reservations. 10,000 free neurons/day on the free tier.