Bring Your Own Model (BYOM) in TurboStream
Published December 17, 2025

One of the most common requests we’ve gotten while building TurboStream was simple:
“Can I use my own LLM instead of yours?”
As of today, the answer is yes.
TurboStream now supports Bring Your Own Model (BYOM).
You can plug in your own LLM provider by setting API keys in your env.local file — no code changes required.
This keeps TurboStream flexible, cost-conscious, and aligned with how most developers actually work.
Why BYOM?
TurboStream sits at an awkward but important intersection:
- WebSockets are fast producers
- LLMs are slow, expensive consumers
Because of that, model choice matters a lot:
- Latency affects real-time usefulness
- Token pricing affects how aggressively you can analyze streams
- Model behavior affects alert quality
BYOM lets you:
- use the model you already trust
- control your costs
- swap providers without changing your pipeline
How It Works
TurboStream abstracts LLM providers behind a common interface.
You configure providers using environment variables.
You only need one provider configured, but you can configure multiple and switch between them as needed.
No vendor lock-in. No hardcoded SDKs in your app code.
Supported LLM Providers
| Provider | Environment Variables | Get API Key |
|---|---|---|
| Azure OpenAI | AZURE_OPENAI_* |
Azure Portal |
| OpenAI | OPENAI_* |
https://platform.openai.com |
| Anthropic (Claude) | ANTHROPIC_* |
https://console.anthropic.com |
| Google Gemini | GOOGLE_* |
https://aistudio.google.com |
| Mistral | MISTRAL_* |
https://console.mistral.ai |
| xAI (Grok) | XAI_* |
https://console.x.ai |
Switching Between Providers
If you configure multiple providers, TurboStream allows you to switch models without changing your stream processing logic.
This is useful for:
- benchmarking latency (time-to-first-token, total generation time)
- comparing alert quality across models
- balancing cost vs accuracy for different streams
Model selection is explicit and transparent — no silent fallbacks.