Cloud Horizon AI / Quickstart
First request in five minutes.
One base URL change to the OpenAI SDK, one API key. The same client code your team already wrote, now hitting EU infrastructure under Dutch law. Five steps, copy-pasteable.
-
Step 1
Sign up
Join the waitlist. Approved accounts get a workspace and a personal API key within one business day.
-
Step 2
Grab an API key
Workspace dashboard, settings, API keys. Keys are scoped to one workspace and revocable independently.
-
Step 3
Point your SDK
Set base_url to https://api.cloudhorizons.ai/v1 in the OpenAI client. Nothing else changes.
-
Step 4
Pick a model
kimi-k2.5, glm-4.6, qwen-3-coder, minimax-m2.5, llama-3.3-70b, mistral-large-3. Same identifiers in every SDK.
-
Step 5
Send a request
Same chat.completions call, same response shape. Streaming works. Function calling works. Embeddings on a separate endpoint.
Code examples
Three flavors. The Python and Node ones use the official OpenAI SDK with the base URL switched. The curl one is for shell smoke-testing.
Python (openai)
from openai import OpenAI
client = OpenAI(
base_url="https://api.cloudhorizons.ai/v1",
api_key="ch_live_...", # your Cloud Horizon key
)
response = client.chat.completions.create(
model="kimi-k2.5",
messages=[{"role": "user", "content": "Summarize this policy doc."}],
extra_headers={
"Cloud-Horizons-Region": "eu-ams-1",
"Cloud-Horizons-Audit-Tag": "tenant=acme;workflow=policy",
},
)
print(response.choices[0].message.content) Node (openai)
import OpenAI from 'openai'
const client = new OpenAI({
baseURL: 'https://api.cloudhorizons.ai/v1',
apiKey: process.env.CLOUD_HORIZONS_API_KEY,
})
const response = await client.chat.completions.create(
{
model: 'kimi-k2.5',
messages: [{ role: 'user', content: 'Summarize this policy doc.' }],
},
{
headers: {
'Cloud-Horizons-Region': 'eu-ams-1',
'Cloud-Horizons-Audit-Tag': 'tenant=acme;workflow=policy',
},
},
)
console.log(response.choices[0].message.content) curl (shell)
curl https://api.cloudhorizons.ai/v1/chat/completions \
-H "Authorization: Bearer $CLOUD_HORIZONS_API_KEY" \
-H "Content-Type: application/json" \
-H "Cloud-Horizons-Region: eu-ams-1" \
-H "Cloud-Horizons-Audit-Tag: tenant=acme;workflow=policy" \
-d '{
"model": "kimi-k2.5",
"messages": [{"role": "user", "content": "Summarize this policy doc."}]
}' LangChain (Python)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://api.cloudhorizons.ai/v1",
api_key="ch_live_...",
model="glm-4.6",
default_headers={
"Cloud-Horizons-Region": "eu-fra-1",
"Cloud-Horizons-Audit-Tag": "tenant=acme",
},
)
print(llm.invoke("Compare GDPR Article 30 and DORA records of processing.").content) Common headers
Optional on every request, no surcharge except the zero-retention path. Set them per request, not globally.
| Header | Allowed values | What it does |
|---|---|---|
| Cloud-Horizons-Region | eu-ams-1, eu-fra-1 | Pin a request to a specific EU region. Returns 422 if the chosen model is not deployed there, no silent failover across borders. |
| Cloud-Horizons-Audit-Tag | free-form key=value pairs | Attached to the audit log. Use for tenant isolation, workflow attribution, customer ID, anything procurement asks you to trace later. |
| Cloud-Horizons-Log-Retention | 0d, 7d, 30d, 90d | Override default 30-day retention. 0d is the zero-retention path: prompt and response never written to disk, 5 percent surcharge. |
| Cloud-Horizons-Redact-Pii | true / false | Strip names, emails, phone, IBAN, BSN, NHS numbers, IP addresses before the model sees the prompt. Restored in the response on the way back. |
| Cloud-Horizons-Tenant | opaque tenant ID | Logical workspace partition for multi-tenant apps. Enforces per-tenant rate limits and audit log isolation independent of the API key. |
Error codes
Same shape as the OpenAI error model. The error object has
code, message, and type fields.
| HTTP | Code | Meaning |
|---|---|---|
| 401 | unauthorized | API key missing or revoked. Check the workspace dashboard. |
| 403 | forbidden | Key valid but the model or region is not enabled for this plan. |
| 422 | unprocessable | Region pin requested for a model not deployed there. Switch model or drop the pin. |
| 429 | too_many_requests | Rate limit hit. Headers Cloud-Horizons-RateLimit-Reset says when to retry. |
| 498 | redaction_failed | PII redaction toggled on but the redactor could not parse the input. Retry without the flag. |
| 503 | model_overloaded | Inference capacity saturation. Retry with exponential backoff or fail over to a sibling model. |
Migrating from OpenAI in production
A staged cutover for teams already running OpenAI in production. The shape we recommend.
- 1. Wrap the OpenAI client construction
If you have
new OpenAI(...)sprinkled across the codebase, centralize it. One factory function, one place to flip base_url. - 2. Run a 5 percent shadow
Send 5 percent of production traffic to Cloud Horizon, log responses to a side table, compare token counts and latency. Most teams find no drift on the chat completions path.
- 3. Map your custom OpenAI features
If you use vision, audio, or specific tool-call schemas, check our docs for parity. We support the chat-completions, embeddings, and tool-call paths. We do not yet have audio.
- 4. Cut over a tenant or workspace at a time
Per-tenant config is the cleanest unit. Migrate the EU customers first, leave US tenants on OpenAI until residency catches up to your roadmap.
- 5. Decom the OpenAI keys you no longer need
Reduce blast radius. Once a tenant is fully on Cloud Horizon, revoke the OpenAI key for that workspace.
Where to go next
Read the docs
Full API reference, error model, streaming, function calling, embeddings.
Pricing breakdown
Plans, per-model token rates, the live token-economics calculator.
Security details
Encryption, audit logs, sub-processors, incident response. The page procurement asks for.
Compare alternatives
Side-by-side vs Mistral La Plateforme, OpenRouter, Aleph Alpha.
Models catalog
Six open-weights models with strengths, weaknesses, recommended use cases.
EU SaaS deep dive
How an EU SaaS team adopts Cloud Horizon end-to-end.
Stuck on something?
Email [email protected]
Personal plan: 2 business day SLA. Team plan: 4-hour weekday Slack response. Enterprise: named technical account manager.