For the complete documentation index, see llms.txt.
Introduction
CrofAI is an OpenAI-compatible API with lower cost. Use the same SDK, access multiple models, and pay less for inference.
Base URL:https://crof.ai/v1
Authentication
All API requests require an API key passed in theAuthorizationheader. You can create and manage keys from your dashboard.
curl https://crof.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-api-key" \
-d '{
"model": "glm-5",
"messages": [{"role": "user", "content": "Hello!"}]
}'Quickstart
If you already use the OpenAI SDK, you're ready. Just swap the base URL and API key.
from openai import OpenAI
client = OpenAI(
base_url="https://crof.ai/v1",
api_key="sk-your-api-key",
)
response = client.chat.completions.create(
model="glm-5",
messages=[{"role": "user", "content": "Explain quantum computing"}],
)
print(response.choices[0].message.content)Chat Completions
Create a model response for a given chat conversation. This is the primary endpoint for interacting with language models.
Endpoint
/v1/chat/completionsSupported Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | ID of the model to use. See Models for available options. |
messages | array | Yes | A list of messages comprising the conversation. Each message has a role and content. |
max_tokens | integer | No | The maximum number of tokens to generate. Defaults to model-specific limits. |
temperature | number | No | Sampling temperature between 0 and 2. Lower values are more deterministic. |
top_p | number | No | Nucleus sampling threshold. Alternative to temperature. |
stop | string | array | No | One or more sequences where generation will stop. |
seed | integer | No | Seed for deterministic sampling. |
stream | boolean | No | If true, responses are streamed as server-sent events. See Streaming. |
tools | array | No | A list of functions the model may call. See Tool Use. |
response_format | object | No | Enforce JSON output schema. See Structured Outputs. |
reasoning_effort | string | No | Controls thinking depth for reasoning models. Accepted values: "low", "medium", "high", "none". See Reasoning Models. |
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1712345678,
"model": "glm-5",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses qubits..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 47,
"total_tokens": 59
}
}Models
List all available models. Each model supports different context windows, output limits, and pricing tiers.
Endpoint
/v1/modelsResponse
Each model object includes context length, pricing, quantization, and an estimated speed in tokens per second.
{
"context_length": 163840,
"created": 1755799640,
"id": "deepseek-v3.2",
"max_completion_tokens": 163840,
"name": "DeepSeek: DeepSeek V3.2",
"pricing": {
"completion": "0.00000038",
"prompt": "0.00000028"
},
"quantization": "Q4_0",
"speed": 50
}For a complete list of models with pricing, see the pricing page.
Errors
CrofAI uses standard HTTP status codes to indicate success or failure. Error responses include a JSON body with details.
| Code | Meaning |
|---|---|
400 | Bad Request — invalid parameters or malformed JSON. |
401 | Unauthorized — invalid or missing API key. |
402 | Payment Required — insufficient credits or expired plan. |
429 | Rate Limited — too many requests. |
500 | Server Error — something went wrong on our end. Try again. |
{
"error": {
"message": "Invalid API key",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}Streaming
Setstream: trueto receive responses as server-sent events. Each event contains a delta with the next chunk of generated text.
from openai import OpenAI
client = OpenAI(
base_url="https://crof.ai/v1",
api_key="sk-your-api-key",
)
stream = client.chat.completions.create(
model="glm-5",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Vision Models
Several models support image input. Pass images as URLs or base64-encoded data URLs in the message content array. Vision-capable models are labeled on the pricing page with an icon.
response = client.chat.completions.create(
model="kimi-k2.5", # vision models are labeled on the pricing page
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image.png"}
}
]
}],
)Reasoning Models
Reasoning models think before responding. You can control how much compute they use with thereasoning_effortparameter.
| Value | Behavior |
|---|---|
"none" | Disables reasoning entirely. Lowest latency and cost. |
"low" | Minimal thinking. Good for simpler tasks. |
"medium" | Balanced. Suitable for most use cases. |
"high" | Maximum compute budget. Best for complex problems. |
Reasoning tokens are streamed viadelta.reasoning_contentand final output viadelta.content.
from openai import OpenAI
client = OpenAI(
base_url="https://crof.ai/v1",
api_key="sk-your-api-key",
)
response = client.chat.completions.create(
model="MODEL-FROM-LIST",
messages=[{"role": "user", "content": "Solve this step by step: what is 17 * 38?"}],
stream=True,
reasoning_effort="low", # "low" | "medium" | "high" | "none"
)
for chunk in response:
try:
if chunk.choices and chunk.choices[0].delta.reasoning_content:
print(chunk.choices[0].delta.reasoning_content, end="", flush=True)
except AttributeError:
pass
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
print()Tool Use
Pass atoolsarray to let the model call functions you define. The model will return a tool call with the function name and arguments when appropriate.
from openai import OpenAI
import json
client = OpenAI(
base_url="https://crof.ai/v1",
api_key="sk-your-api-key",
)
tools = [{
"type": "function",
"function": {
"name": "get_horoscope",
"description": "Get today's horoscope for an astrological sign.",
"parameters": {
"type": "object",
"properties": {
"sign": {
"type": "string",
"description": "An astrological sign like Taurus or Aquarius",
},
},
"required": ["sign"],
"additionalProperties": False,
},
"strict": True,
},
}]
messages = [{"role": "user", "content": "What is my horoscope? I am an Aquarius."}]
stream = client.chat.completions.create(
model="MODEL-FROM-LIST",
messages=messages,
tools=tools,
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)
if delta.tool_calls:
for tc in delta.tool_calls:
print(f"\nTool call: {tc.function.name}")
if tc.function.arguments:
print(f"Args: {tc.function.arguments}")Structured Outputs
Useresponse_formatwith a JSON schema to enforce structured output. The model will return valid JSON matching your schema.
from openai import OpenAI
client = OpenAI(
base_url="https://crof.ai/v1",
api_key="sk-your-api-key",
)
response = client.chat.completions.create(
model="MODEL-FROM-LIST",
messages=[
{"role": "user", "content": "List 3 planets with their diameter in km and whether they have rings."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "planet_list",
"strict": True,
"schema": {
"type": "object",
"properties": {
"planets": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"diameter_km": {"type": "number"},
"has_rings": {"type": "boolean"}
},
"required": ["name", "diameter_km", "has_rings"],
"additionalProperties": False
}
}
},
"required": ["planets"],
"additionalProperties": False
}
}
}
)
print(response.choices[0].message.content)Prompt Caching
Repeated prompts are automatically cached, reducing input token costs by up to 80%. Cached tokens are billed at the$ / M cacherate shown on the pricing page.
No configuration is needed. Caching is enabled by default for all models that support it.
Usage API
Check your account's remaining requests and credit balance.
Endpoint
/usage_api/Response
{
"usable_requests": 450,
"credits": 12.3456
}| Field | Description |
|---|---|
usable_requests | Requests remaining today. null if not on a subscription plan. |
credits | Available credit balance. |
import requests
response = requests.get(
"https://crof.ai/usage_api/",
headers={"Authorization": "Bearer YOUR_API_KEY"}
)
data = response.json()
print(f"Requests left: {data['usable_requests']}")
print(f"Credits: {data['credits']}")OpenCode
CrofAI can be used as a provider in OpenCode. Create an opencode.json file in your project root (or ~/.config/opencode/opencode.json for global config) with the following contents, then set your API key as an environment variable.
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"CrofAI": {
"npm": "@ai-sdk/openai-compatible",
"name": "CrofAI",
"options": {
"baseURL": "https://crof.ai/v1",
"apiKey": "API_KEY_HERE"
},
"models": {
"kimi-k2.5": {
"name": "CrofAI: kimi-k2.5",
"limit": { "context": 262144, "output": 262144 }
},
"deepseek-v3.2": {
"name": "CrofAI: deepseek-v3.2",
"limit": { "context": 163840, "output": 163840 }
}
}
}
}
}Set your key with export CROFAI_API_KEY="your-api-key-here", then open OpenCode and select CrofAI as your provider. The full list of supported models is on the pricing page.
Anthropic Endpoint
CrofAI also exposes an Anthropic-compatible endpoint for use with the Anthropic SDK or any tool expecting the native Anthropic API format.
Base URL:https://anthropic.nahcrof.com
Messages endpoint:https://anthropic.nahcrof.com/v1/messages