For the complete documentation index, see llms.txt.

Introduction

CrofAI is an OpenAI-compatible API with lower cost. Use the same SDK, access multiple models, and pay less for inference.

Base URL:https://crof.ai/v1

Authentication

All API requests require an API key passed in theAuthorizationheader. You can create and manage keys from your dashboard.

bash

curl https://crof.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key" \
  -d '{
    "model": "glm-5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Quickstart

If you already use the OpenAI SDK, you're ready. Just swap the base URL and API key.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://crof.ai/v1",
    api_key="sk-your-api-key",
)

response = client.chat.completions.create(
    model="glm-5",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
)

print(response.choices[0].message.content)

Chat Completions

Create a model response for a given chat conversation. This is the primary endpoint for interacting with language models.

Endpoint

POST/v1/chat/completions

Supported Parameters

Parameter	Type	Required	Description
`model`	string	Yes	ID of the model to use. See Models for available options.
`messages`	array	Yes	A list of messages comprising the conversation. Each message has a `role` and `content`.
`max_tokens`	integer	No	The maximum number of tokens to generate. Defaults to model-specific limits.
`temperature`	number	No	Sampling temperature between 0 and 2. Lower values are more deterministic.
`top_p`	number	No	Nucleus sampling threshold. Alternative to temperature.
`stop`	string \| array	No	One or more sequences where generation will stop.
`seed`	integer	No	Seed for deterministic sampling.
`stream`	boolean	No	If true, responses are streamed as server-sent events. See Streaming.
`tools`	array	No	A list of functions the model may call. See Tool Use.
`response_format`	object	No	Enforce JSON output schema. See Structured Outputs.
`reasoning_effort`	string	No	Controls thinking depth for reasoning models. Accepted values: `"low"`, `"medium"`, `"high"`, `"none"`. See Reasoning Models.

Response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1712345678,
  "model": "glm-5",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Quantum computing uses qubits..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 47,
    "total_tokens": 59
  }
}

Models

List all available models. Each model supports different context windows, output limits, and pricing tiers.

Endpoint

GET/v1/models

Response

Each model object includes context length, pricing, quantization, and an estimated speed in tokens per second.

json

{
  "context_length": 163840,
  "created": 1755799640,
  "id": "deepseek-v3.2",
  "max_completion_tokens": 163840,
  "name": "DeepSeek: DeepSeek V3.2",
  "pricing": {
    "completion": "0.00000038",
    "prompt": "0.00000028"
  },
  "quantization": "Q4_0",
  "speed": 50
}

For a complete list of models with pricing, see the pricing page.

Errors

CrofAI uses standard HTTP status codes to indicate success or failure. Error responses include a JSON body with details.

Code	Meaning
`400`	Bad Request — invalid parameters or malformed JSON.
`401`	Unauthorized — invalid or missing API key.
`402`	Payment Required — insufficient credits or expired plan.
`429`	Rate Limited — too many requests.
`500`	Server Error — something went wrong on our end. Try again.

json

{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Streaming

Setstream: trueto receive responses as server-sent events. Each event contains a delta with the next chunk of generated text.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://crof.ai/v1",
    api_key="sk-your-api-key",
)

stream = client.chat.completions.create(
    model="glm-5",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Vision Models

Several models support image input. Pass images as URLs or base64-encoded data URLs in the message content array. Vision-capable models are labeled on the pricing page with an icon.

python

response = client.chat.completions.create(
    model="kimi-k2.5",  # vision models are labeled on the pricing page
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {"url": "https://example.com/image.png"}
            }
        ]
    }],
)

Reasoning Models

Reasoning models think before responding. You can control how much compute they use with thereasoning_effortparameter.

Value	Behavior
`"none"`	Disables reasoning entirely. Lowest latency and cost.
`"low"`	Minimal thinking. Good for simpler tasks.
`"medium"`	Balanced. Suitable for most use cases.
`"high"`	Maximum compute budget. Best for complex problems.

Reasoning tokens are streamed viadelta.reasoning_contentand final output viadelta.content.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://crof.ai/v1",
    api_key="sk-your-api-key",
)

response = client.chat.completions.create(
    model="MODEL-FROM-LIST",
    messages=[{"role": "user", "content": "Solve this step by step: what is 17 * 38?"}],
    stream=True,
    reasoning_effort="low",  # "low" | "medium" | "high" | "none"
)

for chunk in response:
    try:
        if chunk.choices and chunk.choices[0].delta.reasoning_content:
            print(chunk.choices[0].delta.reasoning_content, end="", flush=True)
    except AttributeError:
        pass
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
print()

Tool Use

Pass atoolsarray to let the model call functions you define. The model will return a tool call with the function name and arguments when appropriate.

python

from openai import OpenAI
import json

client = OpenAI(
    base_url="https://crof.ai/v1",
    api_key="sk-your-api-key",
)

tools = [{
    "type": "function",
    "function": {
        "name": "get_horoscope",
        "description": "Get today's horoscope for an astrological sign.",
        "parameters": {
            "type": "object",
            "properties": {
                "sign": {
                    "type": "string",
                    "description": "An astrological sign like Taurus or Aquarius",
                },
            },
            "required": ["sign"],
            "additionalProperties": False,
        },
        "strict": True,
    },
}]

messages = [{"role": "user", "content": "What is my horoscope? I am an Aquarius."}]

stream = client.chat.completions.create(
    model="MODEL-FROM-LIST",
    messages=messages,
    tools=tools,
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)
    if delta.tool_calls:
        for tc in delta.tool_calls:
            print(f"\nTool call: {tc.function.name}")
            if tc.function.arguments:
                print(f"Args: {tc.function.arguments}")

Structured Outputs

Useresponse_formatwith a JSON schema to enforce structured output. The model will return valid JSON matching your schema.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://crof.ai/v1",
    api_key="sk-your-api-key",
)

response = client.chat.completions.create(
    model="MODEL-FROM-LIST",
    messages=[
        {"role": "user", "content": "List 3 planets with their diameter in km and whether they have rings."}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "planet_list",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "planets": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "diameter_km": {"type": "number"},
                                "has_rings": {"type": "boolean"}
                            },
                            "required": ["name", "diameter_km", "has_rings"],
                            "additionalProperties": False
                        }
                    }
                },
                "required": ["planets"],
                "additionalProperties": False
            }
        }
    }
)
print(response.choices[0].message.content)

Prompt Caching

Repeated prompts are automatically cached, reducing input token costs by up to 80%. Cached tokens are billed at the$ / M cacherate shown on the pricing page.

No configuration is needed. Caching is enabled by default for all models that support it.

Usage API

Check your account's remaining requests and credit balance.

Endpoint

GET/usage_api/

Response

json

{
  "usable_requests": 450,
  "credits": 12.3456
}

Field	Description
`usable_requests`	Requests remaining today. `null` if not on a subscription plan.
`credits`	Available credit balance.

python

import requests

response = requests.get(
    "https://crof.ai/usage_api/",
    headers={"Authorization": "Bearer YOUR_API_KEY"}
)
data = response.json()
print(f"Requests left: {data['usable_requests']}")
print(f"Credits: {data['credits']}")

OpenCode

CrofAI can be used as a provider in OpenCode. Create an opencode.json file in your project root (or ~/.config/opencode/opencode.json for global config) with the following contents, then set your API key as an environment variable.

json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "CrofAI": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "CrofAI",
      "options": {
        "baseURL": "https://crof.ai/v1",
        "apiKey": "API_KEY_HERE"
      },
      "models": {
        "kimi-k2.5": {
          "name": "CrofAI: kimi-k2.5",
          "limit": { "context": 262144, "output": 262144 }
        },
        "deepseek-v3.2": {
          "name": "CrofAI: deepseek-v3.2",
          "limit": { "context": 163840, "output": 163840 }
        }
      }
    }
  }
}

Set your key with export CROFAI_API_KEY="your-api-key-here", then open OpenCode and select CrofAI as your provider. The full list of supported models is on the pricing page.

Anthropic Endpoint

CrofAI also exposes an Anthropic-compatible endpoint for use with the Anthropic SDK or any tool expecting the native Anthropic API format.

Base URL:https://anthropic.nahcrof.com

Messages endpoint:https://anthropic.nahcrof.com/v1/messages