Documentation Index
Fetch the complete documentation index at: https://docs.cowagent.ai/llms.txt
Use this file to discover all available pages before exploring further.
OpenAI offers the most complete coverage and can simultaneously serve text chat, vision understanding, image generation, speech-to-text (ASR), text-to-speech (TTS), and embedding. A single open_ai_api_key lets the Agent use all of these capabilities.
All capabilities below can be configured in one place via the “Model Management” page in the Web Console, with no need to manually edit the configuration file.
Text Chat
{
"model": "gpt-5.5",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.openai.com/v1"
}
| Parameter | Description |
|---|
model | Same as OpenAI’s model parameter; supports gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, the gpt-5 series, gpt-4.1, the o-series, etc. Agent mode defaults to gpt-5.5; use gpt-5.4 for better cost-efficiency |
open_ai_api_key | Create one on the OpenAI Platform |
open_ai_api_base | Optional; change it to access a third-party proxy |
bot_type | Not required when using OpenAI’s official models; set to openai when accessing other vendors via the compatible protocol |
Image Understanding
OpenAI models like gpt-5.5, gpt-5.4, gpt-4o, and gpt-4.1 natively support vision. Once open_ai_api_key is configured, the Agent’s Vision tool automatically uses the main model to recognize images. If the main model does not support vision or you want to specify it explicitly, set it in the configuration file:
{
"tools": {
"vision": {
"model": "gpt-5.4-mini"
}
}
}
Supported Vision models: gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5, gpt-4.1, gpt-4.1-mini, gpt-4o.
Image Generation
Specify the image generation model in the configuration file; the Agent automatically routes image generation skill calls to OpenAI:
{
"skills": {
"image-generation": {
"model": "gpt-image-2"
}
}
}
Supported image generation models: gpt-image-2, gpt-image-1.
Speech-to-Text (ASR)
{
"voice_to_text": "openai",
"voice_to_text_model": "gpt-4o-mini-transcribe"
}
| Parameter | Description |
|---|
voice_to_text | Set to openai to enable OpenAI speech-to-text |
voice_to_text_model | Optional, defaults to gpt-4o-mini-transcribe; can also be gpt-4o-transcribe, whisper-1 |
Credentials are automatically reused from open_ai_api_key.
Text-to-Speech (TTS)
{
"text_to_voice": "openai",
"text_to_voice_model": "tts-1",
"tts_voice_id": "alloy"
}
| Parameter | Description |
|---|
text_to_voice_model | tts-1, tts-1-hd, gpt-4o-mini-tts |
tts_voice_id | Voices: alloy, echo, fable, onyx, nova, shimmer, ash, ballad, coral, sage, verse |
Embedding
{
"embedding_provider": "openai",
"embedding_model": "text-embedding-3-small"
}
Available models: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002. After changing the embedding, run /memory rebuild-index to rebuild the index.