Configuration
Ways to provide configuration
There are multiple ways to provide configuration to Erato.
All sources are configuration are merged together into one, so configuration can be specified via e.g. erato.toml, as well as via environment variables.
As of now, there is no specified precedence order when it comes to values provided for the same configuration key that are provided in different sources.
erato.toml files
The erato.toml file is the preferred way to provide configuration for Erato.
The file must be placed in the current working directory of the Erato process.
In the Helm chart, a secret from where the erato.toml file should be mounted can be specified via backend.configFile.
*.auto.erato.toml files
In addition to the main erato.toml file, Erato will also auto-discover all files matching the pattern *.auto.erato.toml in the current working directory.
This is useful if you e.g. want to split out all the secret values (LLM API keys, Database credentials) into a different file that is not checked into source control.
Environment variables
Configuration can also be provided via environment variables.
Though it is not recommended, values for nested configuration can also be provided via environment variables.
In that case, each nesting level is separated by double underscores (__). E.g. CHAT_PROVIDER__BASE_URL is equivalent to chat_provider.base_url.
Common config options
prompt
Several configuration fields accept prompts in the same structure:
- string — plain prompt text used directly as the prompt value.
- object — a prompt source specification with fields:
source:"static"or"langfuse".prompt: prompt text used whensource = "static".prompt_name: Langfuse prompt identifier used whensource = "langfuse".label: optional Langfuse environment/label value.fallback: optional fallback prompt text if the Langfuse prompt cannot be loaded.
The prompt format also supports template placeholders where applicable.
Configuration reference
🚧Work in progress; Covers ~80% of available configuration options 🚧
server
server.encryption_key
Optional encryption key used by the backend for encrypting application data before storing it in the database.
The key must be base64-encoded and must decode to exactly 32 bytes. This matches the AES-256-GCM-SIV key size used by the backend.
To generate a suitable key:
openssl rand -base64 32When providing this value via an environment variable, use SERVER__ENCRYPTION_KEY.
Default value: None
Type: string | None
Example
[server]
encryption_key = "replace-with-openssl-output"frontend
frontend.theme
The name of the theme to use for the frontend.
When provided, the theme must be part of the frontend bundle (usually located in the public directory), and placed in the custom-theme directory, under the name provided here.
E.g. if frontend.theme is set to my-theme, the theme must be placed in public/custom-theme/my-theme.
See Theming for more information about themes and theme directory structure.
If not provided, the default bundled theme will be used.
Default value: None
Type: string | None
Example
[frontend]
theme = "my-theme"frontend.translation_po_compilation_mode
Controls how frontend Lingui translation catalogs are served. precompiled serves existing messages.json files. just_in_time compiles sibling messages.po files when messages.json is requested and caches each compiled result for the server lifetime.
Default value: precompiled
Type: precompiled | just_in_time
Example
[frontend]
translation_po_compilation_mode = "just_in_time"frontend.disable_upload
Whether to disable file upload functionality in the UI. This is useful for embedded scenarios where file uploads should not be available.
Default value: false
Type: boolean
Example
[frontend]
disable_upload = truefrontend.disable_chat_input_autofocus
Whether to disable automatic focusing of the chat input field. This prevents unwanted scrolling behavior when navigating to pages with embedded chat widgets.
When enabled, the chat input field will not automatically receive focus on page load/navigation.
Default value: false
Type: boolean
Example
[frontend]
disable_chat_input_autofocus = truefrontend.disable_logout
Whether to hide logout functionality from the UI. This is useful for embedded scenarios where logout should be handled by the parent application.
When enabled, the logout button/option will be hidden from the user interface.
Default value: false
Type: boolean
Example
[frontend]
disable_logout = truefrontend.enable_message_feedback
Whether to enable message feedback functionality in the UI. When enabled, users can submit thumbs up/down ratings with optional comments for individual messages.
This feature allows collecting user feedback on message quality, which can be useful for improving model performance and understanding user satisfaction.
Default value: false
Type: boolean
Example
[frontend]
enable_message_feedback = truefrontend.enable_message_feedback_comments
Whether to enable the comment text field in message feedback. When enabled, users can add optional text comments along with their thumbs up/down ratings.
This setting requires enable_message_feedback to be true to have any effect.
Default value: false
Type: boolean
Example
[frontend]
enable_message_feedback = true
enable_message_feedback_comments = truefrontend.message_feedback_edit_time_limit_seconds
Time limit in seconds for editing message feedback after creation. When set, feedback can only be edited within this time window after being submitted. When not set (default), feedback can be edited at any time.
This setting only affects editing existing feedback - initial feedback submission is always allowed regardless of this setting. Attempts to edit feedback after the time limit will return a 403 Forbidden error.
Default value: None (unlimited editing)
Type: integer (seconds)
Example
[frontend]
enable_message_feedback = true
message_feedback_edit_time_limit_seconds = 300 # 5 minutesfrontend.sidebar_collapsed_mode
Controls the behavior of the collapsed sidebar state.
When set to "slim", the collapsed sidebar shows icon-only navigation elements. When set to "hidden" (default), the sidebar is completely hidden when collapsed.
Default value: "hidden"
Type: string
Supported values:
"hidden"- Sidebar is completely hidden when collapsed (default behavior)"slim"- Sidebar shows icon-only navigation when collapsed
Example:
[frontend]
sidebar_collapsed_mode = "slim"frontend.sidebar_logo_path
Optional path to a sidebar-specific logo file. When provided, this logo is displayed in the sidebar header instead of the theme logo. Useful for showing a compact logo variant in the sidebar.
Default value: None (uses theme logo)
Type: string | None
Example:
[frontend]
sidebar_logo_path = "/custom-theme/my-company/sidebar-logo.svg"frontend.sidebar_logo_dark_path
Optional path to a sidebar-specific logo file for dark mode. When provided along with sidebar_logo_path, this logo is displayed in the sidebar when dark mode is active.
Default value: None (uses sidebar_logo_path or theme logo)
Type: string | None
Example:
[frontend]
sidebar_logo_path = "/custom-theme/my-company/sidebar-logo.svg"
sidebar_logo_dark_path = "/custom-theme/my-company/sidebar-logo-dark.svg"frontend.additional_environment
Additional values to inject into the frontend environment as global variables.
These will be made available to the frontend Javascript, and added to the window object.
This is a dictionary where each value can be a string or a map (string key, string value).
This may be useful if you are using a forked version of the frontend, which you need to pass some configuration to.
Default value: None
Type: object<string, any>
Example
[frontend]
additional_environment = { "FOO": "bar" }This will be inejcted into the frontend as:
window.FOO = "bar";frontend.extra_frame_ancestors
Additional Content-Security-Policy frame-ancestors sources that may embed the frontend.
Erato always includes 'self'. When the Microsoft Office add-in is enabled, Erato also includes https://outlook.office.com.
Default value: []
Type: array<string>
Example
[frontend]
extra_frame_ancestors = ["https://outlook.cloud.microsoft"]frontend.allow_any_frame_ancestor
Whether to omit the frame-ancestors directive from the frontend Content-Security-Policy header.
Use this only when Erato must be embeddable from unrestricted parent pages.
Default value: false
Type: boolean
Example
[frontend]
allow_any_frame_ancestor = truefrontend.component_kits.directory
Directory containing runtime-loaded frontend component kits.
Each direct subdirectory is treated as one component kit. For each kit, Erato
serves files under /public/component-kits/<kit-name>/ and injects a root-level
index-<hash>.js module script before the main frontend bundle. If a root-level
.css file is present, it is injected as a stylesheet in the same kit order.
The backend infers kit metadata from files and directories. There is no manifest file in the current component-kit format.
Default value: /app/component-kits
Type: string
Example
[frontend.component_kits]
directory = "/app/component-kits"i18n
Internationalization-related settings used by the backend when determining user language preferences.
i18n.language
Configuration for language detection and defaulting.
i18n.language.language_detection_priority
Ordered list of language sources to evaluate when resolving a user’s preferred language.
id_token_any expands to checking id_token_xms_pl then id_token_xms_tpl in that order.
Supported values:
id_token_anyid_token_xms_plid_token_xms_tplbrowser_accept_language
Type: array<string>
Default value: ["id_token_any", "browser_accept_language"]
Any value omitted in the order is treated as unsupported for that source and detection continues to the next value.
i18n.language.default_language
Fallback language used after no configured sources match.
Type: string
Default value: "en"
Example:
[i18n]
[i18n.language]
language_detection_priority = ["id_token_xms_tpl", "browser_accept_language"]
default_language = "de"guardrails
Global guardrail definitions used by chat-provider filtering. Guardrail definitions do not run by themselves; enable them through chat_providers.all_providers.guardrails or a provider-specific chat_providers.providers.<provider-id>.guardrails block.
guardrails.prompt_patterns.<pattern-id>
Defines a prompt-injection pattern that can later be selected by ID or by tag.
Type: object
Fields:
type- Matching strategy. Supported values:"fixed"for substring matching,"regex"for Rustregexcrate regular expressions.pattern- The substring or regular expression to detect.language- Optional BCP-47-style language hint such as"en".tags- Tags used to select groups of patterns, for example["input", "prompt_injection"].
Example:
[guardrails.prompt_patterns.ignore_previous_instructions]
type = "fixed"
pattern = "ignore all previous instructions"
language = "en"
tags = ["input", "prompt_injection"]chat_provider (deprecated)
⚠️ Deprecated: Please use chat_providers.providers.<provider-id> instead for multiple provider support and better flexibility.
Configuration for a single chat provider (LLM) that Erato will use for generating responses. This is maintained for backward compatibility but will be automatically migrated to the new chat_providers format.
chat_providers
Configuration for multiple chat providers with priority ordering. This is the recommended way to configure chat providers as it supports multiple providers, fallback mechanisms, and better flexibility.
chat_providers.priority_order
An ordered list of provider IDs that defines the priority for provider selection. The first provider in the list has the highest priority.
Type: array of strings
Required: Yes
Example: ["primary", "backup", "local"]
chat_providers.all_providers.guardrails.filter_input_prompt_injection
Default prompt-injection filtering settings for all configured chat providers.
If chat_providers.providers.<provider-id>.guardrails is configured for a provider, that provider-level block completely overrides this all_providers block instead of merging with it.
Type: object
Fields:
enabled(default: false) - Enables prompt-injection filtering before requests are sent to the model.filter_pattern_ids(default: []) - Explicit pattern IDs fromguardrails.prompt_patternsto apply.filter_pattern_tags(default: []) - Pattern tags to apply.
Example:
[chat_providers.all_providers.guardrails.filter_input_prompt_injection]
enabled = true
filter_pattern_tags = ["input"]chat_providers.providers.<provider-id>
A map of provider configurations, where each key is a unique provider ID and the value is a provider configuration object.
Type: object
Required: Yes
Example:
[chat_providers]
priority_order = ["primary", "backup"]
[chat_providers.providers.primary]
provider_kind = "openai"
model_name = "gpt-4"
model_display_name = "GPT-4 (Primary)"
model_description = "Best quality for complex tasks"
model_icon = "builtin-chatgpt"
api_key = "sk-primary-key"
[chat_providers.providers.primary.model_capabilities]
context_size_tokens = 8192
supports_image_understanding = false
cost_input_tokens_per_1m = 30.0
cost_output_tokens_per_1m = 60.0
[chat_providers.providers.backup]
provider_kind = "openai"
model_name = "gpt-3.5-turbo"
model_display_name = "GPT-3.5 Turbo (Backup)"
model_description = "Faster and cheaper fallback model"
model_icon = "simpleicons-anthropic"
api_key = "sk-backup-key"
base_url = "https://api.backup-provider.com/v1/"
[chat_providers.providers.backup.model_capabilities]
context_size_tokens = 16385
supports_image_understanding = false
cost_input_tokens_per_1m = 1.0
cost_output_tokens_per_1m = 2.0Provider Configuration Fields
Each provider in chat_providers.providers supports the following fields:
provider_kind- The type of chat provider ("openai","azure_openai","ollama")model_name- The model identifier for the providermodel_display_name- Optional display name for the model (falls back tomodel_name)model_description- Optional frontend description shown below the model name; translatable viachat_models.<chat-provider-id>.descriptionmodel_icon- Optional frontend icon id, for exampleiconoir-*,simpleicons-*,builtin-chatgpt, or/path/to/icon.svgmodel_name_langfuse- Optional model name for Langfuse reporting (falls back tomodel_name)api_key- Optional API key for authenticationbase_url- Optional custom base URL for the provider APIsystem_prompt- Optional prompt source specification (string or object; seeprompt)system_prompt_langfuse- Optional Langfuse prompt configuration (deprecated, mutually exclusive withsystem_prompt)additional_request_parameters- Optional array of additional request parametersadditional_request_headers- Optional array of additional request headershallucination_suppression- Optional settings to suppress non-substantive outputsguardrails- Optional provider-level prompt-injection filtering settingsmodel_capabilities- Optional configuration for model capabilities and limitationsmodel_settings- Optional configuration for model behavior and generation settings
chat_providers.providers.<provider-id>.provider_kind
The type of chat provider to use.
Type: string
Supported values: "openai", "azure_openai"
Example: "openai"
chat_providers.providers.<provider-id>.model_name
The name of the model to use with the chat provider.
Type: string
Example: "gpt-4o"
chat_providers.providers.<provider-id>.model_display_name
The display name for the model shown to users. If not provided, falls back to model_name.
Type: string | None
Example: "GPT-4 Omni (Production)"
chat_providers.providers.<provider-id>.model_description
An optional description shown below the model display name in the frontend model selector.
The frontend resolves this through a dynamic Lingui translation key using the chat provider id:
chat_models.<chat-provider-id>.description. If no translation exists, the configured value is used as the fallback text.
Type: string | None
Example: "Fast, lower-cost model for everyday questions"
chat_providers.providers.<provider-id>.model_icon
An optional icon identifier shown next to the model in the frontend model selector.
Supported formats:
iconoir-*for icons resolved from theiconoir-reactlibrarysimpleicons-*for icons resolved from thesimple-iconslibrary, for examplesimpleicons-anthropicbuiltin-chatgptfor the built-in ChatGPT/OpenAI mark/path/to/icon.svgfor a custom SVG asset served by the frontend
Type: string | None
Example: "builtin-chatgpt"
chat_providers.providers.<provider-id>.model_name_langfuse
The model name to report to Langfuse for observability and tracing. If not provided, falls back to model_name.
This is useful when the provider’s model name differs from the standardized name used in Langfuse. For example, Azure OpenAI uses deployment names that may differ from the underlying model name, or you may want to use a consistent naming convention across different providers for better analytics in Langfuse.
Type: string | None
Example:
[chat_providers.providers.azure-gpt4]
provider_kind = "azure_openai"
model_name = "gpt-4-deployment-prod-v1" # Azure deployment name
model_name_langfuse = "gpt-4" # Standardized name for Langfusechat_providers.providers.<provider-id>.base_url
The base URL for the chat provider API. If not provided, will use the default for the provider.
Type: string | None
Example: "https://api.openai.com/v1/", "http://localhost:11434/v1/"
chat_providers.providers.<provider-id>.api_key
The API key for the chat provider.
Type: string | None
Example: "sk-..."
chat_providers.providers.<provider-id>.system_prompt
A system prompt specification to use with the chat provider. This sets the behavior and personality of the AI assistant.
Type: string | object | None
A system prompt specification to use with the chat provider. This sets the behavior and personality of the AI assistant. For the shared prompt format, see prompt.
This option is mutually exclusive with system_prompt_langfuse (deprecated). Requires the Langfuse integration to be enabled when using source = "langfuse".
Example:
[chat_providers.providers.main]
provider_kind = "openai"
model_name = "gpt-4"
system_prompt = "You are a helpful assistant that provides concise and accurate answers."Using Langfuse:
[chat_providers.providers.main]
provider_kind = "openai"
model_name = "gpt-4"
system_prompt = { source = "langfuse", prompt_name = "assistant-prompt-v1", label = "production", fallback = "You are a helpful assistant." }Using a static object:
[chat_providers.providers.main]
provider_kind = "openai"
model_name = "gpt-4"
system_prompt = { source = "static", prompt = "You are a helpful assistant that provides concise and accurate answers." }chat_providers.providers.<provider-id>.system_prompt_langfuse
Configuration for using a system prompt from Langfuse prompt management instead of a static prompt.
Deprecated: Use chat_providers.providers.<provider-id>.system_prompt with a prompt source specification instead.
Note: This option is mutually exclusive with system_prompt. Requires the Langfuse integration to be enabled.
chat_providers.providers.<provider-id>.hallucination_suppression.enabled
Controls whether hallucination suppression is enabled for this provider.
When enabled, the model output is monitored for suspicious whitespace-only churn that can indicate degraded generation behavior.
Type: boolean
Default value: false
Example: true
chat_providers.providers.<provider-id>.hallucination_suppression.whitespace_delta_threshold
The number of successive whitespace-only text deltas after which generation is aborted to avoid low-quality or stalled output.
Lower values are more aggressive and may truncate responses earlier; higher values allow more aggressive streaming before suppression triggers.
Type: number
Default value: 20
Example: 15
chat_providers.providers.<provider-id>.guardrails.filter_input_prompt_injection
Provider-specific prompt-injection filtering settings.
When this block is configured for a provider, it completely overrides chat_providers.all_providers.guardrails for that provider. Matching runs after prompt composition and before each request turn is sent to the model. Text content parts are scanned, and tool-call outputs are scanned field-by-field when the tool output is JSON.
If a match is found, generation is aborted with a content-filter-style error that includes the offending pattern ID and the matched text in structured error details.
Type: object | None
Fields:
enabled(default: false) - Enables prompt-injection filtering for this provider.filter_pattern_ids(default: []) - Explicit pattern IDs fromguardrails.prompt_patternsto apply.filter_pattern_tags(default: []) - Pattern tags to apply.
Example:
[guardrails.prompt_patterns.ignore_previous_instructions]
type = "fixed"
pattern = "ignore all previous instructions"
language = "en"
tags = ["input", "prompt_injection"]
[chat_providers.providers.main.guardrails.filter_input_prompt_injection]
enabled = true
filter_pattern_ids = ["ignore_previous_instructions"]
filter_pattern_tags = []chat_providers.providers.<provider-id>.model_capabilities
The model_capabilities field allows you to configure the capabilities and limitations of each chat provider’s model. This information is used for token usage estimation, cost calculation, and determining which features are available.
Type: object | None
Example:
[chat_providers.providers.main.model_capabilities]
context_size_tokens = 128000
supports_image_understanding = true
supports_reasoning = false
supports_reasoning_summary = true
supports_encrypted_reasoning_content = true
supports_audio_input = false
supports_verbosity = false
cost_input_tokens_per_1m = 5.0
cost_output_tokens_per_1m = 15.0Fields:
context_size_tokens(default: 1000000) - Maximum number of tokens that may be provided to the model including input messages, system prompt, and filessupports_image_understanding(default: false) - Whether the model supports being provided with images for understandingsupports_audio_input(default: false) - Whether the model supports binary audio input for the audio transcription, dictation, and conversational audio modessupports_reasoning(default: false) - Whether the model supports reasoning mode (e.g., OpenAI o1 family)supports_reasoning_summary(default: true) - Whether the model supports returning reasoning summaries when reasoning is enabledsupports_encrypted_reasoning_content(default: true) - Whether the model supports requesting encrypted reasoning content for stateless reasoning replaysupports_verbosity(default: false) - Whether the model supports providing a verbosity parameter (for future support of advanced models)cost_input_tokens_per_1m(default: 0.0) - Price per 1 million input tokens (unit-less, for cost estimation)cost_output_tokens_per_1m(default: 0.0) - Price per 1 million output tokens (unit-less, for cost estimation)
If not explicitly configured, all capabilities default to conservative values (large context window, no special features, no cost tracking).
chat_providers.providers.<provider-id>.model_settings
The model_settings field allows you to configure how the model should behave during generation. While model_capabilities describes what a model can do, model_settings controls what the model should do.
Type: object | None
Example:
[chat_providers.providers.image-gen.model_settings]
generate_images = true
temperature = 0.2
top_p = 0.9
reasoning_effort = "minimal"
verbosity = "high"
compat_omit_strict = falseFields:
generate_images(default: false) - When set totrue, the model will generate images instead of text. This is useful for image generation models like DALL-E. The user’s message text will be used as the prompt for image generation. The generated image is automatically downloaded, stored as a file upload, and added to the chat history as an image file pointer.compat_omit_strict(default: false) - When set totrue, requests will omit explicit tool schemastrictfields for MCP tools. This can help when provider gateways reject strict-mode tool metadata.temperature(default: None) - Optional sampling temperature for generation (higher values increase randomness).top_p(default: None) - Optional nucleus sampling parameter to control diversity.reasoning_effort(default: None) - Optional reasoning effort for supported models. Values:"none","minimal","low","medium","high".verbosity(default: None) - Optional verbosity for supported models. Values:"low","medium","high".
Note: Image generation requires a model that supports the image generation API (currently only OpenAI’s DALL-E models are supported). The chat history is not sent when generating images - only the most recent user message text is used as the prompt.
Complete Image Generation Example:
[chat_providers]
priority_order = ["text-model", "image-gen"]
# Standard text generation model
[chat_providers.providers.text-model]
provider_kind = "openai"
model_name = "gpt-4"
api_key = "sk-..."
# Image generation model
[chat_providers.providers.image-gen]
provider_kind = "openai"
model_name = "dall-e-3"
model_display_name = "DALL-E 3 (Image Generator)"
api_key = "sk-..."
[chat_providers.providers.image-gen.model_settings]
generate_images = true
# Allow users to access the image generation model
[model_permissions.rules.allow-image-gen]
rule_type = "allow-all"
chat_provider_ids = ["image-gen"]chat_providers.summary
Configuration for chat summary generation. This allows you to specify which provider and settings to use when generating chat summaries.
Type: object | None
Example:
[chat_providers.summary]
summary_chat_provider_id = "summarizer"
max_tokens = 150
system_prompt = "Generate a summary for the topic of the following chat, based on the first message to the chat. The summary should be a short single sentence description."chat_providers.summary.summary_chat_provider_id
The ID of the chat provider to use specifically for generating chat summaries. This provider must exist in the chat_providers.providers map.
If not specified, Erato will use the highest priority provider from priority_order for summary generation.
Type: string | None
Default value: None (uses highest priority provider)
Example: "summarizer"
chat_providers.summary.system_prompt
Optional prompt source specification for summary generation.
This prompt is used as the instruction prompt for generating chat summaries and supports the same prompt format as prompt.
In practice, the chat_providers.summary.system_prompt structure matches chat_providers.providers.<provider-id>.system_prompt, including all of its nested fields.
Type: string | object | None
Example (string):
system_prompt = "Generate a short, one-sentence summary for the topic of the following chat, based on the first message."Example (Langfuse):
system_prompt = { source = "langfuse", prompt_name = "summary-instructions", label = "production", fallback = "Generate a short, one-sentence summary..." }Default value: Built-in hard-coded summary instruction from previous behavior.
chat_providers.summary.max_tokens
The maximum number of output tokens to allow when generating chat summaries. This helps control the length and cost of summary generation.
Type: number | None
Default value: 300
Example: 150
Complete Example:
[chat_providers]
priority_order = ["main"]
[chat_providers.summary]
summary_chat_provider_id = "summarizer"
max_tokens = 150
system_prompt = "Generate a short, one-sentence summary for the topic of the following chat, based on the first message to the chat."
[chat_providers.providers.main]
provider_kind = "openai"
model_name = "gpt-4"
api_key = "sk-main-key"
[chat_providers.providers.summarizer]
provider_kind = "openai"
model_name = "gpt-4o-mini"
model_display_name = "GPT-4o Mini (Summarizer)"
api_key = "sk-summarizer-key"In this example:
- Regular chat interactions use the “main” provider (GPT-4)
- Chat summaries are generated using the “summarizer” provider (GPT-4o Mini)
- Summary generation is limited to 150 output tokens
- Summary prompts can be customized with
chat_providers.summary.system_promptand rendered with template variables - This allows using a faster, cheaper model for summaries while keeping a more powerful model for main interactions
Migration from chat_provider
If you’re using the old chat_provider configuration, it will be automatically migrated to the new format:
- The old provider will be given the ID
"default" - The priority order will be set to
["default"] - A deprecation warning will be logged
Before (deprecated):
[chat_provider]
provider_kind = "openai"
model_name = "gpt-4"
api_key = "sk-key"After (recommended):
[chat_providers]
priority_order = ["main"]
[chat_providers.providers.main]
provider_kind = "openai"
model_name = "gpt-4"
api_key = "sk-key"file_storage_providers
Configuration for file storage providers that handle uploaded files and documents in Erato. You can configure multiple storage providers with different backends (S3-compatible or Azure Blob Storage) and designate one as the default.
Type: object<string, FileStorageProviderConfig>
Required: At least one file storage provider must be configured.
Example:
[file_storage_providers.primary]
provider_kind = "s3"
display_name = "Primary Storage"
max_upload_size_kb = 102400 # 100 MB
config = { bucket = "erato-files", region = "us-east-1", access_key_id = "AKIAIOSFODNN7EXAMPLE", secret_access_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" }
# Set which provider to use as default
default_file_storage_provider = "primary"file_storage_providers.<provider-id>.provider_kind
The type of file storage backend to use.
Type: string
Required: Yes
Supported values:
"s3"- Amazon S3 or services that expose an S3-compatible API (e.g., MinIO, DigitalOcean Spaces, Cloudflare R2)"azblob"- Azure Blob Storage
Example: "s3" or "azblob"
file_storage_providers.<provider-id>.display_name
A human-readable name to display in the UI for this storage provider.
Type: string | None
Default: Falls back to the provider ID if not specified
Example: "Primary Storage" or "Azure Backup Storage"
file_storage_providers.<provider-id>.max_upload_size_kb
The maximum file size that may be uploaded in kilobytes. This limit is enforced by the backend.
Type: number | None
Default: No limit (system default applies)
Example: 102400 (100 MB), 10240 (10 MB)
file_storage_providers.<provider-id>.config
Provider-specific configuration object. The available fields depend on the provider_kind.
Type: object
Required: Yes
S3 Provider Configuration
When provider_kind = "s3", the following configuration fields are available:
bucket (required)
The name of the S3 bucket to store files in.
Type: string
Example: "erato-files", "my-company-documents"
endpoint (optional)
The S3-compatible endpoint URL. Use this when connecting to S3-compatible services other than AWS S3 (e.g., MinIO, DigitalOcean Spaces).
If not specified, defaults to AWS S3.
Type: string | None
Example: "https://nyc3.digitaloceanspaces.com", "http://localhost:9000"
region (optional)
The AWS region where the bucket is located.
Type: string | None
Example: "us-east-1", "eu-central-1"
root (optional)
A prefix path within the bucket where files should be stored. Useful for organizing files or sharing buckets across multiple applications.
Type: string | None
Example: "erato-uploads/", "production/files/"
access_key_id (optional)
The AWS access key ID for authentication. If not provided, the SDK will attempt to use environment variables or instance credentials.
Type: string | None
Example: "AKIAIOSFODNN7EXAMPLE"
secret_access_key (optional)
The AWS secret access key for authentication. If not provided, the SDK will attempt to use environment variables or instance credentials.
Type: string | None
Example: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
Complete S3 Example:
[file_storage_providers.s3_primary]
provider_kind = "s3"
display_name = "AWS S3 Primary"
max_upload_size_kb = 204800 # 200 MB
config = { bucket = "erato-production-files", region = "us-west-2", root = "uploads/", access_key_id = "AKIAIOSFODNN7EXAMPLE", secret_access_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" }
default_file_storage_provider = "s3_primary"MinIO Example:
[file_storage_providers.minio]
provider_kind = "s3"
display_name = "MinIO Local Storage"
max_upload_size_kb = 102400 # 100 MB
config = { bucket = "erato-storage", endpoint = "http://localhost:9000", access_key_id = "minioadmin", secret_access_key = "minioadmin" }
default_file_storage_provider = "minio"Azure Blob Storage Configuration
When provider_kind = "azblob", the following configuration fields are available:
container (required)
The name of the Azure Blob Storage container to store files in.
Type: string
Example: "erato-files", "documents"
endpoint (required)
The Azure Blob Storage endpoint URL for your storage account.
Type: string
Example: "https://mystorageaccount.blob.core.windows.net"
root (optional)
A prefix path within the container where files should be stored. Useful for organizing files or sharing containers across multiple applications.
Type: string | None
Example: "erato-uploads/", "production/files/"
account_name (optional)
The Azure storage account name for authentication. If not provided, the SDK will attempt to use environment variables or managed identity.
Type: string | None
Example: "mystorageaccount"
account_key (optional)
The Azure storage account key for authentication. If not provided, the SDK will attempt to use environment variables or managed identity.
Type: string | None
Example: "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=="
Complete Azure Blob Storage Example:
[file_storage_providers.azure]
provider_kind = "azblob"
display_name = "Azure Blob Storage"
max_upload_size_kb = 153600 # 150 MB
config = { endpoint = "https://mycompany.blob.core.windows.net", container = "erato-files", root = "uploads/", account_name = "mycompany", account_key = "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==" }
default_file_storage_provider = "azure"Azurite (Azure Blob Storage Emulator) Example:
[file_storage_providers.azurite]
provider_kind = "azblob"
display_name = "Azurite Local Storage"
max_upload_size_kb = 102400 # 100 MB
config = { endpoint = "http://127.0.0.1:10000/devstoreaccount1", container = "erato-storage", account_name = "devstoreaccount1", account_key = "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==" }
default_file_storage_provider = "azurite"default_file_storage_provider
Specifies which file storage provider to use as the default for new file uploads. This must match one of the keys in the file_storage_providers map.
When multiple file storage providers are configured, this setting is required and determines where new files will be written. All configured providers remain available for reading existing files, which is useful for migration scenarios.
Type: string | None
Required when: Multiple file storage providers are configured
Example: "primary", "s3_main", "azure"
Note: This is a top-level configuration key, not nested under file_storage_providers.
Configuration Example:
# Define multiple providers
[file_storage_providers.primary]
provider_kind = "s3"
display_name = "Primary Storage"
max_upload_size_kb = 102400
config = { bucket = "erato-files", region = "us-east-1" }
[file_storage_providers.backup]
provider_kind = "azblob"
display_name = "Backup Storage"
max_upload_size_kb = 102400
config = { endpoint = "https://backup.blob.core.windows.net", container = "erato-backup" }
# Specify which one is default (top-level key)
default_file_storage_provider = "primary"In this example:
- New files will be uploaded to the “primary” S3 provider
- Existing files can still be read from both “primary” and “backup” providers
- This allows migrating from one storage provider to another without data loss
Multiple Providers Example
You can configure multiple file storage providers. This is currently mainly useful for migration scenarios, where new file uploads should be read + written from a new file storage provider, but past uploaded files should still be available to be read from an older file storage provider.
# Primary S3 storage
[file_storage_providers.s3_primary]
provider_kind = "s3"
display_name = "S3 Primary (US)"
max_upload_size_kb = 204800 # 200 MB
config = { bucket = "erato-us-files", region = "us-east-1", access_key_id = "AKIAIOSFODNN7EXAMPLE", secret_access_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" }
# Backup Azure storage
[file_storage_providers.azure_backup]
provider_kind = "azblob"
display_name = "Azure Backup (EU)"
max_upload_size_kb = 204800 # 200 MB
config = { endpoint = "https://eratoeu.blob.core.windows.net", container = "erato-eu-files", account_name = "eratoeu", account_key = "..." }
# Set default provider
default_file_storage_provider = "s3_primary"file_processor
Configuration for the file processing engine that extracts text content from uploaded documents. Erato supports multiple file processor backends with different capabilities.
Type: object | None
Default behavior: If not configured, Erato uses the kreuzberg processor by default.
file_processor.processor
The file processor backend to use for extracting text from uploaded files.
Type: string
Default: "kreuzberg"
Supported values:
Since the removal of the
parser-corefile processor, there is only a single optionkreuzberg
"kreuzberg"- Advanced file processor with page-aware markdown extraction. Extracts content with page boundary markers, making it easy to identify which page content comes from.
Example:
[file_processor]
processor = "kreuzberg"Processor Comparison
| Feature | kreuzberg |
|---|---|
| Text extraction | ✅ |
| Page awareness | ✅ |
| Page markers | ✅ (<page number="N">) |
| Output format | Markdown |
| Supported formats | Word, PDF, Excel, PowerPoint, text files |
Page-Aware Extraction
When using kreuzberg, extracted content includes XML-style page markers that clearly indicate page boundaries:
<page number="1">
First page content here...
<page number="2">
Second page content here...This is particularly useful when working with:
- Multi-page legal documents where page references matter
- Academic papers with citations to specific pages
- Technical documentation with page-specific diagrams
- Any scenario where knowing the source page of content is important
Example configuration:
[file_processor]
# Use kreuzberg for page-aware extraction
processor = "kreuzberg"Testing page-aware extraction:
You can generate test PDFs with distinct content per page using the provided script:
cd e2e-tests
./scripts/generate_multipage_test_pdf.py test.pdf 20This creates a PDF with 20 pages, each containing unique identifiers like PAGE-001, PAGE-002, etc.
Audio modes
Erato has three independently configurable audio modes:
audio_dictationenables microphone dictation that writes transcribed text into the chat input without creating an attachment.audio_transcriptionenables recording or uploading audio as a file attachment with transcription metadata.audio_conversationalenables hands-free conversational audio mode with voice activity detection, auto-send, and restart behavior.
Each enabled audio mode needs an audio-capable chat provider. Either set chat_provider_id on the audio mode, or omit it to let Erato pick the first provider in chat_providers.priority_order with model_capabilities.supports_audio_input = true. Provider kinds using the OpenAI Responses adapters do not currently support binary audio input for these modes.
audio_dictation
Controls the microphone dictation flow that populates the chat input without creating an attachment.
Type: object | None
Default behavior: audio_dictation is disabled.
audio_dictation.enabled
Whether this audio mode is available.
Type: bool
Default: false
audio_dictation.chat_provider_id
The provider ID to use for this audio mode. If omitted, Erato uses the first compatible audio-capable provider from the configured provider priority order.
Type: string | None
Default: None
The corresponding provider needs to have the supports_audio_input capability:
Provider example:
[chat_providers.providers.audio-model]
provider_kind = "openai"
model_name = "gpt-4o-audio-preview"
api_key = "sk-..."
[chat_providers.providers.audio-model.model_capabilities]
supports_audio_input = trueaudio_dictation.max_recording_duration_seconds
Maximum recording duration in seconds.
Type: number
Default: 1200
audio_dictation.chunk_duration_seconds
Audio chunk duration in seconds. Must be greater than 0 and no larger than max_recording_duration_seconds.
Type: number
Default: 30
audio_dictation.max_attempts
Maximum provider attempts per chunk, including retries for loop-detection.
Type: number
Default: 3
audio_dictation.initial_backoff_ms
Initial retry backoff in milliseconds.
Type: number
Default: 250
audio_dictation.max_backoff_ms
Maximum retry backoff in milliseconds.
Type: number
Default: 4000
audio_dictation.min_words_for_loop_check
Minimum transcript word count before loop detection runs.
Type: number
Default: 80
audio_dictation.min_unique_word_ratio
Minimum unique-word ratio accepted for non-loop transcripts.
Type: number
Default: 0.25
Constraint: Must be between 0.0 and 1.0.
audio_dictation.max_words_per_minute
Assumed maximum speech rate used to derive per-chunk output token budgets.
Type: number
Default: 200
audio_dictation.tokens_per_word
Token-per-word multiplier used to derive per-chunk output token budgets.
Type: number
Default: 1.5
audio_dictation.output_token_buffer_factor
Safety multiplier used to derive per-chunk output token budgets.
Type: number
Default: 2.0
audio_dictation.fixed_output_token_budget
Fixed output token budget added to the per-chunk heuristic.
Type: number
Default: 500
audio_dictation.min_output_tokens
Optional minimum output token budget for transcription chunks.
Type: number | None
audio_dictation.max_output_tokens
Optional maximum output token cap for transcription chunks.
Type: number | None
If both minimum and maximum are configured, min_output_tokens must be less than or equal to max_output_tokens.
audio_dictation example
[audio_dictation]
enabled = true
chat_provider_id = "audio-model"
max_recording_duration_seconds = 1200
chunk_duration_seconds = 30
max_attempts = 3
initial_backoff_ms = 250
max_backoff_ms = 4000
min_words_for_loop_check = 80
min_unique_word_ratio = 0.25
max_words_per_minute = 200
tokens_per_word = 1.5
output_token_buffer_factor = 2.0
fixed_output_token_budget = 500
min_output_tokens = 32
max_output_tokens = 2048audio_transcription
Controls the recording or upload-to-attachment transcription flow.
All subkeys share the same structure and meaning as audio_dictation:
[audio_transcription]
enabled = true
chat_provider_id = "audio-model"
max_recording_duration_seconds = 1200
chunk_duration_seconds = 30
max_attempts = 3
initial_backoff_ms = 250
max_backoff_ms = 4000
min_words_for_loop_check = 80
min_unique_word_ratio = 0.25
max_words_per_minute = 200
tokens_per_word = 1.5
output_token_buffer_factor = 2.0
fixed_output_token_budget = 500
min_output_tokens = 32
max_output_tokens = 2048audio_conversational
Controls hands-free conversational dictation with voice activity detection, auto-send, and automatic restart after responses.
All subkeys share the same structure and meaning as audio_dictation:
[audio_conversational]
enabled = true
chat_provider_id = "audio-model"
max_recording_duration_seconds = 1200
chunk_duration_seconds = 30
max_attempts = 3
initial_backoff_ms = 250
max_backoff_ms = 4000
min_words_for_loop_check = 80
min_unique_word_ratio = 0.25
max_words_per_minute = 200
tokens_per_word = 1.5
output_token_buffer_factor = 2.0
fixed_output_token_budget = 500
min_output_tokens = 32
max_output_tokens = 2048model_permissions
Configuration for controlling user access to chat providers based on user attributes like group membership. This allows you to implement fine-grained access control, such as restricting premium models to certain user groups or providing different models to different teams.
Type: object | None
Default behavior: If no model permissions are configured, all users have access to all configured chat providers.
model_permissions.rules
A map of permission rules, where each key is a unique rule name and the value is a rule configuration object. Rules are evaluated independently, and if any rule grants access to a chat provider, the user is allowed to use it.
Type: object<string, ModelPermissionRule>
Example:
[model_permissions.rules.basic_access]
rule_type = "allow-all"
chat_provider_ids = ["gpt-4o-mini"]
[model_permissions.rules.premium_access]
rule_type = "allow-for-group-members"
chat_provider_ids = ["gpt-4", "claude-3-opus"]
groups = ["premium-users", "administrators"]
[model_permissions.rules.admin_only]
rule_type = "allow-for-group-members"
chat_provider_ids = ["gpt-4-turbo", "experimental-model"]
groups = ["administrators"]Rule Types
There are two types of permission rules:
allow-all
Grants access to specified chat providers for all users, regardless of their group membership.
Fields:
rule_type- Must be"allow-all"chat_provider_ids- Array of chat provider IDs that this rule grants access to
Example:
[model_permissions.rules.public_models]
rule_type = "allow-all"
chat_provider_ids = ["gpt-4o-mini", "claude-3-haiku"]allow-for-group-members
Grants access to specified chat providers only for users who belong to at least one of the specified groups. User groups are derived from the groups claim in the user’s ID token (see SSO/OIDC configuration).
Fields:
rule_type- Must be"allow-for-group-members"chat_provider_ids- Array of chat provider IDs that this rule grants access togroups- Array of group names/identifiers that are allowed access
Example:
[model_permissions.rules.premium_models]
rule_type = "allow-for-group-members"
chat_provider_ids = ["gpt-4", "claude-3-opus"]
groups = ["premium-subscribers", "enterprise-users"]How Permission Evaluation Works
- No rules configured: All users have access to all chat providers
- Rules configured: For each chat provider, check if any rule grants access:
allow-allrules always grant accessallow-for-group-membersrules grant access only if the user belongs to at least one specified group
- Access granted: If any rule grants access to a chat provider, the user can use it
- Access denied: If no rules grant access to a chat provider, it’s hidden from the user
Complete Example
# Chat provider configuration
[chat_providers]
priority_order = ["basic", "premium", "experimental"]
[chat_providers.providers.basic]
provider_kind = "openai"
model_name = "gpt-4o-mini"
model_display_name = "GPT-4o Mini (Basic)"
api_key = "sk-basic-key"
[chat_providers.providers.premium]
provider_kind = "openai"
model_name = "gpt-4"
model_display_name = "GPT-4 (Premium)"
api_key = "sk-premium-key"
[chat_providers.providers.experimental]
provider_kind = "openai"
model_name = "gpt-4-turbo"
model_display_name = "GPT-4 Turbo (Experimental)"
api_key = "sk-experimental-key"
# Model permissions configuration
[model_permissions.rules.basic_access]
rule_type = "allow-all"
chat_provider_ids = ["basic"]
[model_permissions.rules.premium_access]
rule_type = "allow-for-group-members"
chat_provider_ids = ["premium"]
groups = ["premium-users", "enterprise"]
[model_permissions.rules.experimental_access]
rule_type = "allow-for-group-members"
chat_provider_ids = ["experimental"]
groups = ["beta-testers", "administrators"]In this example:
- All users can access the “basic” model (GPT-4o Mini)
- Premium users and enterprise users can access both “basic” and “premium” models
- Beta testers and administrators can access “basic”, “premium” (if they’re also in premium groups), and “experimental” models
- Regular users (not in any special groups) can only access the “basic” model
Integration with SSO/OIDC
Model permissions rely on group information from the user’s ID token. The groups claim in the JWT token is parsed and used for permission evaluation. See the SSO/OIDC documentation for information on how to configure your identity provider to include group information in ID tokens.
Common group claim formats:
- Array of strings:
["group1", "group2", "group3"] - Single string:
"group1"(automatically converted to array)
If no groups claim is present in the ID token, the user is treated as having no group memberships, and only allow-all rules will grant them access.
mcp_server_permissions
Configuration for controlling user access to configured MCP servers based on group membership. This is evaluated through the same policy engine as model_permissions, and affects which MCP servers can be attached to assistants or used during generation.
Type: object | None
Default behavior: If no MCP server permissions are configured, all users have access to all configured MCP servers.
mcp_server_permissions.rules
A map of permission rules, where each key is a unique rule name and the value is a rule configuration object. Rules are evaluated independently, and if any rule grants access to an MCP server, the user is allowed to use it.
Type: object<string, McpServerPermissionRule>
Example:
[mcp_server_permissions.rules.public_servers]
rule_type = "allow-all"
mcp_server_ids = ["filesystem", "company-search"]
[mcp_server_permissions.rules.premium_servers]
rule_type = "allow-for-group-members"
mcp_server_ids = ["browser-automation", "code-execution"]
groups = ["premium-users", "administrators"]Rule Types
mcp_server_permissions supports the same rule types as model_permissions:
allow-allallow-for-group-members
The only difference is the resource field name:
mcp_server_idsinstead ofchat_provider_ids
Example:
[mcp_server_permissions.rules.restricted_server]
rule_type = "allow-for-group-members"
mcp_server_ids = ["browser-automation"]
groups = ["security-reviewed-users"]facet_permissions
Configuration for controlling which facets are visible and usable for different users. This is evaluated through the same policy engine as model_permissions, and affects facet visibility in /me/facets, starter prompts, and message generation.
Type: object | None
Default behavior: If no facet permissions are configured, all users have access to all configured facets.
facet_permissions.rules
A map of permission rules, where each key is a unique rule name and the value is a rule configuration object. Rules are evaluated independently, and if any rule grants access to a facet, the user is allowed to use it.
Type: object<string, FacetPermissionRule>
Example:
[facet_permissions.rules.default_facets]
rule_type = "allow-all"
facet_ids = ["extended_thinking"]
[facet_permissions.rules.web_search]
rule_type = "allow-for-group-members"
facet_ids = ["web_search"]
groups = ["premium-users", "research-team"]Rule Types
facet_permissions supports the same rule types as model_permissions:
allow-allallow-for-group-members
The only difference is the resource field name:
facet_idsinstead ofchat_provider_ids
Example:
[facet_permissions.rules.image_generation]
rule_type = "allow-for-group-members"
facet_ids = ["image_generation"]
groups = ["design-team"]Integration with SSO/OIDC
Like model_permissions, both mcp_server_permissions and facet_permissions use the groups claim from the user’s ID token for allow-for-group-members rules. If no groups claim is present, only allow-all rules grant access.
integrations
Configuration for external service integrations.
integrations.langfuse
Configuration for the Langfuse observability and prompt management integration.
integrations.langfuse.enabled
Whether the Langfuse integration is enabled.
Default value: false
Type: boolean
integrations.langfuse.base_url
The base URL for your Langfuse instance. Use https://cloud.langfuse.com for Langfuse Cloud or your self-hosted URL.
Required when enabled: Yes
Type: string
Example: "https://cloud.langfuse.com" or "https://langfuse.yourcompany.com"
integrations.langfuse.public_key
Your Langfuse project’s public key. You can find this in your Langfuse project settings.
Required when enabled: Yes
Type: string
Example: "pk-lf-1234567890abcdef"
integrations.langfuse.secret_key
Your Langfuse project’s secret key. You can find this in your Langfuse project settings.
Required when enabled: Yes
Type: string
Example: "sk-lf-abcdef1234567890"
integrations.langfuse.tracing_enabled
Whether to enable detailed tracing of LLM interactions. When enabled, all chat completions will be logged to Langfuse.
Default value: false
Type: boolean
integrations.langfuse.summary_tracing_enabled
Whether to enable tracing of generated chat summaries. Set to false to skip Langfuse traces for summaries while keeping normal chat completion tracing unchanged.
Default value: true
Type: boolean
integrations.langfuse.enable_feedback
Whether to forward user feedback (thumbs up/down with optional comments) to Langfuse as scores. When enabled, feedback submitted by users will be sent to Langfuse and associated with the corresponding message trace for observability and quality tracking.
Default value: false
Type: boolean
Note: This requires frontend.enable_message_feedback to be enabled for users to submit feedback through the UI.
integrations.langfuse.use_otel
Whether to send Langfuse tracing data via Langfuse’s OpenTelemetry ingestion endpoint instead of the legacy ingestion endpoint. When enabled, Erato sends trace and observation data to ${base_url}/api/public/otel/v1/traces using OTLP over HTTP.
Default value: false
Type: boolean
Note: User feedback forwarding still uses Langfuse score ingestion and is controlled separately by integrations.langfuse.enable_feedback.
See the Langfuse Integration documentation for detailed usage instructions and feature descriptions.
integrations.sentry
Configuration for the Sentry error reporting and performance monitoring integration.
integrations.sentry.sentry_dsn
The Sentry DSN (Data Source Name) for your Sentry project. This enables error reporting and performance monitoring.
Default value: None
Type: string | None
Example:
[integrations.sentry]
sentry_dsn = "https://public@sentry.example.com/1"See the Sentry Integration documentation for detailed setup instructions.
integrations.otel
Configuration for OpenTelemetry tracing.
integrations.otel.enabled
Whether OpenTelemetry tracing is enabled.
Default value: false
Type: boolean
integrations.otel.endpoint
The endpoint URL of the OpenTelemetry collector.
Default value: "http://localhost:4318"
Type: string
Example: "http://localhost:4318" or "http://otel-collector:4317"
integrations.otel.protocol
The protocol to use for exporting traces.
Default value: "http"
Type: string
Supported values: "http", "grpc"
integrations.otel.service_name
The service name to identify this application in traces.
Default value: "erato-backend"
Type: string
integrations.ms_office
Configuration for Microsoft Office integrations.
integrations.ms_office.ews_api_endpoint
Optional Exchange EWS API endpoint used by the backend EWS proxy.
When configured, Erato exposes POST /api/v1beta/integrations/ms-office/ews and forwards the incoming Authorization header to this endpoint.
Type: string | None
integrations.ms_office.ews_skip_tls_validation
Whether the backend EWS proxy skips TLS certificate validation when calling the configured EWS endpoint.
This is unsafe and should only be enabled for development or controlled internal environments with non-compliant certificates.
Default value: false
Type: boolean
integrations.ms_office.addin.enabled
Whether the Office add-in frontend and manifest endpoint are enabled.
When enabled, Erato serves the add-in frontend bundle and exposes GET /office-addin/manifest.xml.
Default value: false
Type: boolean
integrations.ms_office.addin.msal_client_id
The Microsoft identity platform client (application) ID used by the Office add-in for OAuth-based authentication.
Required for add-in authentication flows: Yes
Type: string | None
integrations.ms_office.addin.addin_id
The Office add-in manifest ID used when rendering the add-in manifest.
Override this ID to install multiple add-in instances in the same tenant.
Default value: "ee94d041-bd77-446c-8854-421648f50e7c"
Type: string
integrations.ms_office.addin.msal_authority
The Microsoft identity authority URL used for MSAL authentication.
Default value: "https://login.microsoftonline.com/common"
Type: string
integrations.ms_office.addin.serve_bundle_legacy_path
Whether to also serve the Office add-in bundle under /office-addin in addition to /public/platform-office-addin.
This is useful for deployments that still reference legacy asset URLs.
Default value: true
Type: boolean
integrations.ms_office.addin.frontend_bundle_path
Filesystem path to the Office add-in frontend bundle.
The path must contain the Office add-in frontend artifacts including manifest.xml.
Default value: "./public/platform-office-addin"
Type: string
integrations.ms_office.addin.manifest
Branding values used when rendering the Office add-in manifest.
The string fields control the Office provider name, add-in display name, add-in description, support URL, ribbon group label, ribbon button label, and ribbon button description. Icon path fields control the top-level manifest icons and the 16px, 32px, and 80px ribbon icons.
Icon paths can be absolute http/https URLs, deployment-root-relative paths such as /public/common/custom-theme/contoso/icon.png, or paths relative to the Office add-in bundle such as assets/color-icon-192x192.png.
Type: object
Example:
[integrations.ms_office]
ews_api_endpoint = "https://outlook.office365.com/EWS/Exchange.asmx"
ews_skip_tls_validation = false
[integrations.ms_office.addin]
enabled = true
msal_client_id = "00000000-0000-0000-0000-000000000000"
addin_id = "11111111-2222-3333-4444-555555555555"
msal_authority = "https://login.microsoftonline.com/consumers"
serve_bundle_legacy_path = true
frontend_bundle_path = "./public/platform-office-addin"
[integrations.ms_office.addin.manifest]
provider_name = "Contoso"
display_name = "Contoso Office Extension"
description = "Contoso AI assistant for Outlook"
support_url = "https://contoso.example.com/support"
group_label = "Contoso"
button_label = "Open Contoso"
button_description = "Open the Contoso AI assistant in a task pane"
icon_path = "/public/common/custom-theme/contoso/color-icon.png"
high_resolution_icon_path = "/public/common/custom-theme/contoso/color-icon-hires.png"
icon_16_path = "assets/contoso-outline-16.png"
icon_32_path = "assets/contoso-outline-32.png"
icon_80_path = "https://cdn.example.com/contoso/icon-80.png"integrations.prometheus
Configuration for Prometheus metrics export.
integrations.prometheus.enabled
Whether the Prometheus metrics endpoint is enabled.
Default value: false
Type: boolean
integrations.prometheus.host
Host interface for the Prometheus metrics listener.
Default value: "127.0.0.1"
Type: string
integrations.prometheus.port
Port for the Prometheus metrics listener. This must be different from http_port.
Default value: 3131
Type: integer
integrations.experimental_sharepoint
Configuration for the experimental SharePoint/OneDrive integration. This allows users to attach files from their Microsoft cloud storage directly to chats.
Note: This is an experimental feature. The configuration key will be renamed to integrations.sharepoint once the feature has stabilized.
integrations.experimental_sharepoint.enabled
Whether the SharePoint/OneDrive integration is enabled.
Default value: false
Type: boolean
integrations.experimental_sharepoint.file_upload_enabled
Whether file linking from SharePoint is enabled. When disabled, users can browse their drives but cannot link files to chats.
Default value: true
Type: boolean
integrations.experimental_sharepoint.auth_via_access_token
Whether to use the user’s forwarded access token for Microsoft Graph API authentication. Currently, this is the only supported authentication method.
Default value: true
Type: boolean
integrations.experimental_sharepoint.show_disclaimer
Whether to show a disclaimer in the SharePoint/OneDrive file picker explaining that users may see directories they do not expect because a SharePoint site or OneDrive directory may have been shared more broadly, such as with the whole organization.
Default value: false
Type: boolean
integrations.experimental_sharepoint.all_drives_sources
Which Microsoft Graph discovery surfaces Erato should query for /integrations/sharepoint/all-drives.
If omitted or set to an empty list, Erato queries all supported sources.
Supported values:
me_driveme_drivesjoined_teamsgroup_drivessite_searchsite_drivesshared_with_meshared_drive_details
Default value: [] (interpreted as all sources)
Type: array[string]
Example:
[integrations.experimental_sharepoint]
enabled = true
file_upload_enabled = true
auth_via_access_token = true
all_drives_sources = ["me_drive", "me_drives", "shared_with_me", "shared_drive_details"]Prerequisites:
- Your OIDC provider (e.g., Microsoft Entra ID) must issue access tokens with Microsoft Graph API permissions (
Files.ReadorFiles.ReadWrite) - Your authentication proxy (e.g., oauth2-proxy) must forward the access token via the
X-Forwarded-Access-Tokenheader
See the SharePoint/OneDrive Integration documentation for detailed setup instructions.
integrations.experimental_entra_id
Configuration for the experimental Entra ID integration. This allows exposing organization users and groups to the frontend for sharing features.
Note: This is an experimental feature. The configuration key will be renamed to integrations.entra_id once the feature has stabilized.
integrations.experimental_entra_id.enabled
Whether the Entra ID integration is enabled. When enabled, the /me/organization/users and /me/organization/groups endpoints will query the Microsoft Graph API to list organization users and groups. When disabled, these endpoints will return empty lists.
Default value: false
Type: boolean
integrations.experimental_entra_id.auth_via_access_token
Whether to use the user’s forwarded access token for Microsoft Graph API authentication. Currently, this is the only supported authentication method.
Default value: true
Type: boolean
Example:
[integrations.experimental_entra_id]
enabled = true
auth_via_access_token = truePrerequisites:
- Your OIDC provider (e.g., Microsoft Entra ID) must issue access tokens with Microsoft Graph API permissions (
User.Read.AllandGroup.Read.All) - Your authentication proxy (e.g., oauth2-proxy) must forward the access token via the
X-Forwarded-Access-Tokenheader - The ID token must include the
oidclaim for organization user IDs and thegroupsclaim for organization group memberships
Related Features:
When the Entra ID integration is enabled, the user profile will include additional fields:
organization_user_id- Extracted from theoidclaim (Entra ID specific)organization_group_ids- Extracted from thegroupsclaim
These fields enable sharing assistants with organization groups via share grants with subject_type = "organization_group" and subject_id_type = "organization_group_id".
logging
Configuration for backend log output formatting.
logging.format
Controls how application logs are formatted on stdout/stderr.
Use "plain" for human-readable local development logs, or "json" for structured logs that are easier to ingest in production log pipelines.
When "json" is enabled, startup messages that are emitted before the tracing subscriber is initialized are also re-emitted as structured log events once logging is ready.
Default value: "plain"
Type: string
Supported values: "plain", "json"
Example:
[logging]
format = "json"mcp_servers
Configuration for Model Context Protocol (MCP) servers that extend Erato’s capabilities through custom tools and integrations.
Type: object<string, McpServerConfig>
mcp_servers_global
Global configuration for MCP server behavior across all configured servers.
Type: McpServersGlobalConfig
mcp_servers_global.max_session_idle_seconds
Global default maximum idle time for MCP sessions, in seconds. If a session is idle longer than this threshold, it is evicted by the backend cleanup task.
Server-specific settings can override this value via mcp_servers.<server-id>.max_session_idle_seconds.
Type: integer | None
Default value: 3600 (1 hour)
Example: 900
Operational note: For MCP servers where each session allocates substantial resources (e.g. memory-heavy tools, expensive backend handles), use a lower idle timeout to reclaim server capacity faster.
mcp_servers_global.show_frontend_tab
Whether the MCP servers tab should be shown in the frontend preferences dialog.
This only controls frontend visibility of the tab. It does not change backend MCP server availability, authorization, or permissions.
Type: boolean
Default value: false
Example:
[mcp_servers.file_provider]
transport_type = "sse"
url = "http://127.0.0.1:63490/sse"
[mcp_servers.filesystem]
transport_type = "sse"
url = "https://my-mcp-server.example.com/sse"
[mcp_servers.filesystem.authentication]
mode = "fixed"
[mcp_servers.filesystem.authentication.fixed]
api_key = "token123"
header_name = "Authorization"
prefix = "Bearer "
max_session_idle_seconds = 1800mcp_servers.<server-id>.transport_type
The type of transport protocol used to communicate with the MCP server.
Type: string
Supported values:
"sse"(Server-Sent Events)"streamable_http"(Streamable HTTP)
Example: "sse" or "streamable_http"
mcp_servers.<server-id>.url
The URL endpoint for the MCP server.
- For
transport_type = "sse", this conventionally ends with/sse - For
transport_type = "streamable_http", this should be the base HTTP endpoint
Type: string
Example: "http://127.0.0.1:63490/sse", "https://my-mcp-server.example.com/sse", "https://api.example.com/mcp"
mcp_servers.<server-id>.http_headers
Optional static HTTP headers to include with every request to the MCP server. Use this for non-authentication headers. Prefer mcp_servers.<server-id>.authentication for credentials.
Type: object<string, string> | None
Example:
[mcp_servers.custom_headers_server]
transport_type = "sse"
url = "https://mcp-server.example.com/sse"
http_headers = { "X-Tenant-ID" = "tenant-123", "X-Environment" = "production" }mcp_servers.<server-id>.authentication
Authentication settings for the MCP server.
Type: McpServerAuthenticationConfig
Supported values:
mode = "none": No credentials are attached to MCP requests.mode = "forwarded": Forward credentials from the current user session.mode = "fixed": Attach a configured fixed API key using the configured header and prefix.mode = "oauth2": Use an OAuth 2.0 authorization flow managed by Erato and persist credentials per user.
Examples:
[mcp_servers.public_tools.authentication]
mode = "none"
[mcp_servers.intranet.authentication]
mode = "forwarded"
[mcp_servers.intranet.authentication.forwarded]
credential = "oidc_id_token"
[mcp_servers.partner_api.authentication]
mode = "fixed"
[mcp_servers.partner_api.authentication.fixed]
api_key = "your-api-key"
header_name = "Authorization"
prefix = "Bearer "
[mcp_servers.jira.authentication]
mode = "oauth2"
[mcp_servers.jira.authentication.oauth2]
client_name = "Erato Jira MCP"
scopes = ["read:jira-work"]mcp_servers.<server-id>.authentication.forwarded.credential
Selects which user credential to forward when authentication.mode = "forwarded".
Type: string
Supported values:
"access_token""oidc_id_token"
mcp_servers.<server-id>.authentication.fixed.api_key
The fixed API key to attach when authentication.mode = "fixed".
Type: string
mcp_servers.<server-id>.authentication.fixed.header_name
The HTTP header to use when authentication.mode = "fixed".
Type: string
Default value: "Authorization"
mcp_servers.<server-id>.authentication.fixed.prefix
The prefix prepended to authentication.fixed.api_key when authentication.mode = "fixed".
This defaults to bearer-token formatting, so the default fixed auth header is Authorization: Bearer <api_key>.
Type: string
Default value: "Bearer "
mcp_servers.<server-id>.authentication.oauth2.client_id
Optional statically configured OAuth client ID for authentication.mode = "oauth2".
If omitted, Erato attempts dynamic client registration and persists the returned client identity for later reuse.
Type: string | None
mcp_servers.<server-id>.authentication.oauth2.client_secret
Optional statically configured OAuth client secret for authentication.mode = "oauth2".
Use this for confidential clients that are pre-registered with the MCP server’s authorization server.
Type: string | None
mcp_servers.<server-id>.authentication.oauth2.client_name
Optional display name used during dynamic client registration when authentication.mode = "oauth2" and no static client_id is configured.
Type: string | None
Default behavior: Falls back to "Erato MCP OAuth Client"
mcp_servers.<server-id>.authentication.oauth2.scopes
Optional list of OAuth scopes requested during authorization for authentication.mode = "oauth2".
If omitted, Erato relies on the scope guidance advertised by the MCP server and its authorization metadata.
Type: array<string>
Example:
[mcp_servers.jira.authentication]
mode = "oauth2"
[mcp_servers.jira.authentication.oauth2]
client_id = "erato-jira"
client_secret = "super-secret"
scopes = ["read:jira-work", "offline_access"]Operational note: oauth2 authentication requires server.encryption_key to be configured because Erato stores registered client secrets, authorization state, and user credentials encrypted at rest.
mcp_servers.<server-id>.max_session_idle_seconds
Optional per-server override for maximum MCP session idle time, in seconds.
If set, this value takes precedence over mcp_servers_global.max_session_idle_seconds for sessions created against this server.
Type: integer | None
Default behavior: Uses mcp_servers_global.max_session_idle_seconds (or 3600 when not set)
Example: 300
Operational note: Prefer low values for resource-intensive MCP servers where long-lived idle sessions can consume significant server resources.
See the MCP Servers documentation for more information about Model Context Protocol integration.
prompt_optimizer
Configuration for the prompt optimizer. When enabled, the backend exposes the POST /prompt-optimizer endpoint to optimize user prompts using a dedicated chat provider and system prompt.
prompt_optimizer.enabled
Whether to enable the prompt optimizer feature.
Default value: false
Type: boolean
prompt_optimizer.chat_provider_id
The chat provider ID to use for prompt optimization. Must match a provider configured in chat_providers.
Type: string
prompt_optimizer.prompt
System prompt used by the prompt optimizer. If omitted, a built-in non-interactive default prompt is used. Uses the common prompt format.
Type: string | object | None
Example:
[prompt_optimizer]
enabled = true
chat_provider_id = "gpt-4o-mini"
# Optional override:
# prompt = "Rewrite the user's prompt to be clearer, more specific, and action-oriented. Return only the improved prompt."Using Langfuse:
[prompt_optimizer]
enabled = true
chat_provider_id = "gpt-4o-mini"
prompt = { source = "langfuse", prompt_name = "prompt-optimizer", label = "production", fallback = "Rewrite the user's prompt to be clearer, more specific, and action-oriented. Return only the improved prompt." }Using a static object:
[prompt_optimizer]
enabled = true
chat_provider_id = "gpt-4o-mini"
prompt = { source = "static", prompt = "Rewrite the user's prompt to be clearer, more specific, and action-oriented. Return only the improved prompt." }user_preferences
Configuration for the user preferences feature.
user_preferences.enabled
Whether to enable the user preferences feature.
When disabled, the frontend hides the Personalization tab in the preferences dialog. Other preferences tabs can still be shown.
Default value: true
Type: boolean
Example:
[user_preferences]
enabled = falseuser_preferences.data_tab_enabled
Whether the Data tab should be shown in the frontend preferences dialog.
This only controls frontend visibility of the tab. It does not change backend availability of account data actions.
Default value: true
Type: boolean
Example:
[user_preferences]
data_tab_enabled = falseassistants
Configuration for the assistants feature. This allows enabling the assistants functionality in the frontend, which provides users with pre-configured chat assistants tuned for specific use-cases.
Deprecated compatibility: The previous [experimental_assistants] section is still accepted when parsing configuration, but it is deprecated. Use [assistants] instead. The deprecated section is planned to be removed in Erato 0.7.0.
assistants.enabled
Whether to enable the assistants feature in the frontend. When enabled, the frontend will expose assistant-related functionality and UI components to users.
Default value: false
Type: boolean
Example:
[assistants]
enabled = trueassistants.show_recent_items
Whether to show recent assistant items in the chat sidebar.
Default value: false
Type: boolean
Example:
[assistants]
show_recent_items = trueassistants.context_warning_threshold
Threshold between 0.0 and 1.0 at or above which the assistant editor shows the overall context warning block.
Default value: 0.5
Type: number
Example:
[assistants]
context_warning_threshold = 0.6assistants.context_file_contributor_threshold
Threshold between 0.0 and 1.0 at or above which a file is included in the assistant editor’s “largest file context contributors” list.
Default value: 0.05
Type: number
Example:
[assistants]
context_file_contributor_threshold = 0.01assistants.max_system_prompt_length
Maximum number of characters allowed in an assistant system prompt. When not set, no server-side length limit is enforced and the frontend falls back to its built-in default for UI validation.
Default value: unset
Type: integer
Example:
[assistants]
max_system_prompt_length = 5000starter_prompts
Configuration for welcome-screen starter prompts. Starter prompts prefill the chat input, optional facet selections, and an optional chat provider when clicked, but do not submit the message automatically.
Starter prompt titles and subtitles can be translated in the frontend with Lingui message IDs derived from the starter prompt id:
starter_prompts.<starter-prompt-id>.titlestarter_prompts.<starter-prompt-id>.subtitle
If no translation exists for a given starter prompt id, the frontend falls back to the configured title and subtitle.
starter_prompts.enabled
Whether to enable starter prompts globally.
Default value: false
Type: boolean
Example:
[starter_prompts]
enabled = truestarter_prompts.prompts
Map of starter prompt configurations, keyed by starter prompt id.
Type: object
Example:
[starter_prompts.prompts.research_topic]
title = "Research a topic"
subtitle = "Prefill a research request with web search selected"
icon = "iconoir-globe"
prompt = "Research this topic and summarize the most important findings."
localized_prompts = { de = { source = "static", prompt = "Recherchiere dieses Thema und fasse die wichtigsten Erkenntnisse zusammen." } }
selected_facets = ["web_search"]
chat_provider = "mock-llm"
[starter_prompts.prompts.draft_email]
title = "Draft an email"
subtitle = "Write a concise and professional reply"
icon = "iconoir-mail"
prompt = { source = "static", prompt = "Draft a concise and professional reply to this email." }Using Langfuse:
[starter_prompts.prompts.summarize_notes]
title = "Summarize notes"
subtitle = "Turn rough notes into a structured summary"
icon = "iconoir-page"
prompt = { source = "langfuse", prompt_name = "starter-summarize-notes", label = "production", fallback = "Summarize these notes into clear action items and decisions." }Each starter prompt supports these fields:
title- Fallback title shown when no frontend translation existssubtitle- Fallback subtitle shown when no frontend translation existsicon- Optional icon idprompt- Prompt source specification (uses commonpromptformat)localized_prompts- Optional map of locale code to prompt source specification (each value uses the commonpromptformat)selected_facets- Optional list of facet ids to preselectchat_provider- Optional chat provider id to preselect
localized_prompts is resolved in the backend against the authenticated user’s profile locale. Resolution order is:
- Exact locale match, for example
de-DE - Base language match, for example
de - Fallback to
prompt
starter_prompts.priority_order
List of starter prompt IDs in the order they should be displayed.
Type: list<string>
Default value: []
Example:
[starter_prompts]
priority_order = ["research_topic", "draft_email", "summarize_notes"]experimental_facets
Configuration for the experimental facets feature. Facets allow users to select tool-focused modes (e.g. web search) with optional tool allowlists, model setting overrides, and system prompt additions.
Note: This is an experimental feature. The configuration key may change once the feature has stabilized.
experimental_facets.facets
Map of facet configurations, keyed by facet id. Each facet defines display properties and optional behavior.
Type: object
Example:
[experimental_facets.facets.extended_thinking]
display_name = "Extended thinking"
icon = "iconoir-lightbulb"
disable_facet_prompt_template = true
model_settings = { reasoning_effort = "high" }
[experimental_facets.facets.web_search]
display_name = "Web search"
icon = "iconoir-globe"
tool_call_allowlist = ["web-search-mcp/*", "web-access-mcp/*"]
additional_system_prompt = "Please execute one or multiple web searches to answer the user's question."Using Langfuse:
[experimental_facets.facets.web_search]
display_name = "Web search"
icon = "iconoir-globe"
tool_call_allowlist = ["web-search-mcp/*", "web-access-mcp/*"]
additional_system_prompt = { source = "langfuse", prompt_name = "facet-web-search", label = "production", fallback = "Please execute one or multiple web searches to answer the user's question." }Using a static object:
[experimental_facets.facets.web_search]
display_name = "Web search"
icon = "iconoir-globe"
tool_call_allowlist = ["web-search-mcp/*", "web-access-mcp/*"]
additional_system_prompt = { source = "static", prompt = "Please execute one or multiple web searches to answer the user's question." }experimental_facets.priority_order
List of facet IDs in the order they should be displayed.
Type: list<string>
Default value: []
Example:
[experimental_facets]
priority_order = ["extended_thinking", "web_search"]experimental_facets.tool_call_allowlist
Global tool allowlist that applies regardless of selected facets.
Type: list<string>
Default value: []
Example:
[experimental_facets]
tool_call_allowlist = ["web-search-mcp/*"]experimental_facets.facet_prompt_template
Optional global prompt template injected when facets are selected. Set to an empty string to disable. Uses the common prompt format when set to an object.
Type: string | object | None
Default value: The user has requested the use of the "{{facet_display_name}}" feature.\n\nPrioritize the use of the following tools:\n{{facet_tools_list}}
Example:
[experimental_facets]
facet_prompt_template = ""Using Langfuse:
[experimental_facets]
facet_prompt_template = { source = "langfuse", prompt_name = "facet-template", label = "production", fallback = "The user has requested the use of the \"{{facet_display_name}}\" feature.\n\nPrioritize the use of the following tools:\n{{facet_tools_list}}" }Using a static object:
[experimental_facets]
facet_prompt_template = { source = "static", prompt = "The user has requested the use of the \"{{facet_display_name}}\" feature.\n\nPrioritize the use of the following tools:\n{{facet_tools_list}}" }experimental_facets.only_single_facet
Whether only a single facet can be selected at the same time.
Type: boolean
Default value: false
experimental_facets.show_facet_indicator_with_display_name
Whether the chat box indicator should include the facet display name.
Type: boolean
Default value: false
experimental_facets.default_selected_facets
Facets that should be selected by default in the frontend (no backend logic is applied).
Type: list<string>
Default value: []
Example:
[experimental_facets]
default_selected_facets = ["web_search"]action_facets
Configuration for action facets. Action facets are non-user-visible, parameterized extensions of the facets system that inject context-aware additional system prompts based on client-supplied action ID and arguments. Unlike regular facets, action facets are not selectable by the user — they are resolved from client context at request time.
Clients send an action_facet object (containing id and args) alongside the user’s message on submit, edit, and regenerate requests. The backend resolves the configured prompt template, validates the arguments against the allowed list, and injects the rendered prompt as an additional system prompt before generation.
Note: Action facet prompts are per-request and ephemeral. They do not replace or suppress any other prompt in the composition pipeline — the base system prompt is always preserved regardless of whether an action facet is active.
action_facets.enable_builtin_ms_office_addin
Whether to inject the builtin Outlook action facets used by the Microsoft Office add-in.
Type: boolean
Default value: false
When action_facets.enable_builtin_ms_office_addin = true, the backend injects these builtin Outlook action facets unless the same IDs are explicitly configured under action_facets.facets:
outlook_rewrite_selectionoutlook_review_draft
action_facets.facets.<id>
Map of action facet configurations, keyed by action facet id. Each action facet defines a prompt template with declared argument variables.
Type: object
Example:
[action_facets.facets.excel_rewrite_selection]
display_name = "Excel Rewrite Selection"
platform = "excel"
template = """
The user is working in Excel on sheet "{{sheet_name}}".
The current selection type is "{{selection_type}}".
Rewrite the following selected content:
{{selected_text}}
"""
allowed_args = ["sheet_name", "selection_type", "selected_text"]Each action facet supports these fields:
display_name- Human-readable name for the action facetplatform- Optional platform identifier (e.g."excel","outlook")template- Prompt template with{{variable}}placeholdersallowed_args- List of allowed argument names for the template
Template Rendering
Templates use {{variable}} placeholders that are replaced with argument values supplied in the request. Rendering uses single-pass left-to-right scanning — substituted values are never re-scanned for further {{…}} patterns, preventing injection via argument values.
Validation
The backend validates each request against the action facet configuration:
- The
action_facet.idmust match a configured action facet — unknown IDs are rejected - All keys in
argsmust be present inallowed_args— unexpected keys are rejected - Argument values are subject to a maximum size limit (10 KB per value)
Request Shape
The action_facet field is accepted on submit, edit, and regenerate endpoints. It is not accepted on resume, since the prompt was already composed when the generation started.
{
"action_facet": {
"id": "excel_rewrite_selection",
"args": {
"selection_type": "cell_text",
"sheet_name": "Q1 Plan",
"selected_text": "badly phrased content here"
}
}
}Differences from Regular Facets
| Aspect | Regular Facets | Action Facets |
|---|---|---|
| User-visible | Yes, selectable in UI | No, resolved from client context |
| Parameterized | No, behavior is config-time only | Yes, runtime args from request |
| Request field | selected_facet_ids | action_facet |
| Prompt lifecycle | Toggle-state based, persistent across messages | Per-request, ephemeral |
| Base system prompt | May suppress historical replay on newly-enabled turn | Never suppresses — always additive |
budget
Configuration for budget tracking and display functionality. This feature allows you to track and display per-user spending based on token usage and model pricing.
Type: object
Default behavior: Budget tracking is disabled by default.
Example:
[budget]
enabled = true
max_budget = 100.0
budget_currency = "USD"
warn_threshold = 0.8
budget_period_days = 30budget.enabled
Whether budget tracking and display is enabled. When disabled, no budget information will be calculated or displayed to users.
Type: boolean
Default value: false
Example: true
budget.max_budget
The maximum budget amount per budget period (unit-less). This value is used for calculating budget usage percentages and determining when warning thresholds are exceeded.
The budget amount is treated as unit-less for flexibility - the actual monetary value depends on your model pricing configuration and the configured display currency.
Type: number | None
Required when: budget.enabled = true
Example: 100.0
budget.budget_currency
The currency to use for display purposes in the frontend. This is purely cosmetic and doesn’t affect calculations - all budget amounts and costs are treated as unit-less values.
Type: string
Supported values: "USD", "EUR"
Default value: "USD"
Example: "EUR"
budget.warn_threshold
The threshold (between 0.0 and 1.0) at which to warn users about budget usage. When a user’s spending exceeds this percentage of their budget limit, they will receive warnings in the UI.
For example, a value of 0.7 means users will be warned when they’ve used 70% of their budget.
Type: number
Default value: 0.7
Valid range: 0.0 to 1.0
Example: 0.8
budget.budget_period_days
Number of days that counts as one budget period. Budget usage is calculated and reset based on this period length.
Type: number
Default value: 30
Example: 7 (for weekly budgets), 30 (for monthly budgets)
Complete Budget Configuration Example
[budget]
enabled = true
max_budget = 50.0
budget_currency = "EUR"
warn_threshold = 0.75
budget_period_days = 7
# You also need to configure model pricing for accurate cost calculations
[chat_providers]
priority_order = ["main"]
[chat_providers.providers.main]
provider_kind = "openai"
model_name = "gpt-4o"
api_key = "sk-your-key"
[chat_providers.providers.main.model_capabilities]
cost_input_tokens_per_1m = 5.0
cost_output_tokens_per_1m = 15.0Note: Budget tracking requires that you configure accurate pricing information in your chat provider’s model_capabilities section. The budget calculations are based on actual token usage multiplied by the configured token prices.
caches
Configuration for in-memory caches that improve performance by reducing redundant file processing and token counting operations. Caches use memory-based size limits (in megabytes) rather than entry counts.
Type: object
Default behavior: file_contents_cache_mb, file_bytes_cache_mb, and token_count_cache_mb default to 100MB each. file_processing_parallelism defaults to 4.
Example:
[caches]
file_contents_cache_mb = 100
file_bytes_cache_mb = 100
token_count_cache_mb = 100
file_processing_parallelism = 4caches.file_contents_cache_mb
Maximum memory size in megabytes for the file contents cache. This cache stores parsed file contents indexed by file ID to avoid repeatedly reading and parsing the same files.
When the cache reaches this size limit, the least recently used entries will be automatically evicted to make room for new entries.
Type: number
Default value: 100
Example: 200 (for 200MB cache), 50 (for 50MB cache)
caches.file_bytes_cache_mb
Maximum memory size in megabytes for the raw file bytes cache. This cache stores original file bytes (text and images) before extraction and parsing.
This cache helps avoid repeated file reads for frequently reused files and can be useful when documents are large or re-opened often.
Type: number
Default value: 100
Example: 300 (for 300MB cache), 75 (for 75MB cache)
caches.token_count_cache_mb
Maximum memory size in megabytes for the token count cache. This cache stores token counts indexed by file content to avoid repeatedly tokenizing the same content.
When the cache reaches this size limit, the least recently used entries will be automatically evicted to make room for new entries.
Type: number
Default value: 100
Example: 200 (for 200MB cache), 50 (for 50MB cache)
caches.file_processing_parallelism
Maximum number of file processing tasks that may run in parallel when cache misses require extraction, token counting, or related preprocessing.
Higher values can improve throughput for concurrent requests that trigger expensive file processing, while lower values reduce CPU and memory pressure.
Type: number
Default value: 4
Example: 8 for higher throughput, 2 for more conservative resource usage
Complete Cache Configuration Example
# Configure caches for optimal performance
[caches]
# Allocate 200MB for file contents caching
file_contents_cache_mb = 200
# Allocate 150MB for raw file bytes caching
file_bytes_cache_mb = 150
# Allocate 100MB for token count caching
token_count_cache_mb = 100
# Allow up to 6 parallel file processing tasks
file_processing_parallelism = 6Performance Considerations:
- Increase cache sizes if you have many users frequently accessing the same files
- Decrease cache sizes in memory-constrained environments
- The file contents cache is typically more beneficial than the token count cache
- Cache efficiency depends on file reuse patterns in your workflow