> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runwita.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Models

> Configure the Frontier and Workhorse AI tiers, provider by provider.

Models is where you wire Runwita to one or more AI providers. Two tiers, four providers each, fully independent. See [AI tiers](/concepts/ai-tiers) for what each tier does; this page is the configuration reference.

## The shape of the panel

You'll see two cards: **Frontier** (intelligence) and **Workhorse** (extraction and chat). Each has the same controls:

* **Provider toggle**, four buttons: Claude, OpenAI, Ollama, Custom.
* **API key**, password-masked input.
* **Base URL**, only visible for Ollama and Custom.
* **Model**, a dropdown populated from the provider's `/v1/models` endpoint when you click *Load models*.

Settings auto-saves only the theme picker. Everything else lives in the form until you click **Save all** at the bottom.

## Provider: Claude (Anthropic)

The default for the Frontier tier. Recommended whenever output quality matters more than cost.

<Steps>
  <Step title="Get an API key">
    [console.anthropic.com](https://console.anthropic.com) → API Keys → Create key. Copy it.
  </Step>

  <Step title="Paste it into the API key field">
    On either or both tiers, depending on what you want Claude to handle.
  </Step>

  <Step title="Click Load models">
    The dropdown populates with every Claude model your key has access to. Newer models appear at the top.
  </Step>

  <Step title="Pick a model">
    Suggested defaults: **claude-haiku-4-5** for both tiers (cheap, fast, surprisingly capable). **claude-opus-4-5** for Frontier when budget allows. **claude-sonnet-4-5** for Workhorse when you want a step up from Haiku without going to Opus.
  </Step>

  <Step title="Save all">
    The bottom save bar commits your changes.
  </Step>
</Steps>

The base URL for Claude isn't user-editable, it's always `https://api.anthropic.com`.

## Provider: OpenAI

<Steps>
  <Step title="Get an API key">
    [platform.openai.com](https://platform.openai.com) → API keys → Create new secret key.
  </Step>

  <Step title="Paste it in">
    Same field. Same flow.
  </Step>

  <Step title="Click Load models">
    The dropdown filters to chat-completion-capable models only, embeddings, audio, image, and reasoning-only models are stripped out, so you don't have to wade through 60+ entries.
  </Step>

  <Step title="Pick a model">
    Recommended defaults: **gpt-5.4-nano** for Workhorse (cheapest credible). **gpt-5** for Frontier or for Workhorse if Workhorse is struggling. **gpt-4.1** if you need a larger output budget.
  </Step>
</Steps>

OpenAI's GPT-5 and o-series models have a different parameter shape than older models (`max_completion_tokens` instead of `max_tokens`, no custom temperature). Runwita detects this automatically by model name, you don't have to configure anything.

## Provider: Ollama (local)

Ollama runs a model on your own machine. Zero API cost, zero data leaves your laptop. It's slower and less capable than the cloud options, but for privacy or offline use it's the answer.

<Steps>
  <Step title="Install Ollama">
    [ollama.com/download](https://ollama.com/download). Runs as a background service.
  </Step>

  <Step title="Pull a model">
    Open Terminal and run `ollama pull qwen3:8b`. (Qwen3-8B is the current recommended default for Runwita-style extraction tasks. Llama 3 and Phi-3 also work.)
  </Step>

  <Step title="Configure the tier">
    Provider: Ollama. Base URL: `http://localhost:11434/v1` (the default; only change if you've moved Ollama somewhere else). Click Load models, the dropdown shows everything you have pulled.
  </Step>

  <Step title="Pick the model">
    `qwen3:8b` is the safe default. Smaller models (`qwen3:4b`, `phi3:mini`) are faster but the output quality drops off a cliff for topic matching.
  </Step>
</Steps>

The API key field is unused for Ollama (Ollama doesn't require auth), but the form keeps it visible for consistency. Leave it blank.

## Provider: Custom (any OpenAI-compatible endpoint)

For LiteLLM, vLLM, Together, OpenRouter, your own proxy, anything that speaks the OpenAI chat completions API.

<Steps>
  <Step title="Set the base URL">
    The full URL up to but not including `/chat/completions`. For LiteLLM that's typically `https://your-litellm-host.example/v1`. For OpenRouter it's `https://openrouter.ai/api/v1`.
  </Step>

  <Step title="Set the API key">
    Whatever your endpoint expects. Paste it in.
  </Step>

  <Step title="Click Load models">
    Hits `${base_url}/models`. If your endpoint exposes a model list, the dropdown populates.
  </Step>

  <Step title="Pick the model">
    The exact model identifier your endpoint uses. *"claude-3-5-sonnet"* on a LiteLLM proxy is different from *"anthropic/claude-3-5-sonnet"* on OpenRouter, watch the names.
  </Step>
</Steps>

If your custom endpoint doesn't expose `/v1/models`, you can still use Custom: the *Load models* button errors out, but you can pick from previously-saved models or just type the model name in. The model field is editable text.

## Token budgets

Runwita sends generous output token budgets to give the AI room to produce thorough notes:

* Single-engagement extraction: **16,384 tokens**.
* Merged extractions (multiple inbox items combined): **32,768 tokens**.
* Frontier intelligence calls: **6,144 tokens** (these are short outputs).

Smaller models can fail to fill these budgets and that's fine. Bigger models with smaller native output caps (some open models cap at 4K or 8K) might hit a truncation error. If you see *"AI response was truncated"*, switch to a model with a higher output cap, the error message tells you which model truncated.

## What "Load models" actually does

Clicking *Load models* makes a real HTTP request from your machine to the provider's `/v1/models` endpoint (or `/api/tags` for Ollama). It uses your API key. Two implications:

1. If the request fails (bad key, wrong endpoint, network issue), you'll see the exact error inline, that's diagnostic info, not just *"something's wrong"*.
2. The model list reflects what your account actually has access to *right now*. If OpenAI rolls out a new model and your key is on the right tier to use it, it appears in the dropdown automatically.

## Switching providers preserves your settings

Each provider has its own API-key slot per tier. So you can configure Claude and OpenAI both, switch back and forth, and your keys are preserved. The active provider is what gets used for extractions; the others sit dormant.

## What's next

<CardGroup cols={2}>
  <Card title="Connections" icon="link" href="/settings/connections">
    TickTick sync, MeetingScribe folder, transcription cleanup.
  </Card>

  <Card title="AI tiers (concept)" icon="brain" href="/concepts/ai-tiers">
    What each tier does, when to upgrade which.
  </Card>
</CardGroup>
