Skip to main content

Model Clients

Paddles routes action selection and final rendering through HTTP model clients. Local-first setups should run a service such as Ollama and select models with the ollama:<model> provider form. Model weights, residency, batching, and hardware placement stay inside that HTTP service rather than the Paddles turn runtime.

Turn Runtime Roles

RoleWhat it doesOptimized for
Action selectionRecursive investigation: search, read, refine, branchTool use, multi-step reasoning, evidence gathering
Final renderingUser-facing answer from evidence and trace contextAnswer quality, grounding, citation
RetrievalEvidence gathering for search and refine actionsLocal workspace indexing and ranking

By default, action selection and final rendering use the same HTTP model client. The action-selection flags can override investigation independently; legacy --planner-* spellings remain accepted as migration aliases.

Current Runtime Choices

SelectionExampleNotes
Local HTTP model clientollama:qwen3OpenAI-compatible local service; model process is outside Paddles
Remote HTTP model clientopenai:gpt-5.4Requires provider credentials
Retrieval providersift-directDirect local retrieval for search/refine actions
Experimental retrieval boundarycontext-1Fail-closed until the harness is ready

Search Contract Parameters

search and refine actions share a retrieval contract that controls retrieval depth and strategy:

  • mode: linear or graph
  • strategy: lexical (BM25) or hybrid (BM25 + vector + RRF)
  • step_limit: per-call retrieve budget
  • max_items / max_snippet_chars / max_summary_chars: evidence bundle caps
  • retained_artifact_limit: number of retained artifacts preserved for the action trace

Current defaults are documented on the Search and Retrieval page.

Multi-Provider Support

Paddles supports multiple HTTP API providers:

  • Ollama for local HTTP model services
  • OpenAI-compatible APIs
  • Anthropic API
  • Gemini API

Each provider is accessed through the same model-client contract. Credentials are managed per-provider through the TUI login flow or environment variables.

Routing In Practice

A local HTTP model for both action selection and final rendering:

paddles --model ollama:qwen3

A remote final renderer with a different action-selection client:

paddles --model openai:gpt-5.4 --action-selection-provider ollama --action-selection-model qwen3

The harness degrades gracefully — if a remote provider is unavailable, the turn fails closed with a clear error rather than producing an ungrounded answer.