LiteLLM Proxy

tracer never points the agent directly at your upstream model API. Instead, each job starts its own LiteLLM proxy, which normalizes the endpoint, exposes both OpenAI- and Anthropic-compatible protocols, and attaches the trajectory logger.

Why a proxy

Protocol bridging — different agent scaffolds expect OpenAI or Anthropic message formats. The proxy serves both from one upstream model.
Trajectory capture — the proxy's logger writes each request/response to litellm-trajectory.jsonl, which is what makes the run reproducible and convertible to SFT data.
Isolation — each job gets a fresh, per-job config so concurrent or sequential jobs don't share state.

Configuration

The upstream model is declared in config.yaml under runtime_info.input.llm_api:

llm_api:
  api_key: dummy-key
  api_base_url: "http://<your-production-endpoint>/v1"
  model: "openai/MiniMax-M2.7"
  protocols: [openai_compatible, anthropic_compatible]
  served_via: per_job_litellm_proxy

The proxy itself is configured under runtime_info.input.litellm_proxy:

litellm_proxy:
  config_template: scripts/serve_llm/litellm_config.example.yaml
  port: 4001
  master_key: dummy-key-cf

Lifecycle

scripts/start.sh handles the proxy automatically:

Renders a per-job config from the template into artifacts/litellm/<job>/litellm_config_tracer.yaml.
Starts the proxy on litellm_proxy.port, passing the config to Harbor's LiteLLM serve script via LITELLM_CONFIG.
Runs the Harbor job against the proxy.

Stop the proxy after the job

The proxy is a long-running process. When a job's inference is finished but the proxy is still up, stop the process started for this job — and only that one. Leaving it running holds the port and the upstream connection.

LiteLLM Proxy

Why a proxy

Configuration

Lifecycle

On this page