trajgen

Reference

Inputs & Outputs

The config-driven input/output contract

trajgen is configured entirely through config.yaml. It uses config.yaml rather than a separate inputs.yaml because the file describes a full Harbor run profile, not just upstream inputs. Treat config.yaml as the source of truth before launching a job.

Inputs

NameDescriptionSource
repositoriesGit remote/ref/path/read-only policy for harbor and swe_data_processexternal
environmentHarbor uv env, LiteLLM venv, and swe_data_process uv envexternal
llm_apiRaw upstream model API used by the per-job LiteLLM proxyexternal
litellm_proxyLiteLLM config template, port, and master keyexternal
task_sourceSWE task source (wired from swegen)dependency
harbor_jobjobs dir, concurrency, retries, timeout multiplier, task capexternal
agentHarbor agent scaffold, version, runtime image, sampling controlsexternal
sft_conversionOptional post-job conversion stepexternal
HARBOR_EXCLUDE_TASKSSpace-separated task IDs Harbor must skipderived

The only external value you normally fill is runtime_info.input.llm_api; the task source is wired from swegen via meta_info.dependencies.

Active runtime values

Excerpted from config.yaml:

llm_api:
  api_key: dummy-key
  api_base_url: "http://llm10.jierungogogo.com/v1"
  model: "openai/MiniMax-M2.7"
  protocols: [openai_compatible, anthropic_compatible]
  served_via: per_job_litellm_proxy
litellm_proxy:
  config_template: scripts/serve_llm/litellm_config.example.yaml
  port: 4001
  master_key: dummy-key-cf
task_source:
  provider: local
  dataset_name: ../swegen/artifacts/swe_tasks/py-cc
harbor_job:
  jobs_dir: artifacts/jobs
  n_concurrent: 8
  n_tasks: 16
  max_retries: 2
  timeout_multiplier: 2
agent:
  name: custom-claude-code
  version: 2.1.118
  runtime_image: docker.io/jierun/c-cc-2.1.118:v0.1
  max_turns: 80
  temperature: 0.7
sft_conversion:
  enabled: false
  scaffold: auto
  out_dir: artifacts/sft_data

Outputs

trajgen declares its handoff contract in config.yamlruntime_info.output:

OutputPathFormatConsumer
raw_trajectories_dirartifacts/jobs/artifacts/jobs/<job>/<task>/agent/litellm-trajectory.jsonlsft
sft_data_dirartifacts/sft_data/artifacts/sft_data/<job>/lf.json (LLaMA-Factory)sft

See Results & Artifacts for the full layout, and Config Variants for running experiments without disturbing the active profile.

On this page