Shell Commands

lint

TPV config files can be checked for linting errors using the tpv lint command.

tpv lint <url_or_path_to_config_file>

If linting is successful, a lint successful message will be displayed with an exit code of zero. If the linting fails, a lint failed message with the relevant error will be displayed with an exit code of 1. For example:

$ cat >good.yml <<EOF
tools:
  default:
    cores: 1
    mem: cores * 3.9
    context:
      partition: normal
    params:
      native_specification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={round(mem*1024)} --partition={partition}"
    scheduling:
      reject:
        - offline
    rules: []
EOF
$ tpv lint good.yml
INFO : tpv.commands.shell: lint successful.
$ echo $?
0

cat >bad.yml <<EOF
tools:
  - default:
      cores: 1
EOF
$ tpv lint bad.yml
INFO : tpv.commands.shell: lint failed.
$ echo $?
1

To display the reasons for the failure, use the -v option to increase verbosity, with each additional v increasing log level.

Type-checking and unrecognized fields

The TPV linter can perform type-checking on embedded code as well as issue warnings when unrecognized fields (fields not defined in TPV’s pydantic schema) are encountered. Type-checking is particularly valuable in detecting potential syntax errors, inadequately guarded code etc. before they translate into runtime errors. TPV follows mypy strict, and we recommend that all type related warnings are corrected before deploying your TPV configuration. To view these warnings, you can repeat the verbosity flag.

tpv -vv lint <url_or_path_to_config_file>

$ tpv -vv lint --preserve-temp-code tests/fixtures/linter/linter-types-undefined-variable.yml
WARNING: tpv.commands.linter: T103: /var/folders/0j/l_xj28m94cb7fq5nnk03m7p80000gs/T/tmpym33_1uz.py:64: error: Incompatible return value type (got "set[Any]", expected "int | float | str | None")  [return-value]
WARNING: tpv.commands.linter: T103: /var/folders/0j/l_xj28m94cb7fq5nnk03m7p80000gs/T/tmpym33_1uz.py:64: error: Name "something" is not defined  [name-defined]
WARNING: tpv.commands.linter: T103: /var/folders/0j/l_xj28m94cb7fq5nnk03m7p80000gs/T/tmpym33_1uz.py:68: error: Name "mem2" is not defined  [name-defined]
WARNING: tpv.commands.linter: T104: Unexpected field '.destinations.local.unknown' - make sure the field is nested correctly or manually silence warning
WARNING: tpv.commands.linter: T104: Unexpected field '.destinations.local.if' - make sure the field is nested correctly or manually silence warning
INFO : tpv.commands.shell: lint successful.

Note the use of the -vv flag, which has a single hypen, and each repeated occurrence of v increases verbosity level. Note also the use of the –preserve-temp-code flag. This flag will retain the generated python code so that the line numbers indicated in the error can be inspected. In the generated file, each code block is enclosed in a dedicated function, and the function name can be used to infer the affected code block.

cat /var/folders/0j/l_xj28m94cb7fq5nnk03m7p80000gs/T/tmpym33_1uz.py

will reveal the code that was type checked, and you can find the line numbers pinpointed by mypy:

# This file was autogenerated by TPVConfigLinter for mypy checks.
import logging
from typing import Annotated, Any, ClassVar, Iterable

from galaxy.app import UniverseApplication
from galaxy.jobs import JobWrapper
from galaxy.model import Job, User
from galaxy.tools import Tool as GalaxyTool

from tpv.core import helpers
from tpv.core.entities import Destination, Entity, SchedulingTags
from tpv.core.mapper import EntityToDestinationMapper

log = logging.getLogger(__name__)

# --- 1. Declare global "context" variables ---
app: UniverseApplication
tool: GalaxyTool
user: User | None
job: Job
job_wrapper: JobWrapper | None
resource_params: dict[str, Any] | None
workflow_invocation_uuid: str | None
mapper: EntityToDestinationMapper
entity: Entity

# --- 2. Declare evaluation time "context" variables ---
cores: int | float | str | None
mem: int | float | str | None
gpus: int | str | None
min_cores: int | float | str | None
min_mem: int | float | str | None
min_gpus: int | str | None
max_cores: int | float | str | None
max_mem: int | float | str | None
max_gpus: int | str | None
max_accepted_cores: int | float | None
max_accepted_mem: int | float | None
max_accepted_gpus: int | None
min_accepted_cores: int | float | None
min_accepted_mem: int | float | None
min_accepted_gpus: int | None
env: list[dict[str, str]] | None
params: dict[str, Any] | None
resubmit: dict[str, dict[str, str | int | float | None]] | None
rank: str | None
context: dict[str, Any] | None
handler_tags: SchedulingTags | None
candidate_destinations: list[Destination]
dest_name: str | None
input_size: float

# --- 3. Declare user defined "context" variables ---


# --- 4. User defined, evaluable entity fields ---

def tool_default_cores() -> int | float | str | None:
    return 2

def tool_default_mem() -> int | float | str | None:
    return {something}

def tool_default_params_native_spec() -> str:
    return f'''--mem {mem2}'''

dry-run

You can test that your TPV configuration returns the expected destination for a given tool and/or user using the tpv dry-run command.

tpv dry-run --job-conf <path_to_galaxy_job_conf_file> [--tool <tool_id>] \
    [--user <user_name_or_email>] [--input-size <size_in_gb>] \
    [tpv_config_file ...]

If no TPV config files are specified on the command line, they will be read from the tpv_dispatcher execution environment (destination) definition in the specified Galaxy job configuration file.

For example:

$ tpv dry-run --job-conf /srv/galaxy/config/job_conf.yml
!!python/object:galaxy.jobs.JobDestination
converted: false
env:
- {name: LC_ALL, value: C}
id: slurm
legacy: false
params: {native_specification: --nodes=1 --ntasks=1 --ntasks-per-node=1 --mem=3994
    --partition=normal, outputs_to_working_directory: true, tmp_dir: true}
resubmit: []
runner: slurm
shell: null
tags: null
url: null
$ tpv dry-run --job-conf /srv/galaxy/config/job_conf.yml --tool trinity --input-size 40 *.yml
!!python/object:galaxy.jobs.JobDestination
converted: false
env:
- {name: LC_ALL, value: C}
- {name: TERM, value: vt100}
- {execute: ulimit -c 0}
- {execute: ulimit -u 16384}
id: pulsar
legacy: false
params:
  default_file_action: remote_transfer
  dependency_resolution: remote
  jobs_directory: /scratch/pulsar/staging
  outputs_to_working_directory: false
  remote_metadata: false
  rewrite_parameters: true
  submit_native_specification: --nodes=1 --ntasks=20 --ntasks-per-node=20 --partition=xlarge
  transport: curl
resubmit: []
runner: pulsar
shell: null
tags: null
url: null

Explain mode

The --explain flag provides a detailed decision trace showing how TPV arrived at its scheduling result. This is useful for debugging why a particular tool was routed to a specific destination, or why a mapping failed.

tpv dry-run --job-conf <path_to_galaxy_job_conf_file> --tool <tool_id> --explain \
    [--user <user_name_or_email>] [--input-size <size_in_gb>] \
    [--output-format text|yaml] [tpv_config_file ...]

The trace is written to stderr (so it can be separated from the destination YAML on stdout) and includes the following phases:

  • Configuration Loading – which config files were loaded

  • Entity Matching – which tool, user, and role entities were matched

  • Entity Combining – how matched entities were merged

  • Rule Evaluation – which rules matched or did not match, and how they modified the entity

  • Destination Matching – which candidate destinations matched or were rejected

  • Destination Ranking – how candidates were scored and ordered

  • Destination Evaluation – final evaluation of the selected destination

  • Final Result – the chosen destination, or the reason for failure

For example:

$ tpv dry-run --job-conf /srv/galaxy/config/job_conf.yml --tool bwa --input-size 40 --explain
========================================================================
TPV SCHEDULING DECISION TRACE
========================================================================

--- Configuration Loading ---
  [1] Loaded config: /srv/galaxy/config/tpv_rules.yml

--- Entity Matching ---
  [2] Tool 'bwa': matched entity 'bwa'
  ...

--- Final Result ---
  [10] Selected destination: pulsar (score=2)

========================================================================

If the mapping fails (e.g. no destinations can accept the job), the trace still shows all steps up to the failure, making it easy to diagnose the issue.

To get the trace as YAML instead of text, use --output-format yaml:

$ tpv dry-run --job-conf /srv/galaxy/config/job_conf.yml --tool bwa --explain --output-format yaml

dump

The tpv dump command loads one or more TPV configuration files, merges them, and outputs the fully resolved configuration. This is useful for inspecting the final merged state when multiple config files override each other.

tpv dump [--job-conf <path_to_galaxy_job_conf_file>] \
    [--output-format text|yaml] [tpv_config_file ...]

If no TPV config files are specified on the command line, they will be discovered from the tpv_dispatcher destination definition in the Galaxy job configuration file specified via --job-conf.

The text output shows all non-default fields for each entity, including resource requirements, scheduling tags, environment variables, parameters, and full rule details. Simple rule conditions appear inline, while multiline code blocks are rendered with YAML-style | block syntax.

For example:

$ tpv dump config1.yml config2.yml
========================================================================
TPV MERGED CONFIGURATION
Sources (in load order):
  1. config1.yml
  2. config2.yml
========================================================================

--- Global ---
  default_inherits: default

--- Tools ---
  default:
    cores: 2
    mem: cores * 3
    env: [{'name': 'GALAXY_SLOTS', 'value': '{cores}'}]
    params: {'native_spec': '--mem {mem} --cores {cores}'}
    scheduling: prefer=['general'], reject=['pulsar']
    rules:
      fail_small_data [if: input_size < 5]
        fail: Data size too small

  bwa:
    scheduling: require=['pulsar']
    rules:
      medium_resources [if: input_size <= 10]
        cores: 4
        mem: cores * 4
      large_resources [if: input_size > 10 and input_size < 20]
        scheduling: {'require': ['highmem']}
      reject_huge [if: input_size >= 20]
        fail: Too much data, shouldn't run

--- Users ---
  default:
    rules:
      training_destination_rule
        if: |
          any([r for r in user.all_roles()
               if (not r.deleted and r.name.startswith('training'))])
        scheduling: {'require': ['training']}

--- Destinations ---
  local:
    runner: local
    max_accepted_cores: 4
    max_accepted_mem: 16
    scheduling: prefer=['general']

  k8s_environment:
    runner: k8s
    max_accepted_cores: 16
    max_accepted_mem: 64
    max_accepted_gpus: 2
    scheduling: prefer=['pulsar']

========================================================================

To dump the merged configuration as YAML:

$ tpv dump --output-format yaml config1.yml config2.yml

To discover config files from a Galaxy job configuration:

$ tpv dump --job-conf /srv/galaxy/config/job_conf.yml