TPV by example

Simple configuration

The simplest possible example of a useful TPV config might look like the following:

tools:
  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
    cores: 12
    mem: cores * 4
    gpus: 1

destinations:
 slurm:
   runner: slurm
   max_accepted_cores: 16
   max_accepted_mem: 64
   max_accepted_gpus: 2
 general_pulsar_1:
   runner: pulsar_1
   max_accepted_cores: 8
   max_accepted_mem: 32
   max_accepted_gpus: 1

Here, we define one tool and its resource requirements, the destinations available, and the total resources available at each destination (optional). The tools are matched by tool id, and can be a regular expression. Note how resource requirements can also be computed as python expressions. If resource requirements are defined at the destination, TPV will check whether the job will fit. For example, hisat2 will not schedule on general_pulsar_1 as it has insufficient cores. If resource requirements are omitted in the tool or destination, it is considered a match. Note that TPV only considers destinations defined in its own config file, and ignores destinations in job_conf.yml.

Default inheritance

Inheritance provides a mechanism for an entity to inherit properties from another entity, reducing repetition.

global:
  default_inherits: default

tools:
  default:
    cores: 2
    mem: 4
    params:
      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
    cores: 12
    mem: cores * 4
    gpus: 1

The global section is used to define global TPV properties. The default_inherits property defines a “base class” for all tools to inherit from.

In this example, if the bwa tool is executed, it will match the default tool, as there are no other matches, thus inheriting its resource requirements. The hisat2 tool will also inherit these defaults, but is explicitly overriding cores, mem and gpus. It will inherit the nativeSpecification param.

Explicit inheritance

Explicit inheritance provides a mechanism for exerting greater control over the inheritance chain.

global:
  default_inherits: default

tools:
  default:
    cores: 2
    mem: 4
    params:
      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
    cores: 12
    mem: cores * 4
    gpus: 1
  .*minimap2.*:
    inherits: toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
    cores: 8
    gpus: 0

In this example, the minimap2 tool explicitly inherits requirements from the hisat2 tool, which in turn inherits the default tool. There is no limit to how deep the inheritance hierarchy can be.

Scheduling tags

Scheduling tags provide a means by which to control how entities match up, and can be used to route jobs to preferred destinations, or to explicitly control which users can execute which tools, and where.

tools:
  default:
    cores: 2
    mem: 4
    params:
      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
    scheduling:
      reject:
        - offline
  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
    cores: 4
    mem: cores * 4
    gpus: 1
    scheduling:
      require:
      prefer:
        - highmem
      accept:
      reject:
  toolshed.g2.bx.psu.edu/repos/iuc/minimap2/.*:
    cores: 4
    mem: cores * 4
    gpus: 1
    scheduling:
      require:
        - highmem

destinations:
  slurm:
    runner: slurm
    max_accepted_cores: 16
    max_accepted_mem: 64
    max_accepted_gpus: 2
    scheduling:
      prefer:
        - general

  general_pulsar_1:
    runner: pulsar_1
    max_accepted_cores: 8
    max_accepted_mem: 32
    max_accepted_gpus: 1
    scheduling:
      prefer:
        - highmem
      reject:
        - offline

In this example, all tools reject destinations marked as offline. The hisat2 tool expresses a preference for highmem, and inherits the rejection of offline tags. Inheritance can be used to override scheduling tags. For example, the minimap2 tool inherits hisat2, but now requires a highmem tag, instead of merely preferring it.

The destinations themselves can be tagged in similar ways. In this case, the general_pulsar_1 destination also prefers the highmem tag, and thus, the hisat2 tool would schedule there. However, general_pulsar_1 also rejects the offline tag, and therefore, the hisat2 tool cannot schedule there. Therefore, it schedules on the only available destination, which is slurm.

The minimap2 tool meanwhile requires highmem, but rejects offline tags, which leaves it nowhere to schedule. This results in a JobMappingException being thrown.

A full table of how scheduling tags match up can be found in the Scheduling section.

These TPV defined scheduling tags should be contrasted with Galaxy’s destination level handler tags: https://github.com/galaxyproject/galaxy/blob/0a0d68b7feed5e303ed762f6586ea9757219c6f7/lib/galaxy/config/sample/job_conf.sample.yml#L1037 Galaxy handler tags can be defined as simply tags at the destination.

Tool type tags

TPV auto-injects tool_type_<tool.tool_type> as a tool accept tag during mapping. You can therefore route by tool type (e.g. interactive, expression, user_defined etc.) using normal scheduling tags on destinations.

By default, all destinations reject the tool_type_user_defined tag as security measure. To allow user-defined tools on a destination, explicitly accept that tag.

destinations:
  local:
    runner: local
    scheduling:
      reject:
        - tool_type_interactive
  interactive_cluster:
    runner: pulsar
    scheduling:
      require:
        - tool_type_interactive
  user_defined_cluster:
    runner: pulsar
    scheduling:
      accept:
        - tool_type_user_defined

Rules

Rules provide a means by which to conditionally change entity requirements.

tools:
  default:
    cores: 2
    mem: cores * 3
    rules:
      - id: my_overridable_rule
        if: input_size < 5
        fail: We don't run piddling datasets of {input_size}GB
  bwa:
    scheduling:
      require:
        - pulsar
    rules:
      - id: my_overridable_rule
        if: input_size < 1
        fail: We don't run piddling datasets
      - if: input_size <= 10
        cores: 4
        mem: cores * 4
        execute: |
           from galaxy.jobs.mapper import JobNotReadyException
           raise JobNotReadyException()
      - if: input_size > 10 and input_size < 20
        scheduling:
          require:
            - highmem
      - if: input_size >= 20
        fail: Input size: {input_size} is too large shouldn't run

The if clause can contain arbitrary python code, including multi-line python code. The only requirement is that the last statement in the code block must evaluate to a boolean value. In this example, the input_size variable is an automatically available contextual variable which is computed by totalling the sizes of all inputs to the job. Additional available variables include app, job, tool, and user.

If the rule matches, the properties of the rule override the properties of the tool. For example, if the input_size is 15, the bwa tool will require both pulsar and highmem tags.

Rules can be overridden by giving them an id. For example, the default for all tools is to reject input sizes < 5 by using the my_overridable_rule rule. We override that for the bwa tool by specifically referring to the inherited rule by id. If no id is specified, an id is auto-generated and no longer overridable.

Note the use of the {input_size} variable in the fail message. The general rule is that all non-string expressions are evaluated as python code blocks, while string variables are evaluated as python f-strings.

The execute block can be used to create arbitrary side-effects if a rule matches. The return value of an execute block is ignored.

User and Role Handling

Scheduling rules can also be expressed for users and roles.

tools:
  default:
    scheduling:
      require: []
      prefer:
        - general
      accept:
      reject:
        - pulsar
    rules: []
  dangerous_interactive_tool:
    cores: 8
    mem: 8
    scheduling:
      require:
        - authorize_dangerous_tool
users:
  default:
    scheduling:
      reject:
        - authorize_dangerous_tool
  fairycake@vortex.org:
    cores: 4
    mem: 16
    scheduling:
      accept:
        - authorize_dangerous_tool
      prefer:
        - highmem

roles:
  training.*:
    cores: 5
    mem: 7
    scheduling:
      reject:
        - pulsar

In this example, if user fairycake@vortex.org attempts to dispatch a dangerous_interactive_tool job, the requirements for both entities would be combined. Most requirements would simply be merged, such as env vars and job params. However, when combining gpus, cores and mem, the lower of the two values are used. In this case, the combined entity would have a core value of 4 and a mem value of 8. This allows training users for example, to be forced to use a lower number of cores than usual.

In addition, for these entities to be combined, the scheduling tags must also be compatible. In this instance the dangerous_interactive_tool requires the authorize_dangerous_tool tag, which all users by default reject. Therefore, most users cannot run this tool by default. However, fairycake@vortex.org overrides that and accepts the authorize_dangerous_tool allowing only that user to run the dangerous tool.

Roles can be matched in this exact way. Rules can also be defined at the user and role level.

Metascheduling

Custom rank functions can be used to implement metascheduling capabilities. A rank function is used to select the best matching destination from a list of matching destinations. If no rank function is provided, the default rank function simply chooses the most preferred destination out of the available destinations.

When more sophisticated control over scheduling is required, a rank function can be implemented through custom python code.

tools:
 default:
   cores: 2
   mem: 8
   rank: |
     import requests

     params = {
       'pretty': 'true',
       'db': 'pulsar-test',
       'q': 'SELECT last("percent_allocated") from "sinfo" group by "host"'
     }

     try:
       response = requests.get('http://stats.genome.edu.au:8086/query', params=params)
       data = response.json()
       cpu_by_destination = {s['tags']['host']:s['values'][0][1] for s in data.get('results')[0].get('series', [])}
       # sort by destination preference, and then by cpu usage
       candidate_destinations.sort(key=lambda d: (-1 * d.score(entity), cpu_by_destination.get(d.dest_name)))
       final_destinations = candidate_destinations
     except Exception:
       log.exception("An error occurred while querying influxdb. Using a weighted random candidate destination")
       final_destinations = helpers.weighted_random_sampling(candidate_destinations)
     final_destinations

In this example, the rank function queries a remote influx database to find the least loaded destination, The matching destinations are available to the rank function through the candidate_destinations contextual variable. Therefore, in this example, the candidate destinations are first sorted by the best matching destination (score is the default ranking function), and then sorted by CPU usage per destination, obtained from the influxdb query.

Note that the final statement in the rank function must be the list of sorted destinations.

Custom contexts

In addition to the automatically provided context variables (see Inner workings), TPV allows you to define arbitrary custom variables, which are then available whenever an expression is evaluated. Contexts can be defined both globally or at the level of each entity, with entity level context variables overriding global ones.

global:
  default_inherits: default
  context:
    ABSOLUTE_FILE_SIZE_LIMIT: 100
    large_file_size: 10
    _a_protected_var: "some value"

tools:
  default:
    context:
      additional_spec: --my-custom-param
    cores: 2
    mem: 4
    params:
      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024} {additional_spec}"
     rules:
      - if: input_size >= ABSOLUTE_FILE_SIZE_LIMIT
        fail: Job input: {input_size} exceeds absolute limit of: {ABSOLUTE_FILE_SIZE_LIMIT}
      - if: input_size > large_file_size
        cores: 10

  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
    context:
      large_file_size: 20
      additional_spec: --overridden-param
    mem: cores * 4
    gpus: 1

In this example, three global context variables are defined, which are made available to all entities. Variable names follow Python conventions, where all uppercase variables indicate constants that cannot be overridden. Lower case indicates a public variable that can be overridden and changed, even across multiple TPV config files. An underscore indicates a protected variable that can be overridden within the same file, but not across files.

Additionally, the tool defaults section defines a context variable named additional_spec, which is only available to inheriting tools.

If we were to dispatch a job, say bwa, with an input_size of 15, the large file rule in the defaults section would kick in, and the number of cores would be set to 10. If we were to dispatch a hisat2 job with the same input size however, the large_file_size rule would not kick in, as it has been overridden to 20. The main takeaway from this example is that variables are bound late, and therefore, rules and params can be crafted to allow inheriting tools to conveniently override values, even across files. While this capability can be powerful, it needs to be treated with the same care as any global variable in a programming language.

Multiple matches

If multiple regular expressions match, the matches are applied in order of appearance. Therefore, the convention is to specify more general rule matches first, and more specific matches later. This matching also applies across multiple TPV config files, again based on order of appearance.

tools:
  default:
    cores: 2
    mem: 4
    params:
      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"

  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/.*:
    mem: cores * 4
    gpus: 1

  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
    env:
      MY_ADDITIONAL_FLAG: "test"

In this example, dispatching a hisat2 job would result in a mem value of 8, with 1 gpu. However, dispatching the specific version of 2.1.0+galaxy7 would result in the additional env variable, with mem remaining at 8.

Job Environment

As seen in the previous example, it is possible to specify environment variables that will be set in the job’s executing environment. It is also possible to source environment files and execute commands, using the same syntax as in Galaxy’s job_conf.yml, by specifying env as a list instead of a dictionary.

tools:
  default:
    cores: 2
    mem: 4
    params:
      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
    env:
      - execute: echo "Don't Panic!"

  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/.*:
    mem: cores * 4
    gpus: 1
    env:
      - name: MY_ADDITIONAL_FLAG
        value: "arthur"
      - file: /galaxy/tools/hisat2.env

  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
    inherits: toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/.*:
    env:
      MY_ADDITIONAL_FLAG: "zaphod"

In this example, all jobs will execute the command echo "Don't Panic!". All versions of hisat2 will have $MY_ADDITIONAL_FLAG set and will source the file /galaxy/tools/hisat2.env, but version 2.1.0+galaxy7 will have the value zaphod set for $MY_ADDITIONAL_FLAG instead of the hisat2 default of arthur.

Job Resubmission

TPV has explict support for job resubmissions, so that advanced control over job resubmission is possible.

tools:
  default:
    cores: 2
    mem: 4 * int(job.destination_params.get('SCALING_FACTOR', 1)) if job.destination_params else 1
    params:
      SCALING_FACTOR: "{2 * int(job.destination_params.get('SCALING_FACTOR', 2)) if job.destination_params else 2}"
    resubmit:
      with_more_mem_on_failure:
        condition: memory_limit_reached and attempt <= 3
        destination: tpv_dispatcher

In this example, we have defined a resubmission handler that resubmits the job if the memory limited is reached. Note that the resubmit section looks exactly the same as Galaxy’s, except that it follows a dictionary structure instead of being a list. Refer to the Galaxy job configuration docs for more information on resubmit handlers. One twist in this example is that we automatically increase the amount of memory provided to the job on each resubmission. This is done by setting the SCALING_FACTOR param, which is a custom parameter which we have chosen for this example, that we increase on each resubmission. Since each resubmission’s destination is TPV, the param is re-evaluated on each resubmission, and scaled accordingly. The memory is allocated based on the scaling factor, which therefore, also scales accordingly.

Using the shared database

A shared database of resource requirements and rules are maintained in:

https://github.com/galaxyproject/tpv-shared-database/

This shared database relieves you of the burden of figuring out what resources are typically required by tools, with recommended settings based on those used in the usegalaxy.* federation. You can override these settings based on local resource availability. The shared database can be integrated through your local job_conf.yml as follows:

tpv_dispatcher:
  runner: dynamic
  type: python
  function: map_tool_to_destination
  rules_module: tpv.rules
  tpv_config_files:
    - https://gxy.io/tpv/db-latest.yml
    - config/my_local_overrides.yml  # optional

Clamping resources

Entities can define, min_{cores|gpus|mem} and max_{cores|gpu|mem} as a means of clamping the maximum resources that will be allocated to a tool, even if it requests a higher amount. For example, if a tool requests 16 cores, but a user is defined with max_cores: 4, then the tool’s resource requirement would be clamped down to that maximum amount. This can be useful for allocating lower resources to training users for example, who only use toy datasets that do not require the full core allocation. Conversely, some users can be allocated more resources by using min_cores.

In addition, clamping resources can also be useful when using the TPV shared database. For example, the canu tool has a 96GB recommended memory requirement, which your local cluster may not have. However, you may still want to allow the tool to run, albeit with lower resources. You can of course, locally override the canu tool and allocated less resources, but this can be tedious to do for a large number of tools. All you may really want, is to restrict all tools to use the maximum your cluster can support. You can achieve that effect as follows:

destinations:
  slurm:
    runner: slurm
    max_accepted_cores: 32
    max_accepted_mem: 196
    max_accepted_gpus: 2
    max_cores: 16
    max_mem: 64
    max_gpus: 1

In the example above, we mark the slurm destination as accepting jobs up to 196GB in size, and therefore, the canu tool, which required 96GB, would successfully schedule there. However, we forcibly clamp the job’s max_mem to 64GB, which is the actual memory your cluster can support. In this way, all tools in the shared database can still run, provided they do not exceed the specified max_accepted values.

Giving a parameterized, custom name to a destination

If you need to provide a parameterized name for a destination, you can do so by using the destination_name_override property.

destinations:
  slurm:
    runner: slurm
    destination_name_override: "my-dest-with-{cores}-cores-{mem}-mem"