TPV by example

Simple configuration

The simplest possible example of a useful TPV config might look like the following:

 1tools:
 2  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
 3    cores: 12
 4    mem: cores * 4
 5    gpus: 1
 6
 7destinations:
 8 slurm:
 9   runner: slurm
10   max_accepted_cores: 16
11   max_accepted_mem: 64
12   max_accepted_gpus: 2
13 general_pulsar_1:
14   runner: pulsar_1
15   max_accepted_cores: 8
16   max_accepted_mem: 32
17   max_accepted_gpus: 1

Here, we define one tool and its resource requirements, the destinations available, and the total resources available at each destination (optional). The tools are matched by tool id, and can be a regular expression. Note how resource requirements can also be computed as python expressions. If resource requirements are defined at the destination, TPV will check whether the job will fit. For example, hisat2 will not schedule on general_pulsar_1 as it has insufficient cores. If resource requirements are omitted in the tool or destination, it is considered a match. Note that TPV only considers destinations defined in its own config file, and ignores destinations in job_conf.yml.

Default inheritance

Inheritance provides a mechanism for an entity to inherit properties from another entity, reducing repetition.

 1global:
 2  default_inherits: default
 3
 4tools:
 5  default:
 6    cores: 2
 7    mem: 4
 8    params:
 9      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
10  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
11    cores: 12
12    mem: cores * 4
13    gpus: 1

The global section is used to define global TPV properties. The default_inherits property defines a “base class” for all tools to inherit from.

In this example, if the bwa tool is executed, it will match the default tool, as there are no other matches, thus inheriting its resource requirements. The hisat2 tool will also inherit these defaults, but is explicitly overriding cores, mem and gpus. It will inherit the nativeSpecification param.

Explicit inheritance

Explicit inheritance provides a mechanism for exerting greater control over the inheritance chain.

 1global:
 2  default_inherits: default
 3
 4tools:
 5  default:
 6    cores: 2
 7    mem: 4
 8    params:
 9      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
10  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
11    cores: 12
12    mem: cores * 4
13    gpus: 1
14  .*minimap2.*:
15    inherits: toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
16    cores: 8
17    gpus: 0

In this example, the minimap2 tool explicitly inherits requirements from the hisat2 tool, which in turn inherits the default tool. There is no limit to how deep the inheritance hierarchy can be.

Scheduling tags

Scheduling tags provide a means by which to control how entities match up, and can be used to route jobs to preferred destinations, or to explicitly control which users can execute which tools, and where.

 1tools:
 2  default:
 3    cores: 2
 4    mem: 4
 5    params:
 6      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
 7    scheduling:
 8      reject:
 9        - offline
10  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/.*:
11    cores: 4
12    mem: cores * 4
13    gpus: 1
14    scheduling:
15      require:
16      prefer:
17        - highmem
18      accept:
19      reject:
20  toolshed.g2.bx.psu.edu/repos/iuc/minimap2/.*:
21    cores: 4
22    mem: cores * 4
23    gpus: 1
24    scheduling:
25      require:
26        - highmem
27
28destinations:
29  slurm:
30    runner: slurm
31    max_accepted_cores: 16
32    max_accepted_mem: 64
33    max_accepted_gpus: 2
34    scheduling:
35      prefer:
36        - general
37
38  general_pulsar_1:
39    runner: pulsar_1
40    max_accepted_cores: 8
41    max_accepted_mem: 32
42    max_accepted_gpus: 1
43    scheduling:
44      prefer:
45        - highmem
46      reject:
47        - offline

In this example, all tools reject destinations marked as offline. The hisat2 tool expresses a preference for highmem, and inherits the rejection of offline tags. Inheritance can be used to override scheduling tags. For example, the minimap2 tool inherits hisat2, but now requires a highmem tag, instead of merely preferring it.

The destinations themselves can be tagged in similar ways. In this case, the general_pulsar_1 destination also prefers the highmem tag, and thus, the hisat2 tool would schedule there. However, general_pulsar_1 also rejects the offline tag, and therefore, the hisat2 tool cannot schedule there. Therefore, it schedules on the only available destination, which is slurm.

The minimap2 tool meanwhile requires highmem, but rejects offline tags, which leaves it nowhere to schedule. This results in a JobMappingException being thrown.

A full table of how scheduling tags match up can be found in the Scheduling section.

These TPV defined scheduling tags should be contrasted with Galaxy’s destination level handler tags: https://github.com/galaxyproject/galaxy/blob/0a0d68b7feed5e303ed762f6586ea9757219c6f7/lib/galaxy/config/sample/job_conf.sample.yml#L1037 Galaxy handler tags can be defined as simply tags at the destination.

Rules

Rules provide a means by which to conditionally change entity requirements.

 1tools:
 2  default:
 3    cores: 2
 4    mem: cores * 3
 5    rules:
 6      - id: my_overridable_rule
 7        if: input_size < 5
 8        fail: We don't run piddling datasets of {input_size}GB
 9  bwa:
10    scheduling:
11      require:
12        - pulsar
13    rules:
14      - id: my_overridable_rule
15        if: input_size < 1
16        fail: We don't run piddling datasets
17      - if: input_size <= 10
18        cores: 4
19        mem: cores * 4
20        execute: |
21           from galaxy.jobs.mapper import JobNotReadyException
22           raise JobNotReadyException()
23      - if: input_size > 10 and input_size < 20
24        scheduling:
25          require:
26            - highmem
27      - if: input_size >= 20
28        fail: Input size: {input_size} is too large shouldn't run

The if clause can contain arbitrary python code, including multi-line python code. The only requirement is that the last statement in the code block must evaluate to a boolean value. In this example, the input_size variable is an automatically available contextual variable which is computed by totalling the sizes of all inputs to the job. Additional available variables include app, job, tool, and user.

If the rule matches, the properties of the rule override the properties of the tool. For example, if the input_size is 15, the bwa tool will require both pulsar and highmem tags.

Rules can be overridden by giving them an id. For example, the default for all tools is to reject input sizes < 5 by using the my_overridable_rule rule. We override that for the bwa tool by specifically referring to the inherited rule by id. If no id is specified, an id is auto-generated and no longer overridable.

Note the use of the {input_size} variable in the fail message. The general rule is that all non-string expressions are evaluated as python code blocks, while string variables are evaluated as python f-strings.

The execute block can be used to create arbitrary side-effects if a rule matches. The return value of an execute block is ignored.

User and Role Handling

Scheduling rules can also be expressed for users and roles.

 1tools:
 2  default:
 3    scheduling:
 4      require: []
 5      prefer:
 6        - general
 7      accept:
 8      reject:
 9        - pulsar
10    rules: []
11  dangerous_interactive_tool:
12    cores: 8
13    mem: 8
14    scheduling:
15      require:
16        - authorize_dangerous_tool
17users:
18  default:
19    scheduling:
20      reject:
21        - authorize_dangerous_tool
22  fairycake@vortex.org:
23    cores: 4
24    mem: 16
25    scheduling:
26      accept:
27        - authorize_dangerous_tool
28      prefer:
29        - highmem
30
31roles:
32  training.*:
33    cores: 5
34    mem: 7
35    scheduling:
36      reject:
37        - pulsar

In this example, if user fairycake@vortex.org attempts to dispatch a dangerous_interactive_tool job, the requirements for both entities would be combined. Most requirements would simply be merged, such as env vars and job params. However, when combining gpus, cores and mem, the lower of the two values are used. In this case, the combined entity would have a core value of 4 and a mem value of 8. This allows training users for example, to be forced to use a lower number of cores than usual.

In addition, for these entities to be combined, the scheduling tags must also be compatible. In this instance the dangerous_interactive_tool requires the authorize_dangerous_tool tag, which all users by default reject. Therefore, most users cannot run this tool by default. However, fairycake@vortex.org overrides that and accepts the authorize_dangerous_tool allowing only that user to run the dangerous tool.

Roles can be matched in this exact way. Rules can also be defined at the user and role level.

Metascheduling

Custom rank functions can be used to implement metascheduling capabilities. A rank function is used to select the best matching destination from a list of matching destinations. If no rank function is provided, the default rank function simply chooses the most preferred destination out of the available destinations.

When more sophisticated control over scheduling is required, a rank function can be implemented through custom python code.

 1tools:
 2 default:
 3   cores: 2
 4   mem: 8
 5   rank: |
 6     import requests
 7
 8     params = {
 9       'pretty': 'true',
10       'db': 'pulsar-test',
11       'q': 'SELECT last("percent_allocated") from "sinfo" group by "host"'
12     }
13
14     try:
15       response = requests.get('http://stats.genome.edu.au:8086/query', params=params)
16       data = response.json()
17       cpu_by_destination = {s['tags']['host']:s['values'][0][1] for s in data.get('results')[0].get('series', [])}
18       # sort by destination preference, and then by cpu usage
19       candidate_destinations.sort(key=lambda d: (-1 * d.score(entity), cpu_by_destination.get(d.dest_name)))
20       final_destinations = candidate_destinations
21     except Exception:
22       log.exception("An error occurred while querying influxdb. Using a weighted random candidate destination")
23       final_destinations = helpers.weighted_random_sampling(candidate_destinations)
24     final_destinations

In this example, the rank function queries a remote influx database to find the least loaded destination, The matching destinations are available to the rank function through the candidate_destinations contextual variable. Therefore, in this example, the candidate destinations are first sorted by the best matching destination (score is the default ranking function), and then sorted by CPU usage per destination, obtained from the influxdb query.

Note that the final statement in the rank function must be the list of sorted destinations.

Custom contexts

In addition to the automatically provided context variables (see Concepts and Organisation), TPV allows you to define arbitrary custom variables, which are then available whenever an expression is evaluated. Contexts can be defined both globally or at the level of each entity, with entity level context variables overriding global ones.

 1global:
 2  default_inherits: default
 3  context:
 4    ABSOLUTE_FILE_SIZE_LIMIT: 100
 5    large_file_size: 10
 6    _a_protected_var: "some value"
 7
 8tools:
 9  default:
10    context:
11      additional_spec: --my-custom-param
12    cores: 2
13    mem: 4
14    params:
15      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024} {additional_spec}"
16     rules:
17      - if: input_size >= ABSOLUTE_FILE_SIZE_LIMIT
18        fail: Job input: {input_size} exceeds absolute limit of: {ABSOLUTE_FILE_SIZE_LIMIT}
19      - if: input_size > large_file_size
20        cores: 10
21
22  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
23    context:
24      large_file_size: 20
25      additional_spec: --overridden-param
26    mem: cores * 4
27    gpus: 1

In this example, three global context variables are defined, which are made available to all entities. Variable names follow Python conventions, where all uppercase variables indicate constants that cannot be overridden. Lower case indicates a public variable that can be overridden and changed, even across multiple TPV config files. An underscore indicates a protected variable that can be overridden within the same file, but not across files.

Additionally, the tool defaults section defines a context variable named additional_spec, which is only available to inheriting tools.

If we were to dispatch a job, say bwa, with an input_size of 15, the large file rule in the defaults section would kick in, and the number of cores would be set to 10. If we were to dispatch a hisat2 job with the same input size however, the large_file_size rule would not kick in, as it has been overridden to 20. The main takeaway from this example is that variables are bound late, and therefore, rules and params can be crafted to allow inheriting tools to conveniently override values, even across files. While this capability can be powerful, it needs to be treated with the same care as any global variable in a programming language.

Multiple matches

If multiple regular expressions match, the matches are applied in order of appearance. Therefore, the convention is to specify more general rule matches first, and more specific matches later. This matching also applies across multiple TPV config files, again based on order of appearance.

 1tools:
 2  default:
 3    cores: 2
 4    mem: 4
 5    params:
 6      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
 7
 8  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/.*:
 9    mem: cores * 4
10    gpus: 1
11
12  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
13    env:
14      MY_ADDITIONAL_FLAG: "test"

In this example, dispatching a hisat2 job would result in a mem value of 8, with 1 gpu. However, dispatching the specific version of 2.1.0+galaxy7 would result in the additional env variable, with mem remaining at 8.

Job Environment

As seen in the previous example, it is possible to specify environment variables that will be set in the job’s executing environment. It is also possible to source environment files and execute commands, using the same syntax as in Galaxy’s job_conf.yml, by specifying env as a list instead of a dictionary.

 1tools:
 2  default:
 3    cores: 2
 4    mem: 4
 5    params:
 6      nativeSpecification: "--nodes=1 --ntasks={cores} --ntasks-per-node={cores} --mem={mem*1024}"
 7    env:
 8      - execute: echo "Don't Panic!"
 9
10  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/.*:
11    mem: cores * 4
12    gpus: 1
13    env:
14      - name: MY_ADDITIONAL_FLAG
15        value: "arthur"
16      - file: /galaxy/tools/hisat2.env
17
18  toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7:
19    inherits: toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/.*:
20    env:
21      MY_ADDITIONAL_FLAG: "zaphod"

In this example, all jobs will execute the command echo "Don't Panic!". All versions of hisat2 will have $MY_ADDITIONAL_FLAG set and will source the file /galaxy/tools/hisat2.env, but version 2.1.0+galaxy7 will have the value zaphod set for $MY_ADDITIONAL_FLAG instead of the hisat2 default of arthur.

Job Resubmission

TPV has explict support for job resubmissions, so that advanced control over job resubmission is possible.

 1tools:
 2  default:
 3    cores: 2
 4    mem: 4 * int(job.destination_params.get('SCALING_FACTOR', 1)) if job.destination_params else 1
 5    params:
 6      SCALING_FACTOR: "{2 * int(job.destination_params.get('SCALING_FACTOR', 2)) if job.destination_params else 2}"
 7    resubmit:
 8      with_more_mem_on_failure:
 9        condition: memory_limit_reached and attempt <= 3
10        destination: tpv_dispatcher

In this example, we have defined a resubmission handler that resubmits the job if the memory limited is reached. Note that the resubmit section looks exactly the same as Galaxy’s, except that it follows a dictionary structure instead of being a list. Refer to the Galaxy job configuration docs for more information on resubmit handlers. One twist in this example is that we automatically increase the amount of memory provided to the job on each resubmission. This is done by setting the SCALING_FACTOR param, which is a custom parameter which we have chosen for this example, that we increase on each resubmission. Since each resubmission’s destination is TPV, the param is re-evaluated on each resubmission, and scaled accordingly. The memory is allocated based on the scaling factor, which therefore, also scales accordingly.

Using the shared database

A shared database of resource requirements and rules are maintained in:

https://github.com/galaxyproject/tpv-shared-database/

This shared database relieves you of the burden of figuring out what resources are typically required by tools, with recommended settings based on those used in the usegalaxy.* federation. You can override these settings based on local resource availability. The shared database can be integrated through your local job_conf.yml as follows:

1tpv_dispatcher:
2  runner: dynamic
3  type: python
4  function: map_tool_to_destination
5  rules_module: tpv.rules
6  tpv_config_files:
7    - https://gxy.io/tpv/db.yml
8    - config/my_local_overrides.yml  # optional

Clamping resources

Entities can define, min_{cores|gpus|mem} and max_{cores|gpu|mem} as a means of clamping the maximum resources that will be allocated to a tool, even if it requests a higher amount. For example, if a tool requests 16 cores, but a user is defined with max_cores: 4, then the tool’s resource requirement would be clamped down to that maximum amount. This can be useful for allocating lower resources to training users for example, who only use toy datasets that do not require the full core allocation. Conversely, some users can be allocated more resources by using min_cores.

In addition, clamping resources can also be useful when using the TPV shared database. For example, the canu tool has a 96GB recommended memory requirement, which your local cluster may not have. However, you may still want to allow the tool to run, albeit with lower resources. You can of course, locally override the canu tool and allocated less resources, but this can be tedious to do for a large number of tools. All you may really want, is to restrict all tools to use the maximum your cluster can support. You can achieve that effect as follows:

1destinations:
2  slurm:
3    runner: slurm
4    max_accepted_cores: 32
5    max_accepted_mem: 196
6    max_accepted_gpus: 2
7    max_cores: 16
8    max_mem: 64
9    max_gpus: 1

In the example above, we mark the slurm destination as accepting jobs up to 196GB in size, and therefore, the canu tool, which required 96GB, would successfully schedule there. However, we forcibly clamp the job’s max_mem to 64GB, which is the actual memory your cluster can support. In this way, all tools in the shared database can still run, provided they do not exceed the specified max_accepted values.

Giving a parameterized, custom name to a destination

If you need to provide a parameterized name for a destination, you can do so by using the destination_name_override property.

1destinations:
2  slurm:
3    runner: slurm
4    destination_name_override: "my-dest-with-{cores}-cores-{mem}-mem"