Task Topology Plugin User Guide
Environment setup
Install volcano
Refer to Install Guide to install volcano.
Update scheduler configmap
After installed, update the scheduler configuration:
kubectl edit configmap -n volcano-system volcano-scheduler-configmap
Register task-topology plugin in configmap
kind: ConfigMap
apiVersion: v1
metadata:
name: volcano-scheduler-configmap
namespace: volcano-system
data:
volcano-scheduler.conf: |
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- plugins:
- name: drf
- name: predicates
- name: task-topology
arguments:
task-topology.weight: 10
- name: proportion
- name: nodeorder
- name: binpack
Running Jobs
Take tensorflow job as sample.
Install kubeflow/tf-operator
Refer to Install Guide to install kubeflow, tf-operator included.
Edit yaml of tfjob
- add annotations in volcano job or tensorflow job in format below.
-
affinityannotation indicates that tasks have connections between each other, so they should be set on same nodes; -
anti-affinityannotation indicates that tasks do not have connections between each other, so they should be set on different nodes; -
task-orderannotation indicates the order that tasks should be allocated. For example,ps,workermeans scheduler should schedulepstasks first. After allpstasks were allocated, scheduler started to scheduleworkertasks. This annotation is not a required field.volcano.sh/task-topology-affinity: "ps,worker;ps,evaluator"
volcano.sh/task-topology-anti-affinity: "ps;worker,chief;chief,evaluator"
volcano.sh/task-topology-task-order: "ps,worker,chief,evaluator"
-