Environment setup
Install volcano
Refer to Install Guide to install volcano.
Update scheduler configmap
After installed, update the scheduler configuration:
kubectl edit configmap -n volcano-system volcano-scheduler-configmap
Register task-topology plugin in configmap
kind: ConfigMap
apiVersion: v1
metadata:
name: volcano-scheduler-configmap
namespace: volcano-system
data:
volcano-scheduler.conf: |
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- plugins:
- name: drf
- name: predicates
- name: task-topology
arguments:
task-topology.weight: 10
- name: proportion
- name: nodeorder
- name: binpack
Running Jobs
Take tensorflow job as sample.
Install kubeflow/tf-operator
Refer to Install Guide to install kubeflow, tf-operator included.
Edit yaml of tfjob
add annotations in volcano job or tensorflow job in format below.
affinityannotation indicates that tasks have connections between each other, so they should be set on same nodes;anti-affinityannotation indicates that tasks do not have connections between each other, so they should be set on different nodes;task-orderannotation indicates the order that tasks should be allocated. For example,ps,workermeans scheduler should schedulepstasks first. After allpstasks were allocated, scheduler started to scheduleworkertasks. This annotation is not a required field.volcano.sh/task-topology-affinity: "ps,worker;ps,evaluator" volcano.sh/task-topology-anti-affinity: "ps;worker,chief;chief,evaluator" volcano.sh/task-topology-task-order: "ps,worker,chief,evaluator"