Concept

Volcano: A Kubernetes-based system for HPW

Domain Frameworks :

  • Deployment/Installation of framework in k8s
  • Map framework’s terms/concepts into common concept, e.g. Job, Queue
  • Enable related features for frameworks, e.g. gangscheduling for TensorFlow training

Common Service for high performance workload :

  • Batch scheduling, e.g. fair-share, gang-scheduling
  • Enhanced job management, e.g. multiple pod template, job dependency, job lifecycle management
  • Alternative container runtime, e.g. Singularity
  • Enhancement for heterogeneous computing
  • Enhancement for high performance workload, e.g. performance, throughput