Cloud native batch scheduling system for compute-intensive workloads
Computing jobs can be converted to Kubernetes workloads and scheduled in batches to deliver optimal performance.
Co-scheduling, fair-share scheduling, gang scheduling, topologies, reservation/backfill, data-aware scheduling, and more
Managing jobs with multiple templates
Singularity and GPU Accelerators
Logging, metrics, and dashboard
A cloud native system for high-performance workloads
Volcano is system for running high-performance workloads on Kubernetes. It features powerful batch scheduling capability that Kubernetes cannot provide but is commonly required by many classes of high-performance workloads, including:
These types of applications typically run on generalized domain frameworks like TensorFlow, Spark, PyTorch, and MPI. Volcano is integrated with these frameworks to allow you to run your applications without adaptation efforts while enjoying remarkable batch scheduling.
A powerful batch scheduler that allows you to run multi-architecture, computing-intensive jobs as Kubernetes workloads
Apache Spark™ is a unified analytics engine for large-scale data processing.
Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams.
An end-to-end open source machine learning platform.
An open source machine learning framework that accelerates the path from research prototyping to production deployment.
Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD.
The all-scenario deep learning framework developed by Huawei.
PaddlePaddle is an open source deep learning platform derived from industrial practice initiated by Baidu.
The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable, and scalable.
The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners.
Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
A truly open source deep learning framework suited for flexible research prototyping and production.
The KubeGene is dedicated to making genome sequencing process simple, portable, and scalable.