Introduction
Ray plugin is designed to optimize the user experience when deploying a ray cluster, it not only allows users to write less yaml, but also supports users to deploy a ray cluster.
How the Ray Plugin Works
The Ray Plugin will do three things:
- Configure the commands of head and worker nodes in a ray cluster.
- Open three ports used by ray head node. (GCS, Ray dashboard and Client server)
- Create a service mapped to the ray head node container ports. (ex, submit a ray job, Access a ray dashboard and client server)
Note - This plugin is based on the ray cli (Command Line Interface) and this guide use the official ray docker image. - svc plugin is necessary when you use the ray plugin.
Parameters of the Ray Plugin
Arguments
| ID | Name | Type | Default Value | Required | Description | Example |
|---|---|---|---|---|---|---|
| 1 | head | string | head | No | Name of Head Task in Volcano Job | –head=head |
| 2 | worker | string | worker | No | Name of Worker Task in Volcano Job | –worker=worker |
| 3 | headContainer | string | head | No | Name of Main Container in a head task | –headContainer=head |
| 4 | workerContainer | string | worker | No | Name of Main Container in a worker task | –workerContainer=worker |
| 5 | port | string | 6379 | No | The port to open for the GCS | –port=6379 |
| 6 | dashboardPort | string | 8265 | No | The port to open for the Ray dashboard | –dashboardPort=8265 |
| 7 | clientServerPort | string | 10001 | No | The port to open for the client server | –clientServerPort=10001 |
Examples
This guide is based on the instructions provided in the RayCluster Quick Start.
First, create a Ray cluster using the YAML manifest shown below. - For more details about Ray clusters, see the Ray Cluster Key Concepts documentation. - For more details about How to compose a ray cluster, see the Launching an On-Premise Cluster.
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
name: ray-cluster-job
spec:
minAvailable: 3
schedulerName: volcano
plugins:
ray: []
svc: []
policies:
- event: PodEvicted
action: RestartJob
queue: default
tasks:
- replicas: 1
name: head
template:
spec:
containers:
- name: head
image: rayproject/ray:latest-py311-cpu
resources: {}
restartPolicy: OnFailure
- replicas: 2
name: worker
template:
spec:
containers:
- name: worker
image: rayproject/ray:latest-py311-cpu
resources: {}
restartPolicy: OnFailure
Once applied, a Ray cluster consisting of a head node and one or more worker nodes will be provisioned.
kubectl get pod
NAME READY STATUS RESTARTS AGE
ray-cluster-job-head-0 1/1 Running 0 106s
ray-cluster-job-worker-0 1/1 Running 0 106s
ray-cluster-job-worker-1 1/1 Running 0 106s
Along with the cluster, a ray-cluster-job-head-svc Kubernetes service resource is also created.
(ray-cluster-job service is created by svc plugin.)
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ray-cluster-job ClusterIP None <none> <none> 3s
ray-cluster-job-head-svc ClusterIP 10.96.184.65 <none> 6379/TCP,8265/TCP,10001/TCP 3s
Now that the service name is available, use port-forwarding to access the Ray Dashboard port which is 8265 by default.
# Execute this in a separate shell.
kubectl port-forward service/ray-cluster-job-head-svc 8265:8265 > /dev/null &
Now that the Dashboard port is accessible, submit jobs to the RayCluster:
# The following job's logs will show the Ray cluster's total resource capacity, including 2 CPUs.
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
Job submission server address: http://localhost:8265
-------------------------------------------------------
Job 'raysubmit_W8nYZjW4HEFG6Mqa' submitted successfully
-------------------------------------------------------
Next steps
Query the logs of the job:
ray job logs raysubmit_W8nYZjW4HEFG6Mqa
Query the status of the job:
ray job status raysubmit_W8nYZjW4HEFG6Mqa
Request the job to be stopped:
ray job stop raysubmit_W8nYZjW4HEFG6Mqa
Tailing logs until the job exits (disable with --no-wait):
2025-09-23 14:58:49,442 INFO job_manager.py:531 -- Runtime env is setting up.
2025-09-23 14:59:00,106 INFO worker.py:1630 -- Using address 10.244.2.42:6379 set in the environment variable RAY_ADDRESS
2025-09-23 14:59:00,144 INFO worker.py:1771 -- Connecting to existing Ray cluster at address: 10.244.2.42:6379...
2025-09-23 14:59:00,161 INFO worker.py:1942 -- Connected to Ray cluster. View the dashboard at http://10.244.2.42:8265
{'memory': 16277940225.0, 'node:10.244.4.41': 1.0, 'object_store_memory': 6976260095.0, 'CPU': 30.0, 'node:10.244.3.42': 1.0, 'node:10.244.2.42': 1.0, 'node:__internal_head__': 1.0}
------------------------------------------
Job 'raysubmit_W8nYZjW4HEFG6Mqa' succeeded
------------------------------------------
Visit ${YOUR_IP}:8265 in your browser for the Dashboard. For example, 127.0.0.1:8265. See the job you submitted the above in the Recent jobs pane as shown below.
