-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Description
We are currently running the Kubeflow spark-operator on Linode LKE kubernetes with auto-scaling enabled.
I have noticed that when trying to trigger a large auto-scaling (i.e. trying to trigger to use all possible nodes in a node-pool) Volcano pods are stuck in Pending phase and PodGroup is also stuck on pending.
If I reduce the RAM and CPU request to half, the scaling up will then be triggered and the PodGroup is successful. Not sure why or how the calculation is done in the volcano side.
My volcano config is:
actions: "enqueue, allocate,preempt, backfill"
tiers:
- plugins:
- name: priority
- name: conformance
- plugins:
- name: overcommit
arguments:
overcommit-factor: 15.0
- name: drf
enablePreemptable: false
- name: predicates
- name: capacity
- name: nodeorder
- name: binpack
My queue:
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
name: myqueue
spec:
reclaimable: true
weight: 1
capability:
cpu: "200"
memory: "1200G"
status:
state: Open
My K8s cluster has auto-scaling enabled in the node-pool with Min 1 and Max 10 nodes (16C,300GB Ram)
My spark job conf:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-sleep-2
namespace: spark
spec:
type: Python
mode: cluster
image: "redacted-private-repo/spark:k8s-3.5.1"
imagePullSecrets:
- "regcred"
imagePullPolicy: Always
mainApplicationFile: "local:///opt/spark/work-dir/sleep_forever.py"
sparkVersion: "3.5.1"
batchScheduler: volcano
batchSchedulerOptions:
priorityClassName: urgent
queue: myqueue
restartPolicy:
type: Never
driver:
cores: 2
memory: "8G"
labels:
version: 3.5.1
serviceAccount: spark
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "node.kubernetes.io/instance-type"
operator: In
values:
- "g6-dedicated-32"
executor:
cores: 15
instances: 8
memory: "200G"
labels:
version: 3.5.1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "node.kubernetes.io/instance-type"
operator: In
values:
- "g7-highmem-16"
I have tried various volcano-scheduler.conf options but the same error persists.
PodGroup reports:
1/1 tasks in gang unschedulable: pod group is not ready, 1 Pending, 1 minAvailable; Pending: 1 Unschedulable
queue resource quota insufficient
apiVersion: scheduling.volcano.sh/v1beta1
kind: PodGroup
metadata:
name: spark-spark-sleep-2-pg
namespace: spark
.
.
.
.
status:
phase: Pending
spec:
minMember: 1
minResources:
cpu: '122'
memory: 1608G
priorityClassName: urgent
queue: myqueue
Is anyone aware on how to fix this issue? I removed gang scheduling plugin as per #2558 but that did not work.
Describe the results you received and expected
PodGroup stuck on Pending
What version of Volcano are you using?
1.10
Any other relevant information
k8s 1.29
Metadata
Metadata
Assignees
Labels
Type
Projects
Status