-
Notifications
You must be signed in to change notification settings - Fork 1.9k
nutanix upgrade 4.18 -> 4.19 #65740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nutanix upgrade 4.18 -> 4.19 #65740
Conversation
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you try re-running with the suggested changes, and we will check if the error persists? I think the same worked for an IBM profile, so let's see how it goes with a Nutanix profile.
...enshift-eng-ocp-qe-perfscale-ci-main__nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18.yaml
Show resolved
Hide resolved
memory: 200Mi | ||
tests: | ||
- as: loaded-upgrade-418to419-24nodes | ||
cluster: build01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need this cluster
field.
...enshift-eng-ocp-qe-perfscale-ci-main__nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18.yaml
Show resolved
Hide resolved
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
env: | ||
COMPUTE_CPU: "8" | ||
COMPUTE_MEMORY: "32000" | ||
COMPUTE_REPLICAS: "3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add GC: "false"
as we want to keep the workload during the upgrade.
Also, maybe you can add these params and try
OPENSHIFT_INFRA_NODE_INSTANCE_MEMORYSIZE: 64Gi
OPENSHIFT_INFRA_NODE_INSTANCE_VCPU: "16"
SET_ENV_BY_PLATFORM: custom
Because I see the error
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 136m nutanixcontroller ci-op-gmvftkmm-53c18-vs8np-worker-1-77gnn: reconciler failed to Create machine: failed to create VM: ci-op-gmvftkmm-53c18-vs8np-worker-1-77gnn failed to create the vm: error_detail: INVALID_ARGUMENT: Invalid Argument: 6
:No host has enough available resources for VM bd737d9f-14b3-47d2-9528-7ec3eaa8622f., progress_message: create_vm
Warning FailedUpdate 98s (x273 over 135m) nutanixcontroller ci-op-gmvftkmm-53c18-vs8np-worker-1-77gnn: reconciler failed to Update machine: The retrieved VM "ci-op-gmvftkmm-53c18-vs8np-worker-1-77gnn" has ERROR state. error: [{"message": "Invalid Argument: 6\n :No host has enough available resources for VM bd737d9f-14b3-47d2-9528-7ec3eaa8622f.", "reason": "INVALID_ARGUMENT"}]
error: all 24 nodes didn't become READY in time, failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mehabhalodiya
This test is very unstable - every time is fails on different step. Last time was with no enough resources for scaling from 3 to 24 - but that is issue of not enough resources. - not of step.
I added GC
variable, but the rest are for additional infra nodes - I'm not adding this for this test (test configuration will not pass make jobs
make update
verification.
- ref: workers-scale | ||
- chain: openshift-qe-cluster-density-v2 | ||
- chain: openshift-upgrade-qe-sanity | ||
- ref: openshift-qe-connectivity-check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove - ref: openshift-qe-connectivity-check
, as it is not required; this step is typically used in an IPsec cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this step as suggested.
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
/pj-rehearse pull-ci-openshift-eng-ocp-qe-perfscale-ci-main-nutanix-4.19-nightly-x86-loaded-upgrade-from-4.18-loaded-upgrade-418to419-24nodes |
@skordas: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thank you!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mehabhalodiya, skordas The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/pj-rehearse ack |
@mehabhalodiya: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
@skordas: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
* nutanix upgrade 4.18 -> 4.19 * Mulitzone istallation * New flow for upgrade * Debugging - something is wrong here * Revert "Debugging - something is wrong here" This reverts commit bad7647. * adding GC variable to test, removing step * fix for metadata
https://issues.redhat.com/browse/OCPQE-28934