Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

add MachineHealthCheck #146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
3deep5me wants to merge 5 commits into k8s-proxmox:main
base: main
Choose a base branch
Loading
from 3deep5me:add-MachineHealthCheck

Conversation

@3deep5me
Copy link
Contributor

@3deep5me 3deep5me commented Nov 14, 2023
edited
Loading

Hi @sp-yduck,

this adds a MachineHealthCheck for all Machines of a new cluster. This can help if the node does not go into a running state. E.g. #145 or network problems during startup or other reasons.

Copy link
Collaborator

Thank you for the PR !
As you may know this template file is used for Quick Start. I want to keep Quick Start with minimal setups. So if you want to include this, the options is

  1. create new template for it so that users can choose specific template to try specific features
    ref: https://cluster-api.sigs.k8s.io/clusterctl/commands/generate-cluster.html?highlight=flavor#flavors
    ref: https://cluster-api.sigs.k8s.io/clusterctl/commands/generate-cluster.html?highlight=flavor#alternative-source-for-cluster-templates

Copy link
Contributor Author

I also changed the location of the quick start i hope this is fine for you.
I don't know how the clusterctl knows where to find the cluster-templates.
Do i have to change something else to make this work?

Copy link
Collaborator

clusterctl checks assets of the release so the file changes are ok. the thing is I am using make release to output these assets for each release.

  1. make release-templates
    https://github.com/sp-yduck/cluster-api-provider-proxmox/blob/bbcdd56993d21f9e3581eed558806fc909b71cec/Makefile#L219-L220

  2. make generate-e2e-templates
    https://github.com/sp-yduck/cluster-api-provider-proxmox/blob/bbcdd56993d21f9e3581eed558806fc909b71cec/Makefile#L111-L113

I think you can use kustomize build template/base or something for both of them

3deep5me reacted with thumbs up emoji

maxUnhealthy: 100%
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: '${CLUSTER_NAME}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a good idea to have single mhc for all the nodes in a cluster ?
maybe better to have a different mhc for controlplane and others. what do you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think its more common to have two mhc but in the simple setup i see no reason for that.
But i could be wrong do you have an idea how we could utilize two mhc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, I don't know how the two mhc should differ.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://cluster-api.sigs.k8s.io/tasks/automated-machine-management/healthchecking.html#creating-a-machinehealthcheck
there is an example mhc for both workers(node-unhealthy-5m) and controlplane(kcp-unhealthy-5m)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhhhh... okay. But the Examples do not differ. But we can definitely do two mhc if you want to be more cluster-api conformant.

Copy link
Collaborator

sp-yduck commented Nov 20, 2023
edited
Loading

btw I believe mhc does not help for

E.g. #145 or network problems during startup or other reasons.

since mhc checks Machine and Node object to confirm if Node(in workload cluster) is ready. so like issue #145 , if the vm goes into unhealthy before it joins k8s cluster, mhc cannot find the Node associated to that unhealthy vm and cannot remediate it.

Copy link
Contributor Author

3deep5me commented Nov 23, 2023
edited
Loading

I'm not sure about the detailed mechanics.
But I can confirm that if a VM does not boot, it is deleted and recreated.
(But at the moment no VM boots because i get the error every time 😢)

I think mhc also checks on default the status.condtion[].type.Ready field. Or that is the only way I can explain the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@sp-yduck sp-yduck sp-yduck left review comments

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /