AIO-interface: add Unhealthy container state #5307

I really like you refactoring the container state logic from an interface to an enum. However, I think that both concerns could have been split into separate PRs. One PR to refactor the interface and another one to implement the new healthy states. But the code is already there so why not.

Looks good otherwise but did not test.

php/src/Container/Container.php Outdated Show resolved Hide resolved

php/src/Controller/DockerController.php Outdated Show resolved Hide resolved

php/src/Docker/DockerActionManager.php Outdated Show resolved Hide resolved

php/README.md Show resolved Hide resolved

php/src/Container/IContainerState.php Outdated Show resolved Hide resolved

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 1, 2024

Hi,

Thanks for the answer, as soon as I have time I'll look into it.

I will try to separate and make several PR.

Take care,

@docjyJ docjyJ mentioned this pull request

Oct 2, 2024

[PHP] Continuous Integration / Clean code #5368

Open

8 tasks

@szaimen szaimen mentioned this pull request

Oct 2, 2024

Update Configuration management #5328

Closed

@docjyJ docjyJ added 2. developing blocked and removed 3. to review labels

Oct 2, 2024

@docjyJ docjyJ marked this pull request as draft

October 2, 2024 13:11

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 2, 2024

I keep this PR open for the unhealthy state.
See #5372 for enum

@docjyJ docjyJ force-pushed the ench/noid/heathcheck branch from 1f3f231 to b9ca83a Compare

October 7, 2024 07:45

@docjyJ docjyJ changed the title ~~(削除) AIO: add IContainerState (削除ここまで)~~ (追記) AIO: add Unhealthy container state (追記ここまで)

Oct 7, 2024

@docjyJ docjyJ mentioned this pull request

Oct 7, 2024

AIO-interface: Use enum instead of interface for state #5372

Merged

@szaimen

Copy link

Collaborator

szaimen commented Oct 8, 2024

Conflicts :/

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 8, 2024

Don't worry, i'll handle it.

@docjyJ docjyJ force-pushed the ench/noid/heathcheck branch from b9ca83a to 6a3c340 Compare

October 8, 2024 09:23

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 8, 2024

Solved and up to date and ready (to be tested anyway...)

@docjyJ docjyJ marked this pull request as ready for review

October 8, 2024 09:24

@docjyJ docjyJ added 3. to review and removed 2. developing blocked labels

Oct 8, 2024

@szaimen szaimen changed the title ~~(削除) AIO: add Unhealthy container state (削除ここまで)~~ (追記) AIO-interface: add Unhealthy container state (追記ここまで)

Oct 8, 2024

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 8, 2024

I have podman on my machine and I can't launch the container... I can't test it and I don't have time to debug...

@szaimen

Copy link

Collaborator

szaimen commented Oct 8, 2024

I have podman on my machine and I can't launch the container...

Why may I ask? Do you run into an issue here?

st3iny

st3iny approved these changes

Oct 15, 2024

View reviewed changes

Copy link

Member

@st3iny st3iny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Looks good to me now!

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 15, 2024 •

edited

Loading

Health checks may need to be adjusted, probably reduce the interval during startup to speed up startup of all containers.

start period provides initialization time for containers that need time to bootstrap. Probe failure during that period will not be counted towards the maximum number of retries. However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries.

start interval is the time between health checks during the start period. This option requires Docker Engine version 25.0 or later.

See: https://docs.docker.com/reference/dockerfile/#healthcheck

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 15, 2024

This should allow for better dependency management.
Can be used for helm chart, I guess.

@szaimen

Copy link

Collaborator

szaimen commented Oct 15, 2024 •

edited

Loading

Thanks for the idea! However our health checks are currently built in a way that they never fail after a specific time. See for example https://github.com/nextcloud/all-in-one/blob/main/Containers/apache/healthcheck.sh. for the rest the defaults are good enough imho

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 15, 2024

Thanks for the idea! However our health checks are currently built in a way that they never fail after a specific time. See for example https://github.com/nextcloud/all-in-one/blob/main/Containers/apache/healthcheck.sh. for the rest the defaults are good enough imho

The problem with doing this is that docker considers the container ready... This should immediately take the container out of the starting state.

szaimen

szaimen reviewed

Oct 15, 2024

View reviewed changes

php/src/Docker/DockerActionManager.php Show resolved Hide resolved

szaimen

szaimen requested changes

Oct 15, 2024

View reviewed changes

Copy link

Collaborator

@szaimen szaimen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking, see above

@szaimen

Copy link

Collaborator

szaimen commented Oct 15, 2024

This is the only reliable way to check if the container is really up and running.

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 15, 2024

A fix would be to fail the container if nextcloud:9000 is not reachable and add a startup period (e.g. 10 minutes)

#!/bin/bash
- nc -z "$NEXTCLOUD_HOST" 9000 || exit 0
+ nc -z "$NEXTCLOUD_HOST" 9000 || exit 0
nc -z 127.0.0.1 8000 || exit 1
nc -z 127.0.0.1 "$APACHE_PORT" || exit 1
if ! nc -z "$NC_DOMAIN" 443; then
 echo "Could not reach $NC_DOMAIN on port 443."
 exit 1
fi

- HEALTHCHECK CMD /healthcheck.sh
+ HEALTHCHECK --start-period=10m CMD /healthcheck.sh

@szaimen

Copy link

Collaborator

szaimen commented Oct 15, 2024 •

edited

Loading

A fix would be to fail the container if nextcloud:9000 is not reachable and add a startup period (e.g. 10 minutes)

This is the problem: we cannot spwcify a time here as is depending on the overall setup like installed apps and features, amount of users, given hardware and else especially during upgrades

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 15, 2024 •

edited

Loading

This is the problem: we cannot spwcify a time here as is depending on the overall setup like installed apps and features, amount of users, given hardware and else especially during upgrades

Yes, I see...

Would it be better to manage dependencies like docker compose?

The solution for detecting the ready state of a service is to use the condition attribute with one of the following options:

service_started

service_healthy. This specifies that a dependency is expected to be "healthy", which is defined with healthcheck, before starting a dependent service.

https://docs.docker.com/compose/how-tos/startup-order/

@szaimen

Copy link

Collaborator

szaimen commented Oct 15, 2024

I fear this is going to have the same problem afaics, no?

@docjyJ docjyJ requested a review from szaimen

October 18, 2024 21:40

@szaimen

Copy link

Collaborator

szaimen commented Oct 21, 2024

@docjyJ LGTM now 😊

Can you please rebase the PR and squash the commits?
Afterwards we can merge this :)

@szaimen szaimen mentioned this pull request

Oct 18, 2024

publish new images after recent PRs are merged and tested #5427

Closed

20 tasks

@docjyJ docjyJ force-pushed the ench/noid/heathcheck branch from 730f4f4 to 9482e43 Compare

October 21, 2024 09:18

@docjyJ


 Improuve Container States

Signed-off-by: Jean-Yves <7360784+docjyJ@users.noreply.github.com>

@docjyJ docjyJ force-pushed the ench/noid/heathcheck branch from 9482e43 to 4798489 Compare

October 21, 2024 09:19

@docjyJ

Copy link

Collaborator Author

docjyJ commented Oct 21, 2024

Done

szaimen

szaimen reviewed

Oct 21, 2024

View reviewed changes

php/src/Container/ContainerState.php Show resolved Hide resolved

szaimen

szaimen reviewed

Oct 21, 2024

View reviewed changes

php/src/Docker/DockerActionManager.php Show resolved Hide resolved

docjyJ

docjyJ commented

Oct 21, 2024

View reviewed changes

php/src/Container/ContainerState.php

Comment on lines +17 to +35

public function isStarting(): bool {

return $this == self::Starting;

}

public function isRestarting(): bool {

return $this == self::Restarting;

}

public function isHealthy(): bool {

return $this == self::Healthy;

}

public function isUnhealthy(): bool {

return $this == self::Unhealthy;

}

public function isRunning(): bool {

return $this->isHealthy() || $this->isUnhealthy() || $this->isStarting() || $this->isRestarting();

}

Copy link

Collaborator Author

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Running state corresponds to the old GetRunningContainerState state.
Maybe there is another clearer state. Healthy means that the container is running without any problem detected by the AIO.

php/src/Controller/DockerController.php

// Don't start if container is already running

// This is expected to happen if a container is defined in depends_on of multiple containers

if ($container->GetRunningState() === ContainerState::Running) {

if ($container->GetContainerState()->isRunning()) {

Copy link

Collaborator Author

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

container->GetRunningState does not include Starting and Restarting

Copy link

Collaborator Author

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid it being hard to understand I went through the isSomething function

php/src/Docker/DockerActionManager.php

Comment on lines -52 to -56

if ($responseBody['State']['Running'] === true) {

return ContainerState::Running;

} else {

return ContainerState::Stopped;

}

Copy link

Collaborator Author

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true regardless of the Healthy or Starting or Running state.

szaimen

szaimen requested changes

Oct 24, 2024

View reviewed changes

Copy link

Collaborator

@szaimen szaimen left a comment •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@docjyJ I discussed the latest changes with @st3iny and we agree that the latest changes do not make much sense or at least are not easy enough to merge them. I am very sorry for that.

I would like to ask you if you could restore the changes from b20e2a8 by running git checkout <hash> and then push the old changes to a new branch via so: git checkout -b ench/noid/heathcheck-restored. From that commit hash, I think only such a check like https://github.com/nextcloud/all-in-one/pull/5307/files#diff-502ad3fbc5a2714763795f78cf314d4a76ada0cb3746530f2fb86fa471dd897bR91-R105 is missing. (best in a new commit)

Can you please do the above? I would do it myself but I unfortunately don't have your changes locally available anymore

@szaimen szaimen added 2. developing and removed 3. to review labels

Oct 27, 2024

@szaimen szaimen modified the milestones: v9.8.0, next

Oct 30, 2024

@szaimen szaimen marked this pull request as draft

November 4, 2024 09:24

@szaimen szaimen removed this from the next milestone

Nov 4, 2024

@docjyJ docjyJ mentioned this pull request

Dec 4, 2024

containers-schema.json: add healtchecks and adjust containers accordingly #5689

Merged

2 tasks

Labels

2. developing enhancement

3 participants

@docjyJ @szaimen @st3iny

AIO-interface: add Unhealthy container state #5307

Are you sure you want to change the base?

AIO-interface: add Unhealthy container state #5307

Uh oh!

Conversation

@docjyJ docjyJ commented Sep 21, 2024

Uh oh!

@st3iny st3iny left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

docjyJ commented Oct 1, 2024

Uh oh!

docjyJ commented Oct 2, 2024

Uh oh!

szaimen commented Oct 8, 2024

Uh oh!

docjyJ commented Oct 8, 2024

Uh oh!

docjyJ commented Oct 8, 2024

Uh oh!

docjyJ commented Oct 8, 2024

Uh oh!

szaimen commented Oct 8, 2024

Uh oh!

@st3iny st3iny left a comment

Choose a reason for hiding this comment

Uh oh!

docjyJ commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

docjyJ commented Oct 15, 2024

Uh oh!

szaimen commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

docjyJ commented Oct 15, 2024

Uh oh!

Uh oh!

@szaimen szaimen left a comment

Choose a reason for hiding this comment

Uh oh!

szaimen commented Oct 15, 2024

Uh oh!

docjyJ commented Oct 15, 2024

Uh oh!

szaimen commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

docjyJ commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

szaimen commented Oct 15, 2024

Uh oh!

szaimen commented Oct 21, 2024

Uh oh!

docjyJ commented Oct 21, 2024

Uh oh!

Uh oh!

Uh oh!

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

@docjyJ docjyJ Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

@st3iny st3iny left a comment •

edited

Loading

docjyJ commented Oct 15, 2024 •

edited

Loading

szaimen commented Oct 15, 2024 •

edited

Loading

szaimen commented Oct 15, 2024 •

edited

Loading

docjyJ commented Oct 15, 2024 •

edited

Loading

@szaimen szaimen left a comment •

edited

Loading