-
-
Notifications
You must be signed in to change notification settings - Fork 44
doc: add serverless doc with keda and activator. #499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
X1aoZEOuO
commented
Sep 28, 2025
/kind feature
X1aoZEOuO
commented
Sep 28, 2025
/kind documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc should be merged in the last: #500 added the makefile task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
until #500 was merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@kerthcet
kerthcet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain the relationship between activator and keda at the very beginning. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not only capture this service right? Let's add some explanations here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I've enhanced both the Prometheus and KEDA configuration sections with detailed explanations
X1aoZEOuO
commented
Oct 29, 2025
Please explain the relationship between activator and keda at the very beginning. Thanks!
@kerthcet Thank you for the feedback! I've added a new section "Relationship Between Activator and KEDA". This section now clearly explains:
- How KEDA handles dynamic scaling based on metrics (monitoring and adjusting replicas)
- How the Activator serves as a request interceptor for scale-from-zero scenarios
- How these two components work together to enable true serverless behavior
Signed-off-by: X1aoZEOuO <nizefeng2002@outlook.com>
Signed-off-by: X1aoZEOuO <nizefeng2002@outlook.com>
354e4ee to
56fb1ac
Compare
@pacoxu @kerthcet Helm ci seemed failed because of network error, can we disable or ignore it now?
https://github.com/InftyAI/llmaz/actions/runs/18917273800/job/54003859685?pr=499
kerthcet
commented
Oct 30, 2025
/retest
/retest
@kerthcet Hello, It seems that no space left on e2e test. https://github.com/InftyAI/llmaz/actions/runs/18917278030/job/54003874165?pr=500
[FAILED] in [It] - /home/runner/work/llmaz/llmaz/test/util/validation/validate_playground.go:219 @ 10/29/25 18:02:24.453 [FAILED] in [AfterEach] - /home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:50 @ 10/29/25 18:02:24.453 • [FAILED] [335.923 seconds] playground e2e tests [It] SpeculativeDecoding with llama.cpp /home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:145 [FAILED] Timed out after 335.612s. Expected success, but got an error: <*url.Error | 0xc000712900>: Error: No space left on device : '/home/runner/actions-runner/cached/_diag/pages/7ae5050e-5137-471d-b700-9b1bd0d8553b_338ff102-8e76-46a1-a5ae-f669195390f6_1.log'
X1aoZEOuO
commented
Oct 30, 2025
/retest
@kerthcet And the helm install is not ready. https://github.com/InftyAI/llmaz/actions/runs/18917273800/job/54003859685?pr=499
Installing v3.17.3 Downloading 'v3.17.3' from 'https://get.helm.sh/' Request timeout: /helm-v3.17.3-linux-amd64.tar.gz Waiting 20 seconds before trying again Request timeout: /helm-v3.17.3-linux-amd64.tar.gz Waiting 14 seconds before trying again Error: Error: Failed to download Helm from location https://get.helm.sh/helm-v3.17.3-linux-amd64.tar.gz
@kerthcet
kerthcet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
kerthcet
commented
Oct 30, 2025
/retest
3 similar comments
kerthcet
commented
Oct 30, 2025
/retest
pacoxu
commented
Oct 31, 2025
/retest
pacoxu
commented
Oct 31, 2025
/retest
X1aoZEOuO
commented
Oct 31, 2025
/retest
@pacoxu The CI issue appears to have popped up over the last couple of weeks, likely resulting from recent changes to the GitHub environment. I receive the issue and take a look about it in next PR. #498 (comment)
pacoxu
commented
Oct 31, 2025
/retest
/retest
@pacoxu The CI issue appears to have popped up over the last couple of weeks, likely resulting from recent changes to the GitHub environment. I receive the issue and take a look about it in next PR. #498 (comment)
#508: see https://github.com/InftyAI/llmaz/actions/runs/18964011867/job/54157010420
Adding https://github.com/InftyAI/llmaz/pull/508/files#diff-2976cc01ce3201f4a03e4021cdefdc1cd95b67efed6798930c222c59d1771f9aR73-R88 Free Disk Space can fix the CI failure.
/retest
I opened kerthcet/github-workflow-as-kube#15 to fix the CI.(after that, llmaz should bump the workflow version)
What this PR does / why we need it
This commit introduces a comprehensive guide for configuring serverless environments on Kubernetes, with a focus on integrating Prometheus for monitoring and KEDA for autoscaling. The guide aims to optimize resource efficiency through event-driven scaling while maintaining observability and resilience for AI/ML workloads and other latency-sensitive applications.
This commit adds a detailed guide for configuring serverless environments on Kubernetes, integrating Prometheus for monitoring and KEDA for autoscaling. The guide includes YAML configurations, step-by-step installation instructions, and performance benchmarks to help users achieve optimal resource efficiency and observability for their applications.
Which issue(s) this PR fixes
Fixes #362
Special notes for your reviewer
Does this PR introduce a user-facing change?
cc @pacoxu @kerthcet