-
Notifications
You must be signed in to change notification settings - Fork 83
Conversation
@coderzc
coderzc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. I reviewed this PR and found two issues that should be addressed before merge:
- [P1] New integration test is not executed in CI
- File: .github/workflows/hugegraph-llm.yml (Run integration tests step)
- test_flows_integration.py is added in this PR, but the workflow still runs only three explicit integration files. So the new tests are not part of PR checks and provide no regression protection.
- Suggestion: include the new file in the pytest command, or switch to directory-level collection with markers.
- [P2] Missing external-service skip guard in new integration tests
- File: hugegraph-llm/src/tests/integration/test_flows_integration.py
- The tests call real flows directly (BUILD_VECTOR_INDEX / GRAPH_EXTRACT / RAG_* / TEXT2GREMLIN) without should_skip_external()-style gating or mocks. Once enabled in CI/local integration runs, they can become flaky and environment-dependent.
- Suggestion: add explicit skip conditions for external dependencies, or stabilize with mocks/stubs for non-deterministic external calls.
Please address these and I can re-review quickly.
Thanks for the contribution. I reviewed this PR and found two issues that should be addressed before merge:
- [P1] New integration test is not executed in CI
- File: .github/workflows/hugegraph-llm.yml (Run integration tests step)
- test_flows_integration.py is added in this PR, but the workflow still runs only three explicit integration files. So the new tests are not part of PR checks and provide no regression protection.
- Suggestion: include the new file in the pytest command, or switch to directory-level collection with markers.
- [P2] Missing external-service skip guard in new integration tests
- File: hugegraph-llm/src/tests/integration/test_flows_integration.py
- The tests call real flows directly (BUILD_VECTOR_INDEX / GRAPH_EXTRACT / RAG_* / TEXT2GREMLIN) without should_skip_external()-style gating or mocks. Once enabled in CI/local integration runs, they can become flaky and environment-dependent.
- Suggestion: add explicit skip conditions for external dependencies, or stabilize with mocks/stubs for non-deterministic external calls.
Please address these and I can re-review quickly.
Thank u for your suggestions.
For P1, I plan to add a end-to-end test for all the test cases in the app demo, which need api-key to launch the basic test. So I don't add the test scripts to the CI. I wanna add a contributed.md for this project and call for all the contributor to check the basic functionality of this project. What about your opinion?
For P2, I don't simulate the external api because of the complexity of the workflow (caused by the nondeterminism of llm), so this test scripts is not added to the CI procedure.
I'm sorry I didn't reply to you in a timely manner.
... into flow_test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
下面 8 条评论按优先级给出。其中 3 条阻塞项必须修:
(1) test_rag 实际只跑了 raw 模式、未真正覆盖 4 种 RAG 模式;
(2) update_ui_configs 在测试中会写回 config_prompt.yaml 污染 contributor 工作区;
(3) 缺 should_skip_external 守卫导致默认 pytest 命令必爆。
另:PR 标题建议从 feat: 改为 test:,372 行新增中仅 1 行是生产代码 refactor。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_rag 实际只测了 raw 模式 + 测试会写回 config_prompt.yaml
问题 1(死代码 → 名义 4 模式实际只测 1 种)
update_ui_configs(...)(line 213-222)只调用一次,传入的 vector_only_answer / graph_only_answer / graph_vector_answer 都是 False,因此返回 vector_search=False, graph_search=False。
后续四段 schedule_flow 始终复用这同一组标记(见 line 227-228 / 251-252 / 275-276 / 299-300),而对局部 bool 的反复赋值(line 244-247 / 268-271 / 292-295)根本没传给 flow,是死代码。
=> 四次调用实际全部以"无检索"模式运行;RAG_VECTOR_ONLY / RAG_GRAPH_ONLY / RAG_GRAPH_VECTOR 这三条核心路径没被实际验证。CONTRIBUTING.md 中的 "All RAG modes" 描述不成立。
问题 2(副作用:测试会改 contributor 的 yaml)
update_ui_configs 在 prompt 内容变化时会触发 prompt.update_yaml_file()(见 rag_block.py:141-147)。Contributor 跑完该测试后,工作区里的 config_prompt.yaml 会被悄悄改写为测试中的 EN prompt 与 default_question="梁漱溟和梁济的关系是什么?",下次 git status 会冒出无关 diff,极易误提交。
修法:参数化覆盖 4 种模式 + 直接构造 vector_search / graph_search,绕开 demo helper 副作用:
@pytest.mark.parametrize( ("flow_name", "raw", "vec_only", "graph_only", "graph_vec"), [ (FlowName.RAG_RAW, True, False, False, False), (FlowName.RAG_VECTOR_ONLY, False, True, False, False), (FlowName.RAG_GRAPH_ONLY, False, False, True, False), (FlowName.RAG_GRAPH_VECTOR, False, False, False, True), ], ) def test_rag(self, flow_name, raw, vec_only, graph_only, graph_vec): query = "梁漱溟和梁济的关系是什么?" graph_search = graph_only or graph_vec vector_search = vec_only or graph_vec res = self.scheduler.schedule_flow( flow_name, query=query, vector_search=vector_search, graph_search=graph_search, raw_answer=raw, vector_only_answer=vec_only, graph_only_answer=graph_only, graph_vector_answer=graph_vec, graph_ratio=0.6, rerank_method="bleu", near_neighbor_first=False, custom_related_information="", answer_prompt=PromptConfig.answer_prompt_EN, keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN, gremlin_tmpl_num=-1, gremlin_prompt=PromptConfig.gremlin_generate_prompt_EN, ) assert isinstance(res, dict) and res.get("answer"), f"{flow_name} returned empty answer"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should_skip_external() 守卫,本地默认命令必爆
hugegraph-llm/src/tests/conftest.py:45 强制设置 SKIP_EXTERNAL_SERVICES=true,所有现存 integration test(如 test_kg_construction.py:126、test_rag_pipeline.py:113)都会调用 should_skip_external() 在此前提下自动 skip。
新文件没有这个守卫 → contributor 跑 pytest src/tests/ 或 pytest src/tests/integration/ 时,其它 integration test 全部跳过,只有这套新 test 不跳过且必爆(因为它会真实调 LLM/HugeGraph 但默认 SKIP=true 时基础设施不一定齐全),对 contributor 来说是最糟糕的体验。
@coderzc 已在 P2 提出过一次,作者回复"故意不进 CI"——但进不进 CI 与本地默认 pytest 行为是两回事。至少二选一:
方案 A:在 setup 中加守卫,与现有 integration test 一致:
from src.tests.test_utils import should_skip_external @pytest.fixture(autouse=True) def setup(self): if should_skip_external(): pytest.skip("Skipping tests that require external services") self.index_text = "..." self.scheduler = SchedulerSingleton.get_instance()
方案 B:用自定义 mark + CONTRIBUTING.md 引导:
# pytest.ini 或 pyproject.toml [tool.pytest.ini_options] markers = ["local_e2e: tests requiring HugeGraph + LLM API key (skipped by default)"] # 测试文件 @pytest.mark.local_e2e class TestFlowsIntegration: ...
并在 CONTRIBUTING.md 写明 pytest -m local_e2e 才会跑这套测试。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try/except + pytest.fail 反模式
本行(line 131)以及 line 189、line 199 三处 pytest.fail(...) 全部写成 "BUILD_VECTOR_INDEX flow failed: {e}",但实际抛出的可能是 BUILD_SCHEMA / PROMPT_GENERATE / IMPORT_GRAPH_DATA 等任意一个 flow。对于一个面向 contributor 本地自查的 smoke test,错误信息直接误导新人去排查错误的 flow。
更深层的问题是 try/except + pytest.fail(str(e)) 本身就是反模式:
- pytest 默认遇到异常就会 fail 并展示完整 traceback;这层包装反而吞掉异常类型与 cause chain;
- 多余的 try 块降低了可读性。
最简修法:直接删除 try/except 让异常自然冒泡。例如本测试(test_build_knowledge_graph):
def test_build_knowledge_graph(self): res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text]) assert "chunks" in res log.info("BUILD_VECTOR_INDEX flow executed successfully") # ...后续步骤同样直接调用,不要包 try/except
如果非要保留上下文,至少把 flow 名称变量化:pytest.fail(f"{current_flow} failed: {e}")。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_build_knowledge_graph 串行执行 BUILD_VECTOR_INDEX → GRAPH_EXTRACT → IMPORT_GRAPH_DATA → UPDATE_VID_EMBEDDINGS 四个 flow(line 41-129)。若 IMPORT_GRAPH_DATA 挂掉,配合上面那条复制粘贴 bug,contributor 看到的错误是 "BUILD_VECTOR_INDEX flow failed",会被严重误导。
另外 CONTRIBUTING.md 表格把它标为 1 项测试,与实际行为不符。
修法:拆成 4 个独立测试,用 fixture 共享上一步的产物;或至少改用 pytest-subtests:
def test_build_kg__vector_index(self): res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text]) assert "chunks" in res and len(res["chunks"]) > 0 def test_build_kg__graph_extract(self): data = self.scheduler.schedule_flow(FlowName.GRAPH_EXTRACT, ...) assert data["vertices"] and data["edges"] # 以此类推
好处:失败定位精确到 flow;CONTRIBUTING.md 表格与代码对齐;contributor 跑挂时不需要从头再来。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_schema_generator 无任何断言
本行 self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, ...) 没有任何断言,等价于"只要不抛异常就算通过"。schema 生成器返回空 list、错误结构、或 LLM 回了一句 "抱歉,我无法生成" 都会被判通过。
同样的弱断言遍布全文件:
- line 197 / 242 / 266 / 290 / 314 / 329:仅
assert res is not None - line 125:
assert res is not None(IMPORT_GRAPH_DATA)
对于面向 contributor 的本地自查,"通过"应该真正校验返回结构。
最低限度修法:
res = self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, ...) assert isinstance(res, dict) assert res.get("schema", {}).get("vertexlabels"), "BUILD_SCHEMA returned empty vertexlabels" assert res["schema"].get("edgelabels"), "BUILD_SCHEMA returned empty edgelabels"
理想做法:把期望产物存到 src/tests/data/flows/expected_schema.json(仓库已有 src/tests/data/ 目录),用结构 diff 校验。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SchedulerSingleton 是进程级单例,跨测试共享 pipeline_pool
SchedulerSingleton.get_instance()(见 flows/scheduler.py:181-191)是进程级单例,其 pipeline_pool 内每个 GPipelineManager 都会缓存 pipeline。这意味着:
- 顺序耦合:
test_build_knowledge_graph写入的图数据 / 向量索引会被后续test_rag看到。如果用户用pytest -k test_rag单独跑,test_rag是否通过取决于此前是否跑过test_build_knowledge_graph,行为不可复现。 - 脏 pipeline:某测试中途异常未走到
manager.release/add时,下一个测试可能拿到错误状态的 pipeline。
对 contributor 在反复修复-重试的场景下尤其糟糕:上一次失败留下的脏状态会让下一次的失败原因看起来与代码无关。
修法:teardown 中重置;或为每个测试用唯一的 graph_name(避免共享数据)。例如:
@pytest.fixture(autouse=True) def setup(self): if should_skip_external(): pytest.skip("...") self.index_text = "..." huge_settings.graph_name = f"test_{uuid.uuid4().hex[:8]}" self.scheduler = SchedulerSingleton.get_instance() yield # teardown:清空该 graph,或释放 pipeline
至少在文件 docstring 写明"测试间存在隐式数据依赖",让维护者心里有数。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uv 管理依赖,CONTRIBUTING.md 主命令应统一为 uv run pytest
hugegraph-llm/AGENTS.md 与 .github/workflows/hugegraph-llm.yml:82 都使用 uv run pytest。本文件 line 23 的 python -m pytest 在 contributor 没正确激活 .venv、或 .venv 缺 --extra llm 时会跑到系统 Python,得到莫名其妙的 import error。
CONTRIBUTING.md 是 contributor 的入口,命令必须与 CI 保持一致。
# Run the end-to-end integration tests cd hugegraph-llm uv run pytest src/tests/integration/test_flows_integration.py -v
另外,由于 conftest.py 默认 SKIP_EXTERNAL_SERVICES=true,contributor 直接跑命令时这套 test 可能被全部 skip。需要在文档里告知:
# 显式开启外部服务测试
SKIP_EXTERNAL_SERVICES=false uv run pytest src/tests/integration/test_flows_integration.py -vThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Minor:CONTRIBUTING.md 选址 / 范围 / 强制性
-
选址错位:标题 "Contributing to HugeGraph-AI" 但内容只覆盖
hugegraph-llm子模块。HugeGraph-AI 是 monorepo(含hugegraph-ml、hugegraph-python-client、vermeer-python-client)。建议:- 方案 A:把当前内容移动到
hugegraph-llm/TESTING.md(与hugegraph-llm/AGENTS.md同级),把根 CONTRIBUTING.md 作为通用入口; - 方案 B:保留根级 CONTRIBUTING.md,但把当前内容降级为 "### Module: hugegraph-llm" 章节,并补充 ASF 通用部分(DCO/sign-off、commit message 规范、ICLA 链接、Issue/PR 模板)。
- 方案 A:把当前内容移动到
-
强制性过度:line 26 "All 6 tests must pass before you submit your code" 强制要求 contributor 本地启动 HugeGraph + 配置 LLM API key 才能贡献。对纯文档 / UI / 非 flow PR 是过度要求。建议改为:
如果你的修改涉及 flow / 索引 / RAG 相关代码,请在本地通过下列 6 个测试。
-
额外缺失:当前文档没列出 LLM API key 与 embedding provider 等前置条件,但作者表态"测试需要 api-key"——这条信息应当出现在 Prerequisites。
-
数字硬编码:"All 6 tests must pass" / "
6 passed"(line 39)会随测试增删而过时,建议改成"全部通过"。
No description provided.